ENSAE Paris - École d'ingénieurs pour l'économie, la data science, la finance et l'actuariat

Social Science Genetics



Department: Sociology


A growing number of social science data sources provide molecular genetic information and researchers all over the world are interested in utilizing this information to better understand various social phenomena. In this course, we will learn about the history of social science and behaviour genetics in addition to state of the art research and cutting-edge methods applied across this field.


We will start with a general introduction of genetics in social sciences, and discussing potential research questions we can answer using genetic data. Subsequently, we will review the theory behind twin and family models and how to estimate heritability as the proportion of observed variance in an outcome, which is explained by genetic effects. We will then explore how heritability is measured using molecular genetic data and discuss various challenges and possible applications. We will use Plink software to prepare and analyse genetic data.


Furthermore we will discuss how genetic variants are discovered, which are associated with social science outcomes of interest. We will then review how to utilize these results in social science research through controlling for confounding effects, dealing with genetic heterogeneity in social science models, estimating gene-environment interaction models and using genes as instrumental variables. Substantively, we will rely on recently published genetic discovery studies pertaining e. g. to educational attainment, subjective well-being, fertility and risk behaviour as well as health indicators such as BMI.


The syllabus is structured as follows: The first half of the course focuses on theory and concepts in Social Science Genetics and Quantitative Genetics. The second half will consist of building up the students’ ‘hands on’ experience preparing and analysing genetic data using R and PLINK software. Throughout the course, students are required to give presentations which will be assessed according to the criteria described below (Outcomes &  Assessment).

Participants require an interest in and a basic understanding of quantitative social science research and some experience with R software.

After attending this course, participants should have a basic understanding of the fundamental advantages of integrating genetic information into social science research. They should understand the basic technical terms from quantitative genetics literature and be able to read, interpret and criticize studies in social science genetics. They should be able to conduct basic quantitative genetics analyses and interpret their findings.

The assessment will consist of three elements: First is the student’s attendance and contribution to discussions (20%). Second is the presentation of a scientific paper or a topic in Social Science Genetics from the list below or researched by the student (40%). Third is an essay about the paper or topic presented. The essay should be approximately 3 pages in length and consist of an introductory paragraph summarizing at least one scientific study and highlighting an aspect the student is intrigued about and wishes to discuss in more depth. This should be followed up by an original discussion of the respective aspect of the study and one paragraph summarizing the discussion and your conclusions.


1: Introduction to Social Science Genetics

2: The Human Genome and Human Evolution

3: Genome-Wide Association Studies (GWAS)

4: (Missing) Heritability

5: Polygenic Scores

6: Gene-Environment Interplay

7: Genetic Selection and Confounding

8: Practical: Working with Genomic Data

9: Practical: Creating Polygenic Scores

10: Practical: Working with Polygenic Scores

11: Practical: Working with GWAS Summary Results

12: Summary, Ethics and Q&A


1.      Literature (required in bold)


1: Introduction to Social Science Genetics


Benjamin, D. J. et al. The promises and pitfalls of genoeconomics, Annu. Rev. Econ., 4 (2012), pp. 627-662


Cesarini, D., & Visscher, P. M. (2017). Genetics and educational attainment. Npj Science of Learning, 2(1), 4. http://doi.org/10.1038/s41539-017-0005-6


Conley, D. (2009). The promise and challenges of incorporating genetic data into longitudinal social science surveys and research. Biodemography and Social Biology, 55(2), 238–251.


Conley, D. and J. Fletcher. (2017). The Genome Factor: What the Social Genomics Revolution Reveals about Ourselves, Our History and the Future. Princeton: Princeton University Press.


Mills, Melinda C., Nicola Barban, and Felix C. Tropf. An Introduction to Statistical Genetic Data Analysis. MIT Press, 2020.


Mills, M. C., and F. C. Tropf. Sociology, Genetics, and the Coming of Age of Sociogenomics. Annual Review of Sociology 46 (2020).


Mills, M. C., & Tropf, F. C. (2016). The Biodemography of Fertility: A Review and Future Research Frontiers. Kölner Zeitschrift Für Soziologie Und Sozialpsychologie, 55(Special Issues Demography), 397–424.


Zimmer, C. (2018). She Has Her Mother’s Laugh: The Powers, Perversions, and Potential of Heredity. New York: MacMillan Publishers.


2: The Human Genome and Human Evolution


Courtiol, A., Tropf, F. C., & Mills, M. C. (2016). When genes and environment disagree: Making sense of trends in recent human evolution. Proceedings of the National Academy of Sciences, 113(28), 7693–7695. http://doi.org/10.1073/pnas.1608532113


Mills, Melinda C., Nicola Barban, and Felix C. Tropf. An Introduction to Statistical Genetic Data Analysis. MIT Press, 2020. Chapters 1+ 3


Mukherjee, S. (2016). The Gene: An Intimate History. New York: Simon & Schuster.


Tropf, F. C. et al. (2015). Human fertility, molecular genetics, and natural selection in modern societies. PloS One, 10(6), e0126821


3: Genome-Wide Association Studies (GWAS)


Barban, N. et al. (2016). Genome-wide analysis identifies 12 loci influencing human reproductive behavior. Nat. Genet. 48.


Duncan, L. E., M. C. Keller, A Critical Review of the First 10 Years of Candidate Gene-by-Environment Interaction Research in Psychiatry. Am. J. Psychiatry . 168 , 1041– 1049 (2011).


Karlsson Linnér, R. et al. (2019) Genome-wide association analyses of risk tolerance and risky behaviors in over 1 million individuals identify hundreds of loci and shared genetic influences. Nat. Genet.. doi:10.1038/s41588-018-0309-3


Lee et al. (2018). Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nat. Genet. 50 , 1112– 1121


Mills, Melinda C., Nicola Barban, and Felix C. Tropf. An Introduction to Statistical Genetic Data Analysis. MIT Press, 2020. Chapter 4


Mills, M. C, C. Rahal (2019). A Scientometric Review of Genome-Wide Association Studies. Commun. Biol. 2 , doi:10.1038/s42003-018-0261-x.


Okbay, A. et al. (2016). Genome-wide association study identifies 74 loci associated with educational attainment. Nature, 533(7604), 539–542. http://doi.org/10.1038/nature17671


Okbay, A. et al. (2016). Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses. Nat. Genet. 1–13. doi:10.1038/ng.3552


Visscher, P. M. et al. (2017). 10 Years of GWAS Discovery: Biology, Function, and Translation. Am. J. Hum. Genet. 101 , 5– 22.


4: (Missing) Heritability


Neale, M. C., & Cardon, L. R. (1992). Methodology for genetic studies of twins and families. Dordrecht, the Netherlands: Kluwer Academic Publishers.


Polderman, T. J. C. et al. (2015). Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nat. Genet. 47, 702–709.


Rietveld, C. A., et al. (2013). Molecular genetics and subjective well-being. Proceedings of the National Academy of Sciences, 110(24), 9692–9697.


Tropf, F. C., Barban, N., Mills, M. C., Snieder, H., & Mandemakers, J. J. (2015). Genetic influence on age at first birth of female twins born in the UK, 1919-68. Population Studies, 69(2), 129–145.


Tropf, F. C. et al. (2017). Hidden heritability due to heterogeneity across seven populations. Nat. Hum. Behav. 1, 757–765.


Turkheimer, E. Three Laws of Behavior Genetics and What They Mean. Curr. Dir. Psychol. Sci. 9, 160–164 (2000).


Turkheimer, E. et al (2003). Socioeconomic status modifies heritability of IQ in young children. Psychol. Sci. 14, 623–628.


5: Polygenic Scores


Belsky, D. W. & Israel, S. (2014). Integrating genetics and social science: genetic risk scores. Biodemography Soc. Biol. 60, 137–55.


Belsky, D. W. et al. The Genetics of Success. Psychol. Sci. 27, 957–972 (2016).


Belsky, D. W. & Harden, K. P. (2019). Phenotypic Annotation: Using Polygenic Scores to Translate Discoveries From Genome-Wide Association Studies From the Top Down. Curr. Dir. Psychol. Sci.. doi:10.1177/0963721418807729


Conley, D., et al. (2015). Is the Effect of Parental Education on Offspring Biased or Moderated by Genotype? Sociological Science, 2, 82–105. http://doi.org/10.15195/v2.a6


Conley, D. & Domingue, B. The Bell Curve Revisited: Testing Controversial Hypotheses with Molecular Genetic Data. (2016). doi:10.15195/v3.a23


Euesden, J., Lewis, C. M. & O’Reilly, P. F. (2014). PRSice: Polygenic Risk Score software. Bioinformatics 31, btu848-1468.


Harden, et al. Genetic Associations with Mathematics Tracking and Persistence in Secondary School, bioRxiv, doi: https://doi.org/10.1101/598532


Liu, Hexuan. 2018. Social and Genetic Pathways in Multigenerational Transmission of Educational Attainment. American Sociological Review 83(2): 278–304. http://journals.sagepub.com/doi/10.1177/0003122418759651 (May 29, 2018)


Mehta, D., Tropf, F. C., Gratten, J., Bakshi, A., Zhu, Z., Bacanu, S.-A., … Wu, J. Q. (2016). Evidence for Genetic Overlap Between Schizophrenia and Age at First Birth in Women. JAMA Psychiatry, 73(5), 497–505. http://doi.org/10.1001/jamapsychiatry.2016.0129


Mills, M. C., Barban, N. & Tropf, F. C. (2018). The Sociogenomics of Polygenic Scores of Reproductive Behavior and Their Relationship to Other Fertility Traits. RSF Russell Sage Found. J. Soc. Sci. 4.


Mills, Melinda C., Nicola Barban, and Felix C. Tropf. An Introduction to Statistical Genetic Data Analysis. MIT Press, 2020. Chapter 5+11


Vilhjalmsson, B. J. et al. (2015). Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores. Am J Hum Genet 97, 576–592.


6: Gene-Environment Interplay


Conley, D et al. (2016). Changing Polygenic Penetrance on Phenotypes in the 20th Century Among Adults in the US Population. Sci. Rep. 6, 30348.


Domingue, B. W., H. Liu, A. Okbay, D. W. Belsky (2017). Genetic heterogeneity in depressive symptoms following the death of a spouse: Polygenic score analysis of the U.S. Health and retirement study. Am. J. Psychiatry, doi:10.1176/appi.ajp.2017.16111209.


Engzell, Per, and Felix C. Tropf. "Heritability of education rises with intergenerational mobility." Proceedings of the National Academy of Sciences 116.51 (2019): 25386-25388.


Tucker-Drob, E. M. & Bates, T. C. (2016). Large Cross-National Differences in Gene × Socioeconomic Status Interaction on Intelligence. Psychol. Sci.. doi:10.1177/0956797615612727


Wedow, R. et al. Education, Smoking, and Cohort Change: Forwarding a Multidimensional Theory of the Environmental Moderation of Genetic Effects. Am. Sociol. Rev. (2018). doi:10.1177/0003122418785368


Mills, Melinda C., Nicola Barban, and Felix C. Tropf. An Introduction to Statistical Genetic Data Analysis. MIT Press, 2020. Chapters 6+11




Mills, Melinda C., Nicola Barban, and Felix C. Tropf. An Introduction to Statistical Genetic Data Analysis. MIT Press, 2020. Chapters 7-11







Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).


Conley, D. et al. Assortative mating and differential fertility by phenotype and genotype across the 20th century. Proc. Natl. Acad. Sci. 1523592113 (2016). doi:10.1073/pnas.1523592113

Domingue, B. W. et al. The social genome of friends and schoolmates in the National Longitudinal Study of Adolescent to Adult Health. Proc. Natl. Acad. Sci. (2018). doi:10.1073/pnas.1711803115


Fry et al. (2017). Comparison of Sociodemographic and Health-Related Characteristics of UK Biobank Participants With Those of the General Population. Am. J. Epidemiol. 186 , 1026– 1034.


Grotzinger, A. D. et al. Genomic SEM Provides Insights into the Multivariate Genetic Architecture of Complex Traits. bioRxiv 305029 (2018). doi:10.1101/305029


Mardis,  E. R. (2011).  A decade’s perspective on DNA sequencing technology. Nature. 470, 198 –203.


Smith, G. D. & Ebrahim, S. ‘Mendelian randomization’: Can genetic epidemiology contribute to understanding environmental determinants of disease? International Journal of Epidemiology (2003). doi:10.1093/ije/dyg070


Turley, P. et al. Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat. Genet. 50, 229–237 (2018).


Zhu, Z. et al. Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat. Commun. (2018). doi:10.1038/s41467-017-02317-2