SSGAC EA3: genomic prediction of educational attainment and related cognitive phenotypes

Written by: Stephen Hsu

Primary Source: Information Processing, 07/23/2018.

Years ago I predicted that:

1. Cognitive ability would turn out to be influenced by many thousands of genetic variants, each of small effect.

2. With large enough sample size we would detect these variants and eventually construct genomic predictors.

The Nature Genetics paper (below) from the SSGAC collaboration takes a significant step in that direction.

Although the study used over a million genotypes, the data had to be aggregated across many sub-cohorts using summary statistics only. This does not permit the L1-penalized optimization we used to build our height predictor.

For out of sample validation of the results below, see this PNAS paper, which (unusually) appeared before the paper on which it is based.

The lead author James Lee is on the left below. Chris Chang, author of Plink 2.0, is on the right. The photo was taken in 2010 at BGI — they are standing in front of crates of Illumina sequencers.

Lead author James Lee and Chris Chang of Plink 2.0

Article | Published: 23 July 2018

Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals

James J. Lee, Robbee Wedow, […]David Cesarini
Nature Genetics (2018)

Here we conducted a large-scale genetic association analysis of educational attainment in a sample of approximately 1.1 million individuals and identify 1,271 independent genome-wide-significant SNPs. For the SNPs taken together, we found evidence of heterogeneous effects across environments. The SNPs implicate genes involved in brain-development processes and neuron-to-neuron communication. In a separate analysis of the X chromosome, we identify 10 independent genome-wide-significant SNPs and estimate a SNP heritability of around 0.3% in both men and women, consistent with partial dosage compensation. A joint (multi-phenotype) analysis of educational attainment and three related cognitive phenotypes generates polygenic scores that explain 11–13% of the variance in educational attainment and 7–10% of the variance in cognitive performance. This prediction accuracy substantially increases the utility of polygenic scores as tools in research.

A nice figure from the paper: Add Health (National Longitudinal Study of Adolescent to Adult Health) and HRS (Health in Retirement Study) are two longitudinal cohorts that have been genotyped; horizontal axis is polygenic score. It appears that individuals with top quintile polygenic scores are about 5 times more likely to complete college than bottom quintile individuals.

Table of mean prevalence of college completion percentage

Here’s a comment on the paper I provided to a journalist:

The EA3 predictor correlates about 0.35 with educational attainment, and slightly less well with measured cognitive ability. While this is far from perfect prediction, it does allow identification of individuals, using DNA alone, who are at unusual risk of being well below average in cognitive ability or struggling in school. Standardized tests, such as SAT, ACT, GRE, LSAT, etc., typically also correlate roughly 0.35 with educational outcomes like grade point average, degree completion, etc. In this sense, the genomic predictor is comparable to widely used tests and it will certainly improve as more data are analyzed. See figure.


Table of correlation of EA3 and education attainment

The following two tabs change content below.
Stephen Hsu
Stephen Hsu is vice president for Research and Graduate Studies at Michigan State University. He also serves as scientific adviser to BGI (formerly Beijing Genomics Institute) and as a member of its Cognitive Genomics Lab. Hsu’s primary work has been in applications of quantum field theory, particularly to problems in quantum chromodynamics, dark energy, black holes, entropy bounds, and particle physics beyond the standard model. He has also made contributions to genomics and bioinformatics, the theory of modern finance, and in encryption and information security. Founder of two Silicon Valley companies—SafeWeb, a pioneer in SSL VPN (Secure Sockets Layer Virtual Private Networks) appliances, which was acquired by Symantec in 2003, and Robot Genius Inc., which developed anti-malware technologies—Hsu has given invited research seminars and colloquia at leading research universities and laboratories around the world.
Stephen Hsu

Latest posts by Stephen Hsu (see all)