Machine Learning for Personalized Medicine: Heritability-based models for prediction of complex traits (David Balding)

Written by: Stephen Hsu

Primary Source: Information Processing

Highly recommended talk by David Balding on modern approaches to heritability, relatedness, etc. in statistical genetics. (I listened at 1.5x normal speed, which worked for me.)

MLPM (Machine Learning for Personalized Medicine) Summer School 2015
Monday 21st of September

Heritability-based models for prediction of complex traits
by David Balding

Complex trait genetics has been revolutionised over the past 5 years by developments related to the concept of heritability. Heritability is the fraction of phenotypic variation that can be attributed to genetic mechanisms (mostly we focus on narrow-sense heritability, which considers only additive genetic effects). Since we cannot identify and measure the causal genetic mechanisms, a traditional approach has been to use pedigree relatedness as a proxy for the sharing of causal alleles between individuals. Pedigree relatedness even came to be seen as central to the concept of heritability, which perhaps explains why it was not until 2010 that it became widely appreciated that genome-wide genetic markers (SNPs) offered at least a “noisy” way to directly measure causal alleles, and hence a new approach to assessing heritability. This approach is “noisy” because SNPs generally only tag causal variants imperfectly, depending on SNP density and linkage disequilibrium, and many SNPs may tag little or no causal variation. So genome-wide SNP-based heritability estimates are difficult to interpret, but they can provide a lower bound which was enough to show that SNPs usually tag much more causal variation than can be attributed to genome-wide significant SNPs. Another big step forward has been that heritability can be attributed to different genes, genomic regions or functional classes, and for many phenotypes it is found to be widely dispersed across the genome, with relatively little concentration in coding regions. Further, heritability has become a unit of common currency for gene-based tests and meta-analysis. I will review the ideas and the underlying mathematical models, and present some recent results.

Some comments:

1. He notes that after a few hundred years, it’s highly likely that a given descendant carries no actual DNA from a specific ancestor (e.g., most descendants of Shakespeare alive today have none of his DNA).

2. @18min or so: a request to Chris Chang to add a modified definition of SNP relatedness to PLINK (i.e., new flag), with a different weighting for the heterozygous (1,1) case  ;-)

3. @29min or so: finally, a discussion of systematic errors in GCTA due to LD characteristics of causal variants. As I said here:

I’ve always felt that the real weakness of GCTA is the assumption of random effects. A consequence of this assumption is that if the true causal variants are atypical (e.g., in terms of linkage disequilibrium) among common SNPs, the results could be biased. It is impossible to evaluate this uncertainty at the moment because we do not yet know the (full) genetic architectures of any complex traits.

See also Heritability Estimates from Summary Statistics, No Genomic Dark Matter, and HaploSNPs and missing heritability.

4. @35min: again T1D stands out in terms of genetic architecture

5. @47min: predictive correlations of almost 0.6 for T1D

Slides for this talk. Slides for another Balding lecture: Introduction to Genomic Prediction.

The following two tabs change content below.
Stephen Hsu
Stephen Hsu is vice president for Research and Graduate Studies at Michigan State University. He also serves as scientific adviser to BGI (formerly Beijing Genomics Institute) and as a member of its Cognitive Genomics Lab. Hsu’s primary work has been in applications of quantum field theory, particularly to problems in quantum chromodynamics, dark energy, black holes, entropy bounds, and particle physics beyond the standard model. He has also made contributions to genomics and bioinformatics, the theory of modern finance, and in encryption and information security. Founder of two Silicon Valley companies—SafeWeb, a pioneer in SSL VPN (Secure Sockets Layer Virtual Private Networks) appliances, which was acquired by Symantec in 2003, and Robot Genius Inc., which developed anti-malware technologies—Hsu has given invited research seminars and colloquia at leading research universities and laboratories around the world.
Stephen Hsu

Latest posts by Stephen Hsu (see all)