Material Detail

Sparse Linear Models Explain Phenotypic Variation and Predict Risk of Complex Disease

Sparse Linear Models Explain Phenotypic Variation and Predict Risk of Complex Disease

This video was recorded at NIPS Workshops, Sierra Nevada 2011. A central goal of medical genetics is to create models that accurately predict complex disease given genotype. To maximize predictive value and identify causal single-nucleotide polymorphisms (SNPs), all SNPs should be modeled simultaneously. Lasso penalized models have proven to be a useful class of such models, for detecting causal SNPs and for modeling disease risk. Here, we present a comprehensive analysis of real case/control data using lasso-penalized models. Our models accurately discriminated cases from controls in celiac disease and type 1 diabetes, and strongly replicated across independent datasets with validation AUC of 0.84 for type 1 diabetes and 0.82–0.9 for celiac disease, the latter across four independent datasets of different European ethnicities. The models also explained substantial phenotypic variance in independent validation: 22% for type 1 diabetes and 21–38% for celiac disease. This study shows that supervised learning approaches can address missing phenotypic variance and reliably predict incidence of celiac disease and type 1 diabetes from genotype.

Quality

  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material

Comments

Log in to participate in the discussions or sign up if you are not already a MERLOT member.