Speaker:
Speaker Link:
Institution:
Time:
Host:
Location:
Genetic admixture estimation has been widely studied and proven very useful in ancestry inference and genome-wide association studies. Existing methods and tools, such as ADMIXTUER and OpenADMIXTURE, often encountered identifiability issues and convergence problems due to the model complexity and ultrahigh dimensionality of the parameter space.
In this talk, we provide some analytical insights to characterize the convergence of admixture models and present an innovative and scalable statistical modeling framework to further improve the accuracy of genetic ancestry estimation. Specifically, we will discuss
(1) a novel transfer learning approach that can leverage outside biobank summary data just using allele frequency; and
(2) a novel LASSO penalized clustering model that can seamlessly rank/select important ancestry informative markers.
These newly proposed approaches were applied to public GWAS data to demonstrate their competitive performance.