Laplacian Eigenfunctions Learn Population Structure
Open Access
- 1 December 2009
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 4 (12) , e7928
- https://doi.org/10.1371/journal.pone.0007928
Abstract
Principal components analysis has been used for decades to summarize genetic variation across geographic regions and to infer population migration history. More recently, with the advent of genome-wide association studies of complex traits, it has become a commonly-used tool for detection and correction of confounding due to population structure. However, principal components are generally sensitive to outliers. Recently there has also been concern about its interpretation. Motivated from geometric learning, we describe a method based on spectral graph theory. Regarding each study subject as a node with suitably defined weights for its edges to close neighbors, one can form a weighted graph. We suggest using the spectrum of the associated graph Laplacian operator, namely, Laplacian eigenfunctions, to infer population structure. In simulations and real data on a ring species of birds, Laplacian eigenfunctions reveal more meaningful and less noisy structure of the underlying population, compared with principal components. The proposed approach is simple and computationally fast. It is expected to become a promising and basic method for population genetics and disease association studies.Keywords
This publication has 18 references indexed in Scilit:
- Discovering genetic ancestry using spectral graph theoryGenetic Epidemiology, 2009
- The Population Reference Sample, POPRES: A Resource for Population, Disease, and Pharmacological Genetics ResearchPublished by Elsevier ,2008
- Interpreting principal component analyses of spatial population genetic variationNature Genetics, 2008
- Principal components analysis corrects for stratification in genome-wide association studiesNature Genetics, 2006
- Assessing the impact of population stratification on genetic association studiesNature Genetics, 2004
- Laplacian Eigenmaps for Dimensionality Reduction and Data RepresentationNeural Computation, 2003
- Association mapping, using a mixture model for complex traitsGenetic Epidemiology, 2002
- Use of Unlinked Genetic Markers to Detect Population Stratification in Association StudiesAmerican Journal of Human Genetics, 1999
- Genetic Dissection of Complex TraitsScience, 1994
- Synthetic Maps of Human Gene Frequencies in EuropeansScience, 1978