Detecting population structure using STRUCTURE software: effect of background linkage disequilibrium

Abstract
STRUCTURE is the most widely used clustering software to detect population genetic structure. The last version of this software (STRUCTURE 2.1) has been enhanced recently to take into account the occurrence of linkage disequilibrium (LD) caused by admixture between populations. This last version, however, still does not consider the effects of strong background LD caused by genetic drift, and which may cause spurious results. STRUCTURE authors have, therefore, suggested a rough threshold value of the distance (1.0 cM) between two loci below which the pair of loci should not be used. Because of the sensitiveness of LD to demographic events, the distance between loci is not always a good indicator of the strength of LD. In this study, we examine the link between genomic distance and the strength of the correlation between loci (rLD) in a free-ranging population of mouflon (Ovis aries), and we present an empirical test of effect of rLD on the clustering results provided by the linkage model in STRUCTURE. We showed that a high rLD value increases the probability of detecting spurious clustering. We propose to use rLD as an index to base a decision on whether or not to use a pair of loci in a clustering analysis.