Effect of heterogeneity and assumed mode of inheritance on lod scores

Abstract
Heterogeneity is a major factor in many common, complex diseases and can confound linkage analysis. Using computer‐simulated heterogeneous data we tested what effect unlinked families have on a linkage analysis when heterogeneity is not taken into account. We created 60 data sets of 40 nuclear families each with different proportions of linked and unlinked families and with different modes of inheritance. The ascertainment probability was 0.05, the disease had a penetrance of 0.6, and the recombination fraction for the linked families was zero. For the analysis we used a variety of assumed modes of inheritance and penetrances. Under these conditions we looked at the effect of the unlinked families on the lod score, the evaluation of the mode of inheritance, and the estimate of penetrance and of the recombination fraction in the linked families. When the analysis was done under the correct mode of inheritance for the linked families, we found that the mode of inheritance of the unlinked families had minimal influence on the highest maximum lod score (MMLS) (i.e., we maximized the maximum lod score with respect to penetrance). Adding sporadic families decreased the MMLS less than adding recessive or dominant unlinked families. The mixtures of dominant linked families with unlinked families always led to a higher MMLS when analyzed under the correct (dominant) mode of inheritance than when analyzed under the incorrect mode of inheritance. In the mixtures with recessive linked families, assuming the correct mode of inheritance generally led to a higher MMLS, but we observed broad variation. The estimate of the recombination fraction became larger as the proportion of recessive or dominant unlinked families increased, but, in general, sporadic families had less influence on the estimate of the recombination fraction than did unlinked families with genetic disease. The assumed penetrance that led to the highest maximum lod score occurred close to the value which was used to generate the data. There was more variation as the proportion of linked families in the data sets decreased.