Proposal for an allele nomenclature system based on the evolutionary divergence of haplotypes
- 19 November 2002
- journal article
- research article
- Published by Hindawi Limited in Human Mutation
- Vol. 20 (6) , 463-472
- https://doi.org/10.1002/humu.10143
Abstract
The classical view of what constitutes an “allele” has been challenged by recent findings of a great deal of human genetic variability, i.e., we can expect, on average, one variant site every 100–250 bases of our haploid genome. The haplotype is defined as “the patterns of co-occurrence of variant sites on the same chromosome” (and therefore within each particular gene). Sufficient evidence exists for the divergence of haplotypes during evolution of Homo sapiens sapiens, and the total number of haplotypes per gene will reflect the amount of time any particular ethnic group has existed on the planet, e.g., greatest in Africans, fewer in East Asians, and still fewer in Caucasians. If the average gene spans 30 kb, we can expect ∼170 polymorphic variant sites per gene in the world population. We do not see 2170 haplotypes, however; we might find only 10 to 200 haplotypes (depending on the gene's size and degree of conservation of the gene product). This finite number allows for a reasonable haplotype nomenclature system for each gene, based on evolutionary divergence. For polymorphic variants (i.e., frequency ≥ 0.01), I propose using Arabic numerals for the major clades (e.g., *1, *2, … *20, *21), capital letters for sublineages (e.g., *2A, *2B, *2C), and Arabic numerals for sub-sublineages (e.g., *22G12, *22G13); additional subcategories may be added, in an alternating number/letter/number/letter sequence, depending on the complexity of present-day haplotypes of a particular gene. Web sites with a web master and external advisory committee should be set up for each gene superfamily, family, or individual gene (depending on complexity), and an international haplotype nomenclature committee, perhaps comprised of several dozen of these web masters, should oversee haplotype nomenclature for the entire human genome. The higher heterozygosity and multiallelic nature makes haplotypes more informative than biallelic SNPs. Ultimately, our knowledge of haplotype patterns, rather than single variant sites, of perhaps several hundred genes will likely be helpful in finding associations between genotype and any multiplex phenotype (e.g., complex diseases including cancer, and/or toxicity of pharmaceutical agents or environmental pollutants). Hum Mutat 20:463–472, 2002.Keywords
This publication has 46 references indexed in Scilit:
- Complex SNP-Based Haplotypes in Three Human Helicases: Implications for Cancer Association StudiesGenome Research, 2002
- Out of Africa again and againNature, 2002
- On the changing meanings of ?mutation?Human Mutation, 2001
- A Short Primer on RNAiCell, 2001
- Comparisons of Two Methods for Haplotype Reconstruction and Haplotype Frequency Estimation from Population DataAmerican Journal of Human Genetics, 2001
- Nomenclature for the description of human sequence variationsHuman Genetics, 2001
- Haplotypes and Linkage Disequilibrium at the Phenylalanine Hydroxylase Locus, PAH, in a Global Representation of PopulationsAmerican Journal of Human Genetics, 2000
- Human Sulfotransferases SULT1C1 and SULT1C2: cDNA Characterization, Gene Cloning, and Chromosomal LocalizationGenomics, 2000
- P450 superfamily: update on new sequences, gene mapping, accession numbers and nomenclaturePharmacogenetics, 1996
- The intricacies of β-globin gene expressionBiochemistry and Cell Biology, 1994