A Survey of Human Disease Gene Counterparts in the Drosophila Genome

Abstract
The core component of our survey is a list of 287 human disease genes representing several different classes of diseases, including cancer, neurological diseases, cardiovascular diseases, malformation syndromes, hematological, immune, endocrine, renal, and metabolic disorders (Table 1). This list was compiled by scanning the Online Mendelian Inheritance in Man database (OMIM; http://www.ncbi.nlm.nih.gov/omim/) as well as medical textbooks and review articles listing classes of human disease genes. The criterion for inclusion in the final list was that the human gene must actually be mutated, altered, amplified, or deleted in human subjects with the disease. From our initial set of >800 human genes associated with diseases, over half were eliminated because they did not meet this criterion. Genes potentially linked to a human disease solely by cell culture experiments, yeast two-hybrid interaction screens, model organism studies, or similar approaches were excluded from our analysis. Each human disease gene on the final list was confirmed by checking OMIM or published literature sources, and was placed in the most relevant disease category on the list. For human disease genes in which different paralogs have been associated with disease, such as Ras family members, rhodopsins, and some HOX and PAX gene family members, a single example was chosen to represent the group and redundant paralogs were eliminated from the list. In some cases, assignment of a gene to a particular category was somewhat arbitrary, since altered gene function may result in different diseases or a syndrome characterized by multiple organ involvement or a complex pathophysiology. For example, human Notch gene mutations cause both cancer (Notch1 rearrangements in T cell acute lymphoblastic leukemia) and neurological disease (Notch3 point mutations in CADASIL). The final list of 287 human disease genes is not meant to be comprehensive; in fact, there are currently estimated to be 1,000 human disease genes defined by at least one allelic variant each (Antonarakis and McKusick 2000). However, our list does represent a large set of genes mutated in a wide variety of human diseases, adjusted to prevent biasing the survey towards certain common gene families.