Species Choice for Comparative Genomics: Being Greedy Works
Open Access
- 2 December 2005
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Genetics
- Vol. 1 (6) , e71
- https://doi.org/10.1371/journal.pgen.0010071
Abstract
Several projects investigating genetic function and evolution through sequencing and comparison of multiple genomes are now underway. These projects consume many resources, and appropriate planning should be devoted to choosing which species to sequence, potentially involving cooperation among different sequencing centres. A widely discussed criterion for species choice is the maximisation of evolutionary divergence. Our mathematical formalization of this problem surprisingly shows that the best long-term cooperative strategy coincides with the seemingly short-term “greedy” strategy of always choosing the next best single species. Other criteria influencing species choice, such as medical relevance or sequencing costs, can also be accommodated in our approach, suggesting our results' broad relevance in scientific policy decisions. What would happen if sequencing centres around the world were to choose genomes without consulting each other and without devising long-term strategies? When several parties are involved in decisions with interacting consequences, experience teaches that cooperation and planning are usually necessary to guarantee the best result. Similarly, in computer science, “greedy” algorithms—which construct solutions by iteratively taking the best immediate choice—are rarely the best option to solve a problem. The authors show, however, that in the context of choosing species for comparative genomics, cooperation and planning can be kept to a minimum without affecting the quality of the global result: a greedy algorithm applied to the problem of maximising the evolutionary divergence among species chosen from a known phylogeny is proven to guarantee optimal solutions.Keywords
This publication has 18 references indexed in Scilit:
- Computational screening of conserved genomic DNA in search of functional noncoding elementsNature Methods, 2005
- A Model of the Statistical Power of Comparative Genome Sequence AnalysisPLoS Biology, 2005
- Comparative genome sequencing ofDrosophila pseudoobscura: Chromosomal, gene, andcis-element evolutionGenome Research, 2005
- Reconstructing large regions of an ancestral mammalian genome in silicoGenome Research, 2004
- Genome sequence of the Brown Norway rat yields insights into mammalian evolutionNature, 2004
- The Genome Sequence of Caenorhabditis briggsae: A Platform for Comparative GenomicsPLoS Biology, 2003
- The Transcriptome of the Intraerythrocytic Developmental Cycle of Plasmodium falciparumPLoS Biology, 2003
- Sequencing and comparison of yeast species to identify genes and regulatory elementsNature, 2003
- Quantitative Estimates of Sequence Divergence for Comparative Analyses of Mammalian GenomesGenome Research, 2003
- Initial sequencing and comparative analysis of the mouse genomeNature, 2002