The repetitive landscape of the chicken genome
Open Access
- 15 July 2004
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 15 (1) , 126-136
- https://doi.org/10.1101/gr.2438004
Abstract
Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, and low-copy fractions of the chicken genome. Sequencing high-copy DNA of chicken to about 2.7× coverage of its estimated sequence complexity led to the initial identification of several new repeat families, which were then used for a survey of the newly released first draft of the complete chicken genome. The analysis provided insight into the diversity and biology of known repeat structures such as CR1 and CNM, for which only limited sequence data had previously been available. Cot sequence data also resulted in the identification of four novel repeats (Birddawg, Hitchcock, Kronos, and Soprano), two new subfamilies of CR1 repeats, and many elements absent from the chicken genome assembly. Multiple autonomous elements were found for a novel Mariner-like transposon, Galluhop, in addition to nonautonomous deletion derivatives. Phylogenetic analysis of the high-copy repeats CR1, Galluhop, and Birddawg provided insight into two distinct genome dispersion strategies. This study also exemplifies the power of the CBCS method to create representative databases for the repetitive fractions of genomes for which only limited sequence data is available.Keywords
This publication has 45 references indexed in Scilit:
- Viral Discovery and Sequence Recovery Using DNA MicroarraysPLoS Biology, 2003
- Initial sequencing and comparative analysis of the mouse genomeNature, 2002
- Alu repeats and human genomic diversityNature Reviews Genetics, 2002
- Integration of Cot Analysis, DNA Cloning, and High-Throughput Sequencing Facilitates Genome Characterization and Gene DiscoveryGenome Research, 2002
- Initial sequencing and analysis of the human genomeNature, 2001
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysisGene, 1995
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- A 41–42 bp tandemly repeated sequence isolated from nuclear envelopes of chicken erythrocytes is located predominantly on microchromosomesChromosoma, 1990
- Confidence Limits on Phylogenies: An Approach Using the BootstrapEvolution, 1985