Diploid genome reconstruction of Ciona intestinalis and comparative analysis with Ciona savignyi
Open Access
- 13 June 2007
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 17 (7) , 1101-1110
- https://doi.org/10.1101/gr.5894107
Abstract
One of the main goals in genome sequencing projects is to determine a haploid consensus sequence even when clone libraries are constructed from homologous chromosomes. However, it has been noticed that haplotypes can be inferred from genome assemblies by investigating phase conservation in sequenced reads. In this study, we seek to infer haplotypes, a diploid consensus sequence, from the genome assembly of an organism, Ciona intestinalis. The Ciona intestinalis genome is an ideal resource from which haplotypes can be inferred because of the high polymorphism rate (1.2%). The haplotype estimation scheme consists of polymorphism detection and phase estimation. The core step of our method is a Gibbs sampling procedure. The mate-pair information from two-end sequenced clone inserts is exploited to provide long-range continuity. We estimate the polymorphism rate of Ciona intestinalis to be 1.2% and 1.5%, according to two different polymorphism counting schemes. The distribution of heterozygosity number is well fit by a compound Poisson distribution. The N50 length of haplotype segments is 37.9 kb in our assembly, while the N50 scaffold length of the Ciona intestinalis assembly is 190 kb. We also infer diploid gene sequences from haplotype segments. According to our reconstruction, 85.4% of predicted gene sequences are continuously covered by single haplotype segments. Our results indicate 97% accuracy in haplotype estimation, based on a simulated data set. We conduct a comparative analysis with Ciona savignyi, and discover interesting patterns of conserved DNA elements in chordates.Keywords
This publication has 43 references indexed in Scilit:
- Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomesGenome Research, 2005
- Megabase deletions of gene deserts result in viable miceNature, 2004
- Environmental Genome Shotgun Sequencing of the Sargasso SeaScience, 2004
- Haplotype reconstruction from genotype data using Imperfect PhylogenyBioinformatics, 2004
- Haplotyping as Perfect Phylogeny: A Direct ApproachJournal of Computational Biology, 2003
- The Draft Genome of Ciona intestinalis : Insights into Chordate and Vertebrate OriginsScience, 2002
- BLAT—The BLAST-Like Alignment ToolGenome Research, 2002
- DNA sequence quality trimming and vector removalBioinformatics, 2001
- Initial sequencing and analysis of the human genomeNature, 2001
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997