Genome-wide assembly and analysis of alternative transcripts in mouse
Open Access
- 2 May 2005
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 15 (5) , 748-754
- https://doi.org/10.1101/gr.3269805
Abstract
To build a mouse gene index with the most comprehensive coverage of alternative transcription/splicing (ATS), we developed an algorithm and a fully automated computational pipeline for transcript assembly from expressed sequences aligned to the genome. We identified 191,946 genomic loci, which included 27,497 protein-coding genes and 11,906 additional gene candidates (e.g., nonprotein-coding, but multiexon). Comparison of the resulting gene index with TIGR, UniGene, DoTS, and ESTGenes databases revealed that it had a greater number of transcripts, a greater average number of exons and introns with proper splicing sites per gene, and longer ORFs. The 27,497 protein-coding genes had 77,138 transcripts, i.e., 2.8 transcripts per gene on average. Close examination of transcripts led to a combinatorial table of 23 types of ATS units, only nine of which were previously described, i.e., 14 types of alternative splicing, seven types of alternative starts, and two types of alternative termination. The 47%, 18%, and 14% of 20,323 multiexon protein-coding genes with proper splice sites had alternative splicings, alternative starts, and alternative terminations, respectively. The gene index with the comprehensive ATS will provide a useful platform for analyzing the nature and mechanism of ATS, as well as for designing the accurate exon-based DNA microarrays.Keywords
This publication has 37 references indexed in Scilit:
- Finishing the euchromatic sequence of the human genomeNature, 2004
- Identification of alternatively spliced mRNA variants related to cancers by genome-wide ESTs alignmentOncogene, 2004
- Transcriptome Analysis of Mouse Stem Cells and Early EmbryosPLoS Biology, 2003
- Improving the Arabidopsis genome annotation using maximal transcript alignment assembliesNucleic Acids Research, 2003
- Impact of Alternative Initiation, Splicing, and Termination on the Diversity of the mRNA Transcripts Encoded by the Mouse TranscriptomeGenome Research, 2003
- Evolution of alternative splicing: deletions, insertions and origin of functional parts of proteins from intron sequencesTrends in Genetics, 2003
- Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAsNature, 2002
- Initial sequencing and comparative analysis of the mouse genomeNature, 2002
- BLAT—The BLAST-Like Alignment ToolGenome Research, 2002
- dbEST — database for “expressed sequence tags”Nature Genetics, 1993