ECgene: Genome-based EST clustering and gene modeling for alternative splicing
Open Access
- 1 April 2005
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 15 (4) , 566-576
- https://doi.org/10.1101/gr.3030405
Abstract
With the availability of the human genome map and fast algorithms for sequence alignment, genome-basedESTclustering became a viable method for gene modeling. We developed a novel gene-modeling method, ECgene (Gene modeling by EST Clustering), which combines genome-based EST clustering and the transcript assembly procedure in a coherent and consistent fashion. Specifically, ECgene takes alternative splicing events into consideration. The position of splice sites (i.e., exon–intron boundaries) in the genome map is utilized as the critical information in the whole procedure. Sequences that share any splice sites are grouped together to define an EST cluster in a manner similar to that of the genome-based version of the UniGene algorithm. Transcript assembly is achieved using graph theory that represents the exon connectivity in each cluster as a directed acyclic graph (DAG). Distinct paths along exons correspond to possible gene models encompassing all alternative splicing events. EST sequences in each cluster are subclustered further according to the compatibility with gene structure of each splice variant, and they can be regarded as clone evidence for the corresponding isoform. The reliability of each isoform is assessed from the nature of cluster members and from the minimum number of clones required to reconstruct all exons in the transcript.Keywords
This publication has 44 references indexed in Scilit:
- ESTGenes: Alternative Splicing From ESTs in EnsemblGenome Research, 2004
- The Multiassembly Problem: Reconstructing Multiple Transcript Isoforms From EST Fragment MixturesGenome Research, 2004
- Prediction of Mammalian MicroRNA TargetsCell, 2003
- Impact of Alternative Initiation, Splicing, and Termination on the Diversity of the mRNA Transcripts Encoded by the Mouse TranscriptomeGenome Research, 2003
- The UCSC Genome Browser DatabaseNucleic Acids Research, 2003
- Selecting for Functional Alternative Splices in ESTsGenome Research, 2002
- Splice Variation in Mouse Full-Length cDNAs Identified by Mapping to the Mouse GenomeGenome Research, 2002
- Alternative pre-mRNA splicing and proteome expansion in metazoansNature, 2002
- BLAT—The BLAST-Like Alignment ToolGenome Research, 2002
- Prediction of complete gene structures in human genomic DNAJournal of Molecular Biology, 1997