ESTpass: a web-based server for processing and annotating expressed sequence tag (EST) sequences
Open Access
- 8 May 2007
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 35 (Web erver) , W159-W162
- https://doi.org/10.1093/nar/gkm369
Abstract
We present a web-based server, called ESTpass, for processing and annotating sequence data from expressed sequence tag (EST) projects. ESTpass accepts a FASTA-formatted EST file and its quality file as inputs, and it then executes a back-end EST analysis pipeline consisting of three consecutive steps. The first is cleansing the input EST sequences. The second is clustering and assembling the cleansed EST sequences using d2_cluster and CAP3 programs and producing putative transcripts. From the CAP3 output, ESTpass detects chimeric EST sequences which are confirmed through comparison with the nr database. The last step is annotating the putative transcript sequences using RefSeq, InterPro, GO and KEGG gene databases according to user-specified options. The major advantages of ESTpass are the integration of cleansing and annotating processes, rigorous chimeric EST detection, exhaustive annotation, and email reporting to inform the user about the progress and to send the analysis results. The ESTpass results include three reports (summary, cleansing and annotation) and download function, as well as graphic statistics. They can be retrieved and downloaded using a standard web browser. The server is available at http://estpass.kobic.re.kr/.Keywords
This publication has 25 references indexed in Scilit:
- InterPro, progress and status in 2005Nucleic Acids Research, 2004
- PartiGene—constructing partial genomesBioinformatics, 2004
- Comparison of computational methods for identifying translation initiation sites in EST dataBMC Bioinformatics, 2004
- ESTAnnotator: a tool for high throughput EST annotationNucleic Acids Research, 2003
- TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasetsBioinformatics, 2003
- Making sense of EST sequences by CLOBBing themBMC Bioinformatics, 2002
- d2_cluster: A Validated Method for Clustering EST and Full-Length cDNA SequencesGenome Research, 1999
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- dbEST — database for “expressed sequence tags”Nature Genetics, 1993
- Complementary DNA Sequencing: Expressed Sequence Tags and Human Genome ProjectScience, 1991