ESTExplorer: an expressed sequence tag (EST) assembly and annotation platform
Open Access
- 8 May 2007
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 35 (Web Server) , W143-W147
- https://doi.org/10.1093/nar/gkm378
Abstract
The analysis of expressed sequence tag (EST) datasets offers a rapid and cost-effective approach to elucidate the transcriptome of an organism, but requiring several computational methods for assembly and annotation. ESTExplorer is a comprehensive workflow system for EST data management and analysis. The pipeline uses a ‘distributed control approach’ in which the most appropriate bioinformatics tools are implemented over different dedicated processors. Species-specific repeat masking and conceptual translation are in-built. ESTExplorer accepts a set of ESTs in FASTA format which can be analysed using programs selected by the user. After pre-processing and assembly, the dataset is annotated at the nucleotide and protein levels, following conceptual translation. Users may optionally provide ESTExplorer with assembled contigs for annotation purposes. Functionally annotated contigs/ESTs can be analysed individually. The overall outputs are gene ontologies, protein functional identifications in terms of mapping to protein domains and metabolic pathways. ESTExplorer has been applied successfully to annotate large EST datasets from parasitic nematodes and to identify novel genes as potential targets for parasite intervention. ESTExplorer runs on a Linux cluster and is freely available for the academic community at http://estexplorer.biolinfo.org.Keywords
This publication has 23 references indexed in Scilit:
- JUICE: a data management system that facilitates the analysis of large volumes of information in an EST project workflowBMC Bioinformatics, 2006
- EGassembler: online bioinformatics service for large-scale processing, clustering and assembling ESTs and genomic DNA fragmentsNucleic Acids Research, 2006
- A hitchhiker's guide to expressed sequence tag (EST) analysisBriefings in Bioinformatics, 2006
- ParPEST: a pipeline for EST data analysis based on parallel computingBMC Bioinformatics, 2005
- PartiGene—constructing partial genomesBioinformatics, 2004
- ESTAP—an automated system for the analysis of EST dataBioinformatics, 2003
- ESTWeb: bioinformatics services for EST sequencing projectsBioinformatics, 2003
- ESTAnnotator: a tool for high throughput EST annotationNucleic Acids Research, 2003
- TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasetsBioinformatics, 2003
- Sequence identification of 2,375 human brain genesNature, 1992