Gene2EST: a BLAST2 server for searching expressed sequence tag (EST) databases with eukaryotic gene-sized queries
- 15 March 2001
- journal article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 29 (6) , 1272-1277
- https://doi.org/10.1093/nar/29.6.1272
Abstract
Expressed sequence tags (ESTs) are randomly sequenced cDNA clones. Currently, nearly 3 million human and 2 million mouse ESTs provide valuable resources that enable researchers to investigate the products of gene expression. The EST databases have proven to be useful tools for detecting homologous genes, for exon mapping, revealing differential splicing, etc. With the increasing availability of large amounts of poorly characterised eukaryotic (notably human) genomic sequence, ESTs have now become a vital tool for gene identification, sometimes yielding the only unambiguous evidence for the existence of a gene expression product. However, BLAST-based Web servers available to the general user have not kept pace with these developments and do not provide appropriate tools for querying EST databases with large highly spliced genes, often spanning 50 000-100 000 bases or more. Here we describe Gene2EST (http://woody.embl-heidelberg.de/gene2est/), a server that brings together a set of tools enabling efficient retrieval of ESTs matching large DNA queries and their subsequent analysis. RepeatMasker is used to mask dispersed repetitive sequences (such as Alu elements) in the query, BLAST2 for searching EST databases and Artemis for graphical display of the findings. Gene2EST combines these components into a Web resource targeted at the researcher who wishes to study one or a few genes to a high level of detail.Keywords
This publication has 28 references indexed in Scilit:
- Repeats in genomic DNA: mining and meaningPublished by Elsevier ,2002
- Shotgun sequencing of the human transcriptome with ORF expressed sequence tagsProceedings of the National Academy of Sciences, 2000
- The Genome Sequence of Drosophila melanogasterScience, 2000
- Vertebrate pseudogenesFEBS Letters, 2000
- The EMBL Nucleotide Sequence DatabaseNucleic Acids Research, 2000
- A RAPID algorithm for sequence database comparisons: application to the identification of vector contamination in the EMBL databases.Bioinformatics, 1999
- Genome Sequence of the Nematode C. elegans : A Platform for Investigating BiologyScience, 1998
- ALTERNATIVE SPLICING OF PRE-mRNA: Developmental Consequences and Mechanisms of RegulationAnnual Review of Genetics, 1998
- Does SINE evolution preclude Alu function?Nucleic Acids Research, 1998
- EbEST: An Automated Tool Using Expressed Sequence Tags to Delineate Gene StructureGenome Research, 1998