ForestTreeDB: a database dedicated to the mining of tree transcriptomes
Open Access
- 27 November 2006
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 35 (Databae) , D888-D894
- https://doi.org/10.1093/nar/gkl882
Abstract
ForestTreeDB is intended as a resource that centralizes large-scale expressed sequence tag (EST) sequencing results from several tree species (). It currently encompasses 344 878 quality sequences from 68 libraries, from diverse organs of conifer and hybrid poplar trees. It utilizes the Nimbus data model to provide a hosting system for multiple projects, and uses object-relational mapping APIs in Java and Perl for data accesses within an Oracle database designed to be scalable, maintainable and extendable. Transcriptome builds or unigene sets occupy the focal point of the system. Several of the five current species-specific unigenes were used to design microarrays and SNP resources. The ForestTreeDB web application provides the means for multiple combination database queries. It presents the user with a list of discrete queries to retrieve and download large EST datasets or sequences from precompiled unigene assemblies. Functional annotation assignment is not trivial in conifers which are distantly related to angiosperm model plants. Optimal annotations are achieved through database queries that integrate results from several procedures based open-source tools. ForestTreeDB aims to facilitate sequence mining of coherent annotations in multiple species to support comparative genomic approaches. We plan to continuously enrich ForestTreeDB with other resources through collaborations with other genomic projects.Keywords
This publication has 17 references indexed in Scilit:
- Automated SNP detection from a large collection of white spruce expressed sequences: contributing factors and approaches for the categorization of SNPsBMC Genomics, 2006
- Water stress-responsive genes in loblolly pine (Pinus taeda) roots identified by analyses of expressed sequence tag librariesTree Physiology, 2006
- Dirigent Proteins in Conifer Defense: Gene Discovery, Phylogeny, and Differential Wound- and Insect-induced Expression of a Family of DIR and DIR-like Genes in Spruce (Picea spp.)Plant Molecular Biology, 2006
- Generation, annotation, analysis and database integration of 16,500 white spruce EST clustersBMC Genomics, 2005
- Comparative Plant Genomics Resources at PlantGDBPlant Physiology, 2005
- The TIGR Gene Indices: clustering and assembling EST and known genes and integration with eukaryotic genomesNucleic Acids Research, 2004
- Apparent homology of expressed genes from wood-forming tissues of loblolly pine ( Pinus taeda L.) with Arabidopsis thalianaProceedings of the National Academy of Sciences, 2003
- Analysis of xylem formation in pine by cDNA sequencingProceedings of the National Academy of Sciences, 1998
- Base-Calling of Automated Sequencer Traces UsingPhred. I. Accuracy AssessmentGenome Research, 1998
- Base-Calling of Automated Sequencer Traces Using Phred. II. Error ProbabilitiesGenome Research, 1998