New computational tools for Brassica genome research
Open Access
- 7 April 2004
- journal article
- website
- Published by Wiley in Comparative and Functional Genomics
- Vol. 5 (3) , 276-280
- https://doi.org/10.1002/cfg.394
Abstract
With the increasing quantities of Brassica genomic data being entered into the public domain and in preparation for the complete Brassica genome sequencing effort, there is a growing requirement for the structuring and detailed bioinformatic analysis of Brassica genomic information within a user-friendly database. At the Plant Biotechnology Centre, Melbourne, Australia, we have developed a series of tools and computational pipelines to assist in the processing and structuring of genomic data, to aid its application to agricultural biotechnology research. These tools include a sequence database, ASTRA, a sequence processing pipeline incorporating annotation against GenBank, SwissProt and Arabidopsis Gene Ontology (GO) data and tools for molecular marker discovery and comparative genome analysis. All sequences are mined for simple sequence repeat (SSR) molecular markers using ‘SSR primer’ and mapped onto the complete Arabidopsis thaliana genome by sequence comparison. The database may be queried using a text-based search of sequence annotation or GO terms, BLAST comparison against resident sequences, or by the position of candidate orthologues within the Arabidopsis genome. Tools have also been developed and applied to the discovery of single nucleotide polymorphism (SNP) molecular markers and the in silico mapping of Brassica BAC end sequences onto the Arabidopsis genome. Planned extensions to this resource include the integration of gene expression data and the development of an EnsEMBL-based genome viewer.Keywords
This publication has 10 references indexed in Scilit:
- Simple sequence repeat marker loci discovery using SSR primerBioinformatics, 2004
- Mining for Single Nucleotide Polymorphisms and Insertions/Deletions in Maize Expressed Sequence Tag DataPlant Physiology, 2003
- TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasetsBioinformatics, 2003
- Redundancy based detection of sequence polymorphisms in expressed sequence tag data using autoSNPBioinformatics, 2003
- The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plantNucleic Acids Research, 2001
- Analysis of the genome sequence of the flowering plant Arabidopsis thalianaNature, 2000
- Gene Ontology: tool for the unification of biologyNature Genetics, 2000
- Primer3 on the WWW for General Users and for Biologist ProgrammersPublished by Springer Nature ,2000
- Base-Calling of Automated Sequencer Traces UsingPhred. I. Accuracy AssessmentGenome Research, 1998
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997