New computational tools for Brassica genome research

Open Access

7 April 2004

journal article
website
Published by Wiley in Comparative and Functional Genomics

Vol. 5 (3) , 276-280
https://doi.org/10.1002/cfg.394

Abstract

With the increasing quantities of Brassica genomic data being entered into the public domain and in preparation for the complete Brassica genome sequencing effort, there is a growing requirement for the structuring and detailed bioinformatic analysis of Brassica genomic information within a user-friendly database. At the Plant Biotechnology Centre, Melbourne, Australia, we have developed a series of tools and computational pipelines to assist in the processing and structuring of genomic data, to aid its application to agricultural biotechnology research. These tools include a sequence database, ASTRA, a sequence processing pipeline incorporating annotation against GenBank, SwissProt and Arabidopsis Gene Ontology (GO) data and tools for molecular marker discovery and comparative genome analysis. All sequences are mined for simple sequence repeat (SSR) molecular markers using ‘SSR primer’ and mapped onto the complete Arabidopsis thaliana genome by sequence comparison. The database may be queried using a text-based search of sequence annotation or GO terms, BLAST comparison against resident sequences, or by the position of candidate orthologues within the Arabidopsis genome. Tools have also been developed and applied to the discovery of single nucleotide polymorphism (SNP) molecular markers and the in silico mapping of Brassica BAC end sequences onto the Arabidopsis genome. Planned extensions to this resource include the integration of gene expression data and the development of an EnsEMBL-based genome viewer.

Keywords

This publication has 10 references indexed in Scilit:

Simple sequence repeat marker loci discovery using SSR primer
Bioinformatics, 2004
Mining for Single Nucleotide Polymorphisms and Insertions/Deletions in Maize Expressed Sequence Tag Data
Plant Physiology, 2003
TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets
Bioinformatics, 2003
Redundancy based detection of sequence polymorphisms in expressed sequence tag data using autoSNP
Bioinformatics, 2003
The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant
Nucleic Acids Research, 2001
Analysis of the genome sequence of the flowering plant Arabidopsis thaliana
Nature, 2000
Gene Ontology: tool for the unification of biology
Nature Genetics, 2000
Primer3 on the WWW for General Users and for Biologist Programmers
Published by Springer Nature ,2000
Base-Calling of Automated Sequencer Traces UsingPhred. I. Accuracy Assessment
Genome Research, 1998
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Nucleic Acids Research, 1997