GAIA: Framework Annotation of Genomic Sequence
Open Access
- 1 March 1998
- journal article
- review article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 8 (3) , 234-250
- https://doi.org/10.1101/gr.8.3.234
Abstract
As increasing amounts of genomic sequence from many organisms become available, and as DNA sequences become a primary reagent in biologic investigations, the role of annotation as a prospective guide for laboratory experiments will expand rapidly. Here we describe a process of high-throughput, reliable annotation, called framework annotation, which is designed to provide a foundation for initial biologic characterization of previously unexamined sequence. To examine this concept in practice, we have constructed Genome Annotation and Information Analysis (GAIA), a prototype software architecture that implements several elements important for framework annotation. The center of GAIA consists of an annotation database and the associated data management subsystem that forms the software bus along which other components communicate. The schema for this database defines three principal concepts: (1) Entries, consisting of sequence and associated historical data; (2) Features, comprising information of biologic interest; and (3) Experiments, describing the evidence that supports Features. The database permits tracking of annotation results over time, as well as assessment of the reliability of particular results. New framework annotation is produced by CARTA, a set of autonomous sensors that perform automatic analyses and assert results into the annotation database. These results are available via a Web-based query interface that uses graphical Java applets as well as text-based HTML pages to display data at different levels of resolution and permit interactive exploration of annotation. We present results for initial application of framework annotation to a set of test sequences, demonstrating its effectiveness in providing a starting point for biologic investigation, and discuss ways in which the current prototype can be improved. The prototype is available for public use and comment at http://www.cbil.upenn.edu/gaia.Keywords
This publication has 21 references indexed in Scilit:
- Analysis of EST-Driven Gene Annotation in Human Genomic SequenceGenome Research, 1998
- A gene belonging to the Sm family of snRNP core proteins maps within the mouse MHC.Immunogenetics, 1997
- Genotator: A Workbench for Sequence AnnotationGenome Research, 1997
- Prediction of complete gene structures in human genomic DNAJournal of Molecular Biology, 1997
- Published by Oxford University Press (OUP) ,1997
- The Sulfolobus solfataricus P2 genome projectFEBS Letters, 1996
- A transcription map of the DiGeorge and velo-cardio-facial syndrome minimal critical region on 22q11Human Molecular Genetics, 1996
- Evaluation of Gene Structure Prediction ProgramsGenomics, 1996
- Simple repetitive DNA sequences from primates: Compilation and analysisJournal of Molecular Evolution, 1995
- Basic Local Alignment Search ToolJournal of Molecular Biology, 1990