QGB: A System for Querying Sequence Database Fields and Features
- 1 January 1994
- journal article
- research article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 1 (1) , 3-14
- https://doi.org/10.1089/cmb.1994.1.3
Abstract
We have developed a general system, QGB, for performing complex queries on the information in the DDBJ/EMBL/GenBank databases, including queries over the structural features of sequences implied in the FEATURE TABLE. Queries are formed in a Structured Query Language (SQL)-like syntax with language extensions to support complex types (e.g., sets, ordered sets, and records) appropriate for representing and querying sequence data. A novel aspect of QGB is its ability to deduce missing features and infer relationships among features as a consequence of constructing a parse tree of sequence structure from information described in the FEATURE TABLE. The grammar for the parse tree is implemented in a customized form of the Definite Clause Grammar syntax of the logic programming language Prolog. The logic grammar formalism was chosen because it provides a perspicuous representation for features and constraints, and Prolog provides an execution model for the grammar rules. Construction of the parse tree also identifies inconsistencies and errors in the FEATURE TABLE that can in some cases be corrected automatically and used to generate an augmented version of the table.Keywords
This publication has 2 references indexed in Scilit:
- GenBankNucleic Acids Research, 1992
- Maintaining knowledge about temporal intervalsCommunications of the ACM, 1983