Subtilist: a relational database for the Bacillus subtilis genome

Abstract
SUMMARY: In the framework of the international collaborative project aiming to sequence the whole Bacillus subtilis chromosome, we have created a relational database for managing and analysing information associated with the molecular genetics of this bacterium: Subtilist. It allows recovery of non-redundant DNA sequences of the B. subtilis genome, as well as related information, i.e. genes, proteins, etc. A logical structure has been designed with appropriate links between the different objects, and a set of procedures has been implemented for data updating and management. The database is organized around a core constituted by all known contigs of B. subtilis, i.e. sets of nonredundant sequences created from original entries in the EMBL data library. A user-friendly interface has been developed to make the database easy to consult. Sequence analysis tools have been integrated into the database, such as a program for rapid similarity searching of protein data banks, and a powerful DNA pattern searching program. Thanks to the consistency of Subtilist, we have performed a codon usage analysis by Factorial Correspondence Analysis, and a study of the distribution of the isoelectric points of known proteins of B. subtilis. The Subtilist database is available through anonymous ftp (address “ftp.pasteur.fr” or IP number 157.99.64.12, directory “/pub/GenomeDB/SubtiList”).