DIAN: A Novel Algorithm for Genome Ontological Classification
Open Access
- 1 October 2001
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 11 (10) , 1766-1779
- https://doi.org/10.1101/gr.183301
Abstract
Faced with the determination of many completely sequenced genomes, computational biology is now faced with the challenge of interpreting the significance of these data sets. A multiplicity of data-related problems impedes this goal: Biological annotations associated with raw data are often not normalized, and the data themselves are often poorly interrelated and their interpretation unclear. All of these problems make interpretation of genomic databases increasingly difficult. With the current explosion of sequences now available from the human genome as well as from model organisms, the importance of sorting this vast amount of conceptually unstructured source data into a limited universe of genes, proteins, functions, structures, and pathways has become a bottleneck for the field. To address this problem, we have developed a method of interrelating data sources by applying a novel method of associating biological objects to ontologies. We have developed an intelligent knowledge-based algorithm, DIAN, to support biological knowledge mapping, and, in particular, to facilitate the interpretation of genomic data. In this respect, the method makes it possible to inventory genomes by collapsing multiple types of annotations and normalizing them to various ontologies. By relying on a conceptual view of the genome, researchers can now easily navigate the human genome in a biologically intuitive, scientifically accurate manner.Keywords
This publication has 34 references indexed in Scilit:
- Identification of common molecular subsequencesPublished by Elsevier ,2004
- Annotating eukaryote genomesCurrent Opinion in Structural Biology, 2000
- An ontology for bioinformatics applications.Bioinformatics, 1999
- Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationshipsProceedings of the National Academy of Sciences, 1998
- Pfam: multiple sequence alignments and HMM-profiles of protein domainsNucleic Acids Research, 1998
- CATH – a hierarchic classification of protein domain structuresPublished by Elsevier ,1997
- Prediction of complete gene structures in human genomic DNAJournal of Molecular Biology, 1997
- Isolation of a novel gene mutated in Wiskott-Aldrich syndromeCell, 1994
- A Method to Identify Protein Sequences That Fold into a Known Three-Dimensional StructureScience, 1991
- Prosite: a dictionary of sites and patterns in proteinsNucleic Acids Research, 1991