Biomedical Discovery Acceleration, with Applications to Craniofacial Development
Open Access
- 27 March 2009
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Computational Biology
- Vol. 5 (3) , e1000215
- https://doi.org/10.1371/journal.pcbi.1000215
Abstract
The profusion of high-throughput instruments and the explosion of new results in the scientific literature, particularly in molecular biomedicine, is both a blessing and a curse to the bench researcher. Even knowledgeable and experienced scientists can benefit from computational tools that help navigate this vast and rapidly evolving terrain. In this paper, we describe a novel computational approach to this challenge, a knowledge-based system that combines reading, reasoning, and reporting methods to facilitate analysis of experimental data. Reading methods extract information from external resources, either by parsing structured data or using biomedical language processing to extract information from unstructured data, and track knowledge provenance. Reasoning methods enrich the knowledge that results from reading by, for example, noting two genes that are annotated to the same ontology term or database entry. Reasoning is also used to combine all sources into a knowledge network that represents the integration of all sorts of relationships between a pair of genes, and to calculate a combined reliability score. Reporting methods combine the knowledge network with a congruent network constructed from experimental data and visualize the combined network in a tool that facilitates the knowledge-based analysis of that data. An implementation of this approach, called the Hanalyzer, is demonstrated on a large-scale gene expression array dataset relevant to craniofacial development. The use of the tool was critical in the creation of hypotheses regarding the roles of four genes never previously characterized as involved in craniofacial development; each of these hypotheses was validated by further experimental work. Recent technology has made it possible to do experiments that show hundreds or even thousands of genes that play a role in a disease or other biological phenomena. Interpreting these experimental results in the light of everything that has ever been published about any of those genes is often overwhelming, and the failure to take advantage of all prior knowledge may impede biomedical research. The computer program described in this paper “reads” the biomedical literature and molecular biology databases, “reasons” about what all that information means to this experiment, and “reports” on its findings in a way that makes digesting all of this information far more efficient than ever before possible. Analysis of a large, complex dataset with this tool led rapidly to the creation of a novel hypothesis about the role of several genes in the development of the tongue, which was then confirmed experimentally.Keywords
This publication has 125 references indexed in Scilit:
- New Nanostructured Carbon Coating Inhibits Bacterial Growth, but Does Not Influence on Animal CellsNanomaterials, 2020
- Manual curation is not sufficient for annotation of genomic databasesBioinformatics, 2007
- Enrichment of OBO ontologiesJournal of Biomedical Informatics, 2007
- Network‐based classification of breast cancer metastasisMolecular Systems Biology, 2007
- PReMod: a database of genome-wide mammalian cis-regulatory module predictionsNucleic Acids Research, 2006
- The mouse genome database (MGD): new features facilitating a model systemNucleic Acids Research, 2006
- Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profilesProceedings of the National Academy of Sciences, 2005
- Probabilistic model of the human protein-protein interaction networkNature Biotechnology, 2005
- Physical Network ModelsJournal of Computational Biology, 2004
- Hepatocyte growth factor is essential for migration of myogenic cells and promotes their proliferation during the early periods of tongue morphogenesis in mouse embryosDevelopmental Dynamics, 2002