Characterizing environmental and phenotypic associations using information theory and electronic health records
Open Access
- 17 September 2009
- journal article
- conference paper
- Published by Springer Nature in BMC Bioinformatics
- Vol. 10 (S9) , S13
- https://doi.org/10.1186/1471-2105-10-s9-s13
Abstract
Background: The availability of up-to-date, executable, evidence-based medical knowledge is essential for many clinical applications, such as pharmacovigilance, but executable knowledge is costly to obtain and update. Automated acquisition of environmental and phenotypic associations in biomedical and clinical documents using text mining has showed some success. The usefulness of the association knowledge is limited, however, due to the fact that the specific relationships between clinical entities remain unknown. In particular, some associations are indirect relations due to interdependencies among the data. Results: In this work, we develop methods using mutual information (MI) and its property, the data processing inequality (DPI), to help characterize associations that were generated based on use of natural language processing to encode clinical information in narrative patient records followed by statistical methods. Evaluation based on a random sample consisting of two drugs and two diseases indicates an overall precision of 81%. Conclusion: This preliminary study demonstrates that the proposed method is effective for helping to characterize phenotypic and environmental associations obtained from clinical reports.Keywords
This publication has 22 references indexed in Scilit:
- Active Computerized Pharmacovigilance Using Natural Language Processing, Statistics, and Electronic Health Records: A Feasibility StudyJournal of the American Medical Informatics Association, 2009
- Mutual information reveals variation in temperature-dependent sex determination in response to environmental fluctuation, lifespan and selectionProceedings Of The Royal Society B-Biological Sciences, 2008
- Automated Acquisition of Disease-Drug Knowledge from Biomedical and Clinical Documents: An Initial StudyJournal of the American Medical Informatics Association, 2008
- Theory and Limitations of Genetic Network Inference from Microarray DataAnnals of the New York Academy of Sciences, 2007
- Clinical decision support tools: analysis of online drug information databasesBMC Medical Informatics and Decision Making, 2007
- A statistical methodology for analyzing co-occurrence data from a large sampleJournal of Biomedical Informatics, 2006
- Detecting Causal Nonlinear Exposure-Response Relations in Epidemiological DataDose-Response, 2006
- ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular ContextBMC Bioinformatics, 2006
- Reverse engineering of regulatory networks in human B cellsNature Genetics, 2005
- The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical textPublished by Elsevier ,2004