Literature-Based Generation of Hypotheses on Chemical Composition Using Database Co-occurrence of Chemical Compounds
- 29 July 2005
- journal article
- Published by American Chemical Society (ACS) in Journal of Chemical Information and Modeling
- Vol. 45 (5) , 1153-1158
- https://doi.org/10.1021/ci049716u
Abstract
Candidates for identification of unknown constituents in a sample to be chemically analyzed are hypothetical. It is proposed to generate these hypotheses according to the co-occurrence of different chemical compounds with a known sample constituent in the chemical literature. The efficiency of the co-occurrence approach for predicting chemical compositions was tested for 67 impurities in 17 chemical/pharmaceutical products. The relative co-occurrence of impurity compounds and these products in the Chemical Abstracts Service database was evaluated and compared with corresponding values for several reference groups of probability sampled compounds from the literature. Almost all impurities (97%) and only ≤8% randomly sampled compounds co-occurred with these chemical products. Mean and median values of relative co-occurrence for impurities are much higher than those of probability sampled compounds which co-occurred with the products. For the combination of impurities and the probability sample of 396 interfering compounds, the power to predict the chemical composition using the highest co-occurrences is 0.49−0.59. The co-occurrence value can also be considered as an “empiric” indicator of chemical similarity useful to generate new hypotheses on relationships both between compounds and between compounds and their properties.Keywords
This publication has 19 references indexed in Scilit:
- Using concepts in literature‐based discovery: Simulating Swanson's Raynaud–fish oil and migraine–magnesium discoveriesJournal of the American Society for Information Science and Technology, 2001
- Determination of oxytetracycline and some impurities in plasma by non-aqueous capillary electrophoresis using solid-phase extractionChromatographia, 2000
- Identification of chemical substances by testing and screening of hypothesesAnalytical and Bioanalytical Chemistry, 2000
- Determination of 1-benzo[b]thien-2-ylethanone and related impurities by high performance liquid chromatographyJournal of Pharmaceutical and Biomedical Analysis, 1996
- High-performance liquid chromatographic separation and determination of small amounts of process impurities of ciprofloxacin in bulk drugs and formulationsJournal of Chromatography A, 1995
- Evaluation of different injection techniques in the gas chromatographic determination of thermolabile trace impurities in a drug substanceJournal of Pharmaceutical and Biomedical Analysis, 1995
- Stability-indicating method for the determination of levodopa, levodopa—carbidopa and related impuritiesJournal of Chromatography A, 1994
- HPLC Determination of Oxiracetam, Its Impurities, and Piracetam in Pharmaceutical FormulationsAnalytical Letters, 1994
- Definition and role of similarity concepts in the chemical and physical sciencesJournal of Chemical Information and Computer Sciences, 1992
- Co‐citation in the scientific literature: A new measure of the relationship between two documentsJournal of the American Society for Information Science, 1973