Pattern recognition studies of complex chromatographic data sets
- 1 November 1985
- journal article
- Published by National Institute of Standards and Technology (NIST) in Journal of Research of the National Bureau of Standards
- Vol. 90 (6) , 543-549
- https://doi.org/10.6028/jres.090.059
Abstract
Chromatographic fingerprinting of complex biological samples is an active research area with a large and growing literature. Multivariate statistical and pattern recognition techniques can be effective methods for the analyisis of such complex data. However, the classification of complex samples on the basis of their chromatographic profiles is complicated by two factors: 1) confounding of the desired group information by experimental variables or other systematic variations, and 2) random or chance classification effects with linear discriminants. We will treat several current projects involving these effects and methods for dealing with the effects. Complex chromatographic data sets often contain information dependent on experimental variables as well as information which differentiates between classes. The existence of these types of complicating relationships is an innate part of fingerprint-type data. ADAPT, an interactive computer software system, has the clustering, mapping, and statistical tools necessary to identify and study these effects in realistically large data sets. In one study, pattern recognition analysis of 144 pyrochromatograms (PyGCs) from cultured skin fibroblasts was used to differentiate cystic fibrosis carriers from presumed normal donors. Several experimental variables (donor gender, chromatographic column number, etc.) were involved in relationships that had to be separated from the sought relationships. Notwithstanding these effects, discriminants were developed from the chromatographic peaks that assigned a given PyGC to its respective class (CF carrier vs normal) largely on the basis of the desired pathological difference. In another study, gas chromatographic profiles of cuticular hydrocarbon extracts obtained from 179 fire ants were analyzed using pattern recognition methods to seek relations with social caste and colony. Confounding relationships were studied by logistic regression. The data analysis techniques used in these two example studies will be presented. Previously, Monte Carlo simulation studies were carried out to assess the probability of chance classification for nonparametric and parametric linear discriminants. The level of expected chance classification as a function of the number of observations, the dimensionality, and the class membership distributions were examined. These simulation studies established limits on the approaches that can be taken with real data sets so that chance classifications are improbable.Keywords
This publication has 10 references indexed in Scilit:
- Application of pyrolysis/gas chromatography/pattern recognition to the detection of cystic fibrosis heterozygotesAnalytical Chemistry, 1985
- Chemical Mimicry in the Myrmecophilous BeetleMyrmecaphodius excavaticollisScience, 1982
- Interpretation of analytical chemical information by pattern recognition methods—a surveyTalanta, 1981
- Classification of human cancer cells by means of capillary gas chromatography and pattern recognition analysisJournal of Chromatography A, 1981
- The role of organic volatile profiles in clinical diagnosis.Clinical Chemistry, 1981
- Metabolic abnormalities associated with diabetes mellitus, as investigated by gas chromatography and pattern-recognition analysis of profiles of volatile metabolites.Clinical Chemistry, 1981
- Adaptive least-squares method applied to structure-activity correlation of hypotensive N-alkyl-N''-cyano-N'-pyridylguanidinesJournal of Medicinal Chemistry, 1979
- Application of pattern recognition and feature extraction techniques to volatile constituent metabolic profiles obtained by capillary gas chromatographyJournal of Chromatography B: Biomedical Sciences and Applications, 1979
- Botulism: A Pyrolysis-Gas-Liquid Chromatographic StudyJournal of Chromatographic Science, 1978
- Profiling of human body fluids in healthy and diseased states using gas chromatography and mass spectrometry, with special reference to organic acidsJournal of Chromatography B: Biomedical Sciences and Applications, 1977