Environmental, dietary, demographic, and activity variables associated with biomarkers of exposure for benzene and lead
- 1 November 2003
- journal article
- research article
- Published by Springer Nature in Journal of Exposure Science & Environmental Epidemiology
- Vol. 13 (6) , 417-426
- https://doi.org/10.1038/sj.jea.7500296
Abstract
Classification and regression tree methods represent a potentially powerful means of identifying patterns in exposure data that may otherwise be overlooked. Here, regression tree models are developed to identify associations between blood concentrations of benzene and lead and over 300 variables of disparate type (numerical and categorical), often with observations that are missing or below the quantitation limit. Benzene and lead are selected from among all the environmental agents measured in the NHEXAS Region V study because they are ubiquitous, and they serve as paradigms for volatile organic compounds (VOCs) and heavy metals, two classes of environmental agents that have very different properties. Two sets of regression models were developed. In the first set, only environmental and dietary measurements were employed as predictor variables, while in the second set these were supplemented with demographic and time–activity data. In both sets of regression models, the predictor variables were regressed on the blood concentrations of the environmental agents. Jack-knife cross-validation was employed to detect overfitting of the models to the data. Blood concentrations of benzene were found to be associated with: (a) indoor air concentrations of benzene; (b) the duration of time spent indoors with someone who was smoking; and (c) the number of cigarettes smoked by the subject. All these associations suggest that tobacco smoke is a major source of exposure to benzene. Blood concentrations of lead were found to be associated with: (a) house dust concentrations of lead; (b) the duration of time spent working in a closed workshop; and (c) the year in which the subject moved into the residence. An unexpected finding was that the regression trees identified time–activity data as better predictors of the blood concentrations than the measurements in environmental and dietary media.Keywords
This publication has 22 references indexed in Scilit:
- SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivationNature Genetics, 2008
- Multivariate Analysis on Levels of Selected Metals, Particulate Matter, VOC, and Household Characteristics and Activities from the Midwestern States NHEXASApplied Occupational and Environmental Hygiene, 2001
- Particulate matter and manganese exposures in Toronto, CanadaAtmospheric Environment, 1999
- Do groups of women aged 50 to 75 match the national average mammography rate?American Journal of Preventive Medicine, 1998
- Decision Tree Method for the Classification of Chemical Pollutants: Incorporation of Across-Chemical Variability and Within-Chemical UncertaintyEnvironmental Science & Technology, 1998
- A Decision Tree System for Finding Genes in DNAJournal of Computational Biology, 1998
- Global land cover classifications at 8 km spatial resolution: The use of training data derived from Landsat imagery in decision tree classifiersInternational Journal of Remote Sensing, 1998
- Decision tree classification of land cover from remotely sensed dataRemote Sensing of Environment, 1997
- Biomarkers of environmental benzene exposure.Environmental Health Perspectives, 1996
- The exposure of the general population to benzeneCell Biology and Toxicology, 1989