Identifying Decision Structures Underlying Activity Patterns: An Exploration of Data Mining Algorithms
- 1 January 2000
- journal article
- Published by SAGE Publications in Transportation Research Record: Journal of the Transportation Research Board
- Vol. 1718 (1) , 1-9
- https://doi.org/10.3141/1718-01
Abstract
The utility-maximizing framework—in particular, the logit model—is the dominantly used framework in transportation demand modeling. Computational process modeling has been introduced as an alternative approach to deal with the complexity of activity-based models of travel demand. Current rule-based systems, however, lack a methodology to derive rules from data. The relevance and performance of data-mining algorithms that potentially can provide the required methodology are explored. In particular, the C4 algorithm is applied to derive a decision tree for transport mode choice in the context of activity scheduling from a large activity diary data set. The algorithm is compared with both an alternative method of inducing decision trees (CHAID) and a logit model on the basis of goodness-of-fit on the same data set. The ratio of correctly predicted cases of a holdout sample is almost identical for the three methods. This suggests that for data sets of comparable complexity, the accuracy of predictions does not provide grounds for either rejecting or choosing the C4 method. However, the method may have advantages related to robustness. Future research is required to determine the ability of decision tree-based models in predicting behavioral change.Keywords
This publication has 8 references indexed in Scilit:
- Integrated Model System of Stop Generation and Tour Formation for Analysis of Activity and Travel PatternsTransportation Research Record: Journal of the Transportation Research Board, 1999
- System for Logical Verification and Inference of Activity (SYLVIA) DiariesTransportation Research Record: Journal of the Transportation Research Board, 1999
- Computer Simulation of Household Activity SchedulingEnvironment and Planning A: Economy and Space, 1998
- Computational-process modelling of household activity schedulingTransportation Research Part B: Methodological, 1994
- An Empirical Comparison of Pruning Methods for Decision Tree InductionMachine Learning, 1989
- A Model-Free Approach for Analysis of Complex Contingency Data in Survey ResearchJournal of Marketing Research, 1980
- An Exploratory Technique for Investigating Large Quantities of Categorical DataJournal of the Royal Statistical Society Series C: Applied Statistics, 1980
- Constructing optimal binary decision trees is NP-completeInformation Processing Letters, 1976