Capturing knowledge through top-down induction of decision trees
- 1 June 1990
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Expert
- Vol. 5 (3) , 41-50
- https://doi.org/10.1109/64.54672
Abstract
TDIDT (top-down induction of decision trees) methods for heuristic rule generation lead to unnecessarily complex representations of induced knowledge and are overly sensitive to noise in training data. Practical alternatives to TDIDT approaches which lead to more direct representations of the same knowledge, are examined. The alternatives are more immune to problems with spurious correlations in small data sets and to noise in initial training data. These knowledge representation problems and alternatives are examined in the context of chess, for which a TDIDT algorithm called the ID3 algorithm was originally devised. Modifications to the ID3 algorithm are proposed so that users can measure heuristically the information content of attributes to guide search. The program iteratively examines all positive instances remaining to be covered, along with negative training-set instances; search does not take place with irrelevant context restrictions. This algorithm is no more complex than TDIDT, just as fast and less sensitive to noise and it leads to clearer representations of the information present in training-set data.Keywords
This publication has 5 references indexed in Scilit:
- Assessing Credit Card Applications Using Machine LearningIEEE Expert, 1987
- Some comments on rule inductionThe Knowledge Engineering Review, 1987
- Technology Lecture: The superarticulacy phenomenon in the context of software manufactureProceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, 1986
- Induction of decision treesMachine Learning, 1986
- The Art of Artificial Intelligence. 1. Themes and Case Studies of Knowledge EngineeringPublished by Defense Technical Information Center (DTIC) ,1977