Computer-Assisted Diagnosis by a Model-Free System of Direct Data Analysis

Abstract
COMPUTER-ASSISTED DIAGNOSIS BY A MODEL-FREE SYSTEM OF DIRECT DATA ANALYSIS NEWTON RESSLER* Introduction One of the more remarkable technical developments in contemporary society has been that of the computer [I]. Some of the impressive applications include the automatic monitoring and control of complicated processes, determinations of correlations, and the developing discipline of "artificial intelligence." In this light, it has been suggested that applications of computers in the field of medicine have tended to lag behind some of the accomplishments in other areas. Despite certain important developments, the general use of computers in medical practice is still quite restricted. While a number of mathematical models have been proposed for the determination of medical diagnoses, they all tend to have a similar mathematical basis, that is, the use of discriminant functions for the recognition of particular patterns [2]. The following reasons have been cited for the relatively limited acceptance or use of these models: a lack of standard medical definitions, lack of large and reliable medical data basis, difficulties when the patient has either none or more than one of the diseases being considered, and other limitations due to the assumptions of the model being used [2, 3]. Croft [2] has suggested that new directions which could lead to solutions of such problems as these, rather than more sophisticated mathematical methods, are most important for futher progress. In this respect, it has been pointed out by Baron and Fraser [3] that some taxonomic approaches which do not depend upon any assumptions or models could be applied to clinical diagnoses. This freedom from assumptions would appear to obviate some of the difficulties of models in which certain assumptions must be inherent [3]. In an approach consistent with this view, a procedure was described by Ressler and Whitlock for obtaining the most probable diagnosis for a given patient directly from medical data and test results [4, 5]. This data ?Departments of Pathology and Biochemistry, University of Illinois Medical Center, 1853 West Polk Street, Chicago, Illinois 60612. Perspectives in Biology and Medicine · Autumn 1975 | 101 analysis method involves the computer compilation of separate frequency distributions for each disease population. This is automatically done by a data-processing system for each clinical test. The relation of test results to different disease populations is the essence of the nature or purpose of the clinical tests. No assumptions or restrictions are involved , since there is no model. In this communication, further applications and considerations involved in this type of data treatment will be described. These include the evaluation of the accuracy of clinical tests and diagnostic ranges, the determination of the most probable diseases for a given patient (i.e., the numerical probabilities for each disease), and the determination of the additional tests (not yet done) which would be most useful in decreasing further the uncertainty of the diagnosis. This use of probabilities, besides providing a probe for maximizing the discrimination, also avoids the conclusion that clinical test results are always necessary or important for the diagnosis of every disease. When relatively specific clinical tests are available for the diagnosis ofa particular disease, the resulting probabilties from such tests that a given patient has this disease may approach either 0 or 100 percent. If, on the other hand, there are not any relatively specific tests, the resulting probabilities from the clinical tests may remain at intermediate values, and the establishment of the diagnosis will be more dependent upon other aspects of the physician's training and understanding. Consequently, this system of analysis could assist but not replace the physician. I. Method ofData Handling: The Use of Individual Frequency Distributionsfor Each Disease Classification Suppose a data-processing system automatically compiles two frequency distributions for the hippuric acid test of liver function, as illustrated diagrammatically in figure 1 . One ofthe frequency distributions is for patients with liver disease, while the other is for normals. In terms of the experience represented by this diagram, all individuals with hippuric acid values less than ? have liver disease, and all with values over y are normal. If an individual has a value between ? and y, it cannot be determined with certainty whether he is normal or has liver disease. The proportion of...

This publication has 0 references indexed in Scilit: