Automatic Coding of Diagnostic Reports

1 July 1998

journal article
research article
Published by Georg Thieme Verlag KG in Methods of Information in Medicine

Vol. 37 (03) , 260-265
https://doi.org/10.1055/s-0038-1634526

Abstract

A method is presented for assigning classification codes to pathology reports by searching similar reports from an archive collection. The key for searching is textual similarity, which estimates the true, semantic similarity. This method does not require explicit modeling, and can be applied to any language or any application domain that uses natural language reporting. A number of simulation experiments was run to assess the accuracy of the method and to indicate the role of size of the archive and the transfer of document collections across laboratories. In at least 63% of the simulation trials, the most similar archive text offered a suitable classification on organ, origin and diagnosis. In 85 to 90% ofthe trials, the archive's best solution was found within the first five similar reports. The results indicate that the method is suitable for its purpose: suggesting potentially correct classifications to the reporting diagnostician.

Keywords

This publication has 4 references indexed in Scilit:

Classification of diagnoses that are described in natural language
International Journal of Healthcare Technology and Management, 1999
Developments in Automatic Text Retrieval
Science, 1991
Comparison of manual data coding errors in two hospitals.
Journal of Clinical Pathology, 1986
Index term weighting
Information Storage and Retrieval, 1973