Classification of Procedures in the Domain of Thoracic Surgery—A Study of Reliability in Coding

Abstract
This paper relates a study of reliability of coding of surgical procedures in the domain of thoracic surgery. The reliability measured is inter-coder variability in form of agreement. Four classifications were used by four physicians on 100 patient cases. The classifications, having differing granularity and structure, were analyzed using a statistical method (kappa). These results are discussed and related to the differences between the classifications. One of the topics for discussion is how the granularity affects the degree of agreement, coupled to the usefulness of the classification. Also the concept of using formal methods for representing classifications is discussed, how this will affect how classifications are designed and used.