Positive predictive value of computerized medical records for uncomplicated and complicated upper gastrointestinal ulcer

Abstract
Purpose Computerized databases can be an efficient resource to study the epidemiology of peptic ulcer (PU) and upper gastrointestinal complications (UGIC) if we achieve a high positive predictive value (PPV) of outcome definitions. We assessed the PPV of diagnosis codes in THIN, a primary‐care medical‐record database, to ascertain individuals with uncomplicated PU, and to identify UGIC and Helicobacter pylori infection status (HPIS) among these patients. Methods We identified: (1) patients with codes suggesting a first episode of uncomplicated PU; (2) episodes of UGIC among them. The computerized profiles with free‐text comments of these individuals were reviewed and classified as definite, possible, or excluded cases. Dates and HPIS were also ascertained. For a sample of definite and possible PU, and for all UGIC cases, primary care physicians were sent a questionnaire for confirmation. Results The 5296 individuals with codes suggesting PU were classified as definite (49%), possible (25%), and excluded (26%) cases. The PPV for definite/possible PU was 94% (99% for definite, 84% for possible cases). Of the questionnaires with information on HPIS (62%), the PPV and NPV were 100%. The 97 individuals with codes suggesting UGIC were classified as definite (48%), possible (27%), and excluded (22%) cases; the PPV for definite/possible was 95% (100% for definite, 88% for possible cases). Code dates were generally later than medical‐record dates. Conclusion The identification of PU cases and their HPIS and UGIC requires careful review of the computerized clinical information with free‐text comments. The validation of a sample is needed to confirm the accuracy of the diagnoses. Copyright © 2009 John Wiley & Sons, Ltd.