Methods for Semi-automated Indexing for High Precision Information Retrieval

Abstract
Objective: To evaluate a new system, ISAID (Internet-based Semi-automated Indexing of Documents), and to generate textbook indexes that are more detailed and more useful to readers. Design: Pilot evaluation: simple, nonrandomized trial comparing ISAID with manual indexing methods. Methods evaluation: randomized, cross-over trial comparing three versions of ISAID and usability survey. Participants: Pilot evaluation: two physicians. Methods evaluation: twelve physicians, each of whom used three different versions of the system for a total of 36 indexing sessions. Measurements: Total index term tuples generated per document per minute (TPM), with and without adjustment for concordance with other subjects; inter-indexer consistency; ratings of the usability of the ISAID indexing system. Results: Compared with manual methods, ISAID decreased indexing times greatly. Using three versions of ISAID, inter-indexer consistency ranged from 15% to 65% with a mean of 41%, 31%, and 40% for each of three documents. Subjects using the full version of ISAID were faster (average TPM: 5.6) and had higher rates of concordant index generation. There were substantial learning effects, despite our use of a training/run-in phase. Subjects using the full version of ISAID were much faster by the third indexing session (average TPM: 9.1). There was a statistically significant increase in three-subject concordant indexing rate using the full version of ISAID during the second indexing session (p < 0.05). Summary: Users of the ISAID indexing system create complex, precise, and accurate indexing for full-text documents much faster than users of manual methods. Furthermore, the natural language processing methods that ISAID uses to suggest indexes contributes substantially to increased indexing speed and accuracy.