Finding themes in Medline documents - probabilistic similarity search

Abstract
Large on-line document databases, such as Medine, pose a major challenge of retrieving the few documents most relevant to the user's needs, while multimizing the return rate of nonrelevant documents. Retrieval of documents similar to a user provided example document is a promising query paradigm towards meeting this goal. We present a new theme-based probabilistic approach for finding documents relevant to a given query document, and summarizing their contents. Preliminary experiments conducted over a subset of Medline documents related to AIDS demonstrate the effectiveness of our approach.

This publication has 14 references indexed in Scilit: