Ovarian cancer detection by logical analysis of proteomic data

Abstract
A new type of efficient and accurate proteomic ovarian cancer diagnosis systems is proposed. The system is developed using the combinatorics and optimization‐based methodology of logical analysis of data (LAD) to the Ovarian Dataset 8‐7‐02 (http://clinicalproteomics.steem.com), which updates the one used by Petricoin et al. in The Lancet 2002, 359, 572–577. This mass spectroscopy‐generated dataset contains expression profiles of 15 154 peptides defined by their mass/charge ratios (m/z) in serum of 162 ovarian cancer and 91 control cases. Several fully reproducible models using only 7–9 of the 15 154 peptides were constructed, and shown in multiple cross‐validation tests (k‐folding and leave‐one‐out) to provide sensitivities and specificities of up to 100%. A special diagnostic system for stage I ovarian cancer patients is shown to have similarly high accuracy. Other results: (i) expressions of peptides with relatively low m/z values in the dataset are shown to be better at distinguishing ovarian cancer cases from controls than those with higher m/z values; (ii) two large groups of patients with a high degree of similarities among their formal (mathematical) profiles are detected; (iii) several peptides with a blocking or promoting effect on ovarian cancer are identified.