Image-based crystal detection: a machine-learning approach
Open Access
- 18 November 2008
- journal article
- research article
- Published by International Union of Crystallography (IUCr) in Acta Crystallographica Section D-Biological Crystallography
- Vol. 64 (12) , 1187-1195
- https://doi.org/10.1107/s090744490802982x
Abstract
The ability of computers to learn from and annotate large databases of crystallization-trial images provides not only the ability to reduce the workload of crystallization studies, but also an opportunity to annotate crystallization trials as part of a framework for improving screening methods. Here, a system is presented that scores sets of images based on the likelihood of containing crystalline material as perceived by a machine-learning algorithm. The system can be incorporated into existing crystallization-analysis pipelines, whereby specialists examine images as they normally would with the exception that the images appear in rank order according to a simple real-valued score. Promising results are shown for 319 112 images associated with 150 structures solved by the Joint Center for Structural Genomics pipeline during the 2006-2007 year. Overall, the algorithm achieves a mean receiver operating characteristic score of 0.919 and a 78% reduction in human effort per set when considering an absolute score cutoff for screening images, while incurring a loss of five out of 150 structures.Keywords
This publication has 14 references indexed in Scilit:
- Advances in High-throughput Methodologies for Crystallizing ProteinsBiotechnology and Genetic Engineering Reviews, 2006
- Integrated state evaluation for the images of crystallization droplets utilizing linear and nonlinear classifiersActa Crystallographica Section D-Biological Crystallography, 2006
- Automated classification of protein crystallization images using support vector machines with scale-invariant texture and Gabor featuresActa Crystallographica Section D-Biological Crystallography, 2006
- Evaluation of crystalline objects in crystallizing protein droplets based on line-segment information in greyscale imagesActa Crystallographica Section D-Biological Crystallography, 2006
- The Impact of Structural Genomics: Expectations and OutcomesScience, 2006
- Automatic Classification and Pattern Discovery in High-throughput Protein Crystallization TrialsJournal of Structural and Functional Genomics, 2005
- Protein Production and Crystallization at the Joint Center for Structural GenomicsJournal of Structural and Functional Genomics, 2005
- Computational analysis of crystallization trialsActa Crystallographica Section D-Biological Crystallography, 2002
- The high-speed Hydra-Plus-One system for automated high-throughput protein crystallographyActa Crystallographica Section D-Biological Crystallography, 2002
- Structural genomics of the Thermotoga maritima proteome implemented in a high-throughput structure determination pipelineProceedings of the National Academy of Sciences, 2002