Automated annotation of Drosophila gene expression patterns using a controlled vocabulary
Open Access
- 16 July 2008
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 24 (17) , 1881-1888
- https://doi.org/10.1093/bioinformatics/btn347
Abstract
Motivation: Regulation of gene expression in space and time directs its localization to a specific subset of cells during development. Systematic determination of the spatiotemporal dynamics of gene expression plays an important role in understanding the regulatory networks driving development. An atlas for the gene expression patterns of fruit fly Drosophila melanogaster has been created by whole-mount in situ hybridization, and it documents the dynamic changes of gene expression pattern during Drosophila embryogenesis. The spatial and temporal patterns of gene expression are integrated by anatomical terms from a controlled vocabulary linking together intermediate tissues developed from one another. Currently, the terms are assigned to patterns manually. However, the number of patterns generated by high-throughput in situ hybridization is rapidly increasing. It is, therefore, tempting to approach this problem by employing computational methods. Results: In this article, we present a novel computational framework for annotating gene expression patterns using a controlled vocabulary. In the currently available high-throughput data, annotation terms are assigned to groups of patterns rather than to individual images. We propose to extract invariant features from images, and construct pyramid match kernels to measure the similarity between sets of patterns. To exploit the complementary information conveyed by different features and incorporate the correlation among patterns sharing common structures, we propose efficient convex formulations to integrate the kernels derived from various features. The proposed framework is evaluated by comparing its annotation with that of human curators, and promising performance in terms of F1 score has been reported. Contact: jieping.ye@asu.edu Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 25 references indexed in Scilit:
- Global Analysis of mRNA Localization Reveals a Prominent Role in Organizing Cellular Architecture and FunctionCell, 2007
- Global analysis of patterns of gene expression during DrosophilaembryogenesisGenome Biology, 2007
- Prediction of Gene Expression in Embryonic Structures of Drosophila melanogasterPLoS Computational Biology, 2007
- Kernel-based data fusion for gene prioritizationBioinformatics, 2007
- Automatic recognition and annotation of gene expression patterns of fly embryosBioinformatics, 2007
- Genome-wide atlas of gene expression in the adult mouse brainNature, 2006
- An Integrated Strategy for Analyzing the Unique Developmental Programs of Different Myoblast SubtypesPLoS Genetics, 2006
- Canonical Correlation Analysis: An Overview with Application to Learning MethodsNeural Computation, 2004
- A statistical framework for genomic data fusionBioinformatics, 2004
- Gene Expression During the Life Cycle of Drosophila melanogasterScience, 2002