Genome-Wide Expression Profiling of the Arabidopsis Female Gametophyte Identifies Families of Small, Secreted Proteins

Abstract
The female gametophyte of flowering plants, the embryo sac, develops within the diploid (sporophytic) tissue of the ovule. While embryo sac–expressed genes are known to be required at multiple stages of the fertilization process, the set of embryo sac–expressed genes has remained poorly defined. In particular, the set of genes responsible for mediating intracellular communication between the embryo sac and the male gametophyte, the pollen grain, is unknown. We used high-throughput cDNA sequencing and whole-genome tiling arrays to compare gene expression in wild-type ovules to that in dif1 ovules, which entirely lack embryo sacs, and myb98 ovules, which are impaired in pollen tube attraction. We identified nearly 400 genes that are downregulated in dif1 ovules. Seventy-eight percent of these embryo sac–dependent genes were predicted to encode for secreted proteins, and 60% belonged to multigenic families. Our results define a large number of candidate extracellular signaling molecules that may act during embryo sac development or fertilization; less than half of these are represented on the widely used ATH1 expression array. In particular, we found that 37 out of 40 genes encoding Domain of Unknown Function 784 (DUF784) domains require the synergid-specific transcription factor MYB98 for expression. Several DUF784 genes were transcribed in synergid cells of the embryo sac, implicating the DUF784 gene family in mediating late stages of embryo sac development or interactions with pollen tubes. The coexpression of highly similar proteins suggests a high degree of functional redundancy among embryo sac genes. During the sexual reproduction of flowering plants, a pollen tube delivers sperm cells to a specialized group of cells known as the embryo sac, which contains the egg cell. It is known that embryo sacs are active participants in guiding the growth of pollen tubes, in facilitating fertilization, and in initiating seed development. However, the genes responsible for the complex biology of embryo sacs are poorly understood. The authors use two recently developed technologies, whole-genome tiling microarrays and high-throughput cDNA sequencing, to identify hundreds of genes expressed in embryo sacs of Arabidopsis thaliana. Most embryo sac–dependent genes have no known function, and include entire families of related genes that are only expressed in embryo sacs. Furthermore, most embryo sac–dependent genes encode small proteins that are potentially secreted from their cells of origin, suggesting that they may act as intracellular signals or to modify the extracellular matrix during fertilization or embryo sac development. These results illustrate the extent to which our understanding of plant sexual reproduction is limited and identifies hundreds of candidate genes for future studies investigating the molecular biology of the embryo sac.