Characterization of soybean genomic features by analysis of its expressed sequence tags
- 18 November 2003
- journal article
- research article
- Published by Springer Nature in Theoretical and Applied Genetics
- Vol. 108 (5) , 903-913
- https://doi.org/10.1007/s00122-003-1499-2
Abstract
We analyzed 314,254 soybean expressed sequence tags (ESTs), including 29,540 from our laboratory and 284,714 from GenBank. These ESTs were assembled into 56,147 unigenes. About 76.92% of the unigenes were homologous to genes from Arabidopsis thaliana (Arabidopsis). The putative products of these unigenes were annotated according to their homology with the categorized proteins of Arabidopsis. Genes corresponding to cell growth and/or maintenance, enzymes and cell communication belonged to the slow-evolving class, whereas genes related to transcription regulation, cell, binding and death appeared to be fast-evolving. Soybean unigenes with no match to genes within the Arabidopsis genome were identified as soybean-specific genes. These genes were mainly involved in nodule development and the synthesis of seed storage proteins. In addition, we also identified 61 genes regulated by salicylic acid, 1,322 transcription factor genes and 326 disease resistance-like genes from soybean unigenes. SSR analysis showed that the soybean genome was more complex than the Arabidopsis and the Medicago truncatula genomes. GC content in soybean unigene sequences is similar to that in Arabidopsis and M. truncatula. Furthermore, the combined analysis of the EST database and the BAC-contig sequences revealed that the total gene number in the soybean genome is about 63,501.Keywords
This publication has 45 references indexed in Scilit:
- Deductions about the Number, Organization, and Evolution of Genes in the Tomato Genome Based on Analysis of a Large Expressed Sequence Tag Collection and Selective Genomic SequencingPlant Cell, 2002
- RePS: A Sequence Assembler That Masks Exact Repeats Identified from the Shotgun DataGenome Research, 2002
- A Draft Sequence of the Rice Genome ( Oryza sativa L. ssp. japonica )Science, 2002
- A Draft Sequence of the Rice Genome ( Oryza sativa L. ssp. indica )Science, 2002
- Evidence for a Role of Salicylic Acid in the Oxidative Damage Generated by NaCl and Osmotic Stress in Arabidopsis SeedlingsPlant Physiology, 2001
- Homeodomain Leucine Zipper Proteins Bind to the Phosphate Response Domain of the SoybeanVspBTripartite PromoterPlant Physiology, 2001
- Arabidopsis Transcription Factors: Genome-Wide Comparative Analysis Among EukaryotesScience, 2000
- Expressed sequence tags for genes: a reviewGenetics Selection Evolution, 1998
- The tomato gene Pti1 encodes a serine/threonine kinase that is phosphorylated by Pto and is involved in the hypersensitive responseCell, 1995
- Nuclear DNA content of some important plant speciesPlant Molecular Biology Reporter, 1991