Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags
Open Access
- 2 April 2007
- journal article
- research article
- Published by Springer Nature in Genome Biology
- Vol. 8 (4) , R45
- https://doi.org/10.1186/gb-2007-8-4-r45
Abstract
Background: Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. Results: Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. Conclusion: This EST collection, the largest to date in pig, represents an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies.Keywords
This publication has 54 references indexed in Scilit:
- CHARACTERIZATION OF A NORMALIZED CDNA LIBRARY FROM BOVINE INTESTINAL MUSCLE AND EPITHELIAL TISSUESAnimal Biotechnology, 2005
- Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequencesProceedings of the National Academy of Sciences, 2002
- Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAsNature, 2002
- A Comprehensive Collection of Chicken cDNAsCurrent Biology, 2002
- A first-generation EST RH comparative map of the porcine and human genomeMammalian Genome, 2002
- The Drosophila Gene Collection: Identification of Putative Full-Length cDNAs for 70% of D. melanogaster GenesGenome Research, 2002
- Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genesGenome Biology, 2002
- Making sense of microarray data distributionsBioinformatics, 2002
- The contribution of 700,000 ORF sequence tags to the definition of the human transcriptomeProceedings of the National Academy of Sciences, 2001
- CROC-4: A Novel Brain Specific Transcriptional Activator of c-fos Expressed from Proliferation through to Maturation of Multiple Neuronal Cell TypesMolecular and Cellular Neuroscience, 2000