Systematic genome-wide annotation of spliceosomal proteins reveals differential gene family expansion
Open Access
- 12 December 2005
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 16 (1) , 66-77
- https://doi.org/10.1101/gr.3936206
Abstract
Although more than 200 human spliceosomal and splicing-associated proteins are known, the evolution of the splicing machinery has not been studied extensively. The recent near-complete sequencing and annotation of distant vertebrate and chordate genomes provides the opportunity for an exhaustive comparative analysis of splicing factors across eukaryotes. We describe here our semiautomated computational pipeline to identify and annotate splicing factors in representative species of eukaryotes. We focused on protein families whose role in splicing is confirmed by experimental evidence. We visually inspected 1894 proteins and manually curated 224 of them. Our analysis shows a general conservation of the core spliceosomal proteins across the eukaryotic lineage, contrasting with selective expansions of protein families known to play a role in the regulation of splicing, most notably of SR proteins in metazoans and of heterogeneous nuclear ribonucleoproteins (hnRNP) in vertebrates. We also observed vertebrate-specific expansion of the CLK and SRPK kinases (which phosphorylate SR proteins), and the CUG-BP/CELF family of splicing regulators. Furthermore, we report several intronless genes amongst splicing proteins in mammals, suggesting that retrotransposition contributed to the complexity of the mammalian splicing apparatus.Keywords
This publication has 74 references indexed in Scilit:
- How did alternative splicing evolve?Nature Reviews Genetics, 2004
- A plethora of plant serine/arginine-rich proteins: redundancy or evolution of novel gene functions?Biochemical Society Transactions, 2004
- Splicing double: insights from the second spliceosomeNature Reviews Molecular Cell Biology, 2003
- The Draft Genome of Ciona intestinalis : Insights into Chordate and Vertebrate OriginsScience, 2002
- The genome sequence of Schizosaccharomyces pombeNature, 2002
- T-coffee: a novel method for fast and accurate multiple sequence alignment 1 1Edited by J. ThorntonJournal of Molecular Biology, 2000
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- dbEST — database for “expressed sequence tags”Nature Genetics, 1993
- Basic Local Alignment Search ToolJournal of Molecular Biology, 1990
- Basic local alignment search toolJournal of Molecular Biology, 1990