A Complementary Bioinformatics Approach to Identify Potential Plant Cell Wall Glycosyltransferase-Encoding Genes
Open Access
- 1 September 2004
- journal article
- research article
- Published by Oxford University Press (OUP) in Plant Physiology
- Vol. 136 (1) , 2609-2620
- https://doi.org/10.1104/pp.104.042978
Abstract
Plant cell wall (CW) synthesizing enzymes can be divided into the glycan (i.e. cellulose and callose) synthases, which are multimembrane spanning proteins located at the plasma membrane, and the glycosyltransferases (GTs), which are Golgi localized single membrane spanning proteins, believed to participate in the synthesis of hemicellulose, pectin, mannans, and various glycoproteins. At the Carbohydrate-Active enZYmes (CAZy) database where e.g. glucoside hydrolases and GTs are classified into gene families primarily based on amino acid sequence similarities, 415 Arabidopsis GTs have been classified. Although much is known with regard to composition and fine structures of the plant CW, only a handful of CW biosynthetic GT genes—all classified in the CAZy system—have been characterized. In an effort to identify CW GTs that have not yet been classified in the CAZy database, a simple bioinformatics approach was adopted. First, the entire Arabidopsis proteome was run through the Transmembrane Hidden Markov Model 2.0 server and proteins containing one or, more rarely, two transmembrane domains within the N-terminal 150 amino acids were collected. Second, these sequences were submitted to the SUPERFAMILY prediction server, and sequences that were predicted to belong to the superfamilies NDP-sugartransferase, UDP-glycosyltransferase/glucogen-phosphorylase, carbohydrate-binding domain, Gal-binding domain, or Rossman fold were collected, yielding a total of 191 sequences. Fifty-two accessions already classified in CAZy were discarded. The resulting 139 sequences were then analyzed using the Three-Dimensional-Position-Specific Scoring Matrix and mGenTHREADER servers, and 27 sequences with similarity to either the GT-A or the GT-B fold were obtained. Proof of concept of the present approach has to some extent been provided by our recent demonstration that two members of this pool of 27 non-CAZy-classified putative GTs are xylosyltransferases involved in synthesis of pectin rhamnogalacturonan II (J. Egelund, B.L. Petersen, A. Faik, M.S. Motawia, C.E. Olsen, T. Ishii, H. Clausen, P. Ulvskov, and N. Geshi, unpublished data).Keywords
This publication has 55 references indexed in Scilit:
- SCOP: A structural classification of proteins database for the investigation of sequences and structuresPublished by Elsevier ,2006
- The Pfam protein families databaseNucleic Acids Research, 2004
- The MUR3 Gene of Arabidopsis Encodes a Xyloglucan Galactosyltransferase That Is Evolutionarily Related to Animal ExostosinsPlant Cell, 2003
- Three-dimensional structures of the Mn and Mg dTDP complexes of the family GT-2 glycosyltransferase SpsA: a comparison with related NDP-sugar glycosyltransferasesJournal of Molecular Biology, 2001
- Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structureJournal of Molecular Biology, 2001
- Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. CohenJournal of Molecular Biology, 2001
- Enhanced genome annotation using structural profiles in the program 3D-PSSM 1 1Edited by J. ThorntonJournal of Molecular Biology, 2000
- The Protein Data BankNucleic Acids Research, 2000
- GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequencesJournal of Molecular Biology, 1999
- Biosynthesis of Pectins and GalactomannansPublished by Elsevier ,1999