GibbsCluster: unsupervised clustering and alignment of peptide sequences
Top Cited Papers
Open Access
- 12 April 2017
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 45 (W1) , W458-W463
- https://doi.org/10.1093/nar/gkx248
Abstract
Receptor interactions with short linear peptide fragments (ligands) are at the base of many biological signaling processes. Conserved and information-rich amino acid patterns, commonly called sequence motifs, shape and regulate these interactions. Because of the properties of a receptor-ligand system or of the assay used to interrogate it, experimental data often contain multiple sequence motifs. GibbsCluster is a powerful tool for unsupervised motif discovery because it can simultaneously cluster and align peptide data. The GibbsCluster 2.0 presented here is an improved version incorporating insertion and deletions accounting for variations in motif length in the peptide input. In basic terms, the program takes as input a set of peptide sequences and clusters them into meaningful groups. It returns the optimal number of clusters it identified, together with the sequence alignment and sequence motif characterizing each cluster. Several parameters are available to customize cluster analysis, including adjustable penalties for small clusters and overlapping groups and a trash cluster to remove outliers. As an example application, we used the server to deconvolute multiple specificities in large-scale peptidome data generated by mass spectrometry. The server is available at http://www.cbs.dtu.dk/services/GibbsCluster-2.0.Keywords
This publication has 22 references indexed in Scilit:
- Direct identification of clinically relevant neoepitopes presented on native human melanoma tissue by mass spectrometryNature Communications, 2016
- Sampling From the Proteome to the Human Leukocyte Antigen-DR (HLA-DR) Ligandome Proceeds Via High SpecificityMolecular & Cellular Proteomics, 2016
- High‐sensitivity HLA class I peptidome analysis enables a precise definition of peptide motifs and the identification of peptides from cell lines and patients’ seraProteomics, 2016
- MHCcluster, a method for functional clustering of MHC moleculesImmunogenetics, 2013
- New and continuing developments at PROSITENucleic Acids Research, 2012
- Simultaneous alignment and clustering of peptide data using a Gibbs sampling approachBioinformatics, 2012
- Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletionNucleic Acids Research, 2012
- Uncovering new aspects of protein interactions through analysis of specificity landscapes in peptide recognition domainsFEBS Letters, 2012
- Assembly of Cell Regulatory Systems Through Protein Interaction DomainsScience, 2003
- Detecting Subtle Sequence Signals: a Gibbs Sampling Strategy for Multiple AlignmentScience, 1993