Based Upon Repeat Pattern (BURP): an algorithm to characterize the long-term evolution of Staphylococcus aureus populations based on spa polymorphisms
Open Access
- 29 October 2007
- journal article
- Published by Springer Nature in BMC Microbiology
- Vol. 7 (1) , 98
- https://doi.org/10.1186/1471-2180-7-98
Abstract
For typing of Staphylococcus aureus, DNA sequencing of the repeat region of the protein A (spa) gene is a well established discriminatory method for outbreak investigations. Recently, it was hypothesized that this region also reflects long-term epidemiology. However, no automated and objective algorithm existed to cluster different repeat regions. In this study, the Based Upon Repeat Pattern (BURP) implementation that is a heuristic variant of the newly described EDSI algorithm was investigated to infer the clonal relatedness of different spa types. For calibration of BURP parameters, 400 representative S. aureus strains with different spa types were characterized by MLST and clustered using eBURST as "gold standard" for their phylogeny. Typing concordance analysis between eBURST and BURP clustering (spa-CC) were performed using all possible BURP parameters to determine their optimal combination. BURP was subsequently evaluated with a strain collection reflecting the breadth of diversity of S. aureus (JCM 2002; 40:4544). In total, the 400 strains exhibited 122 different MLST types. eBURST grouped them into 23 clonal complexes (CC; 354 isolates) and 33 singletons (46 isolates). BURP clustering of spa types using all possible parameter combinations and subsequent comparison with eBURST CCs resulted in concordances ranging from 8.2 to 96.2%. However, 96.2% concordance was reached only if spa types shorter than 8 repeats were excluded, which resulted in 37% excluded spa types. Therefore, the optimal combination of the BURP parameters was "exclude spa types shorter than 5 repeats" and "cluster spa types into spa-CC if cost distances are less than 4" exhibiting 95.3% concordance to eBURST. This algorithm identified 24 spa-CCs, 40 singletons, and excluded only 7.8% spa types. Analyzing the natural population with these parameters, the comparison of whole-genome micro-array groupings (at the level of 0.31 Pearson correlation index) and spa-CCs gave a concordance of 87.1%; BURP spa-CCs vs. manually grouped spa types resulted in 95.7% concordance. BURP is the first automated and objective tool to infer clonal relatedness from spa repeat regions. It is able to extract an evolutionary signal rather congruent to MLST and micro-array data.Keywords
This publication has 14 references indexed in Scilit:
- Validation of Pulsed-Field Gel Electrophoresis and spa Typing for Long-Term, Nationwide Epidemiological Surveillance Studies of Staphylococcus aureus InfectionsJournal of Clinical Microbiology, 2007
- Assignment of Staphylococcus Isolates to Groups by spa Typing, SmaI Macrorestriction Analysis, and Multilocus Sequence TypingJournal of Clinical Microbiology, 2006
- Automated DNA Sequence-Based Early Warning System for the Detection of Methicillin-Resistant Staphylococcus aureus OutbreaksPLoS Medicine, 2006
- Variation of the Polymorphic Region X of the Protein A Gene during Persistent Airway Infection of Cystic Fibrosis Patients Reflects Two Independent Mechanisms of Genetic Change in Staphylococcus aureusJournal of Clinical Microbiology, 2005
- eBURST: Inferring Patterns of Evolutionary Descent among Clusters of Related Bacterial Genotypes from Multilocus Sequence Typing DataJournal of Bacteriology, 2004
- spa Typing Method for Discriminating among Staphylococcus aureus Isolates: Implications for Use of a Single Marker To Detect Genetic Micro- and MacrovariationJournal of Clinical Microbiology, 2004
- Evolutionary Models of the Emergence ofMethicillin-Resistant StaphylococcusaureusAntimicrobial Agents and Chemotherapy, 2003
- National Nosocomial Infections Surveillance (NNIS) System Report, data summary from January 1992 through June 2003, issued August 2003American Journal of Infection Control, 2003
- Typing of Methicillin-Resistant Staphylococcus aureus in a University Hospital Setting by Using Novel Software for spa Repeat Determination and Database ManagementJournal of Clinical Microbiology, 2003
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994