PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences
Top Cited Papers
- 1 May 2015
- journal article
- conference paper
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 22 (5) , 377-386
- https://doi.org/10.1089/cmb.2014.0156
Abstract
We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATe). We also show that trees estimated on PASTA alignments are highly accurate-slightly better than SATe trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATe, highly parallelizable, and requires relatively little memory.Keywords
This publication has 18 references indexed in Scilit:
- Making automated multiple alignments of very large numbers of protein sequencesBioinformatics, 2013
- FASTSP: linear time calculation of alignment accuracyBioinformatics, 2011
- HMMER web server: interactive sequence similarity searchingNucleic Acids Research, 2011
- A Comprehensive Benchmark Study of Multiple Sequence Alignment Methods: Current Challenges and Future PerspectivesPLOS ONE, 2011
- Fast, scalable generation of high‐quality protein multiple sequence alignments using Clustal OmegaMolecular Systems Biology, 2011
- FastTree 2 – Approximately Maximum-Likelihood Trees for Large AlignmentsPLOS ONE, 2010
- Faculty Opinions recommendation of A new generation of homology search tools based on probabilistic inference.Published by H1 Connect ,2010
- Rapid and Accurate Large-Scale Coestimation of Sequence Alignments and Phylogenetic TreesScience, 2009
- INDELible: A Flexible Simulator of Biological Sequence EvolutionMolecular Biology and Evolution, 2009
- The Comparative RNA Web (CRW) Site: an online database of comparative sequence and structure information for ribosomal, intron, and other RNAsBMC Bioinformatics, 2002