Sequence length variation, indel costs, and congruence in sensitivity analysis
- 1 February 2005
- journal article
- Published by Wiley in Cladistics
- Vol. 21 (1) , 15-30
- https://doi.org/10.1111/j.1096-0031.2005.00053.x
Abstract
The behavior of two topological and four character-based congruence measures was explored using different indel treatments in three empirical data sets, each with different alignment difficulties. The analyses were done using direct optimization within a sensitivity analysis framework in which the cost of indels was varied. Indels were treated either as a fifth character state, or strings of contiguous gaps were considered single events by using linear affine gap cost. Congruence consistently improved when indels were treated as single events, but no congruence measure appeared as the obviously preferable one. However, when combining enough data, all congruence measures clearly tended to select the same alignment cost set as the optimal one. Disagreement among congruence measures was mostly caused by a dominant fragment or a data partition that included all or most of the length variation in the data set. Dominance was easily detected, as the character-based congruence measures approached their optimal value when indel costs were incremented. Dominance of a fragment or data partition was overwhelmed when new sequence length-variable fragments or data partitions were added.Keywords
This publication has 57 references indexed in Scilit:
- An improved algorithm for matching biological sequencesPublished by Elsevier ,2004
- An empirical test of the treatment of indels during optimization alignment based on the phylogeny of the genus Secale (Poaceae)Molecular Phylogenetics and Evolution, 2004
- Unalignable sequences and molecular evolutionTrends in Ecology & Evolution, 2001
- Exploring the Behavior of POY, a Program for Direct Optimization of Molecular DataCladistics, 2001
- Fixed Character States and the Optimization of Molecular Sequence DataCladistics, 1999
- Measuring Topological Congruence by Extending Character TechniquesCladistics, 1999
- Towards Integration of Multiple Alignment and Phylogenetic Tree ConstructionJournal of Computational Biology, 1997
- TESTING SIGNIFICANCE OF INCONGRUENCECladistics, 1994
- DISTANCE DATA REVISITEDCladistics, 1985
- Some biological sequence metricsAdvances in Mathematics, 1976