Identification and measurement of neighbor-dependent nucleotide substitution processes

Abstract
Motivation: Neighbor-dependent substitution processes generated specific pattern of dinucleotide frequencies in the genomes of most organisms. The CpG-methylation–deamination process is, e.g. a prominent process in vertebrates (CpG effect). Such processes, often with unknown mechanistic origins, need to be incorporated into realistic models of nucleotide substitutions. Results: Based on a general framework of nucleotide substitutions we developed a method that is able to identify the most relevant neighbor-dependent substitution processes, estimate their relative frequencies and judge their importance in order to be included into the modeling. Starting from a model for neighbor independent nucleotide substitution we successively added neighbor-dependent substitution processes in the order of their ability to increase the likelihood of the model describing given data. The analysis of neighbor-dependent nucleotide substitutions based on repetitive elements found in the genomes of human, zebrafish and fruit fly is presented. Availability: A web server to perform the presented analysis is freely available at: http://evogen.molgen.mpg.de/server/substitution-analysis Contact:arndt@molgen.mpg.de
Keywords