HIV-1 and HIV-2 LTR Nucleotide Sequences: Assessment of the Alignment by N-block Presentation, “Retroviral Signatures” of Overrepeated Oligonucleotides, and a Probable Important Role of Scrambled Stepwise Duplications/Deletions in Molecular Evolution
Open Access
- 1 July 2001
- journal article
- research article
- Published by Oxford University Press (OUP) in Molecular Biology and Evolution
- Vol. 18 (7) , 1231-1245
- https://doi.org/10.1093/oxfordjournals.molbev.a003909
Abstract
Previous analyses of retroviral nucleotide sequences, suggest a so-called “scrambled duplicative stepwise molecular evolution” (many sectors with successive duplications/deletions of short and longer motifs) that could have stemmed from one or several starter tandemly repeated short sequence(s). In the present report, we tested this hypothesis by focusing on the long terminal repeats (LTRs) (and flanking sequences) of 24 human and 3 simian immunodeficiency viruses. By using a calculation strategy applicable to short sequences, we found consensus overrepresented motifs (often containing CTG or CAG) that were congruent with the previously defined “retroviral signature.” We also show many local repetition patterns that are significant when compared with simply shuffled sequences. First- and second-order Markov chain analyses demonstrate that a major portion of the overrepresented oligonucleotides can be predicted from the dinucleotide compositions of the sequences, but by no means can biological mechanisms be deduced from these results: some of the listed local repetitions remain significant against dinucleotide-conserving shuffled sequences; together with previous results, this suggests that interspersed and/or local mononucleotide and oligonucleotide repetitions could have biased the dinucleotide compositions of the sequences. We searched for suggestive evolutionary patterns by scrutinizing a reliable multiple alignment of the 27 sequences. A manually constructed alignment based on homology blocks was in good agreement with the polypeptide alignment in the coding sectors and has been exhaustively assessed by using a multiplied alphabet obtained by the promising mathematical strategy called the N-block presentation (taking into account the environment of each nucleotide in a sequence). Sector by sector, we hypothesize many successive duplication/deletion scenarios that fit our previous evolutionary hypotheses. This suggests an important duplication/deletion role for the reverse transcriptase, particularly in inducing stuttering cryptic simplicity patterns.Keywords
This publication has 31 references indexed in Scilit:
- Identification of common molecular subsequencesPublished by Elsevier ,2004
- Evidence for a High Frequency of Simultaneous Double-Nucleotide SubstitutionsScience, 2000
- Caractérisation des N-écritures et application à l'étude des suites de complexité ultimement n + csteTheoretical Computer Science, 1999
- Common Modular Structure of Lentivirus LTRsVirology, 1996
- Structure and Function of the Human Immunodeficiency Virus Leader RNAProgress in Nucleic Acid Research and Molecular Biology, 1996
- Intracellular Factors Involved in Gene Expression of Human RetrovirusesPublished by Springer Nature ,1995
- Cellular Transcription Factors Involved in the Regulation of HIV-1 Gene ExpressionAIDS, 1992
- Scrambled duplications in the feline leukemia virusgag gene: A putative pattern for molecular evolutionJournal of Molecular Evolution, 1989
- Cryptic simplicity in DNA is a major source of genetic variationNature, 1986
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970