Skip to main content

Volume 8 Supplement 2

Frontiers of Retrovirology 2011

Conservation of ancient full-length open reading frames in vertebrate endogenous retroviruses


Endogenous retroviruses (ERVs) are genetic remnants of exogenous retroviral infections that have endured selective processes and made their way into ancestral host genomes. While many of them have deteriorated beyond the ability to code for detectable proteins, some still retain activity or the potential for coding activity [13]. A few active ERVs have been coopted into fully functional genes that play important roles in development or disease [1, 2, 4]. New high throughput data revealed that these ERVs are far more common than previously thought and different software packages have been created to search genomes for these structures. In this study, we build on such a tool to look at several vertebrate genomes, search for ERV remnants and identify loci with long, intact open reading frames for the major retroviral genes and dating these loci of interest using the long terminal repeat (LTR) divergence method [5].

Materials and methods

The genomes of twelve vertebrates were acquired from UCSC Genome Browser and analyzed with an in-house method based on the Itrharvest[6] software package, complemented with additional filtering and data retrieval. Candidate ERV loci were then probed for full-length open reading frames, which were classified within one of the three major retroviral genes: gag, pol or env by a BlastX [7] against a retroviral protein database. LTR dating for each ERV locus was done by comparing the genetic divergence between the LTRs.


Our study reveals the existence of a few conserved full length open reading frames in loci whose LTRs present a similarity lower than 70% and plenty of open reading frames occur across all genomes in loci with LTR similarity below 80%. Current method limitations may cause this number to be an underestimation but nonetheless reveals the existence of old retroviral infections that have, to this date, kept some coding potential. In the studied primate genomes, a select number of loci still retain long to full length open reading frames on all three major genes even on loci with LTR divergence under 95% (Table 1). Pol genes appear to possess the highest number of conserved ORF and a high number of recent integrations with full conservation has been found in the mouse genome.

Table 1 Number of ERV loci containing large intact open reading frames (at least 1000 uninterrupted nucleotides) for all three major retroviral genes: gag , pol and env , grouped by their locus'LTR divergence percentage


  1. Simpson GR, et al: Endogenous D-type (HERV-K) related sequences are packaged into retroviral particles in the placenta and possess open reading frames for reverse transcriptase. Virology. 1996, 222: 451-456. 10.1006/viro.1996.0443.

    CAS  Article  PubMed  Google Scholar 

  2. Mi S, et al: Syncytin, a captured retroviral envelope protein involved in human placental morphogenesis. Nature. 2000, 403: 785-789. 10.1038/35001608.

    CAS  Article  PubMed  Google Scholar 

  3. Takeuchi Y, et al: Host range and interference studies of three classes of pig endogenous retrovirus. J Virol. 1998, 72: 9986-9991.

    PubMed Central  CAS  PubMed  Google Scholar 

  4. Nexø BA, et al: The etiology of multiple sclerosis: genetic evidence for the involvement of the human endogenous retrovirus HERV-Fc1. PLoS One. 2011, 6 (2): e16652-10.1371/journal.pone.0016652.

    PubMed Central  Article  PubMed  Google Scholar 

  5. SanMiguel P, et al: The paleontology of intergene retrotransposons of maize. Nat Genet. 1998, 20: 43-45. 10.1038/1695.

    CAS  Article  PubMed  Google Scholar 

  6. Ellinghaus D: LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinf. 2008, 9: 18-10.1186/1471-2105-9-18.

    Article  Google Scholar 

  7. Altschul SF, et al: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

Download references

Author information



Rights and permissions

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Martins, H., Villesen, P. Conservation of ancient full-length open reading frames in vertebrate endogenous retroviruses. Retrovirology 8, P47 (2011).

Download citation

  • Published:

  • DOI:


  • Long Terminal Repeat
  • UCSC Genome Browser
  • Code Activity
  • Vertebrate Genome
  • High Throughput Data