Natural history of the ERVWE1 endogenous retroviral locus
© Bonnaud et al; licensee BioMed Central Ltd. 2005
Received: 21 July 2005
Accepted: 22 September 2005
Published: 22 September 2005
The human HERV-W multicopy family includes a unique proviral locus, termed ERVWE1, whose full-length envelope ORF was preserved through evolution by the action of a selective pressure. The encoded Env protein (Syncytin) is involved in hominoid placental physiology.
In order to infer the natural history of this domestication process, a comparative genomic analysis of the human 7q21.2 syntenic regions in eutherians was performed. In primates, this region was progressively colonized by LTR-elements, leading to two different evolutionary pathways in Cercopithecidae and Hominidae, a genetic drift versus a domestication, respectively.
The preservation in Hominoids of a genomic structure consisting in the juxtaposition of a retrotransposon-derived MaLR LTR and the ERVWE1 provirus suggests a functional link between both elements.
The infectious retrovirus founding the contemporary HERV-W family  entered the genome of a Catarrhine ancestor 25–40 million years ago [2, 3]. The spread of the HERV-W family into the genome essentially results from autonomous and non-autonomous events of intracellular retrotransposition of transcriptionally active copies [4, 5]. The HERV-W family contains a unique locus, termed ERVWE1, which encodes an envelope glycoprotein expressed in the placenta [3, 6]. This envelope, also dubbed Syncytin, exhibits fusogenic properties in vitro and is directly involved in trophoblast differentiation [6–8]. The functional conservation of the ERVWE1 locus among Hominoids  and the identification of selective constraints on the env gene  strongly suggest that this retroviral locus has been recruited to play a role in placental physiology. In order to decipher the natural history of the ERVWE1 locus, we performed a comparative genomic analysis of the eutherian chromosomal regions syntenic to a portion of human chromosome 7q21.2 containing the (H)ERVWE1 locus. We observe in this region that the content in transposable elements varies between species, notably with a progressive enrichment of LTR-elements in the Platyrrhine and Catarrhine lineages. Based on an ancestral mosaic of LTR-elements, this retroviral cluster followed two opposed evolutionary pathways, a genetic drift versus a domestication, in Cercopithecidae and Hominidae lineages, respectively.
Results and Discussion
The length of the PEX1-ODAG intergenic region varies among species (17.8 ± 7.9 kb), ranging from 2.6 kb to 30.9 kb for rat and human, respectively (Figure 1a). The length variation of the intergenic region is generally due to the presence of various transposable elements (TEs) (Figure 1b). The particularly short intergenic regions of rodents may result from the general deletion mechanisms previously proposed to account for rodent small genome size . The herein described region suggests that the rodent deletion process show no bias towards TEs (Figure 1b). In comparison, the length of PEX1 and ODAG intronic regions is homogenous (PEX1 : 38.5 ± 13.4 kb ; ODAG : 8.1 ± 2.5 kb), the variability relying mostly upon one species for each gene (Figure 1b). For example, the largest intronic region of PEX1 orthologous gene is observed in Bos taurus and corresponds to the presence of about 40 kb of TEs as compared to 10–20 kb in other species (Figure 1b).
TEs contents differ quantitatively and qualitatively between lineages and between intergenic and intronic regions (Figure 1b). In introns, SINEs then LINEs represent the majority of TEs among all species. The singular large LINE content of Bos taurus PEX1 introns is compatible with the huge amount of specific LINE elements in the genome of this species . The absence of such specific LINE elements in Bos taurus ODAG introns may be due to the shorter length of this gene. Within the intergenic regions, first LINEs and second SINEs predominate in Carnivores, Artiodactyls and Rodents. In primates, the intergenic regions consist largely of LTR elements and Alus. The LTR-elements are clustered in a 20 kb region just downstream from the PEX1 gene and the Alu elements are spread within the 10 kb region upstream from the ODAG gene. This local LTR concentration in primates is particularly high as compared to previous comparative analysis over several megabases . The 30 kb human PEX1-ODAG intergenic region contains 11%, 2% and 64% of Alus, LINE-1s and LTR-elements, respectively.
Second, a 633 bp ERV-P element was acquired by the common ancestor of the Platyrrhines and Catarrhines more than 40 million years ago . As for the MaLR-e1 element, the absence of trivial duplication of the integration site shades the origin of the contemporary isolated ERV-P LTRs. In any case, the putative primary recombination between paired LTRs may have occurred rapidly after integration as no ERV-P internal sequence can be detected in any of the studied species. The LTR sequence is complete as referred to the consensus sequence), although the 5' first ten nucleotides largely diverged.
Third, ERV-H and ERV-W proviruses integrated in the germ line of a Catarrhine ancestor, within the ERV-P and MaLR-e1 LTRs, respectively. Note that an ERV-H sequence is identified in the Platyrrhines (ERV-H(p)), distinct from the Catarrhines ERV-H provirus (ERV-H(c)) described above, as located about 2 kb upstream from the ERV-P LTR. The ERV-W element corresponds to the ERVWE1 provirus as it contains the locus-specific signature (a 12 bp deletion in the 3' end of the env gene) previously identified by comparing (H)ERVWE1 and paralogous HERV-W copies . The presence in several species of degenerated direct repeat at both ends of ERV-H(c) [A(C/T)(G/A)AC] and ERVWE1 [CA(A/G)(C/T)] proviruses attests that retrovirus-like integration events occurred. Whether these proviral insertions derived from re-infection or cis- or trans-retrotransposition processes remains unknown. Nevertheless, the duplication of the integration site indicates the existence at that time of functional H- and W-specific reverse transcriptases. The accumulation of independent substitutions in 5' and 3' paired LTRs, identical when the provirus integrated, is informative about the chronology of these events. Thus, the comparison of paired LTRs distances between the ERV-H(c) and the ERVWE1 proviruses (0.84 and 0.65, respectively) suggests that ERV-H(c) integrated earlier than ERVWE1.
Then the Catarrhine ancestor genomic structure followed two divergent evolutionary pathways in Cercopitheques and Hominoids (Figure 2). An about 9 kb fragment was deleted in the Cercopitheque lineage, consisting of a 3.8 kb pol-env-LTR ERV-H(c) sequence, a 4.3 kb LTR-gag-pol ERVWE1 sequence and the 0.9 kb inter-proviral region. This large deletion produced an hybrid ERV-(H/W) defective proviral structure. Surprisingly, as both ERV-H(c) 5' and ERVWE1 3' flanking sequences were also deleted, the Cercopitheque lineage is devoid of MaLR-e1 and ERV-P LTRs elements. This global inactivation of all four LTR elements was followed by the genetic drift of the env gene as revealed by the presence of different inactivating substitutions in the baboon and macaque ERVWE1 remnants, a stop codon in position 181 and a frameshift in position 498, respectively. In Hominoids, the overall 30 kb structure was preserved as confirmed by overlapping LD-PCR amplification of gorilla, orangutan and gibbon genomic DNA (data not shown). In Hominoids, the ERV-H(c) element contains a locus specific signature that consists in a unique pol-env junction. An accurate dating of this deletion event would require an extended panel of species as the region of interest is absent from the Macaca mulatta and Papio anubis genomes. The presence of the env 12 bp deletion (crucial for the Env fusogenic activity) in Hominoids  and Cercopitheques ERVWE1 proviruses suggests that this deletion occurred originally in a primary Catarrhine ancestor possibly soon after integration, in the youth of the ERV-W family. Furthermore, the ERVWE1 env signature was found to be unique in human and chimpanzee genomes, what shows an absence of retrotransposition of this element. This suggests an absence of expression of the ERVWE1 locus in the Hominoid germ line, as opposed to many other HERV-W loci that were shown to retrotransposed using mainly LINE-RT .
ERVWE1 was shown to be a bona fide gene involved in hominoid placental physiology . The concomitant conservation in Hominoids of the surrounding LTR elements suggests that they were either required for ERVWE1 activity or hitchhiked during the purifying ERVWE1 selection process . The substitution profile along the whole region does not rule out any hypotheses. Nevertheless, it reveals the strict identity of the MaLR-e1 portion located upstream from ERVWE1 in human, chimpanzee and gorilla, as opposed to a MaLR-e1 3' part different for each species. The regulation of the expression of ERVWE1 env was shown to be a bipartite element  composed of (i) a cyclic AMP (cAMP)-inducible retroviral promoter, the ERVWE1 5' LTR, and (ii) a 436 bp upstream regulatory element (URE), encompassing the MaLR-e1 5' part, that contains the trophoblast specific enhancer (TSE) cited above, conferring high level of expression and placental tropism . Although efficient, the cooperation between the URE and the LTR seemed complex due to an interference phenomenon, probably resulting from the presence of AP-2 and Sp-1 binding sites on the TSE and the cAMP-responsive elements of the LTR . Interestingly, the gibbon transcriptional regulatory elements shows an in vitro biased behavior as compared to human, chimpanzee, gorilla and orangutan orthologous elements, i.e. the ERVWE1 5' LTR exhibits a higher placental promoter activity  and the URE is deficient in enhancer activity . This feature of the gibbon URE seems associated with two specific mutations in AP-2 and Sp-1, an enhancer activity equivalent to the human one being restored after the modification of the two corresponding residues . Although we cannot exclude the possibility that these observations are partially due to the specific context of a human trophoblastic cell line, this functional analysis supports the very recent recruitment of the elderly MaLR-e1 5' half as proposed in this work. Thus, a LTR of retrotransposon MaLR element and a LTR of a (H)ERV-W proviral locus were co-opted to regulate syncytin expression in placenta. Interestingly, the newly identified murine syncytin-B env gene which triggers cell-cell fusion in vitro and is expressed specifically in placenta in vivo displays an upstream MaLR LTR . Whether this represents an additional element to the puzzling convergent physiological role of primate and rodent syncytins remains to be determined.
We observe in the region syntenic to a portion of human chromosome 7q21.2 containing the (H)ERVWE1 locus a progressive enrichment of LTR-elements in the Platyrrhine and Catarrhine lineages. Based on an ancestral mosaic of LTR-elements, two opposed evolutionary pathways are followed, a genetic drift versus a domestication, in Cercopithecidae and Hominidae lineages, respectively. The domestication process includes the ERVWE1 locus in Hominoid species, and putatively a retrotransposon-derived MaLR LTR strictly conserved in the Homo/Pan/Gorilla subgroup. We propose that both elements were recruited to achieve the regulation of syncytin expression in placenta.
Syntenic sequences to PEX1-ODAG intergenic regions are extracted from the high throughput genomic sequences (HTGS) division of GenBank using BLAST . The query sequence is composed of exons of PEX1 and ODAG genes, as described in the ensembl repository http://www.ensembl.org as vega transcript OTTHUMT00000060247 and OTTHUMG00000023913, respectively. We obtain the following GenBank accession nos., [GenBank:AC092510.2]: Papio anubis, [GenBank:AC148267.2] and [GenBank:AC148269.3]: Callithrix jacchus, [GenBank:AC148127.3] and [GenBank:AC149006.1]: Otolemur garnettii, [GenBank:AC147739.3]: Dasypus novemcinctus, [GenBank:AC148524.3]: Rhinolophus ferrumequinum, [GenBank:AC145009.2] and [GenBank:AC108896.2]: Bos taurus, [GenBank:AC105371.2]: Sus scrofa, [GenBank:AC147729.2]: Oryctolagus cuniculus, [GenBank:AC148352.2]: Sorex araneus, [GenBank:AC097829.7], [GenBank:AC079989.2], [GenBank:AC127809.3] and [GenBank:AC079998.2]: Rattus norvegicus, [GenBank:AC092872.2]: Pan troglodytes, [GenBank:AC114335.3]: Canis familiaris, [GenBank:AC148249.3]: Otolemur garnettii, [GenBank:AC148380.2] and [GenBank:AC148379.2]: Taeniopygia guttata, [GenBank:AC148423.3] and [GenBank:AC148421.2]: Meleagris gallopavo, [GenBank:AC138736.2]: Gallus gallus.
We use RepeatMasker (Smit, AFA, Hubley, R & Green, P. RepeatMasker Open-3.0. 1996–2004 http://www.repeatmasker.org) to identify transposable elements in all the studied species. Sequence alignments were computed with ClustalW  and refined manually using Seaview .
We have sequenced Ateles fusciceps robustus and Macaca mulatta genomic PEX1-ODAG region. Sequences are provided in genomic databases with the following accession number : [GenBank:AY925147] for Ateles fusciceps robustus and [GenBank:AY925148] for Macaca mulatta.
List of Abbreviations
human endogenous retrovirus
open reading frame
long terminal repeat
mammalian apparent LTR-retrotransposon
short interspersed element
long interspersed element
long distance PCR
BB is supported by a doctoral fellowship from bioMérieux and Centre National de la Recherche Scientifique and a grant from "La fondation pour la recherche médicale (FRM)". The work was partially supported by INTAS 01-0759. We thank G. Hunsmann for Ateles DNA samples.
- Blond JL, Beseme F, Duret L, Bouton O, Bedin F, Perron H, Mandrand B, Mallet F: Molecular characterization and placental expression of HERV-W, a new human endogenous retrovirus family. J Virol. 1999, 73: 1175-1185.PubMed CentralPubMedGoogle Scholar
- Kim HS, Takenaka O, Crow TJ: Isolation and phylogeny of endogenous retrovirus sequences belonging to the HERV-W family in primates. J Gen Virol. 1999, 80: 2613-2619.View ArticlePubMedGoogle Scholar
- Voisset C, Bouton O, Bedin F, Duret L, Mandrand B, Mallet F, Paranhos-Baccala G: Chromosomal distribution and coding capacity of the human endogenous retrovirus HERV-W family. AIDS Res Hum Retroviruses. 2000, 16: 731-740. 10.1089/088922200308738.View ArticlePubMedGoogle Scholar
- Costas J: Characterization of the intragenomic spread of the human endogenous retrovirus family HERV-W. Mol Biol Evol. 2002, 19: 526-533.View ArticlePubMedGoogle Scholar
- Pavlicek A, Paces J, Elleder D, Hejnar J: Processed pseudogenes of human endogenous retroviruses generated by LINEs: their integration, stability, and distribution. Genome Res. 2002, 12: 391-399. 10.1101/gr.216902. Article published online before print in February 2002.PubMed CentralView ArticlePubMedGoogle Scholar
- Blond JL, Lavillette D, Cheynet V, Bouton O, Oriol G, Chapel-Fernandes S, Mandrand B, Mallet F, Cosset FL: An envelope glycoprotein of the human endogenous retrovirus HERV-W is expressed in the human placenta and fuses cells expressing the type D mammalian retrovirus receptor. J Virol. 2000, 74: 3321-3329. 10.1128/JVI.74.7.3321-3329.2000.PubMed CentralView ArticlePubMedGoogle Scholar
- Mi S, Lee X, Li X, Veldman GM, Finnerty H, Racie L, LaVallie E, Tang XY, Edouard P, Howes S, et al: Syncytin is a captive retroviral envelope protein involved in human placental morphogenesis. Nature. 2000, 403: 785-789. 10.1038/35001608.View ArticlePubMedGoogle Scholar
- Frendo JL, Olivier D, Cheynet V, Blond JL, Bouton O, Vidaud M, Rabreau M, Evain-Brion D, Mallet F: Direct involvement of HERV-W Env glycoprotein in human trophoblast cell fusion and differentiation. Mol Cell Biol. 2003, 23: 3566-3574. 10.1128/MCB.23.10.3566-3574.2003.PubMed CentralView ArticlePubMedGoogle Scholar
- Mallet F, Bouton O, Prudhomme S, Cheynet V, Oriol G, Bonnaud B, Lucotte G, Duret L, Mandrand B: The endogenous retroviral locus ERVWE1 is a bona fide gene involved in hominoid placental physiology. Proc Natl Acad Sci U S A. 2004, 101: 1731-1736. 10.1073/pnas.0305763101.PubMed CentralView ArticlePubMedGoogle Scholar
- Bonnaud B, Bouton O, Oriol G, Cheynet V, Duret L, Mallet F: Evidence of Selection on the Domesticated ERVWE1 env Retroviral Element Involved in Placentation. Mol Biol Evol. 2004, 21: 1895-1901. 10.1093/molbev/msh206.View ArticlePubMedGoogle Scholar
- Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, An P, et al: Initial sequencing and comparative analysis of the mouse genome. Nature. 2002, 420: 520-562. 10.1038/nature01262.View ArticlePubMedGoogle Scholar
- Thomas JW, Touchman JW, Blakesley RW, Bouffard GG, Beckstrom-Sternberg SM, Margulies EH, Blanchette M, Siepel AC, Thomas PJ, McDowell JC, et al: Comparative analyses of multi-species sequences from targeted genomic regions. Nature. 2003, 424: 788-793. 10.1038/nature01858.View ArticlePubMedGoogle Scholar
- Wei W, Gilbert N, Ooi SL, Lawler JF, Ostertag EM, Kazazian HH, Boeke JD, Moran JV: Human L1 retrotransposition: cis preference versus trans complementation. Mol Cell Biol. 2001, 21: 1429-1439. 10.1128/MCB.21.4.1429-1439.2001.PubMed CentralView ArticlePubMedGoogle Scholar
- Esnault C, Maestre J, Heidmann T: Human LINE retrotransposons generate processed pseudogenes. Nat Genet. 2000, 24 (4): 363-367. 10.1038/74184.View ArticlePubMedGoogle Scholar
- Smit AF: Identification of a new, abundant superfamily of mammalian LTR-transposons. Nucleic Acids Res. 1993, 21: 1863-1872.PubMed CentralView ArticlePubMedGoogle Scholar
- Prudhomme S, Oriol G, Mallet F: A retroviral promoter and a cellular enhancer define a bipartite element which controls env ERVWE1 placental expression. J Virol. 2004, 78: 12157-12168. 10.1128/JVI.78.22.12157-12168.2004.PubMed CentralView ArticlePubMedGoogle Scholar
- Goodman M, Porter CA, Czelusniak J, Page SL, Schneider H, Shoshani J, Gunnell G, Groves CP: Toward a phylogenetic classification of Primates based on DNA evidence complemented by fossil evidence. Mol Phylogenet Evol. 1998, 9: 585-598. 10.1006/mpev.1998.0495.View ArticlePubMedGoogle Scholar
- Jurka J: Repbase update: a database and an electronic journal of repetitive elements. Trends Genet. 2000, 16: 418-420. 10.1016/S0168-9525(00)02093-X.View ArticlePubMedGoogle Scholar
- Dupressoir A, Marceau G, Vernochet C, Benit L, Kanellopoulos C, Sapin V, Heidmann T: Syncytin-A and syncytin-B, two fusogenic placenta-specific murine envelope genes of retroviral origin conserved in Muridae. Proc Natl Acad Sci U S A. 2005, 102: 725-730. 10.1073/pnas.0406509102.PubMed CentralView ArticlePubMedGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215: 403-410. 10.1006/jmbi.1990.9999.View ArticlePubMedGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22: 4673-4680.PubMed CentralView ArticlePubMedGoogle Scholar
- Galtier N, Gouy M, Gautier C: SEA VIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny. Comput Appl Biosci. 1996, 12: 543-548.PubMedGoogle Scholar
- Murphy WJ, Eizirik E, O'Brien SJ, Madsen O, Scally M, Douady CJ, Teeling E, Ryder OA, Stanhope MJ, de Jong WW, et al: Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science. 2001, 294: 2348-2351. 10.1126/science.1067179.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.