Identification of endogenous retroviral reading frames in the human genome
© Villesen et al; licensee BioMed Central Ltd. 2004
Received: 22 September 2004
Accepted: 11 October 2004
Published: 11 October 2004
Human endogenous retroviruses (HERVs) comprise a large class of repetitive retroelements. Most HERVs are ancient and invaded our genome at least 25 million years ago, except for the evolutionary young HERV-K group. The far majority of the encoded genes are degenerate due to mutational decay and only a few non-HERV-K loci are known to retain intact reading frames. Additional intact HERV genes may exist, since retroviral reading frames have not been systematically annotated on a genome-wide scale.
By clustering of hits from multiple BLAST searches using known retroviral sequences we have mapped 1.1% of the human genome as retrovirus related. The coding potential of all identified HERV regions were analyzed by annotating viral open reading frames (vORFs) and we report 7836 loci as verified by protein homology criteria. Among 59 intact or almost-intact viral polyproteins scattered around the human genome we have found 29 envelope genes including two novel gammaretroviral types. One encodes a protein similar to a recently discovered zebrafish retrovirus (ZFERV) while another shows partial, C-terminal, homology to Syncytin (HERV-W/FRD).
This compilation of HERV sequences and their coding potential provide a useful tool for pursuing functional analysis such as RNA expression profiling and effects of viral proteins, which may, in turn, reveal a role for HERVs in human health and disease. All data are publicly available through a database at http://www.retrosearch.dk.
It has become evident that the human genome harbors a fairly small number of genes, and exons account for little over 1% of our DNA. This stands in stark contrast to various types of repetitive DNA, and it has been estimated that transposable elements alone take up almost half of our genome . Among such multi-copy elements are human endogenous retroviruses (HERVs). These represent stably inherited copies of integrated retroviral genomes (so-called provirus structures) that have entered our ancestors' genome. It has been estimated that HERVs and related sequences such as solitary long terminal repeat structures (solo-LTRs) and retrotransposon-like (env-deficient) elements constitute approximately 8% of the human genome .
Phylogenetic analysis of the retroviral polymerase gene (pol)  and envelope genes (env)  have identified at least 26 distinct HERV groups. However, less well-defined sequence comparisons suggest that there may be well over 100 different HERV groups [4, 5]. Within the family of Retroviridae most of the seven genera are represented by endogenous members, and HERVs are divided into class I, II and III depending on sequence relatedness to gammaretroviruses, betaretroviruses or spumaviruses, respectively. Many HERVs are named according to tRNA usage (i.e. HERV-K has a primer binding site that matches a lysine tRNA), while others have been more or less provisionally named by their discoverer. It seems increasingly clear that the nomenclature for endogenous retroviruses (ERVs) needs to be revised to accommodate such wide diversity. Furthermore, it is evident that many more ERVs are yet to be discovered as retroviral elements are present in most, if not all, vertebrates and even in some invertebrates [6, 7].
With a single exception (HERV-K) all HERV groups are ancient (i.e. entered the genome prior to human speciation) and entered our genome at least 25 million years ago [6, 8, 9] presumably as an infection of the germ-line. Alternatively, it is possible that ERVs have evolved from pre-existing genomic elements such as LTR-retrotransposons . After colonization most HERV groups have spread within the genome either by re-infection or intracellular transposition [11, 12] and have reached copy numbers ranging from a few to several hundreds . The vast majority of these provirus copies are non-functional due to the accumulation of debilitating mutations. Indeed, no replication-competent HERVs have yet been described, although fully intact members of the HERV-K group have been reported . Other mammalian species such as mouse, cat and pig harbor modern replication-competent ERVs that to a large extent may interact with related exogenous viruses [15, 16].
The presence of endogenous retroviral sequences in our genome has several possible implications: i) replication and (random) insertion of new proviral structures, ii) effect on adjacent cellular genes, iii) long range genomic effects and iv) expression of viral proteins (or RNA). Since the majority of HERVs are highly defective no de novo insertions have been observed and presumably HERV mobilization very rarely results in spontaneous genetic disorders or gene knock-outs as seen with other active retrotransposons such as L1 elements . However, existing HERV loci have been shown to alter gene expression by providing alternative transcription initiation, new splice sites or premature polyadenylation sites . Moreover, the presence of enhancers and hormone-responsive elements in the LTR structure of existing HERVs may up- or down-regulate the transcription of flanking cellular genes. It has been speculated that transcription initiation from HERVs/solo-LTRs into neighboring genes in the antisense orientation might interfere with gene expression. Alternatively, gene transcripts encompassing antisense viral sequences could down-regulate HERV expression. The human C4 gene may provide an example of the latter, where antisense HERV-K sequences are generated and display an effect on a heterologous target . Such effects may possibly rely on formation of dsRNA and RNA interference. On a genome scale the presence of closely related sequences may trigger events of ectopic recombination and hence lead to chromosomal rearrangements. Sequence analysis of provirus flanking-DNA suggests that this has occurred during primate evolution . The frequency and significance of such events in human disorders are not clear at present. Finally, HERVs may express viral proteins. The common retroviral genes, gag, (pro), pol and env lead to expression of 3 viral polyproteins (Gag, Gag-Pol and Env) that are processed by a viral or host protease into the active structural and enzymatic subunits. Although most HERV genes are no longer intact, a small fraction has escaped mutational decay. For a subgroup of HERV-K (HDTV) all proteins can apparently be expressed and particle formation has been detected in teratocarcinoma cell lines . Furthermore, HERV-K (HDTV) also directs expression of a small accessory protein Rec (formerly cORF) that up-regulates nucleo-cytoplasmic transport of unspliced viral RNA [21, 22]. Loci from other HERV groups have maintained a single intact open reading frame, such as the env genes from HERV-H , HERV-W  and HERV-R (ERV3) . Conservation of an open reading frame during primate evolution clearly suggests some biological function. Animal studies have demonstrated that ERV proteins may in fact serve a useful role for the host either by preventing new retroviral infection or by adopting a physiological role. Syncytin, an Env-derived protein that mediates cell-cell fusion during human placenta formation, provides a striking example of the latter [26, 27]. Recently, a second Env protein, dubbed Syncytin 2, proposed to have a similar cell-fusion role  was identified. Env proteins may also inhibit cell entry of related exogenous retroviruses that use a common surface receptor, and a Gag-derived protein restricts incoming retroviruses in mice .
In the literature, expression of HERVs has frequently been linked with human disease including various cancers and a number of autoimmune disorders . While causal links between disease and HERV activity have yet to be established, it is clear from animal models that expression of endogenous retroviral proteins can affect cell proliferation and invoke or modulate immune responses. A few recent examples include i) the possible association of Rec (HERV-K) with germ-cell tumors , ii) the immunosuppressive abilities of HERV-H Env in a murine cancer model resulting in disturbed tumor clearance  and iii) the possible superantigenic (SAg) properties of envelopes from HERV-K and HERV-W [33, 34] and the increased activity of such proviruses in multiple sclerosis , rheumatoid arthritis , schizophrenia  and type-1 diabetes . SAg expression from the HERV-K18 locus may furthermore be induced by INF-α and thus viral infection such as Epstein-Barr virus [37, 38]. One major problem in verifying putative disease association is the multi-copy nature of HERVs and the ambiguous assignment to individual provirus; a problem that can be solved by properly annotating the human genome.
Among Env-associated effects the mechanism of SAg-like activity is believed to involve true epitope-independent stimulation of T-cells, while the mechanism of action of the immunosuppressive CKS-17-like domain is still unknown. This immunosuppressive peptide region maps to the envelope gene  and may significantly alter the pathogenic properties of retrovirus and even enhance cancer development. Phylogenetic analysis suggests that a CKS17-like motif arose early in the evolution of retrovirus and is widespread in many current HERV lineages , thus identification of novel envelope genes attracts particular attention.
Computer-assisted identification of HERV loci has previously been reported. These include searching conserved amino-acid motifs within the pol gene [2, 40] and env gene , detection of full-length env genes by nucleotide similarity  and compiling of LTR- or ERV-classified repeats as reported by RepeatMasker analysis [4, 5, 42]. Currently only Paces et al. [5, 42] provide a searchable database where individual loci are mapped as chromosomal coordinates . However, except for detection of 16 full-length env genes in a recent survey by de Parseval et al  and a detailed analysis of intactness of HERV-H- related proviruses , no one has systematically detected HERV regions and scanned them for content of viral open reading frames. In this paper we report mapping of 7836 regions in the human genome that show sequence resemblance to known retroviral genomes which cover the majority of large proviral structures or HERV loci, and, importantly, provide a detailed annotation of all viral open reading frames.
The average region size is 4300 nucleotides and the ~7800 HERV regions cover ~1.1% of the human genome. All data are publicly available as a searchable database at http://www.retrosearch.dk Our data include i) chromosomal coordinates and sequence information of the 7836 HERV regions, ii) annotation of ~38000 retroviral ORFs within these regions and iii) graphical visualization of individual HERV regions (Figure 1C) or larger chromosomal window. All DNA and predicted vORF sequences can be retrieved and is linked to external genome browsers for further analysis.
Skewed chromosomal distribution and few intragenic HERVs
Genomic distribution of HERV regions
χ2 test within chr.c
Limited number of intact viral open reading frames
Distribution of vORF lengths (stop codon to stop codon)
vORF size (aa/codons)
63 – 100
100 – 200
200 – 300
300 – 400
400 – 500
500 – 600
600 – 700
700 – 800
800 – 900
900 – 1000
If one extends the search criteria and scans the human genome for retroviral genes where a single mutation (one nucleotide insertion, deletion or substitution) either removes premature termination or restores the correct reading frame, the number of long Gag, Pol and Env proteins increases two-fold to 27, 23 and 43, respectively (Figure 3).
Novel envelope genes identified
Previously and newly identified long Env ORFs in the human genome
Chromosomal position of locus (NCBI release 34)
HERV H- like Env
Chr. X 70307525–70316940 (+1)
N-term unknown Minor C-term deletion
Chr. X 95868842–95875915 (+1)
Chr. X 105067535–105070015 (-1)
Minor N-term deletion
HERV-K Env (type 1)
Chr. 1 75266332–75270814 (+1)
In frame pol-env fusion
HERV-K Env (type 1)
Chr. 1 157878336–157885675 (+1)
In frame pol-env fusion
Chr. 2 155926784–155933168 (+1)
HERV-K Env (type 1)
Chr. 2 130813720–130815944 (-1)
In frame pol-env fusion
Chr. 2 166767087–166774769 (-1)
Chr. 3 16781208–16788508 (+1)
HERV-K Env (type 1)
Chr. 3 114064939–114072223 (-1)
In frame pol-env fusion C-term deletion
Chr. 3 167860265–167867997 (-1)
Chr. 5 34507318–34513254 (-1)
N- and C-term deletion
Chr. 6 11211667–11219905 (-1)
Chr. 6 78422690–78431275 (-1)
Chr. 7 4367317–4383401 (-1)
Chr. 7 63862984–63871411 (-1)
Chr. 7 91710047–91718755 (-1)
Chr. 7 152498159–152502575 (-1)
Chr. 8 7342682–7353583 (-1)
Chr. 11 101104479–101112064 (+1)
Minor C-term deletion
Chr. 12 104204746–104209814 (+1)
Minor C-term deletion
Chr. 12 57008431–57016689 (-1)
Chr. 14 91072914–91085655 (-1)
HERV-K Env (type 1)
Chr. 16 35312483–35314318 (+1)
In frame pol-env fusion
Chr. 19 20334642–20343232 (+1)
Chr. 19 58210000–58211244 (+1)
N-term unknown Minor C-term deletion
Chr. 19 58244133–58246051 (+1)
Chr. 19 32821287–32829201 (-1)
EST matching to HERV regions with long ORFs
We mapped 265 ESTs to one of the 42 HERV regions that encode a long Gag, Pol or Env ORF (Figure 3). The EST GenBank accession number, the matching HERV ID and the source organ and tissue type are provided as supplementary material (see Additional file 2). Briefly, 20 of the 42 HERV regions were found to have matching ESTs suggesting transcriptional activity. For the long envelope genes we have included the number of EST matches in Table 3. Our analysis reveals that besides "activity" of members of the HERV-K group, only HERV-Fc(2), HERV-R (Erv3) and a few HERV-W/FRD members (including Syncytin-1 and -2) have unambiguous EST matches. By far, Syncytin-1, dominates with 100 EST matches, followed by Syncytin-2 and HERV-R. Syncytin-1 and Syncytin-2 were predominantly found in placental EST libraries (see Additional file 2), which is also true for 5 of 17 HERV-R ESTs. Interestingly, among the two (partial) HERV-W/FRD-like env genes four of 6 ESTs are also derived from placental tissues.
We report a mapping of 7836 loci in the human genome that show nucleotide sequence similarity to retroviral genomes and importantly, we provide a detailed analysis of their coding potential by annotation of all viral ORFs (stop-codon to stop-codon fragments longer than 62). This compilation of HERV regions and their corresponding viral ORFs is available as a searchable database . A graphical example is provided in Figure 1C. In total our HERV regions (which exclude flanking LTRs) amount to 1.1 % of the human genome, a number that agrees well with previous reports [1, 42].
The vast majority of the mapped HERV regions contain several frame-shift mutations or in-frame stop codons that truncate the viral ORFs and thus testify to their old association with the human genome. In fact, we detect only 42 proviruses that have retained Gag, Pol or Env ORFs in the size range that approach full-length proteins (Figure 3 and Table 2). As expected the majority are part of the evolutionary young HERV-K (HML-2) group. Neither of these HERV-K loci are completely intact, although one potential replication-competent locus (HERV-K113, polymorphic for humans and not present in the NCBI34 genome) has been reported . Alternatively, complementation among HERV-K loci may open up for infectious particle formation, and clearly defines interesting candidates to investigate experimentally. Moreover, assuming a high error-rate during transcription or retrotransposition, one cannot exclude that almost-intact loci may occasionally revert to their original functional state and become replication-competent. Based on our data about 34 gag, pol or env genes can be restored by a single point mutation or a single insertion-deletion event.
Within our list of intact or almost-intact viral ORFs in the human genome, we detect only a single gag gene and two pol genes that are not from the HERV-K group. However, among the 29 long envelope genes 15 are gammaretroviral (Table 3). The fragmented, pseudogene nature of the gag and pol genes (small ORFs) in several of these provirus loci strongly suggests that selection has preserved the env genes. In case of syncytin-1 and -2 (HERV-W and HERV-FRD members, respectively) evolutionary conservation can be understood in functional terms, since the encoded envelope proteins have been suggested to play an essential role in placental development by causing trophoblast syncytia formation [28, 48]. Compelling evolutionary evidence for purifying selection in these genes has recently been gathered to support this hypothesis [28, 49, 50].
Concerning other ancient loci such as HERV-R (erv3) no evidence for a physiological role has yet been established despite a remarkable conservation and expression of the env gene. Potential cellular roles for envelope genes that may drive purifying selection include i) protection from infection by related retroviruses by receptor interference as demonstrated for the murine fv4 locus , ii) mediator of organized cell-cell fusion like the syncytin genes [26–28] and iii) a hypothesized role in preventing the immune response against the developing embryo by means of the immunosuppressive domain .
Two seemingly intact env genes not detected in the recent survey of intact human envelope genes  are equally interesting in terms of possible functional conservation. One is located on chromosome 14q32.12 and this novel gene shows low but significant similarity to a recently reported endogenous retrovirus from Zebrafish (ZFERV ). BLAST analysis of the protein coding regions suggests that this HERV group belong to the gammaretroviral genera. Whether this gene is still active or whether the encoded protein still maintains function and/or plays a cellular role is yet to be established. Although we were unable to detect any unambiguous EST matches to this gene (Table 3), RT-PCR analysis indicates low RNA abundance in a few human tissues including placenta (Kjeldbjerg AL, Aagaard L, Villesen P and Pedersen FS, unpublished). A second seemingly intact novel env gene is found on chromosome 19q13.41, and interestingly a C-terminal truncated "twin" gene is located just 40 kb away. Both genes appear to be active as judged by EST data (Table 3) mostly in placental tissue (see Additional file 2). We have been able to confirm this by RT-PCR analysis (unpublished), and ongoing expression analysis aims at clarifying the activity and function of these novel genes.
Among the long betaretroviral env genes five turned out to carry a specific 292 bp deletion that fuses the pol and env reading frames. This deletion variant of the HERV-K (HML-2) group is indicative of the type 1 genomes  that despite the lack of functional proteins have been mobilized quite efficiently. Alternatively, recombination or gene conversion may have conserved this HERV-K deletion variant [11, 54]. It is noteworthy that the Env protein from one of these Ä292-genes, HERV-K18, is reported to have SAg-like activity , and a similar function of the other four K18 SAg-like genes is an open question.
Although our analysis is extensive it is most likely not exhaustive. The sensitivity is obviously limited by our query sequences, and some ancient HERVs may have suffered from the mutational decay to a degree which makes is impossible to detect them by homology. For instance, the ZFERV-related env gene reported by us was only detected due to inclusion of the ZFERV sequence , and although available data such as HERVd  also points to this region it is reported as a number of incomplete HERVs. Similarly, nucleotide based searches (as RepeatMasker and BLAST detection) only partially detect the novel HERV-W/RFD-like envelope genes and the intact envelope genes among HERV-Fc family even though these proviruses are fairly intact as suggested by a recent mobilization of HERV-Fc in the primate lineage . Thus, inclusion of more retroviral query sequences as our vORF validated HERV data may likely improve detection methods in an iterative manner ("phylogenetic walking") as previously applied by Tristem . Finally, screening the human genome in silico does not guarantee detection of polymorphic HERV loci in which the empty pre-integration site is still segregating in the human population. Indeed, an experimental survey has recently detected two such polymorphic loci in the human population (HERV-K113 and 115 ), and like HERV-K113 other recently acquired proviruses may escape our attention.
In general, our analysis of the genomic positions of our ~7800 HERV regions revealed three distinct patterns, which all confirm earlier reported results: i) there is an unequal distribution of HERVs between chromosomes and along the genome. In particular the Y chromosome stands out with a five-fold excess of our vORF positive (internal) HERV sequences (Table 1), and it has thus been dubbed "a chromosomal graveyard" . This agrees well with previous genome surveys of LTR/ERV-related elements and the phenomenon may likely be associated with the high level of heterochromatin and low levels of recombination [55–58]. ii) HERVs are underrepresented within genes and iii) HERVs found in introns are predominantly orientated in the antisense direction (Figure 2). This pattern is well known [56, 58] and expected due to selection against gene disruption or interference by retroviral regulatory elements such as promoters, splice sites and polyadenylation signals. This selection may have counteracted a preference for proviral integration (and retrotransposition) near or inside genes as suggested by recent studies for several retroviral genera [59, 60].
Initially, HERV discovery was driven by the search for replication-competent viruses and their possible association with human cancers as established in other species. Recent research has demonstrated that the presence of endogenous retroviral sequences in our genome has a number of complex functional and evolutionary consequences and cannot simply be regarded as "junk" DNA. The increased complexity and diversity of HERVs as testified by the identification of two novel env genes in this survey make expression analysis and functional assessment a difficult task. To aid this process our genome-wide HERV data as well as predictions of Gag, Pol and Env reading frames in these loci are a useful resource and our data can be searched and visualized at http://www.retrosearch.dk Clearly, the 42 HERVs encompassing intact or near-intact gag, pol and env genes as described here are interesting experimental objects, although less intact viral proteins may also hold biological activity. In the near future use of comparative genomics and mapping of allele polymorphisms will most certainly enhance identification of endogenous retroviruses and reveal selection patterns that may eventually decipher a role for these genes in human health and/or disease.
In order to identify HERV regions in the human genome we performed BLAST searches using sensitive parameters. BLAST hits were saved in a database and subsequently clustered into putative HERV loci. These putative loci were then scanned for viral Open Reading Frames (vORFs) and the presence of flanking direct repeat sequences (putative LTRs). Subsequently, ORFs were categorized based on a library of known retroviral proteins and non-retroviral proteins.
Identifying HERV regions
In order to cover as many different HERV families as possible we compiled a query set of 237 publicly available sequences from Genbank, published papers and Repbase sequences . These sequences cover all known retroviral genera and include both endogenous and exogenous strains from various host organisms (the query set is available upon request). Each query sequence was manually edited, removing LTR elements in order to avoid detection of solo LTRs. BLAST searches against contigs from the NCBI release 34 of the human genome were performed using WU-BLAST (Gish, W. (1996–2003) http://blast.wustl.edu), with default parameters except for W = 8, E = 0.001, V = 1000000, B = 1000000. Search results were stored in a MySql database and mapped to chromosomal positions using Ensembl Bioperl packages .
Overlapping BLAST hits were clustered into putative HERV regions allowing a gap of 500 nucleotides between hits. A region-score was calculated based on the sum of e-value weighted hitlengths divided by region length. Only regions longer than 300 nucleotides and a region-score > 3.0 (threshold based on empirical tests) were kept, resulting in 45658 putative HERV region.
Detection of direct flanking repeats (putative LTRs) were done by comparing a window before and after the HERV region.
ORF finding and categorization
For the 45658 putative HERV loci, we scanned the DNA sequence (including 1000 bases flanking the locus) for forward open reading frames (stop-codon to stop-codon) of lengths > 62 aminoacids (aa). Stop-codon-to-stop-codon fragments were chosen to accommodate the use of non-conventional translational initiation by retroviruses at the internal pro and pol genes (by means of ribosomal frame-shifting and terminations suppression). Therefore the predicted proteins in particular for gag and env genes may contain incorrect N-terminal regions that must be removed by looking for appropriate start codons. ORF lengths below 63 aa were discarded as the probability of finding ORFs less than 63 aa in a random sequence increases to more than 0.05 (assuming equal codon frequencies).
All ORFs were then assigned to a category by FASTA searching against a library of known retroviral proteins (RV) and known non-retroviral proteins (NON_RV). RV proteins were downloaded from NCBI and categorized into either GAG, POL, PRO, ENV, ACC (accessory protein) or UNWANTED (for unwanted or unknown proteins). NON_RV proteins consists of all human SwissProt proteins of length 400–700 aa not including the words "endogenous, virus, envelope, env-, env, gag-, gag, pol-, pol, reverse". The final library consisted of 6260 records (3454 RV proteins + 2806 NON_RV proteins). ORF was assigned to the same category as the highest scoring hit. All loci with a significant RV ORF (vORF) were flagged as HERVs (E < 0.0005) – this data set consists of 7836 loci. Manual inspection of long ORF above 400 codons revealed that two envelope ORFs (ORF ID 86185 and 312172) were (mis)categorized as non-significant (NonS) due to low sequence similarity to our retroviral protein library.
EST matching to individual proviruses
In order to match the human ESTs to the vORF positive HERV regions we first performed an all against all search using NCBI MegaBLAST . The output was filtered so that only the best matching pairs (HERV-EST) were kept and put into a database. The ESTs that matched the HERV regions encompassing a long ORF were subsequently assigned to a human genomic region using EST mapping data from UCSC Genome Browser . ESTs that unambiguously mapped to the same genomic region as the HERV regions of interest were counted as positive EST matches.
List of abbreviations used
Expressed sequence tag
Human endogenous retrovirus
Long terminal repeat
Viral open reading frame
This work was supported by the Danish Medical and Technical Research Councils, The Karen Elise Jensen Foundation, The Danish Cancer Society and Aarhus University Research Foundation.
- International Human Genome Sequencing Consortium (IHGSC): Initial sequencing and analysis of the human genome. Nature. 2001, 409: 860-921. 10.1038/35057062.View Article
- Tristem M: Identification and characterization of novel human endogenous retrovirus families by phylogenetic screening of the Human Genome Mapping Project database. J Virol. 2000, 74: 3715-3730. 10.1128/JVI.74.8.3715-3730.2000.PubMed CentralView ArticlePubMed
- Benit L, Dessen P, Heidmann T: Identification, phylogeny, and evolution of retroviral elements based on their envelope genes. J Virol. 2001, 75: 11709-11719. 10.1128/JVI.75.23.11709-11719.2001.PubMed CentralView ArticlePubMed
- Jurka J: Repbase update: A database and an electronic journal of repetitive elements. Trends Genet. 2000, 16: 418-420. 10.1016/S0168-9525(00)02093-X.View ArticlePubMed
- Paces J, Pavlicek A, Zika R, Kapitonov VV, Jurka J, Paces V: HERVd: the Human Endogenous RetroViruses Database: update. Nucleic Acids Res. 2004, 32: D50-10.1093/nar/gkh075.PubMed CentralView ArticlePubMed
- Boeke JD, Stoye JP: Retrotransposons, endogenous retroviruses, and the evolution of retroelements. In Retroviruses. Edited by: Coffin JM, Hughes SH, Varmus HE. 1997, Cold Spring Harbor: Cold Spring Harbor Laboratory Press, 343-435.
- Herniou E, Martin J, Miller K, Cook J, Wilkinson M, Tristem M: Retroviral diversity and distribution in vertebrates. J Virol. 1998, 72: 5955-5966.PubMed CentralPubMed
- Barbulescu M, Turner G, Seaman MI, Deinard AS, Kidd KK, Lenz J: Many human endogenous retrovirus K (HERV-K) proviruses are unique to humans. Curr Biol. 1999, 9: 861-868. 10.1016/S0960-9822(99)80390-X.View ArticlePubMed
- Shih A, Coutavas EE, Rush MG: Evolutionary implications of primate endogenous retroviruses. Virology. 1991, 182: 495-502. 10.1016/0042-6822(91)90590-8.View ArticlePubMed
- Temin HM: Origin of retroviruses from cellular moveable genetic elements. Cell. 1980, 21: 599-600. 10.1016/0092-8674(80)90420-1.View ArticlePubMed
- Belshaw R, Pereira V, Katzourakis A, Talbot G, Paces J, Burt A, Tristem M: Long-term reinfection of the human genome by endogenous retroviruses. Proc Natl Acad Sci USA. 2004, 101: 4894-4899. 10.1073/pnas.0307800101.PubMed CentralView ArticlePubMed
- Tchenio T, Heidmann T: High-frequency intracellular transposition of a defective mammalian provirus detected by an in situ colorimetric assay. J Virol. 1992, 66: 1571-1578.PubMed CentralPubMed
- Lower R, Lower J, Kurth R: The viruses in all of us: Characteristics and biological significance of human endogenous retrovirus sequences. Proc Natl Acad Sci USA. 1996, 93: 5177-5184. 10.1073/pnas.93.11.5177.PubMed CentralView ArticlePubMed
- Turner G, Barbulescu M, Su M, Jensen-Seaman MI, Kidd KK, Lenz J: Insertional polymorphisms of full-length endogenous retroviruses in humans. Curr Biol. 2001, 11: 1531-1535. 10.1016/S0960-9822(01)00455-9.View ArticlePubMed
- Mikkelsen JG, Pedersen FS: Genetic reassortment and patch repair by recombination in retroviruses. J Biomed Sci. 2000, 7: 77-99. 10.1159/000025434.View ArticlePubMed
- Rasmussen HB: Interactions between exogenous and endogenous retroviruses. J Biomed Sci. 1997, 4: 1-8.View ArticlePubMed
- Kazazian HH: An estimated frequency of endogenous insertional mutations in humans. Nat Genet. 1999, 22: 130-130. 10.1038/9638.View ArticlePubMed
- Brosius J: RNAs from all categories generate retrosequences that may be exapted as novel genes or regulatory elements. Gene. 1999, 238: 115-134. 10.1016/S0378-1119(99)00227-9.View ArticlePubMed
- Schneider PM, Witzel-Schlomp K, Rittner C, Zhang L: The endogenous retroviral insertion in the human complement C4 gene modulates the expression of homologous genes by antisense inhibition. Immunogenetics. 2001, 53: 1-9. 10.1007/s002510000288.View ArticlePubMed
- Hughes JF, Coffin JM: Evidence for genomic rearrangements mediated by human endogenous retroviruses during primate evolution. Nat Genet. 2001, 29: 487-489. 10.1038/ng775.View ArticlePubMed
- Magin C, Lower R, Lower J: cORF and RcRE, the Rev/Rex and RRE/RxRE homologues of the human endogenous retrovirus family HTDV/HERV-K. J Virol. 1999, 73: 9496-9507.PubMed CentralPubMed
- Yang J, Bogerd HP, Peng S, Wiegand H, Truant R, Cullen BR: An ancient family of human endogenous retroviruses encodes a functional homolog of the HIV-1 Rev protein. Proc Natl Acad Sci USA. 1999, 96: 13404-13408. 10.1073/pnas.96.23.13404.PubMed CentralView ArticlePubMed
- Lindeskog M, Mager DL, Blomberg J: Isolation of a human endogenous retroviral HERV-H element with an open env reading frame. Virology. 1999, 258: 441-450. 10.1006/viro.1999.9750.View ArticlePubMed
- Blond JL, Beseme F, Duret L, Bouton O, Bedin F, Perron H, Mandrand B, Mallet F: Molecular characterization and placental expression of HERV-W, a new human endogenous retrovirus family. J Virol. 1999, 73: 1175-1185.PubMed CentralPubMed
- Cohen M, Powers M, Oconnell C, Kato N: The nucleotide-sequence of the Env gene from the human provirus ERV3 and isolation and characterization of an ERV3-Specific cDNA. Virology. 1985, 147: 449-458. 10.1016/0042-6822(85)90147-3.View ArticlePubMed
- Blond JL, Lavillette D, Cheynet V, Bouton O, Oriol G, Chapel-Fernandes S, Mandrand B, Mallet F, Cosset FL: An envelope glycoprotein of the human endogenous retrovirus HERV-W is expressed in the human placenta and fuses cells expressing the type D mammalian retrovirus receptor. J Virol. 2000, 74: 3321-3329. 10.1128/JVI.74.7.3321-3329.2000.PubMed CentralView ArticlePubMed
- Mi S, Lee X, Li X, Veldman GM, Finnerty H, Racie L, LaVallie E, Tang XY, Edouard P, Howes S, Keith JC, McCoy JM: Syncytin is a captive retroviral envelope protein involved in human placental morphogenesis. Nature. 2000, 403: 785-789. 10.1038/35001608.View ArticlePubMed
- Blaise S, de Parseval N, Benit L, Heidmann T: Genomewide screening for fusogenic human endogenous retrovirus envelopes identifies syncytin 2, a gene conserved on primate evolution. Proc Natl Acad Sci USA. 2003, 100: 13013-13018. 10.1073/pnas.2132646100.PubMed CentralView ArticlePubMed
- Best S, Le Tissier P, Towers G, Stoye JP: Positional cloning of the mouse retrovirus restriction gene Fv1. Nature. 1996, 382: 826-829. 10.1038/382826a0.View ArticlePubMed
- Lower R: The pathogenic potential of endogenous retroviruses: facts and fantasies. Trends Microbiol. 1999, 7: 350-356. 10.1016/S0966-842X(99)01565-6.View ArticlePubMed
- Boese A, Sauter M, Galli U, Best B, Herbst H, Mayer J, Kremmer E, Roemer K, Mueller-Lantzsch N: Human endogenous retrovirus protein cORF supports cell transformation and associates with the promyelocytic leukemia zinc finger protein. Oncogene. 2000, 19: 4328-4336. 10.1038/sj.onc.1203794.View ArticlePubMed
- Mangeney M, de Parseval N, Thomas G, Heidmann T: The full-length envelope of an HERV-H human endogenous retrovirus has immunosuppressive properties. J Gen Virol. 2001, 82: 2515-2518.View ArticlePubMed
- Conrad B, Weissmahr RN, Boni J, Arcari R, Schupbach J, Mach B: A human endogenous retroviral superantigen as candidate autoimmune gene in type I diabetes. Cell. 1997, 90: 303-313. 10.1016/S0092-8674(00)80338-4.View ArticlePubMed
- Perron H, Jouvin-Marche E, Michel M, Ounanian-Paraz A, Camelo S, Dumon A, Jolivet-Reynaud C, Marcel F, Souillet Y, Borel E, Gebuhrer L, Santoro L, Marcel S, Seigneurin JM, Marche PN, Lafon M: Multiple sclerosis retrovirus particles and recombinant envelope trigger an abnormal immune response in vitro, by inducing polyclonal Vbeta16 T-lymphocyte activation. Virology. 2001, 287: 321-332. 10.1006/viro.2001.1045.View ArticlePubMed
- Gaudin P, Ijaz S, Tuke PW, Marcel F, Paraz A, Seigneurin JM, Mandrand B, Perron H, Garson JA: Infrequency of detection of particle-associated MSRV/HERV-W RNA in the synovial fluid of patients with rheumatoid arthritis. Rheumatology. 2000, 39: 950-954. 10.1093/rheumatology/39.9.950.View ArticlePubMed
- Karlsson H, Bachmann S, Schroder J, McArthur J, Torrey EF, Yolken RH: Retroviral RNA identified in the cerebrospinal fluids and brains of individuals with schizophrenia. Proc Natl Acad Sci USA. 2001, 98: 4634-4639. 10.1073/pnas.061021998.PubMed CentralView ArticlePubMed
- Stauffer Y, Marguerat S, Meylan F, Ucla C, Sutkowski N, Huber B, Pelet T, Conrad B: Interferon-alpha-induced endogenous superantigen. a model linking environment and autoimmunity. Immunity. 2001, 15: 591-601. 10.1016/S1074-7613(01)00212-6.View ArticlePubMed
- Sutkowski N, Conrad B, Thorley-Lawson DA, Huber BT: Epstein-Barr virus transactivates the human endogenous retrovirus HERV-K18 that encodes a superantigen. Immunity. 2001, 15: 579-589. 10.1016/S1074-7613(01)00210-2.View ArticlePubMed
- Cianciolo GJ, Copeland TD, Oroszlan S, Snyderman R: Inhibition of lymphocyte proliferation by a synthetic peptide homologous to retroviral envelope proteins. Science. 1985, 230: 453-455.View ArticlePubMed
- Jern P, Sperber GO, Blomberg J: Definition and variation of human endogenous retrovirus H. Virology. 2004, 327: 93-110. 10.1016/j.virol.2004.06.023.View ArticlePubMed
- de Parseval N, Lazar V, Casella JF, Benit L, Heidmann T: Survey of human genes of retroviral origin: Identification and transcriptome of the genes with coding capacity for complete envelope proteins. J Virol. 2003, 77: 10414-10422. 10.1128/JVI.77.19.10414-10422.2003.PubMed CentralView ArticlePubMed
- Paces J, Pavlicek A, Paces V: HERVd: database of human endogenous retroviruses. Nucleic Acids Res. 2002, 30: 205-206. 10.1093/nar/30.1.205.PubMed CentralView ArticlePubMed
- Human endogenous retrovirus database. [http://herv.img.cas.cz]
- Reus K, Mayer J, Sauter M, Zischler H, Muller-Lantzsch N, Meese E: Genomic organization of the human endogenous retrovirus HERV-K(HML-2.HOM) (ERVK6) on chromosome 7. Genomics. 2001, 72: 314-320. 10.1006/geno.2000.6488.View ArticlePubMed
- Benit L, Calteau A, Heidmann T: Characterization of the low-copy HERV-Fc family: evidence for recent integrations in primates of elements with coding envelope genes. Virology. 2003, 312: 159-168. 10.1016/S0042-6822(03)00163-6.View ArticlePubMed
- Shen CH, Steiner LA: Genome structure and thymic expression of an endogenous retrovirus in zebrafish. J Virol. 2004, 78: 899-911. 10.1128/JVI.78.2.899-911.2004.PubMed CentralView ArticlePubMed
- RetroSearch: database of ORF annotated human endogenous retroviruses. [http://www.retrosearch.dk]
- Frendo JL, Olivier D, Cheynet V, Blond JL, Bouton O, Vidaud M, Rabreau M, Evain-Brion D, Mallet F: Direct involvement of HERV-W Env glycoprotein in human trophoblast cell fusion and differentiation. Mol Cell Biol. 2003, 23: 3566-3574. 10.1128/MCB.23.10.3566-3574.2003.PubMed CentralView ArticlePubMed
- Bonnaud B, Bouton O, Oriol G, Cheynet V, Duret L, Mallet F: Evidence of Selection on the Domesticated ERVWE1 env Retroviral Element Involved in Placentation. Mol Biol Evol. 2004, 21: 1895-1901. 10.1093/molbev/msh206.View ArticlePubMed
- Mallet F, Bouton O, Prudhomme S, Cheynet V, Oriol G, Bonnaud B, Lucotte G, Duret L, Mandrand B: The endogenous retroviral locus ERVWE1 is a bona fide gene involved in hominoid placental physiology. Proc Natl Acad Sci USA. 2004, 101: 1731-1736. 10.1073/pnas.0305763101.PubMed CentralView ArticlePubMed
- Gardner MB, Kozak CA, O'Brien SJ: The Lake Casitas wild mouse: evolving genetic resistance to retroviral disease. Trends Genet. 1991, 7: 22-27. 10.1016/0168-9525(91)90017-K.View ArticlePubMed
- Harris JR: Placental endogenous retrovirus (ERV): structural, functional, and evolutionary significance. Bioessays. 1998, 20: 307-316. 10.1002/(SICI)1521-1878(199804)20:4<307::AID-BIES7>3.3.CO;2-6.View ArticlePubMed
- Lower R, Lower J, Tondera-Koch C, Kurth R: A general method for the identification of transcribed retrovirus sequences (R-U5 PCR) reveals the expression of the human endogenous retrovirus loci HERV-H and HERV-K in teratocarcinoma cells. Virology. 1993, 192: 501-511. 10.1006/viro.1993.1066.View ArticlePubMed
- Costas J: Evolutionary dynamics of the human endogenous retrovirus family HERV-K inferred from full-length proviral genomes. J Mol Evol. 2001, 53: 237-243. 10.1007/s002390010213.View ArticlePubMed
- Kjellman C, Sjogren HO, Widegren B: The Y-chromosome – a graveyard for endogenous retroviruses. Gene. 1995, 161: 163-170. 10.1016/0378-1119(95)00248-5.View ArticlePubMed
- Medstrand P, van de Lagemaat LN, Mager DL: Retroelement distributions in the human genome: variations associated with age and proximity to genes. Genome Res. 2002, 12: 1483-1495. 10.1101/gr.388902.PubMed CentralView ArticlePubMed
- Pavlicek A, Paces J, Elleder D, Hejnar J: Processed pseudogenes of human endogenous retroviruses generated by LINEs: their integration, stability, and distribution. Genome Res. 2002, 12: 391-399. 10.1101/gr.216902. Article published online before print in February 2002.PubMed CentralView ArticlePubMed
- Smit AF: Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr Opin Genet Dev. 1999, 9: 657-663. 10.1016/S0959-437X(99)00031-3.View ArticlePubMed
- Mitchell RS, Beitzel BF, Schroder AR, Shinn P, Chen H, Berry CC, Ecker JR, Bushman FD: Retroviral DNA Integration: ASLV, HIV, and MLV Show Distinct Target Site Preferences. PLoS Biol. 2004, 2: e234-10.1371/journal.pbio.0020234.PubMed CentralView ArticlePubMed
- Schroder AR, Shinn P, Chen H, Berry C, Ecker JR, Bushman F: HIV-1 integration in the human genome favors active genes and local hotspots. Cell. 2002, 110: 521-529. 10.1016/S0092-8674(02)00864-4.View ArticlePubMed
- Ensembl Genome Browser. [http://www.ensembl.org]
- Zhang Z, Schwartz S, Wagner L, Miller W: A greedy algorithm for aligning DNA sequences. J Computational Biology. 2000, 7: 203-214. 10.1089/10665270050081478.View Article
- UCSC Genome Browser. [http://genome.ucsc.edu]
This article is published under license to BioMed Central Ltd. This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.