- Open Access
Identification and characterization of diverse groups of endogenous retroviruses in felids
© Mata et al.; licensee BioMed Central. 2015
- Received: 4 October 2014
- Accepted: 23 February 2015
- Published: 15 March 2015
Endogenous retroviruses (ERVs) are genetic elements with a retroviral origin that are integrated into vertebrate genomes. In felids (Mammalia, Carnivora, Felidae), ERVs have been described mostly in the domestic cat, and only rarely in wild species. To gain insight into the origins and evolutionary dynamics of endogenous retroviruses in felids, we have identified and characterized partial pro/pol ERV sequences from eight Neotropical wild cat species, belonging to three distinct lineages of Felidae. We also compared them with publicly available genomic sequences of Felis catus and Panthera tigris, as well as with representatives of other vertebrate groups, and performed phylogenetic and molecular dating analyses to investigate the pattern and timing of diversification of these retroviral elements.
We identified a high diversity of ERVs in the sampled felids, with a predominance of Gammaretrovirus-related sequences, including class I ERVs. Our data indicate that the identified ERVs arose from at least eleven horizontal interordinal transmissions from other mammals. Furthermore, we estimated that the majority of the Gamma-like integrations took place during the diversification of modern felids. Finally, our phylogenetic analyses indicate the presence of a genetically divergent group of sequences whose position in our phylogenetic tree was difficult to establish confidently relative to known retroviruses, and another lineage identified as ERVs belonging to class II.
Retroviruses have circulated in felids along with their evolution. The majority of the deep clades of ERVs exist since the primary divergence of felids’ base and cluster with retroviruses of divergent mammalian lineages, suggesting horizontal interordinal transmission. Our findings highlight the importance of additional studies on the role of ERVs in the genome landscaping of other carnivore species.
- Endogenous retroviruses
- Neotropical cats
- pro/pol region
Endogenous retroviruses (ERVs) are remnants of exogenous counterparts that have integrated into the nuclear DNA of a germ-line cell of vertebrates and are transmitted through generations in a typical Mendelian fashion . In the course of time, ERVs tend to become defective due to selection against functional retroviruses and also due to random mutations in the host genome [2,3]. Because they may have formed numerous copies in the host genome during or subsequent to the initial endogenization, behaving as transposable elements, ERVs can be represented by both mobile DNAs and remnants of ancient retroviral infections. ERVs and their exogenous counterparts are commonly grouped into three classes: Class I (ERV1, gammaretroviruses, and epsilonretroviruses), Class II (ERV2, alpharetroviruses, betaretoviruses, deltaretroviruses and lentiviruses) and Class III (ERV3 and spumaviruses) . In addition to these groups, a new class of ERV (ERV4) has been described, with no characterised exogenous counterpart .
Endogenous retroviruses have been found in a wide variety of vertebrate species, including amphibians, reptiles, birds and mammals, as well as within the genomes of large DNA viruses, such as herpesviruses and poxviruses . The potential of retroviruses to be horizontally transmitted between different species is shown by the close phylogenetic relationships inferred between viruses coming from distantly related hosts [6,7]. Among retroviruses, gammaretroviruses are able to infect a broad host range, contrasting with betaretroviruses that have a much narrower range . Phylogenetic hypotheses bearing on retrovirus-host relationships are important because several pathogenic retroviruses arose after being transmitted between different hosts , and the understanding of retroviral evolution may help to monitor and/or minimize their potential impact.
Felids are an important animal model for some viral diseases offering strong parallels to related viruses in humans . They can be infected by three known exogenous retroviruses, the feline immunodeficiency virus (FIV), the feline leukemia virus (FeLV) and the feline foamy virus (FFV). The former two are associated with serious illnesses, while FFV, like the human counterpart, causes no recognized disease. FIV induces an AIDS-like syndrome and is considered a model for HIV regarding lentiviral pathogenesis, therapeutic assays as well as for vaccine development. Therefore, the evolutionary history of Retroviridae in felids may help to clarify retrovirus evolution among mammals.
Endogenous retroviruses have been described mostly in the genome of the domestic cat, and only rarely in wild felids. Examples include the well-known endogenous feline leukemia virus (FeLV) and RD114 in Felis, Beta-like RV-domestic cat and RV-cougar , and the lineages FERV1-5 , FERVmlu1 and FERVmlu2  and ERV-DC  in domestic cats. In a recent study, Song et al.  have described several novel ERVs in domestic cats, and currently nine divergent ERV class I lineages are known in these animals.
Felids comprise nearly 40 living species, eleven of which occur in the Neotropics and nine in Brazil, including the recently recognized Leopardus guttulus . Neotropical cats belong to three of the major felid lineages, and one species, Panthera onca, is a member of the most basal extant group in the Felidae family . In this context, if ERV lineages diversified during (or subsequent to) the divergence of the main felid groups, one might hypothesize that Neotropical cats, by representing three of the main felid clades, would carry distinct and possibly exclusive ERV lineages. In contrast, if the main episodes of ERV diversification happened prior to the divergence of the main felid groups, it could be hypothesized that Neotropical cats belonging to different clades would share some of the same ERV lineages, which would also be shared with cats from other geographic regions. Testing these alternative hypotheses would shed light on the evolutionary history of ERVs in felids and possibly in other mammals, given the potential for interspecies transmission involving cats and their prey. A related question is whether retroviruses integrated into the genome of wild cats frequently in the past, as shown for the domestic cat , or if these processes represent unusual events. In spite of their relevance, such tests have not been performed in Neotropical cats or across any other group of wild felids so far, due to the lack of available sequence data.
To address those questions, we experimentally identified novel endogenous retroviral sequences present in eight Neotropical cat species sampled in Brazil. Additionally, we searched for novel ERV sequences in available genome sequences from the domestic cat (Felis catus) and the Siberian tiger (Panthera tigris altaica). We used these data to perform phylogenetic and molecular dating analyses aiming to assess the evolutionary relationships of ERVs in Neotropical cats and those found in Felis catus and Panthera tigris, and to gain broader insights into the pattern and timing of diversification of endogenous retroviral elements in felids.
Eight Neotropical wild cat species (P. concolor, P. yagouaroundi, L. geoffroyi, L. colocolo, L. guttulus, L. pardalis, L. wiedii and P. onca) were screened by PCR and 85 ERV-like clones (out of 236) were identified, containing the typical retroviral PR-RT motifs and showing significant matches (e-values < 10-10) to ERV sequences deposited in GenBank. Out of 85 sequences, 80 were similar to class I ERVs, 4 similar to class II, and one was unclassified. The majority of these sequences (n = 82) were defective, presenting nonsense and frameshift mutations. Using these identified sequences as references, ERVs from the domestic cat (Felis catus), the Siberian tiger (Panthera tigris altaica) and several additional mammalian genomes were identified in silico (Additional file 1: Table S1).
Phylogenetic analysis of the divergent sequence from Leopardus wiedii
Sequences from mammalian hosts retrieved from Blast searches represented in Figure 2 (Dataset 2)
Query coverage (%)
Accession # a
Panthera tigris altaica
Panthera tigris altaica
Canis lupus familiaris
Odobenus rosmarus divergens
Mustela putorius furo
Ceratotherium simum simum
In-depth phylogenetic analysis of Gamma-like ERVs
More specifically, we observed that the three major groups encompassed eleven distinguishable lineages of felid Gamma-like ERVs, most of which contained sequences from multiple cat species and were not obviously congruent with host phylogeny. The most diverse major group was G2, comprising seven felid ERV lineages, two of which correspond to the well-known FeLV and RD114. Groups G1 and G3 were less diverse, with the former containing three lineages and the latter containing a single lineage (see below). The sister groups of felids were very diverse, comprising sequences from other mammalian orders, such as cetartiodactylans (specifically cetaceans), primates, rodents, carnivores and bats, suggesting multiple separated transmission events among mammals (see Discussion).
Characterization of the eleven pol gene lineages of class I ERVs from felids
Information on Gamma -like lineages represented in Figure 3
Nonsense/frameshift mutations a
Within group p -distance b (SE)
Description of lineage and time c
Sequence ID d
Contig length (bp)
Presumed ERV (positions within contig)
Avg .LTR length (bp)
LTR dist e /No. of differences f
Time since integration g
Time reported in the literature
Felis, Pco, Py, Lco, Lgu, Lp, Lg, OC, tiger [10 MY]
12666 19424 35074
3998-12096 983-9298 24631-32736
166 294 298
0.28/27 0.131/32 0.084/22
60.8/28/11.7 28.5/13.1/5.5 18.26/8.4/3.5
gamma1 (30MY/8.2 MY/-) 
Felis, Lg, tiger [10 MY]
11 [3- 14]
Felis, Lg, tiger [10 MY]
AANG02100182.1 AANG02102504.1 ATCQ01007049.1
91034 54996 22449
82322-90764 41175-50607 291-8884
343 260 314
0.20/53 0.22/43 0.20/49
43.7/20.1/8.4 49.1-22.6-9.4 45.2/20.8/8.7
Felis, Py, Lwi, tiger [10 MY]
Felis [3.36 MY]
Felis [3.36 MY]
The oldest ERV-DC integration (2.8 MY) 
Felis, Py, Pco, Lwi, tiger [10 MY]
gamma 6 (16 to 3.8 MY) 
Felis, Pco, Py, tiger [10 MY]
12392 23123 16168
2915-9700 2586-11074 5551-14181
542 538 501
0.069/34 0.042/21 0.029/14
15/6.9/2.9 9.3/4.2/1.75 6.3-2.9-1.21
Felis, tiger [10 MY]
Felis, Lg, Lco, tiger [10 MY]
Felis, Py, Lgu, Lwi, LP, Pco, tiger [10 MY]
Lineage I is the most diverse group represented in the tree, comprising sequences of all species studied (experimentally and in silico), except for Leopardus wiedii. Additionally, our lineage-specific PCR recovered a sequence of L. wiedii 95% (e-value 3E-54) similar to ERV gamma1  through Blastn (Additional file 3: Table S2). This lineage also presents a high intragroup diversity (0.1; Table 2). The [GenBank: AANG02165009.1] sequence from the domestic cat (see Table 2) represents a very old integration event (with its age estimated between 11.7 and 60.8 million years (MY) falling into the Miocene or even earlier), while the [GenBank: ATCQ01141921.1] sequence from the Siberian tiger was inferred to be much younger, ranging between 18 and 3.5 MY (Miocene-Pliocene), with a high degree of pro-pol intactness, presenting only three nonsense/frameshift mutations.
In contrast, lineage II is the least diverse clade represented in the phylogenetic tree and only harbors sequences from L. geoffroyi, Siberian tiger and domestic cat in our first analysis. The lineage-specific PCR demonstrated that lineage II is a genomic component of all species studied (Additional file 3: Table S2, Additional file 4: Figure S2). In the Megablast search using the LgCT10007 sequence as query, we retrieved only three hits out of 125 with similarity > 85% and coverage > 10% which belong to Panthera tigris [GenBank: ATCQ01146145.1], similarity of 97%, e-value 0.0) and Felis catus ([GenBank: AANG02130535.1] - Abyssinian breed - and [GenBank: ACBE01580426.1]) - mixed breed, both presenting similarity of 95% and e-value of 0.0). It was not possible to recover both LTRs from the sequences mentioned above, which prevented the estimation of their time of insertion in the genome. Nonetheless, we observed that this lineage is related (albeit with low support) to sequences from the seal Halichoerus grypus (HaEV) and the walrus Odobenus rosmarus divergens, both of which are marine carnivores (pinnipeds), as well as to Meles meles (MeEVII) and Mustela vison (MiEVII), both mustelids.
Lineage III is closely related to sequences from bats and presents the highest intragroup distance analyzed in this study (0.144; Table 2). This lineage also displays the lowest degree of pro-pol intactness, as evaluated by nonsense/frameshift mutations (median of 11). Accordingly, LTR divergences of three sequences indicated very ancient insertion events (estimated between 8 and 49 MY falling into Miocene-Oligocene). Furthermore, the longer branch lengths in the tree demonstrate that this lineage has persisted in the host genomes for long periods. Lineage-specific PCR confirm that this lineage is present in the genome of all species analysed (Additional file 4: Figure S2).
Lineage IV is related to ERVs from cetaceans, but separated from them by a long branch. ERVs from this lineage are present inside the genome of all species of Brazilian wild cat (Additional file 3: Table S2, Additional file 4: Figure S2). The two sequences of this lineage exemplified in Table 2 show recent insertion times compared to others (from 9 to 1 MY), ranging from the end of Miocene into the Pleistocene). It is noteworthy that this lineage presents few nonsense/frameshift mutations (maximum of 5) and a relatively high degree of pro-pol intactness.
Lineage V (FeLV lineage) and lineage VI (RD114/ERV-DC) are well-known ERVs present only in felids from the domestic cat lineage (Felis). The samples from the South American cats and the Blast searches of the Siberian tiger genome did not reveal any sequences related to these lineages, confirming previous studies (see [13,19]).
Lineage VII is closely related to sequences from bats, while lineage VIII is related to canine sequences (DERV1 and VuEV). Lineage IX is related to sequences from primates and lineage X is basal to lineage IX and to the primate clade. These lineages show a very similar pattern, with low rates of intragroup distance and few nonsense/frameshift mutations when compared to the others, and they are also a genomic component of all species examined in this study (Additional file 3: Table S2, Additional file 4: Figure S2). Dating based on LTR divergence showed that the insertion of some ERV sequences ranged from the middle Miocene to the early Pleistocene (see Table 2), near the time of diversification of modern felids . The short branch lengths generally observed in the topologies (Figure 3, lineages VII to X) further support the inference of rapid diversification. Moreover, we detected short sequences by PCR (~325 bp of RT domain) from L. pardalis and L. colocolo that clustered within lineage VII in a ML tree (not shown), indicating that this lineage is very abundant in felid genomes.
Lineage XI is the only felid representative of the G3 group, inferred to be a sister clade to sequences from several other mammalian orders. Similar to the lineages described above, this fossil retrovirus is found in all species examined (Additional file 3: Table S2, Additional file 4: Figure S2). The LTR divergence of a sequence from domestic cat shows an insertion event ranging from Miocene (18.7 MY) to Pliocene (3.6 MY). The short branch lengths observed in the topology (Figure 3), the low intragroup diversity and the high degree of pro-pol intactness (Table 2) all indicate a recent diversification of this group.
Phylogenetic analysis of felid ERV class II sequences
Class II ERVs from felids grouped into three different clades which were well supported by bootstrap values (Figure 4). One group was formed by the Betaretrovirus-like viruses previously described . The second was described by Gifford et al. , and also included a sequence from Leopardus geoffroyi retrieved by PCR. The third clade is newly identified in this study, which grouped with a sequence from P. yagouaroundi. According to our phylogenetic estimates (Figures 1 and 4), this group is basal to exogenous alpha- and betaretroviruses, but there was no statistical support for this resolution. As a whole, we could observe that class II ERVs are present in at least five different felid species, belonging to four different evolutionary lineages within Felidae (Figure 4).
This study represents the first broad assessment of the genome landscape of ERVs in the Felidae family, since similar assessments reported previously were restricted to the domestic cat . Our study initially focused on the characterization of ERV sequences of Neotropical wild cats, and was expanded by further analyzing ERV sequences retrieved from the genomes Felis catus and Panthera tigris, thus unveiling general patterns of ERV diversity across the Felidae family. Firstly, our results indicate that different class I ERV lineages from felids are generally not congruent with their currently accepted host phylogenies (see Figure 3), possibly due to lack of resolution given the short segments analysed, or to variable sorting of ancestral polymorphisms. Furthermore, ERV sequences from Neotropical cats did not form a unique (monophyletic) clade, which could be expected from more recent infection events, for instance after they colonized South America. In contrast, phylogenetic reconstructions showed well supported ERV clusters from felids dispersed across the retroviral phylogenies, all presenting sequences of Felis catus and Panthera tigris, except for the RD114, FeLV and ERV-DCs gammaretroviruses, which are found only in Felis, but not in other closely related Felidae genera [13,19]. Secondly, in a recent study using only domestic cat samples, Song et al.  obtained a similar pattern on class I ERV diversity to the one shown here. Additionally, Cho et al.  showed that the cat and tiger genomes have very similar composition and ratios of repeat elements. However, it is unlikely that the present study exhausted the diversity of ERVs in Neotropical cats, considering the numerous retroviral sequences found during our searches using WGS of Felis catus and Panthera tigris. We expect additional, distinct lineages to be yet undiscovered.
We found sequences that could be classified as class II ERVs in felid genomes, but as expected we did not find any sequences similar to exogenous feline immunodeficiency viruses (FIV). Many retroviruses do not infect germ line cells, and the majority of integration events do not become fixed . In fact, no endogenous copy of FIV has been found so far, although some FIV strains found in felids are deeply divergent, consistent with a long permanence, and may have existed within the Felidae family since the late Pliocene .
We failed to recover class III ERVs (spumaviruses) in the analysed felids, except for one sequence from L. wiedii. However, Song et al.  reported that class III ERVs are very abundant in Felis catus, although these in silico identified sequences are very divergent to group with other retroviral sequences, such as the sequence from L. wiedii. Therefore, class III ERVs in felids are probably more divergent than class I and II elements, which may have contributed to our failure in retrieving them by PCR.
Our phylogenetic analyses provided evidence of two divergent ERV sequence groups in felids. The most divergent group is related to clone LwiJO7007 from Leopardus wiedii, whose position in our phylogenetic tree was difficult to establish confidently relative to known retroviruses possibly because we did not detect a longer proviral sequence to improve the resolution of our analyses. However, our results strongly indicate that it is related to a group of sequences that colonized multiple mammalian genomes, as they have also been detected in several species (see Figure 2). These ERVs are very ancient components of Felidae genomes since the topological resolutions of a tree reconstructed using sequences from our lineage-specific PCR strategy (Additional file 2: Figure S1) resembles a phylogenetic split at species-level. The isolated position of a sequence from Panthera tigris [GenBank: ATCQ01106762] relative to the other felids in Figure 2 suggests that felid genomes have been colonized at least twice by this retroelement.
The other divergent group is formed by sequences related to PyEM13001 from P. yagouaroundi, and grouped with class II ERVs in our analyses. This clade seems to be intermediate between alpha- and betaretroviruses, perhaps being categorized provisionally as an alphabeta group, as named by Bolisetty et al. in a study on avian ERVs . Further studies are thus warranted to better characterize and understand the evolutionary relationships within these two divergent groups and also their positioning among retroviruses.
An important motivation for more detailed analysis of Gamma-like ERVs was to gain some insight into viruses that infected wild cats in an ancient retroviral world. By means of phylogenetic reconstruction, we provide a broader evolutionary spectrum for Gamma-like viruses in felids, unveiling a frequent history of horizontal interorder transmissions among mammals. Interspecies transmission is suggested for the origin of RD114, ERV-DCs and FeLV [1,13,24]. The Gamma-like tree topology (Figure 3) indicated that the divergence between clusters (lineages) of felid elements reflects their origins from different mammal hosts and the different groups found within these lineages points to further diversification, after these viruses jumped into the Felidae. The split between the Panthera lineage and those of other moderns cats, estimated to have occurred ca. ten million years ago , suggests that the majority of retroviral lineages shown in this study have existed in a common ancestor at least since middle Miocene. Furthermore, LTR divergence of Gamma-like sequences provided support for considerably old times of ERV insertion in most cases, although some sequences seem to have integrated more recently into felid genomes. The insertions within lineages shown in Table 2 that dated after ten million years ago did not necessarily represent new infections of exogenous retroviruses such as FeLV, since they could also result from postintegration diversification. Moreover, the divergence time of several LTR sequences analysed here suggests that Gamma-like lineages may have integrated into common ancestors in periods close to/during the diversification of modern felids (Table 2). This may explain the monophyly of the eleven Gamma-like lineages from felids, which may be further scrutinized as additional genomes from this family and other mammals become available.
Although our data suggest that cats acquired Gamma-like retroviruses during multiple cross-species transmission events in the past, the exact origin of these proviruses cannot be strictly deduced based on current information. It is thought that humans became infected with HIV during butchery of primates hunted as bush meat . Competition for ecological resources is also an important factor for cross-species transmission, as exemplified for FIV . Felid ERVs are close to those of primates, rodents, carnivores and bats, all of which are common prey items of modern wild cats . Thus, it is possible that cats also acquired retroviruses by hunting and/or scavenging carcasses. Although this hypothesis may be considered speculative, mainly because we cannot recover the real scenarios in which the ancestral cats lived or their exact trophic interactions, the phylogenies reconstructed herein are consistent with that hypothesis. Alternatively, felids could have been the source of transmission of some Gamma-like lineages. Primate and felid lineage IX grouped in a well-supported clade, with the felid lineage X as a basal clade (albeit poorly supported). Similarly, marine carnivores showed disconnected lineages I and II. To confirm the hypothesis of cross-transmission events as opposed to an older integration, it would be necessary to investigate insertion regions among the different mammalian genomes. The detection of orthologous sequences could indicate a more ancient event, considering that new insertions from infection result in random integrations.
The evolutionary tree in Figure 3 shows that some felid ERV lineages are very closely related to their sister groups, while some lineages are consistently more divergent. For example, it is noticeable that lineages III and VII probably arose from a single cross-species transmission event due to common ancestry with sequences from bats. However, the longer branches connecting lineage III with bat sequences as well as the long internal branch lengths within the felid clade suggest old infection events. Consistent with this, we estimated integration events having occurred before the basal divergence of modern felids (see Table 2), around ten million years ago . On the other hand, the shorter branches splitting bat and felid sequences of lineage VII and the low genetic distances estimated among felid sequences suggest that this lineage arose more recently. Furthermore, the LTR divergence in the two ERV sequences of lineage VII (see Table 2) as well as in the ERV sequence found in the contig [GenBank: AAPE02061792.1] from the bat Myotis lucifugus (LTR K2P distance of 0.01, dating at 2.17 MY using the substitution rate of 2.3 E-9) supports the idea that these lineages have had recent activity.
Our results strongly indicate that retroviruses have circulated in wild cats throughout their evolution, with the majority of the detected ERV clades having arisen during or prior to the basal diversification of modern felid lineages. We hypothesize, based on our phylogenetic inferences, that modern cat species descend from survivors of several retroviral infections: Pol similarities indicate that they comprised at least eleven lineages of gammaretroviruses (including class I ERVs), three of betaretroviruses (including class II ERVs) and one divergent group whose position in our phylogeny was difficult to establish confidently relative to known retroviruses. We also inferred that horizontal inter-ordinal transmission among mammals has been quite common, particularly among class I ERVs. Our findings can assist in future studies that aim to identify and characterize ERVs in mammalian genomes, and highlight the importance of further studies assessing the evolutionary history of ERVs in the genomic landscape of other carnivore species.
Samples of Brazilian wild cats were used to capture major components of the ERV diversity in Neotropical felids. They were originally obtained from road-killed individuals or wild animals captured from field ecology studies at the Laboratory of Genomics and Molecular Biology of Pontifícia Universidade Católica do Rio Grande do Sul (PUCRS), Brazil. A sample of Leopardus geoffroyi was donated by the Museum of Sciences and Technology of PUCRS. Eight species (one specimen each) of Brazilian wild cats were included in this study: southern tigrina Brazil (Leopardus guttulus; formerly part of Leopardus tigrinus), margay (Leopardus wiedii), ocelot (Leopardus pardalis), jaguarundi (Puma yagouaroundi), pampas’ cat (Leopardus colocolo), geoffroy’s cat (Leopardus geoffroyi), cougar (Puma concolor) and jaguar (Panthera onca). Genomic DNA was extracted from whole blood or tissues using PureLink Genomic DNA Kit (Life Technologies, Carlsbad, USA) according to the manufacturer’s protocol. To avoid cross contamination, samples were handled separately from each other during DNA extraction, PCR amplification, cloning and sequencing.
PCR, cloning and sequencing
Four degenerate primers targeting a fragment of the protease-reverse transcriptase (pro-pol) genes were used for PCR amplification as previously described . These viral genomic regions have been shown to be conserved, allowing the reconstruction of robust phylogenetic hypotheses [29,30]. Furthermore, the targeted regions allowed the comparison with previously published ERV sequences. PCRs were conducted with 2 min at 80°C followed by 35 cycles of denaturation at 94°C for 30s, annealing at 45°C to 50°C for 30s, and polymerization at 74°C for 60s, and one final cycle at 94°C for 30s, 45°C for 3 min, and 74°C for 10 min. Reaction conditions were performed with 50 mM KCl, 10 mM Tris-HCl (pH 8.3), 1.5 mM MgCl2, 200 mM dNTPs, 150 pmol of each primer, 100 ng of template DNA, and 2 U of Platinum Taq polymerase (Invitrogen, São Paulo, Brazil).
PCR-amplified DNA products were electrophoresed into 1% low melting agarose gels stained with ethidium bromide. Products ranging between 600 and 1200 bp were excised from the gel and purified with the Wizard SV gel kit (Promega, Madison, USA). DNA fragments were then cloned into pGEM-T Easy (Promega) and transfected into E. coli JM109 cells, following the manufacturer’s recommendations. Colony plasmid miniprep DNA purification was performed with the method of lysis by alkali  or with the Wizard Plus SV Miniprep DNA Purification System kit (Promega). Sequencing of purified PCR products was performed with the BigDye Terminator kit v.3.1 (Life Technologies, Carlsbad, USA), using the universal primers T7 and SP6, and reactions were run on an ABI-PRISM 3100 Genetic Analyzer (Life Technologies) following the manufacturer’s protocol. A total of 236 clones comprising the eight felid species were evaluated, corresponding to an average of 31 clones (25- 40) for each species studied.
To confirm the results of degenerate PCR we designed lineage-specific primers (see Additional file 4: Figure S2) and performed PCR using aliquots of genomic DNA of three specimens of each Brazilian wild cat species. The expected products were around 200 bp for the Gamma-like lineage and around 400 bp for the LwiJO7007 group. Selected primers were subjected to in silico PCR searches through the UCSC In-Silico PCR (http://genome.ucsc.edu) using the Felis catus (ICGSC Felis_catus 6.2/felCat5; Sep. 2011) genome assembly. Additionally, PCR products were sequenced in order to confirm their specificity.
Sequence nomenclature and lineage definition
The ERV sequences obtained by PCR were named as follows: the first three letters correspond to the species’ name, while the latter two represent the primer used in the PCR; the numbers correspond to the size of the fragment followed by the clone number. For example: Lwi (Leopardus wiedii) JO (primer) 700 (fragment size) 7 (clone number). The retroviral sequences obtained in this study have been deposited in GenBank (accession numbers KJ632899-KJ632941 and KJ650005). The sequences Lg_EM_1000_3 [GenBank: JX406427] from L. geoffroyi, Pco_JO_1000_6 [GenBank: JX406428]) and Pco_JO_1200_1 [GenBank: JX406429] from P. concolor were published in a previous study . Sequences retrieved from GenBank and used in our phylogenetic analyses are listed in Additional file 1: Table S1.
A group of sequences identified in felids and closely related to class I ERVs (which includes gammaretroviruses) was defined as the Gamma-like lineage. Within this group, lineage classification was based on phylogenetic analyses and specifically delimited by host switches between Felidae and other mammalian groups.
Data mining and alignments
DNA sequences obtained by PCR in this study were compared to GenBank entries using the program Blastn to identify ERV sequences among the amplified retroelements. Additionally, we performed specific searches using the program RepeatMasker via the UCSC Genome Database (https://genome.ucsc.edu) and the program Censor . Database searches were performed online from April to September 2013.
We performed extensive phylogenetic analyses of the identified ERVs along with GenBank-retrieved sequences, using four different datasets constructed with distinct strategies. Dataset 1 (DS1) was the broadest, and aimed to maximize the number of included ERV sequences by focusing only on the RT fragment, which is more conserved, allowing the comparison of amino acid data among very divergent different retroviral genera. Dataset 2 (DS2) focused specifically on the group of 19 sequences that were most similar to our LwiJO7007 element, based on Blast searches against different databases. We analyzed only the segment contained in this ERV that could be reliably aligned to these other sequences (using the alignment strategy described below), i.e. 613 bp (93% of the original positions), of which 441 represented the RT fragment. Also, since we observed no DNA saturation in this dataset (given the more shallow sampled time span than DS1, and the conserved nature of the segment), we used nucleotides instead amino acids for this analysis.
For datasets 3 and 4 (DS3 and DS4), we maximized the number of molecular characters (amino acids) by using the pro/pol fragment and analysing divergent groups separately. DS3 focused on the Gamma-like elements (described in detail below), while DS4 focused on the class II ERVs (incorporating GenBank sequences retrieved with Blast searches, as well as elements previously reported in the literature).
To construct DS3, which comprised most of the sequences identified in this study, we initially defined putative felid Gamma-like groups through phylogenetic analysis using the neighbour-joining (NJ) algorithm implemented in MEGA 5 . The alignment used was 327 amino acids long and contained 39 sequences, revealing the existence of eight different ERV lineages in this dataset (with > 15% divergence between them). To further characterize these groups, we carried out MegaBlast searches out using one representative sequence of each lineage as a query against the whole genome shotgun (WGS) contigs from Felis catus publicly available. Furthermore, tBlastn searches in the nucleotide collection (nr/nt) and WGS databases were performed using the NCBI Blast suite (http://blast.ncbi.nlm.nih.gov/). Initially, we selected all sequences retrieved in the Blast results, removing those with less than 30% identity and 70% coverage, presenting an e-value > 0.001, identical sequences and those with large indels. Short interspersed elements (SINEs) inserted within ERV sequences were excluded from the alignments. Alignments composed of sequences retrieved both from the nr/nt and the WGS databases were merged to create DS3, removing very similar sequences (≥95% identity) belonging to the same species. To reduce computer time, only one or two sequences per host species were kept in the final alignment for each lineage. However, as our focus was on felids, the majority sequences of the domestic cat were kept. Overall, this effort resulted in the identification of three additional lineages contained in this Gamma-like group, making up a total of 11 lineages contained in DS3.
During the course of this study, the genome sequence from Panthera tigris altaica was released in NCBI. We performed a second round of Megablast searches against this genome, using one or two representatives of the 11 lineages described above as queries. The highest hits belonging to Panthera tigris altaica (all with e-values close to zero), covering at least 70% of the query sequences, were incorporated into DS3.
Amino acid alignments were constructed for DS1, DS3 and DS4 as described in , using as template the known amino acid sequences of previously described retroviruses and virtual translations of novel retroviral sequences that do not have nonsense mutations in the amplified region. Sequences were carefully edited and manually adjusted. Although many sequences contained nonsense and/or frameshift mutations, we considered only their full coding capacity by eliminating these mutational events.
In the final alignments, the deduced fragments of virus-like Pro/Pol proteins from Gamma-like sequences were aligned again using Mafft v.7.017  with the FTT-NS-2 option; the L-INS-I option was used to align RT sequences related to class II ERVs and nucleotide sequences related to LwiJO7007. Regions of high sequence diversity and hence uncertain alignment were also eliminated using Gblock v.0.91  with the following options: minimum length of block of 5, gap allowed with half (treated as gaps positions where 50% or more sequences have a gap), minimum number of sequences for a flanking position of 120 for DS1, 15 for DS2, 197 for DS3, 12 for DS4 and maximum number of contiguous non-conserved position of 10 for DS3 and of 8 for DS1, DS2 and DS4. DS1 consisted of 152 sequences and 147 amino acid residues (56% of the original positions). DS3 consisted of 263 sequences, including 198 amino acid residues (50% of the original positions). DS4 comprised 19 sequences and 236 amino acid residues (68% of the original positions).
Dating ERV insertions using long terminal repeats (LTR)
To estimate the age of felid ERVs, we focused on Gamma-like ERV lineages established in this study, estimating the age of proviral pro/pol sequences found in the domestic cat and tiger genomes that were related to the elements of Neotropical wild cats. Our analyses were based on nucleotide divergence between 5′- and 3′-LTR sequences at the same insert, assuming that all differences arising post-integration evolved neutrally . The analysis was based on the relation T = (D/R)/2, where T is the time of integration, D is the divergence between the LTRs and R is the nucleotide substitution rate. We estimated D using Kimura’s two-parameter (K2P) model  with MEGA5. Because the LTR mutation rates of from felid ERVs are unknown, we employed three alternative substitution rates reported in the literature: 2.3E-9 and 5E-9 substitutions per site per year, based on human ERVs , and 1.2E-8 substitutions per site per year, a general rate estimated for felid nuclear DNA  and previously used for domestic cat ERVs . The LTR sequence divergence in contigs that contained LTRs flanking putative proviral Gammaretrovirus sequences (~7-11 kb) was assessed. Typical LTR features such as the presence of conserved motifs corresponding to (i) short inverted repeat motifs TG and CA at both LTR ends, (ii) the polyadenylation signal (AATAAA motif) and (iii) a less conserved AT-rich stretch corresponding to the TATA box  were also examined. Furthermore, potential amino acid sequences of gag-pol-env genes were searched using Blastx and manually assessed to verify the structural organization of these genes within the putative ERV sequence. Only sequences that presented matches with high statistical significance (i.e. e–values ≤ 0.001) were analyzed. For each lineage identified in our Gamma-like phylogeny (DS3), at least one representative in which the conditions described above were present was selected. Therefore, the representative was a putative full-length ERV delimited by the identification of retroviral gag, pol and env sequences positioned next to each other and placed between 5′- and 3′-LTRs. We were then able to estimate the time of the insertion of that ERV sequence and further compared it with temporal inferences of felid speciation times described in the literature.
A phylogenetic approach was used to assess the evolutionary relationship among ERV sequences found by both PCR and in silico analyses. Maximum likelihood (ML) trees were performed using Randomized Accelerated Maximum Likelihood conducted with RAxML v.7.2.6  using the raxml GUI 1.3 interface  and performing 1,000 bootstrap replicates to evaluate branch support. The best substitution models were estimated using jModeltest v.0.1.1  for the nucleotide dataset and ProtTest v.2.4  for the amino acid datasets, both employing the Akaike information criterion (AIC). The best-fit substitution models selected were as follows: JTT + G (DS1 and DS3); RtREV + I + G (DS4); and GTR + G (DS2).
We are grateful to Carla Suertegaray Fontana (Museu de Ciências e Tecnologia- PUCRS) for providing the L. geoffroyi sample. This work was supported by the Brazilian Research Council (CNPq; grant no. 142425/2008-7).
- Weiss RA. The discovery of endogenous retroviruses. Retrovirology. 2006;3:67.View ArticlePubMed CentralPubMedGoogle Scholar
- Gifford R, Tristem M. The evolution, distribution and diversity of endogenous retroviruses. Virus Genes. 2003;26:291–315.View ArticlePubMedGoogle Scholar
- Blomberg J, Benachenhou F, Blikstad V, Sperber G, Mayer J. Classification and nomenclature of endogenous retroviral sequences (ERVs): problems and recommendations. Gene. 2009;448:115–23.View ArticlePubMedGoogle Scholar
- Chong AY, Kojima KK, Jurka J, Ray DA, Smit AFA, Isberg SR, et al. Evolution and gene capture in ancient endogenous retroviruses - insights from the crocodilian genomes. Retrovirology. 2014;11:71.View ArticlePubMed CentralPubMedGoogle Scholar
- Weiss RA. On the concept and elucidation of endogenous retroviruses. Philos Trans R Soc Lond B Biol Sci. 2013;368:20120494.View ArticlePubMed CentralPubMedGoogle Scholar
- Martin J, Herniou E, Cook J, Waugh O’Neill R, Tristem M. Interclass Transmission and Phyletic Host Tracking in Murine Leukemia Virus-Related Retroviruses. J Virol. 1999;73:2442–9.PubMed CentralPubMedGoogle Scholar
- Hayward A, Grabherr M, Jern P. Broad-scale phylogenomics provides insights into retrovirus-host evolution. PNAS. 2013;110:20146–51.View ArticlePubMed CentralPubMedGoogle Scholar
- Katzourakis A. Paleovirology: inferring viral evolution from host genome sequence data. Phil Trans R Soc B. 2013;368:20120493.View ArticlePubMed CentralPubMedGoogle Scholar
- O’Brien SJ, Troyer JL, Brown MA, Johnson WE, Antunes A, Roelke ME, et al. Emerging viruses in the Felidae: shifting paradigms. Viruses. 2012;4:236–57.View ArticlePubMed CentralPubMedGoogle Scholar
- Gifford RJ, Kabat P, Martin J, Lynch C, Tristem M. Evolution and distribution of class II-related endogenous retroviruses. J Virol. 2005;79:6478–86.View ArticlePubMed CentralPubMedGoogle Scholar
- Pontius JU, Mullikin JC, Smith DR, Agencourt Sequencing Team, Lindblad-Toh K, Gnerre S, et al. Initial sequence and comparative analysis of the cat genome. Genome Research. 2007;17:1675–89.View ArticlePubMed CentralPubMedGoogle Scholar
- Yuhki N, Mullikin JC, Beck T, Stephens R, O’Brien SJ. Sequences annotation and single nucleotide polymorphism of the major histocompatibility complex in the domestic Cat. PLoS ONE. 2008;3:e2674.View ArticlePubMed CentralPubMedGoogle Scholar
- Anai Y, Ochi H, Watanabe S, Nakagawa S, Kawamura M, Gojobori T, et al. Infectious endogenous retroviruses in cats and emergence of recombinant viruses. J Virol. 2012;86:8634–44.View ArticlePubMed CentralPubMedGoogle Scholar
- Song N, Jo H, Choi M, Kim J-H, Seo HG, Cha S-Y, et al. Identification and classification of feline endogenous retroviruses in the cat genome using degenerate PCR and in silico data analysis. J Gen Virol. 2013;94:1587–96.View ArticlePubMedGoogle Scholar
- Trigo TC, Schneider A, Oliveira TG, Lehugeur LM, Silveira L, Freitas TRO, et al. Molecular data reveal complex hybridization and a cryptic species of neotropical wild Cat. Curr Biol. 2013;23:2528–33.View ArticlePubMedGoogle Scholar
- Johnson WE, Eizirik E, Pecon-Slattery J, Murphy WJ, Antunes A, Teeling E, et al. The late Miocene radiation of modern felidae: a genetic assessment. Science. 2006;311:73–7.View ArticlePubMedGoogle Scholar
- Tristem M, Myles T, Hill F. A highly divergent retroviral sequence in the tuatara (Sphenodon). Virology. 1995;210:206–11.View ArticlePubMedGoogle Scholar
- Kohany O, Gentles AJ, Hankus L, Jurka J. Annotation, submission and screening of repetitive elements in repbase: repbase submitter and censor. BMC Bioinformatics. 2006;7:474.View ArticlePubMed CentralPubMedGoogle Scholar
- Reeves RH, O’Brien SJ. Molecular genetic characterization of the RD-114 gene family of endogenous feline retroviral sequences. J Virol. 1984;52:164–71.PubMed CentralPubMedGoogle Scholar
- Cho YS, Hu L, Hou H, Lee H, Xu J, Kwon S, et al. The tiger genome and comparative analysis with lion and snow leopard genomes. Nat Commun. 2013;4:2433.PubMed CentralPubMedGoogle Scholar
- Emerman M, Malik HS. Paleovirology-modern consequences of ancient viruses. PLoS Biol. 2010;8:e1000301.View ArticlePubMed CentralPubMedGoogle Scholar
- Pecon-Slattery J, Troyer JL, Johnson WE, O’Brien SJ. Evolution of feline immunodeficiency virus in Felidae: implications for human health and wildlife ecology. Vet Immunol Immunopathol. 2008;123:32–44.View ArticlePubMed CentralPubMedGoogle Scholar
- Bolisetty M, Blomberg J, Benachenhou F, Sperber G, Beemon K. Unexpected diversity and expression of avian endogenous retroviruses. MBio. 2012;3:e00344–12.View ArticlePubMed CentralPubMedGoogle Scholar
- Benveniste RE, Sherr CJ, Todaro GJ. Evolution of type C viral genes: origin of feline leukemia virus. Science. 1975;190:886–8.View ArticlePubMedGoogle Scholar
- Kalish ML, Wolfe ND, Ndongmo CB, McNicholl J, Robbins KE, Aidoo M, et al. Central African hunters exposed to simian immunodeficiency virus. Emerg Infect Dis. 2005;11:1928–30.View ArticlePubMed CentralPubMedGoogle Scholar
- VandeWoude S, Troyer J, Poss M. Restrictions to cross-species transmission of lentiviral infection gleaned from studies of FIV. Vet Immunol Immunopathol. 2010;134:25–32.View ArticlePubMed CentralPubMedGoogle Scholar
- Sunquist M, Sunquist F: Wild Cats of the World: University of Chicago Press; 2002.Google Scholar
- Tristem M. Amplification of divergent retroelements by PCR. BioTechniques. 1996;20:608–12.PubMedGoogle Scholar
- Herniou E, Martin J, Miller K, Cook J, Wilkinson M, Tristem M. Retroviral diversity and distribution in vertebrates. J Virol. 1998;72:5955–66.PubMed CentralPubMedGoogle Scholar
- Jern P, Sperber GO, Blomberg J. Use of endogenous retroviral sequences (ERVs) and structural markers for retroviral phylogenetic inference and taxonomy. Retrovirology. 2005;2:50.View ArticlePubMed CentralPubMedGoogle Scholar
- Sambrook J, Fritsch EF, Maniatis T. Molecular cloning: a laboratory manual: Cold Spring Harbour Laboratory; 1989.Google Scholar
- Mata H, Gongora J, Ravazzolo AP. Molecular characterization of SINEs integrated in endogenous retrovirus sequences from Leopardus geoffroyi and Puma concolor. Acta Sci Vet. 2013;41:1133.Google Scholar
- Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:2731–9.View ArticlePubMed CentralPubMedGoogle Scholar
- Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30:772–80.View ArticlePubMed CentralPubMedGoogle Scholar
- Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000;17:540–52.View ArticlePubMedGoogle Scholar
- Johnson WE, Coffin JM. Constructing primate phylogenies from ancient retrovirus sequences. PNAS. 1999;96:10254–60.View ArticlePubMed CentralPubMedGoogle Scholar
- Kimura M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol. 1980;16:111–20.View ArticlePubMedGoogle Scholar
- Bannert N, Kurth R. Retroelements and the human genome: new perspectives on an old relation. PNAS. 2004;101:14572–9.View ArticlePubMed CentralPubMedGoogle Scholar
- Lopez JV, Culver M, Stephens JC, Johnson WE, O’Brien SJ. Rates of nuclear and cytoplasmic mitochondrial DNA sequence divergence in mammals. Mol Biol Evol. 1997;14:277–86.View ArticlePubMedGoogle Scholar
- Benachenhou F, Sperber GO, Bongcam-Rudloff E, Andersson G, Boeke JD, Blomberg J. Conserved structure and inferred evolutionary history of long terminal repeats (LTRs). Mob DNA. 2013;4:5.View ArticlePubMed CentralPubMedGoogle Scholar
- Stamatakis A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006;22:2688–90.View ArticlePubMedGoogle Scholar
- Silvestro D, Michalak I. raxmlGUI: a graphical front-end for RAxML. Org Divers Evol. 2012;12:335–7.View ArticleGoogle Scholar
- Posada D. ModelTest: Phylogenetic Model Averaging. Mol Biol Evol. 2008;25:1253–6.View ArticlePubMedGoogle Scholar
- Abascal F, Zardoya R, Posada D. ProtTest: selection of best-fit models of protein evolution. Bioinformatics. 2005;21:2104–5.View ArticlePubMedGoogle Scholar
- Roca AL, Pecon-Slattery J, O'Brien SJ. Genomically intact endogenous feline leukemia viruses of recent origin. J Virol. 2004;78:4370–5.View ArticlePubMed CentralPubMedGoogle Scholar
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.