Early steps of retrovirus replicative cycle
© Nisole and Saïb 2004
Received: 06 March 2004
Accepted: 14 May 2004
Published: 14 May 2004
Skip to main content
© Nisole and Saïb 2004
Received: 06 March 2004
Accepted: 14 May 2004
Published: 14 May 2004
During the last two decades, the profusion of HIV research due to the urge to identify new therapeutic targets has led to a wealth of information on the retroviral replication cycle. However, while the late stages of the retrovirus life cycle, consisting of virus replication and egress, have been partly unraveled, the early steps remain largely enigmatic. These early steps consist of a long and perilous journey from the cell surface to the nucleus where the proviral DNA integrates into the host genome. Retroviral particles must bind specifically to their target cells, cross the plasma membrane, reverse-transcribe their RNA genome, while uncoating the cores, find their way to the nuclear membrane and penetrate into the nucleus to finally dock and integrate into the cellular genome. Along this journey, retroviruses hijack the cellular machinery, while at the same time counteracting cellular defenses. Elucidating these mechanisms and identifying which cellular factors are exploited by the retroviruses and which hinder their life cycle, will certainly lead to the discovery of new ways to inhibit viral replication and to improve retroviral vectors for gene transfer. Finally, as proven by many examples in the past, progresses in retrovirology will undoubtedly also provide some priceless insights into cell biology.
In the case of HIV entry, for example, while the mechanisms of receptor binding, conformational changes and fusion appear to be relatively well defined, the involvement of attachment molecules and the importance of lipid rafts in fusion or in recruitment of coreceptors remain uncertain. Similarly, though the molecular process of reverse transcription is well described, very little is known about the concurrent uncoating process. One of the most poorly understood steps is the trafficking of pre-integration complexes (PICs) from the cell surface to the vicinity of the nucleus, despite a growing body of knowledge arising from the study of other viral models such as adenoviruses (Ad)  or Herpes simplex viruses (HSV) . Much has been learned regarding nuclear entry, but the cellular proteins involved are still unknown and the exact role of each viral component remains controversial . Finally, the molecular mechanisms of integration, the last event of the early phase of retroviral life cycle, are now well understood, but the choice of target site remains mysterious. Thus, while certain of these steps have been characterized, we are still far from obtaining a complete picture of these processes.
Fully elucidating the early steps of retrovirus replication is therefore crucial not only for identifying new antiretroviral drugs, but also for improving the design of retroviral vectors for gene therapy. Cellular inhibitors that interfere with these steps can represent useful tools for better characterizing the molecular processes involved and, in this respect, the recent discovery of cellular factors that block the lentiviral cycle at an early stage in primates provides novel directions for AIDS research .
In this review, we will summarise our current understanding of the early steps of the retroviral cycle, focussing particularly on the most recent and controversial findings in the field.
HIV-1, HIV-2 and Simian Immunodeficiency Virus (SIV) are known to bind the surface of dendritic cells through interaction of their envelope glycoproteins with the C-type mannose binding lectins DC-SIGN (Dendritic cell-specific intercellular adhesion molecule 3-grabbing nonintegrin) and DC-SIGNR (DC-SIGN related) [24, 25]. These molecules cannot be considered as receptors since they do not promote viral entry leading to productive infection. Instead, they allow DC to bind and capture viral particles and should therefore be considered as efficient binding factors. In the case of HIV-1, it seems that high mannose structures on gp120 are recognized by DC-SIGN [26–28], but there may also be a direct interaction between the two proteins . This interaction allows HIV particles to use DC as a Trojan horse. Indeed, DCs are thought to capture virions at peripheral sites of infection and carry them to the lymph nodes, so promoting efficient infection in trans of target cells expressing appropriate entry receptors [24, 25]. But the involvement of dendritic cells in lentivirus pathogenesis may be more complex, since various DC subsets express distinct arrays of receptors capable of binding HIV gp120 . Interestingly, this strategy seems to be shared by many other viruses (for a recent review, see ) and even by non-viral pathogens such as Mycobacterium tuberculosis .
Following the initial step of binding, retroviral particles use cell-surface proteins as specific receptors to enter their target cells through interactions with the viral envelope glycoproteins. As illustrated by the growing list of receptors identified, retroviruses are able to utilize a variety of cellular proteins to initiate infection, such as the amino-acid transporter CAT-1 for ecotropic MLV [33, 34], the T-cell surface marker CD4 for HIV , the glucose transporter GLUT-1 for HTLV  or the phosphate transporters PIT-1 and PIT-2 used by Gibbon ape Leukemia Virus (GaLV)  and amphotropic MLV [38, 39], respectively. In the case of Foamy viruses (FVs), although the receptor is still unknown, it appears to be ubiquitous since these retroviruses can infect a very wide range of cell lines, although CD4+ and CD8+ lymphocytes appear to be the main in vivo reservoirs [40–42].
Retroviral entry is a complex multi-step mechanism that has been particularly well studied for HIV. Firstly, the envelope glycoprotein gp120, present on the surface of viral particles as gp41/gp120 trimers, recognises the primary receptor CD4. This interaction leads to conformational changes in both CD4 and gp120 and to the recruitment of coreceptors belonging to the chemokine receptor family, mainly CXCR4 and CCR5 (for a review, see ). A second interaction then takes place between gp120 and one of these coreceptors, which triggers new conformational shifts in the envelope glycoproteins . These sequential conformational changes finally lead to the dissociation of gp120 from gp41, and to the transition of gp41 to its fusogenic conformation. Entry of virions into the cell is achieved by insertion of the gp41 fusion peptide into the target membrane, resulting in the fusion of viral and cellular membranes and the release of the viral core in the cytoplasm (for recent reviews, see [45, 46]).
Although it has been suspected for some time that galactosyl ceramide (GalCer) may be used by HIV-1 as an alternative receptor to infect neural cells , until recently little else was known about the role of lipids in retroviral entry. The discovery that lipids are distributed heterogeneously within cell membranes has led to the proposal that sphingolipids and cholesterol tend to segregate into microdomains called lipid rafts . Several observations support the hypothesis that lipid rafts may be involved in the HIV entry process. Firstly, binding of HIV-1 to CD4 has been reported to result in a direct interaction between gp120 and certain glycosphingolipids in membrane microdomains . Furthermore, disruption of target cell membrane rafts by cholesterol depletion prevents HIV-1 infection , as does targeting CD4 to non-raft membrane domains . Finally, binding of virus to permissive cells induces the clustering of CD4, CXCR4 and CCR5 within lipid-rafts [50, 52, 53]. Despite these lines of evidence, the contribution of lipid rafts to HIV entry remains controversial, as some studies have shown that the localization of CD4 and CCR5 to non-raft membrane domains may not prevent HIV entry [54, 55]. Interestingly, membrane microdomains also seem to be involved in late events of the retroviral cycle, since HIV-1 particles have been found to bud preferentially through raft microdomains of the plasma membrane . This explains the unusually high cholesterol and sphingomyelin content of HIV membranes , a composition that is thought to be important for fusion, since cholesterol-depleted virions fail to enter cells .
Most retroviruses, including HIV, enter target cells by direct fusion with the plasma membrane, as indicated by their resistance to drugs blocking the acidification of endosomes . Interestingly, although HIV entry is strictly pH independent, the majority of viral particles that bind to the cell surface enters by endocytosis . It seems that a balance exists between these two entry pathways of HIV-1 into T-lymphocytes, since the inhibition of one route increases entry of particles by the alternative mechanism . However, particles entering by endocytosis do not support productive infection as they are degraded by the proteasome , a conclusion supported by the observation that inhibition of endosomal/lysosomal degradation increases the infectivity of HIV-1 . The only known exceptions in the retrovirus family are ecotropic and amphotropic MLV , and FVs , which seem to enter target cells by endocytosis, although in the case of FVs, the possibility of entry by direct fusion cannot be excluded. However, the route of penetration into the cytoplasm can depend of the type of cell being infected. Indeed, whereas the ecotropic MLV enters mouse NIH 3T3 cells by endocytosis, its entry into rat XC cells occurs by fusion at the cell surface . It is interesting to note that the involvement of pH in retroviral entry has been reconsidered, since the distinction between pH-dependence and independence has been shown to be more relative than initially thought. Indeed, while the entry mechanism of avian leukosis viruses (ALV) has originally been classified as pH-independent in comparison to influenza virus (for a review, see ), it has been shown to involve a low pH step . In contrast to influenza virus, it is the interaction of ALV with its receptor that converts the envelope glycoprotein to a pH-sensitive form, capable of promoting fusion at low pH .
Finally, in the case of lentiviruses, there are some examples of direct infection from cell to cell. This is the case of dendritic cells which can transmit HIV particles to T-cells by direct contact without themselves being infected [25, 67, 68]. The fact that most of the infectious HIV produced by primary macrophages is assembled on late endocytic membranes rather than at the plasma membrane suggests that a direct transmission of virions from infected macrophages to T-cells during antigen presentation could also occur .
The fusion of viral and cellular membranes delivers the viral core into the cytoplasm, where the viral RNA is reverse transcribed by the virion-packaged reverse transcriptase (RT), generating a linear double-stranded DNA molecule (for a review, see ). Although there is evidence for limited DNA synthesis in virions prior to infection [71–73], reverse transcription usually occurs after the release of the viral core into the cytoplasm of the target cell. The only exceptions are FVs, which also reverse transcribe their RNA during a late stage of their life cycle [74–76]. Although unique among retroviruses, this feature is shared with Hepadnaviruses, a viral family that has many other similarities with FVs (for a review see ). The trigger for the initiation of reverse transcription is not clearly understood, but exposure of the incoming viral ribonucleoprotein complex to a significant concentration of deoxyribonucleotides in the cytoplasm is thought to play an important role (for a review, see ).
Immediately after its release into the cytoplasm, the viral core undergoes a partial and progressive disassembly, known as uncoating, that leads to the generation of subviral particles called reverse-transcription complexes (RTCs) and pre-integration complexes (PICs). It seems that initiation of reverse transcription is coupled to the onset uncoating of the viral core . It should be noted that the distinction between RTCs and PICs is somewhat arbitrary, since uncoating is believed to occur progressively, but PICs are usually defined as the integration-competent complexes, whereas reverse-transcription is incomplete in RTCs . Attempts to define the composition of RTCs and/or PICs have not yielded a clear answer, since the nature of the viral and cellular components found to be associated with the viral genome depends on the technique used for purifying the complexes, which are very sensitive to detergents. Furthermore, it is known that the vast majority of viruses entering a cell will not lead to a productive infection, meaning that purified complexes may not necessarily represent those particles able to perform reverse-transcription, nuclear import or integration. Indeed, in the case of HIV-1, it has been reported that the infectivity to particle ratio is as low as 1 in 60,000 [80, 81], even if some mathematical analyses tend to prove that more than 10% of particles in a viral stock is theoretically able to infect cells .
As a result of these practical restraints, it is still unclear which proteins remain associated with the viral genome in the RTCs/PICs. For HIV, RTCs have been shown to associate rapidly with the host cytoskeleton after infection, possibly through a direct interaction between the matrix protein and the actin network . They appear as large nucleoprotein structures by electron microscopy and have a sedimentation velocity of approximately 350 S and a density of 1.34 g/ml in equilibrium gradients [84, 85]. While most studies show that HIV PICs contain protease (PR), reverse-transcriptase (RT), integrase (IN) and Vpr, the presence of the structural proteins is more controversial. The capsid proteins (CA) are thought to be released soon after infection and only trace amounts are found in PICs. Whereas nucleocapsid (NC) and matrix (MA) were initially thought to be associated with PICs [86, 87], more recent studies revealed that the majority of these proteins are lost during the uncoating process . Interestingly, as some viral structural components are released, certain cellular proteins associate with the PICs during their journey to the nucleus, such as the high mobility group protein HMG I(Y), which has been proposed to be important for integration .
It seems that the MLV core persists longer than that of HIV since NC, MA and CA can all be detected in structures at the vicinity of the nuclear membrane by electron microscopy . However, whereas NC and IN can be detected in the nucleus, MA and CA were found only in the cytoplasm [89, 90]. Similarly, in the case of FVs, electron microscopy studies revealed that incoming capsids seem to retain an intact structure during their journey from the cell surface to the microtubule-organizing centre (MTOC) . Interestingly, FV capsids were never detected either within the nucleus, or close to nuclear pores, even later during the replication cycle, whereas unassembled Gag proteins and the viral genome are detected in the nucleus early after infection . Therefore, in contrast to viruses such as Adenovirus type 2 (Ad2) or Herpes Simplex Virus type 1 (HSV-1), whose capsids dock to the nuclear pore triggering nuclear translocation of the viral genome [93–95], nuclear import of FV Gag and genome must be accompanied by disassembly or significant deformation of the core particle at the MTOC.
Some viral and cellular proteins appear to influence the uncoating and/or the reverse-transcription of retroviruses. This has been exemplified by HIV-1 Nef and Vif and the cellular protein cyclophilin A. These three proteins, present in incoming virions by virtue of their association with the viral core, have been shown to modulate early events of the replicative cycle of HIV, but their mode of action is still unclear. Indeed, viral particles lacking one of these proteins are less infectious than wild-type and this defect seems to occur early in the viral cycle. Nef-defective viruses for example display a strong decrease in infectivity [96–98]. Since it does not appear to alter virion binding or entry but does enhance viral DNA synthesis, Nef has been proposed to act either at the level of viral uncoating or reverse transcription [99, 100]. Nef appears likely to modulate viral entry only when it occurs by fusion at the plasma membrane , as HIV-1 virions pseudotyped with the amphotropic MLV envelope [100, 102], but not with the envelope glycoprotein from the vesicular stomatitis virus (VSV-G)  display Nef-mediated enhancement of infectivity membrane. This mechanism, dependent on the route used by the virus to enter its target cell, may be related to the high content of cholesterol present in the viral particle membrane . Indeed, it has been proposed that Nef may enhance viral infectivity by increasing the synthesis and incorporation of cholesterol into progeny virions .
Vif, another HIV-1 accessory protein known to be incorporated into virions, also seems to play a role in an early step of the HIV replicative cycle, as Δ-Vif viruses are unable to complete viral DNA synthesis  and their RTCs are less stable than wild-type viruses . These observations may now be explained by recent studies. Indeed, Vif has been shown to counteract the antiviral activity of CEM15/APOBEC3G by preventing its incorporation into progeny virions [106–110]. The fact that this cellular protein inhibits HIV replication at the step of reverse-transcription is consistent with the observed phenotype of Δ-Vif viruses. This latter will be discussed in more detail below.
Finally, the cellular protein cyclophilin A (CypA), which is incorporated into virions through its interaction with viral capsid [111–113], has been shown to play a critical role in the correct disassembly of the HIV-1 cores early after infection , since particles lacking CypA display a defect between entry and reverse-transcription. However, these observations are probably due to the failure of CA to bind CypA rather than the absence of the cellular protein in the virions. Indeed, some data suggest that CypA incorporation into virions is dispensable, since CypA can associate with the CA of incoming particles within the target cells . CypA is believed to protect the viral capsid from the human restriction factor Ref1, leading to an increase in HIV-1 infectivity . The mechanism of Ref1 restriction will be discussed below.
Additionally, it should be noted that early expression of viral genes from unintegrated viral cDNA has also been described [116–120]. Although the role of this early expression is not clear, it is enhanced in the presence of Vpr .
After penetration into the host cell, pathogens have to reach their sites of replication, the nucleus in the case of retroviruses. The cytoplasm, containing a high protein concentration in addition to organelles and the cytoskeleton, constitutes a medium in which incoming particles cannot rely on simple passive diffusion to move. Consequently, viruses have evolved numerous and specific mechanisms to hijack cellular machinery, and in particular the cytoskeleton, to facilitate their spread within the infected cells, . For example, microtubules (MT) are essential for HSV-1  and Ad  to reach the nucleus of the infected cells, while vaccinia virus exploits first the microtubule network for its intracellular movement , and then the actin cytoskeleton to enhance its cell-to-cell spread .
Initial studies have revealed that the use of specific drugs altering the integrity of the cytoskeleton can interfere with the retroviral cycle, either by directly affecting the intracellular trafficking of incoming viruses or by interfering with other steps of the early phase of infection such as reverse transcription. Indeed, it has been shown that an intact actin cytoskeleton is essential for efficient reverse transcription of HIV-1 . Additional reports have described specific interactions between retroviral proteins and cytoskeleton components. For example, HIV-1 IN and NC have been shown to interact with yeast microtubule-associated proteins , and actin [126–128], respectively, but the precise role of such interactions in intracellular trafficking of incoming viruses remains to be elucidated. In contrast, several reports have described the effect of retroviral proteins on the cytoskeleton, which might assist viral replication. This is exemplified by the effect of the HIV-1 Rev and Vpr proteins on the polymerisation of the microtubule network  or on the nuclear membrane (see below), respectively, or the ability of Vif to alter the structure of vimentin network . But once again, a direct link between these observations and intracellular trafficking remains to be clarified. Interestingly, the microtubule network has been reported to be implicated in the intracellular trafficking of incoming retroviruses. Such movement has been demonstrated for incoming FVs which target the microtubule organizing centre (MTOC) prior to nuclear translocation. Centrosomal targeting of incoming viral proteins and subsequent viral replication were inhibited by a treatment with nocodazole, demonstrating the involvement of the MT network in intracellular trafficking . Remarkably, the Gag protein by itself can target the MTOC in transfected cells through interaction with the cytoplasmic light chain 8 (LC8) of the minus-end directed MT motor dynein . A similar role for LC8 has been described for ASFV (African Swine Fever Virus) and rabies virus, two other viruses which use the MT network to move within infected cells [131–134]. Interestingly, this evolutionarily conserved molecule has been shown to interact with numerous cellular complexes such as nitric oxide synthase, or myosin V, an actin-based motor mainly located at the plasma membrane which shuttles between the cell periphery and the MTOC along the MT network (for a review, see ). Therefore, interaction between incoming retroviral capsids and the multifunctional LC8 could provide a bridge to shuttle between an actin-based motor beneath the plasma membrane and the MT network within the cytoplasm. Remarkably, McDonald and al. have observed the migration of HIV-1 particles along MT toward the centrosome by following GFP-tagged viral particles in the cytoplasm of infected cells. . A MT-dependent movement of retroviral Gag proteins from the MTOC has also been described during late stages of the life cycle for HTLV-I , the Mason Pfizer Monkey virus [137, 138] and also intracisternal type A particles [139, 140]. Although the viral and cellular protagonists involved in this transport were not determined, these observations suggest that distinct classes of retroelements may use the dynein-dynactin complex motor on the MT network to make their way to or from the nucleus, through the cytoplasm.
The retroviral life cycle requires the integration of the viral DNA into the host cell genome to form the so-called provirus. To achieve this, the reverse-transcribed DNA associated with viral proteins to form PICs, must enter the nucleus (for a review, see ). PICs from most retroviruses are unable to enter intact nuclei and must therefore "wait" for the breakdown of the nuclear membrane occurring during mitosis [141, 142]. Consequently, these retroviruses, such as MLV, are dependent on the cell cycle and cannot replicate in non-dividing cells. In contrast, lentiviruses such as HIV-1 are able to productively infect non-dividing cells , such as macrophages or quiescent T lymphocytes, indicating that PICs are able to actively cross the nuclear membrane . Some other retroviruses seem to have an intermediate capacity to enter the nucleus, since the PICs of Rous sarcoma virus  and FVs [92, 146] are able to penetrate intact nuclei with a low efficiency, but their replication is dramatically increased in dividing cells. HIV PICs, composed of the double-stranded linear DNA associated with the viral proteins MA, RT, IN and Vpr, have a estimated Stokes diameter of 56 nm . Since the central channel of the nuclear pore has a maximum diameter of 25 nm and the pore is known to be able to transport macromolecules up to 39 nm , HIV has developed a strategy to achieve the challenge of passing through these structures.
Nuclear pore complexes (NPCs) are large supramolecular protein structures that span the nuclear membrane and protrude into both cytoplasm and nucleoplasm (for a recent review, see ). Signal-mediated nuclear import involves the interaction of nuclear localization signals (NLS) in proteins with nucleocytoplasmic shuttling receptors, belonging to the karyopherin β family, also known as importins. NLSs are typically short stretches of amino acids, the best studied of which are basic amino acid-rich sequences that interact with the receptor importin β, either directly or through the adapter importin α . Importin β interacts with other classes of NLS using different adapters, including snurportin, RIP (for Rev interacting protein), and importin 7. This latter has recently been proposed to play a key role in nuclear import of HIV-1 PICs in primary macrophages . Four different viral components have been identified to contribute to the nuclear import of HIV-1. Among the constituents that are believed to form the PIC, IN, MA, Vpr and the viral DNA are suspected to play a significant role in this complex process, either directly or indirectly, although the exact function of each remains to be fully understood (for reviews, see [7, 149]).
Integrase has been considered to be the main mediator of HIV-1 nuclear translocation for some time, but its exact implication is now being re-evaluated. This viral protein, which harbours a non-classical NLS, has been shown to be both necessary and sufficient to promote the nuclear accumulation of viral PICs [150, 151]. The nature of the pathway used by this NLS is not known, but interestingly, the nuclear import function of IN was found to be essential for productive infection of both non-dividing and dividing cells . This unexpected result suggests that nuclear entry of HIV-1 PICs during mitosis may not be a passive process. Supporting this finding, it has been reported that nuclear import of HIV-1 PICs might be mitosis-independent in cycling cells . However, new questions have been raised concerning the karyophilic properties of IN and the role of its NLS. Indeed, IN has been found to enter the nucleus even when the NLS has been mutated [153, 154], and some data suggest that nuclear accumulation of IN does not involve members of the karyopherinfamily . Furthermore, it has been proposed that the observed nuclear localization of IN may result from its ability to bind DNA, in combination to its degradation in the cytoplasm . Hence, more studies are required in order to elucidate the exact role of IN in PIC nuclear import.
Two other HIV-1 proteins have been proposed to possess karyophilic properties. The first of these is the MA, which has been found to contain a classical basic NLS in its N-terminal region (GKKKYK), responsible for targeting the PIC into the nucleus [157, 158]. The mutation of this signal has been found to block HIV replication in non-dividing cells , whereas it does not interfere with virus growth in replicating cells . However, the role of this NLS was later disputed, with several reports demonstrating its dispensability for infection in non-dividing cells [159–161]. A second NLS has been identified in the C-terminal region of MA , re-igniting the controversy surrounding the exact role of MA in nuclear import.
The third protein that has been proposed to be involved in nuclear import of HIV-1 PICs, Vpr [163, 164], is probably the most controversial. This small viral protein (11.7 kD) has been shown to be a component of PICs and, despite not containing a canonical NLS, various sequences have been reported to target fusion proteins to NPCs . Vpr has been found to interact directly with components of the NPC, such as importin α [163, 166] and nucleoporin hCG1 [167, 168]. These interactions are believed to enhance nuclear import efficiency . Interestingly, Vpr expression has been shown to induce transient bulges in the nuclear envelope, which sometimes burst, creating a channel between the nucleus and the cytoplasm . However, the precise role of these nuclear envelope disruptions in PIC nuclear import remains uncertain, since Vpr-deficient viruses can infect non-dividing cells efficiently [151, 159]. In contrast, the Vpx protein encoded by HIV-2 and SIV has been shown to be both necessary and sufficient for the nuclear import of PICs .
In addition to lentiviruses, other retro-elements possess a cPPT, such as FVs [176, 177], the yeast Ty1 retrotransposon  and the fish retroviruses Walleye dermal sarcoma virus (WDSV)  and Walleye epidermal hyperplasia virus (WEHV) . Consequently, the reverse transcription process in these viruses generates a cDNA containing a single-stranded gap (Figure 3). However, the possible implications of this particular structure in nuclear import of the corresponding PIC have not yet been investigated. Another issue, which is still debated, concerns the role of the circular viral DNA forms arising during the replication cycle of many retroviruses. Firstly, the so-called 1- or 2-LTR circles, which were initially thought to be markers of a recent infection and dead-end complexes, may be in fact stable structures . Furthermore, whereas these circular DNA molecules have been used as a marker for PIC nuclear translocation and integration, 2-LTR circles can be detected in the cytoplasm of MLV infected cells as soon as 2 hours post-viral entry, in dividing or non-dividing cells . Thus, these different observations indicate that the exact nature and function of circular viral DNA must be reconsidered.
Therefore, although several factors were shown to regulate nuclear import of retroviral genomes in particular in non-dividing cells, one can bet that future works will precise the role of each of them and will certainly implicate other proteins, as recently suggested in the case of HIV-1 CA , in this stage of the replication cycle.
Although the process of proviral integration has been intensively studied in in vitro assays in the presence of recombinant integrase, the molecular basis of in vivo integration of animal retroviruses remains poorly understood. This unique property of retroviruses maintains the genetic information life-long in the cell genome and constitutes a major advantage for retroviral vectors when gene correction must be continuous. Initially, integration events following the use of retroviral vectors into the host genome were accepted to be random and the chance of accidentally disruption or deregulated expression of a host gene was considered to be extremely low. MLV-derived vectors were used in the first definitive cure of a genetic disease by gene therapy . Children with SCID-X1 syndrome recovered a functional immune system following administration of their own haematopoietic stems cells transduced ex vivo with an MLV vector carrying the γc chain cytokine receptor gene. Unfortunately, two of the ten children developed a leukaemia-like disorder due to the integration of the retroviral vector near the lmo2 oncogene, leading to clonal expansion of the corresponding transduced T cells [185, 186]. This represents the first description of insertional mutagenesis following a clinical trial of a murine retroviral vector in humans, raising the old question of the potential danger of such viruses, which are known to cause somatic and germline mutations that lead to cancers and inherited disorders in their natural hosts. Indeed, this property of murine leukemia viruses is also successfully used for the identification of essential cellular genes involved in tumour development, a technique called provirus tagging (for a review, see ).
Initial studies on retrovirus integration have demonstrated that proviral insertion generally occurs in a non sequence-specific fashion but may be influenced by the structure of the neighbouring chromatin . In this respect, MLV integration was shown to occur within DNaseI-hypersensitive chromatin regions, suggesting that actively transcribed genes are preferred targets for provirus insertion , while HIV-1 integration was never observed in centromeric alphoid repeats . Conversely, transcriptionally active regions are not favoured as sites of integration for ALV . Gaining a global picture of the integration pattern of a given retrovirus has now become possible, thanks to the complete sequencing of the human genome. Schröder et al. have mapped over 500 integration events of HIV-1 and of derived retroviral vectors following infection of a human T cell line, revealing that integration preferentially occurs in genes highly transcribed by the RNA PolII . This specificity may therefore favour efficient HIV-1 gene expression, maximizing virus propagation whilst being deleterious to host survival. Similarly, Wu et al. have mapped 903 different integration sites of MLV, revealing preferential integration into highly transcribed genes . MLV integration events distribute evenly upstream and downstream of the transcriptional start site of actively transcribed genes, +/- 1 kb from the CpG islands, whereas HIV-1 proviruses are found on the entire length of the transcriptional unit. Such regional preferences along the host genome, in the absence of sequence specificity, suggest that integration may be influenced by specific interaction occurring between host proteins and viral components or by specific chromatin architecture in these regions.
Several studies have suggested that the integrase is a key factor in determining the site of integration and, in this respect, it is interesting to note that this protein can dock to mitotic chromosomes in the absence of other viral proteins or viral genome [194–196]. IN, which is a member of the D, D(35)E transposase/IN superfamily of proteins, mediates integration of the viral DNA into the host genome . We know for example that the integrase of FIV, HIV and Visna virus display distinct preference of integration sites when given an identical DNA target in vitro [198–200]. In the case of HIV-1, several cellular DNA binding proteins have been described to interact with the integrase and may therefore constitute good candidates for directing the PIC to its target site. The integrase interactor 1 (Ini1, also called hSNF5), a subunit of the SWI/SNF chromatin-remodeling complex, was initially isolated by a yeast two hybrid screen for human proteins interacting with the IN  and was proposed to stimulate the in vitro DNA-joining activity of the IN and to target the viral genome to active genes in an as yet undetermined manner. Equally, HMG-I(Y) , a non-histone chromosomal protein important for transcriptional control and chromosome architecture, and the barrier-to-autointegration factor (BAF) , a cellular protein involved in the reorganization of post-mitotic nuclei, have been identified as partners of the HIV-1 IN. Both proteins appear to be required for efficient integration in vitro, but their respective role in directing the PIC to precise sites of the host genome was not evaluated.
Two other IN-binding partners were isolated which seem to be critical for directing the PIC to the host chromatin. This is the case for the EED protein which is encoded by the human homologue of the mouse embryonic ectoderm development (eed) gene product and of the Drosophila esc gene, and which interacts also with the matrix protein of HIV-1 [203–205]. These genes belong to the family of widely conserved Polycomb group of genes, involved in the maintenance of the silent state of chromatin and reduction of DNA accessibility. An interaction occurring between EED and the viral proteins MA and IN might not only direct the PIC to the host chromatin but also trigger transcriptional activation . Finally, the lens epithelium-derived growth factor (LEDGF/p75), a protein implicated in the regulation of gene expression and in the cellular stress response was found to interact with the HIV-1 IN . Interestingly, this interaction is not essential for nuclear accumulation of the HIV-1 IN, but seems to be absolutely required to dock the PIC to the host chromatin ( and S. Emiliani, personal communication).
Although the molecular basis of site specificity is unclear for retroviruses, much more is known about other retrovirus-like elements known to preserve the integrity of the host genome during their replication. Retrotransposons contain a similar arrangement of their genes to mammalian retroviruses, and also are flanked by direct repeats (LTRs), use similar mechanisms to replicate and share strong reverse transcriptase homologies. However, they harbour at least two major differences. First, an extracellular phase of the life cycle is not generally observed in the case of retrotransposons since most of them do not encode an envelope glycoprotein. More importantly, some retrotransposons are non-randomly distributed along the genome they colonize. This has been evidenced, for example, by the clustering of retrotransposons in intergenic regions of maize  or the association of some retroelements with heterochromatin and telomeres in Drosophila . The pressure on target site selection is even more extreme in the case of yeast retrotransposons, as these elements must integrate their DNA into a gene-rich, densely packed and timely haploid genome without disruption of essential host genes. This is the case for Ty1, a yeast copia-like element, which integrates within a tight window of 1 to 4 nucleotides upstream of RNA pol III dependent promoter start sites without deleterious effects on host survival. Similarly, Ty5, another yeast retrotransposons, specifically inserts into regions of silent chromatin. Such site selection is driven by specific interactions between the viral integration machinery, especially the integrase, and host proteins, allowing a balance between the fitness of the host and the ability of the retrotransposon to propagate and survive in the host genome (for reviews, see [208–210]). A similar mechanism may also account for site selection of animal retroviruses [211, 212].
Understanding the stepwise molecular interactions occurring between cell components and the PIC proteins responsible for guiding the viral genome to its integration site will be essential to fully understand the risk factors and benefits of different retroviral gene-therapy systems. Moreover, this knowledge is clearly indispensable for the development of new generations of engineered safer retroviral vectors harbouring chosen site specificity. For that purpose, comparing the specificity of different retroviral integrases, and other components of the PIC which may influence chromatin docking, and defining the protein domains involved in determining site selection will allow the possibility to engineer this enzyme without loss of in vivo function. The yeast model has unexpectedly, but significantly, improved our understanding of the integration process regarding animal retroviruses [211, 212]. This provides an excellent system in which to study both the mechanics of retrotransposon integration and the influence of host genes, which can affect distinct steps of the retrotransposon life cycle [213, 214]. Indeed, functional genomics screens for host factors that influence Ty retrotransposition reveal that several gene products, identified as host defence factors which are able to limit Ty activity, were conserved in other organisms . This model will be useful to provide a starting point for identifying host factors implicated in retroviral restriction of pathogenic viruses .
While providing all the molecules, proteins and machinery required by viruses to achieve their replicative cycle, mammalian cells have developed specific defences to protect themselves against viral infection. Among the array of antiretroviral genes, some act by interfering with early steps of the retroviral cycle. However, at the same time, retroviruses have found strategies to avoid or counteract many of these host defence mechanisms. For example, the human apolipoprotein B mRNA-editing enzyme-catalytic polypeptide-like-3G (APOBEC3G), also known as CEM15, has recently been reported to be an endogenous inhibitor of HIV-1 replication [107, 110, 216]. This cellular protein is a DNA deaminase that is incorporated into virions during egress and subsequently exerts antiviral activity during reverse transcription by triggering G-to-A hypermutation in the nascent retroviral DNA. It has been shown that APOBEC3G can inhibit a broad range of retroviruses, including HIV, SIV, and MLV, as well as the Hepatitis B Virus (HBV), a pararetrovirus whose life cycle also involves a reverse-transcription step . HIV-1 Vif was demonstrated to counteract this antiviral protein by preventing the encapsidation of APOBEC3G into virions, either through inhibition of its expression and packaging [218, 219] or by promoting its degradation by the proteasome [108, 109, 220]. The hypermutation of reverse transcripts catalyzed by APOBEC3G may be directly lethal or may result in instability of the RTCs, consistent with the described phenotypes of Δ-Vif viruses [104, 105].
The search for host genes affecting the susceptibility of mice to infection by MLV has been particularly extensive, starting in the early 1970s with the description of a series of genes controlling responses to Friend virus infection, known as Fv1-Fv6 (for Friend Virus susceptibility genes 1 to 6). Since then, many other murine genes have been described affecting the sensitivity of mice to other strains of MLV. While many of these genes influence the immunological response, others act directly on virus replication (for a review, see ). Most of these latter genes interfere with viral entry by one of two distinct mechanisms. The first group of genes encodes variant forms of the receptor used by viruses, such as Slc7a1, an allelic variant of the ecotropic CAT1 receptor  or Svx, a polymorphism of the polytropic/xenotropic receptor [223, 224]. The second group of resistance factors block MLV entry through an interference mechanism. The best-characterized of these genes, Fv4, expresses high levels of an envelope glycoprotein closely related to that of the ecotropic MLVs, interfering with receptor binding of exogenous ecotropic viruses [225–227]. Another gene, called Rmcf, has been shown to act by a similar mode of action, and interferes with the binding of polytropic mink cell-focus forming (MCF) MLVs [228, 229].
While the MLV capsid protein was rapidly suspected to represent the viral target of Fv1 restriction [234, 235], the restriction specificity has been shown to be mainly determined by a single amino-acid at position 110 in CA [236, 237]. This latter finding and the fact that Fv1 seems to be a CA-like protein is consistent with a mechanism in which Fv1 would interfere with an early event of the MLV cycle by competing with the capsid of incoming virions. This is supported by the observation that Fv1 can be saturated by an excess of restricted virus or by the pre-treatment of cells with inactive virion particles, a mechanism referred to as abrogation . However, the fact that (i) Fv1 was found to be expressed at extremely low levels, (ii) is completely unrelated to MLV CA and, (iii) that Gag proteins have never previously been implicated in viral interference, has led to the suggestion that Fv1 may act via a more subtle mechanism. So far, this mechanism is still unknown, but it is believed to involve a direct interaction between Fv1 and CA [231, 239].
Interestingly, similar restriction activities have recently been described in non-murine cells, and have been shown to be due to an Fv1-like factor present in these cells (for a review, see ). The first factor, called Ref1 (for Restriction factor 1), interferes with N-MLV and Equine Infectious Anemia Virus (EIAV) infection in human and other primate and non-primate species [240–242]. This factor shows many similarities with its murine counterpart, and in particular with Fv1b, since it can be abrogated by an excess of MLV N but not by MLV B . Surprisingly, the same residue 110 in MLV CA that confers the specificity of inhibition to Fv1 is also responsible for Ref1 specificity . However, Ref1 has been found to act at a stage between entry and reverse-transcription, whereas the Fv1 block is subsequent to reverse-transcription . Interestingly, cyclophilin A which is known to be associated to HIV-1 Gag in virions [112, 244] and to facilitate an early step of infection , has been shown to modulate the sensitivity of HIV-1 to restriction factors . In human cells, its association with the viral CA prevents it from being the target of the Ref1 restriction factor, whereas in certain non-human primates, this association may be responsible for the restriction of HIV-1 cells by Ref1 . Surprisingly, the incorporation of CypA into virions is not a prerequisite for the protection of HIV-1 against Ref1 antiviral activity, as the relevant CA-CypA interaction takes place in the target cells .
The second Fv1-like factor is expressed in certain non-human primates and, depending on the species, can inhibit the replication of various lentiviruses, including N-MLV, HIV-1, HIV-2, SIV and EIAV [245–247]. Because it shares many characteristics with Fv1, this factor was called Lv1, for Lentivirus susceptibility 1. Like Fv1 and Ref1, the viral determinant of Lv1 restriction maps to the viral capsid and, like Ref1, it blocks infection before reverse transcription occurs. The relationship between Fv1, Ref1 and Lv1 remains to be investigated.
Examples of cellular factors inhibiting early steps of retroviral cycle.
Step being affectedb
N-MLV or B-MLV
between RT and integration
between entry and RT
N-MLV, HIV-1, HIV-2, SIVmac, EIAV
between entry and RT
between RT and nuclear entry
All these results illustrate the striking ability of retroviruses to counteract the antiviral mechanisms developed by their hosts, either by direct use of a viral protein, or by hijacking a cellular factor, thus allowing early steps of the replicative cycle to proceed.
It is interesting to note that the FV accessory protein Bet seems to display a similar activity to the cellular restriction factors described above. This protein is translated from a multispliced mRNA transcribed from an internal promoter (IP) located between the env gene and the 3' LTR , which also encodes for Tas, the transactivator of gene expression from both the 5' LTR and the IP. Bet is highly expressed in infected cells, where it localizes to both the cytoplasm and the nucleus . Interestingly, Bet has also been shown to be secreted by infected cells, and to be internalised by surrounding naive cells  where it targets the nucleus through its C-terminal bipartite NLS . Finally, this protein is believed to be implicated in the establishment and/or maintenance of viral persistence in vivo. Indeed, a Tas-defective genome (ΔHFV) has been described to behave like a defective interfering virus and to interfere with the replication of wild-type viruses by the production of Bet . Furthermore, expression of Bet has been shown to interfere with an early stage of FV replication, between virus entry and integration . The capacity of Bet to prevent up-regulation of basal IP activity might also be a factor in its ability to block superinfection of cells . Although its role and mechanism of action are still unclear, these observations suggest that Bet may help FVs to control their own spread in order to persist in their host. This protein therefore represents an atypical inhibitor of early steps of the retrovirus replicative cycle.
The stepwise events allowing retroviruses to enter the target cell, to move within the cytoplasm, to penetrate into the nucleus and to integrate its genome into host chromosomes, are beginning to be unravelled, but many issues are still unanswered. This is particularly evident concerning the uncoating of incoming viruses, a complex process involving cellular and viral proteins and which takes place all along this early journey. It is interesting to note that among the PIC proteins the viral protease, which is critical for the late phase of infection, could also be involved in the uncoating process, as already described for certain retroviruses [258–260] and other viral families . Similarly, post-translational modifications of viral components such as phosphorylation, ubiquitination and/or sumoylation, could also influence and regulate these early steps. The way in which retroviruses activate signalling cascades  which might also regulate the behaviour of the incoming viral components, is still unknown. However, it has already been demonstrated that HIV-1 virions hijack many cellular proteins harbouring signalling properties such as CypA or mitogen-activated protein kinase, two pivotal proteins known to be implicated in signalling pathways (for a review, see ). An apparent block in HIV-1 replication was described in resting CD4+ T cells prior to the integration of the viral genome into host cell chromosomes in a state called preintegration latency, awaiting stimulation and a transition to productive infection. Several studies have demonstrated that resting CD4+ T cells isolated from the blood of HIV-1-infected individuals contain completely reverse transcribed unintegrated viral DNA, likely constituting a latent reservoir (for reviews, see [264, 265]), since these forms of DNA were shown to be transcriptionally active . Therefore, it will also be important to precisely define the intracellular compartments in which these unintegrated viral structures localize and how they are maintained. The fact that 2-LTR junctions can be detected in the cytoplasm of MLV infected cells soon after viral entry, whereas these structures were believed to appear in the nucleus only if integration had occurred , may provide a clue to unravel this unknown mechanism.
Interestingly, the discovery of host gene products that can interfere with early steps of retroviral infection has strengthen the idea that the incoming virus is not simply an inert cage protecting the viral genome but rather interacts widely with cellular components. The identification of restriction factors and the characterization of their mode of action may lead to new approaches for blocking retroviral replication.
Understanding the precise interactions between cellular and viral partners occurring during the early steps of infection will certainly open new fields of research leading to the discovery of new antiretroviral drugs. Towards this goal, the study of retroelements from distinct organisms, such as retrotransposons in the yeast model, will help us to define conserved and non-conserved cellular mechanisms involved in the early steps of infection. This will also allow the development of safer therapeutic long-term expression vectors, targeting the transgene to specific regions of the host genome without deleterious effects [210, 211]. Finally, one can also assume that the study of early steps of infection will certainly contribute to a better understanding of principal cell functions.
We thank Jonathan Stoye and Laura Burleigh for critical review of the manuscript. S.N. is supported by the European Molecular Biology Organization (EMBO Long-term Fellowship ALTF 343-2001). A.S. is supported by Ensemble contre le SIDA/SIDACTION and ANRS.
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.