HIV-1 gene expression: lessons from provirus and non-integrated DNA
© Wu 2004
Received: 21 May 2004
Accepted: 25 June 2004
Published: 25 June 2004
Skip to main content
© Wu 2004
Received: 21 May 2004
Accepted: 25 June 2004
Published: 25 June 2004
Replication of HIV-1 involves a series of obligatory steps such as reverse transcription of the viral RNA genome into double-stranded DNA, and subsequent integration of the DNA into the human chromatin. Integration is an essential step for HIV-1 replication; yet the natural process of HIV-1 infection generates both integrated and high levels of non-integrated DNA. Although proviral DNA is the template for productive viral replication, the non-integrated DNA has been suggested to be active for limited viral gene synthesis. In this review, the regulation of viral gene expression from proviral DNA will be summarized and issues relating to non-integrated DNA as a template for transcription will be discussed, as will the possible function of pre-integration transcription in HIV-1 replication cycle.
Intracellular parasites such as viruses depend on cellular machinery to disseminate their genetic information. Different viruses evolve different strategies to utilize the host machinery. The human immunodeficiency virus (HIV), prototype of the lentiviral subfamily of Retroviruses, is one of the ultimate players in exploiting the host mechanism. Its RNA genome is first reverse transcribed into a DNA template, integrated into host chromatin, then transcribed as a cellular gene. Only one viral encoded transcription factor, Tat (Trans-activator of transcription), is directly involved in the process of viral gene transcription. While HIV gene expression heavily depends on cellular machinery, it also has some unique features. This review will cover aspects related to regulations of HIV gene expression, with focus on transcription from non-integrated HIV DNA.
Retrovial integration is a specific process mediated by viral encoded integrases, which are biochemically both necessary and sufficient for integration. Although integration occurs randomly in vitro in assay conditions, in vivo, it preferentially occurs in the upstream portion of active genes or near DNAse-hypersensitive sites . In addition, not all regions of the genome are equally favored for integration . Recent analyses of 524 HIV DNA integration sites confirmed these early findings and indicate that integration prefers active genes and genes that are activated after HIV infection . Regional hotspots for integration were also found on cellular chromosomes. However, these findings are in contrast to one previous study on an onco-retrovirus, which suggests that active transcription inhibits viral integration . The discrepancy may be due to a difference in integration site selection between HIV and onco-retroviruses. Integration into active genes could be an advantage for viral replication. Presumably the local chromatin environment of transcribing genes would favor proviral transcription.
Regulation of HIV gene expression involves a complex interplay between chromatin-associated proviral DNA, cellular transcription factors and the viral encoded trans-activator of transcription, Tat. The process of viral transcription can be divided into two distinct phases. The first phase occurs early in transcription and is mediated by direct interaction between cellular transcription factors and cis-acting elements located in the HIV promoter region. The second phase immediately follows the first one, and relies on the accumulation of sufficient amounts of Tat from the first phase . Following integration, the HIV promoter is under the control of local chromatin environment, which determines the basal transcriptional activity. Independent of the site of integration, HIV 5' LTR is assembled into three unique nucleosomes: nuc-0, -1 and -2. Nuc-1 is positioned immediately downstream of the transcription start site [13, 14], and is rapidly disrupted upon transcriptional activation of the HIV-1 promoter . Interestingly, the region between nuc-0 and 1 appears to remain nucleosome-free although it is large enough to accommodate an additional nucleosome. Multiple cellular transcription factors constantly bind to this region [16, 17], which can induce significant DNA bending. As a result, these factors may affect nucleosome assembly, either by direct competing with histons or by rendering the nucleosome-free region a disfavored site for nucleosome assembly . This nuclesome-free region is also where the LTR core promoter and enhancer are located. The viral core or basal promoter (nt -78 to -1) contains a TATAA box and three consensus SP1 binding sites. The enhancer (nt -105 to -79) carries a duplication of the 10-bp NF-kB binding sites. Regions upstream from the NF-kB sites also influence viral gene expression and are designated the modulatory region (-454 to -104). This region has been proposed to contain a negative regulatory element (NRE) [18, 19]. Multiple cellular factors such as NF-AT, USF, Ap-1, c-Myb, COUP have been proposed to interact with the modulatory region. For a comprehensive list of cellular transcription factors interacting with the HIV-1 LTR promoter, please refer to a recent review by Pereira et al. . Sequences near the RNA initiation site also contain regulatory elements such as the putative inducer of short transcripts (IST) [21, 22], the initiator and the trans-activation response (TAR) element (nt +1 to +60) which interacts with Tat and plays an important role in Tat mediated trans-activation.
In the absence of Tat and cellular stimulation, the nucleosome packed LTR is almost silent. Low levels of transcription are mediated by available cellular transcription factors. Efficient activation of the LTR promoter is largely driven by Tat, and is concomitant with an acetylation-dependent rearrangement of the nucleosome ponsitioned at the viral transcription start site [12, 23–25]. Tat has been suggested to be involved in remodeling nucleosomes to relieve transcriptional blockage imposed by chromatin. It has been shown that Tat associates with p300/CBP and P/CAF histon acetyltransferases (HAT) both in vitro and within the cells [26–28]. Similar association has also been seen in the Tax protein of HTLV-1 . Interestingly, although Tat needs both p300 and P/CAF to activate HIV LTR promoter, only the HAT domain of P/CAF is essential ; whereas in HTLV-1, the Tax protein also requires both p300 and P/CAF, but it is the HAT domain of p300 required , demonstrating evolutionary similarities and divergences used by the two human retroviruses. Other HATs such as Tip60  and hGCN5  have also been implicated to interact with the HIV Tat protein. It is possible that these HATs become components of the protein complex during activation of viral transcription initiation. Tat may interact with HATs directly or via another cellular factor, and act on the LTR promoter. Additionally, Tat appears to be able to directly interact with some transcription factors such as Sp1  and TBP  to promote transcription.
One unique feature of Tat mediated trans-activation is the ability of Tat to interact with RNA rather than with DNA . This interaction occurs specifically between Tat and a specific 59-residue stem-loop structure, TAR, on the RNA leader sequence. Interactions among Tat, TAR and cellular cofactors have been the subject of intense investigation in the past. For a comprehensive review of this subject, please refer to Rana and Jeang , Karn  and Garber et al. . In general, the current model suggests that Tat causes a dramatic increase in transcriptional levels upon binding to TAR. This effect is due to stimulation of a specific protein kinase called TAK (Tat-associated Kinase), which hyperphosphorylates the carboxyl-terminal domain (CTD) of the large subunit of RNA polymerase II, and leads to promoter clearance and processive elongation. Multiple kinases can phosphorylate RNAP II-CTD and evidence suggests that CDK9 is the TAK Kinase [37–40]. The cyclin component of TAK has also been identified. It is the CDK9 associated cycline T1 . Cyclin T1 does not interact directly with TAR, but forms ternary complex with Tat and TAR. It should be noted that the above model is developed from a cell-free transcription system. Certain in vivo conditions such as a chromatin configured provial template may not be accounted for. As a matter of fact, the nucleosome-free LTR is a highly active promoter even in the absence of Tat in the cell-free system. The Tat responsiveness in the system was achieved not by imposing physiological restrictions but by specific assay conditions. Nevertheless, data from these in vitro systems provided invaluable insight into regulation of HIV gene transcription at the basic molecular level.
Successful transcription leads to the generation of approximately 30 different viral transcripts from the provirus. All these transcripts are derived from a single full-length transcript by alternative splicing, which generates mRNA with common 5' and 3' ends. The spliced viral RNA can be grouped into three classes: the multiply spliced mRNA encoding early regulatory proteins such as Tat, Nef and Rev; the singly spliced mRNA encoding Vpu, Vpr, Vif and Env; the un-spliced, full-length mRNA encoding the Gag-Pol poly protein. HIV gene expression is also regulated at a second level by the nuclear export of intron-containing transcripts. This process is mediated by the viral encoded Rev protein (for a comprehensive review, please see ). Both singly-spliced and un-spliced viral RNAs are intron-containing transcripts and carry a secondary structure called Rev Responsive Element (RRE) within the 3' end intron region. Like most pre-spliced transcripts in eukaryotic cells, intro-containing viral transcripts are retained in the nucleus by the interaction of splicing factors until they are spliced to completion or degraded. However, specific interaction between REV and RRE permits nuclear export of incompletely spliced viral transcripts in infected cells . The current model suggests that REV directly binds to RRE and multimerizes upon RRE binding. REV multimerization stablizes the formation of a complex between REV, cellular exportin-1(CRM-1) and the GTPase Ran. This complex targets the mRNA complex to the nuclear pore complex for export. After cytoplasmic translocation, Ran-GTP is converted to Ran-GDP, and dissociated along with exportin-1 from the mRNA complex. REV is also dissociated from mRNA by unknown mechanism and recycled back into the nucleus by cellular importin-β. REV interacts with importin-β in the cytoplasm and dissociates with it in the nucleoplasm due to the action of Ran-GTP. Several other host cofactors have also been implicated to interact with the REV/RRE nuclear export process. These include eIF-5A, Rip/Rab, B23, p32 (for a review, see ). However, their distinctive roles in the process of REV/RRE mediated nuclear export still need to be defined.
The shuttling of REV between cytoplasm and nucleus and its interaction with RRE are fundamentally important in the regulation of HIV gene expression. It has been shown that the REV function is nonlinear with respect to the intracellular concentration of REV in transfection-based assays . A threshold amount of REV, albeit still undefined, would be required for multimerization and exerts REV function in infected cells. The requirement for REV multimerization separates HIV gene expression into an early, REV-independent phase for the regulatory gene expression and a late, REV-dependent phase for the structural protein synthesis. An under-threshold level of REV would restrict viral gene expression to the early phase and may render viral infection into a state of latency.
Accumulation of non-integrated viral DNA is a feature of HIV infection. It occurs both in vivo in infected T cells, lymphoid and brain tissues, and in cell culture conditions [46–49]. During the asymptomatic phase of HIV infection, levels of non-integrated HIV DNA can reach 99% of total viral DNA . As well, in the brains of patients with AIDS and dementia, non-integrated viral DNA was found to be more than 10 fold higher than intergrated DNA. These findings suggested a common feature shared by both HIV and other retroviruses. As in other retroviral infection, the non-integrated HIV DNA exists as three forms, the 1-LTR circle, the 2-LTR circle and the linear DNA. The circular forms of retroviral DNA were first demonstrated by Varmus and Guntaka as closed circular DNA (form I) in duck cells infected with Avian Sarcoma Virus (ASV) [51, 52], and by Gianni in Moloney Leukemia Virus (MLV) infection . Form I circular DNA was later purified exclusively from the nucleus of the ASV infected quail tumor cells , and was shown, within 24 to 48 hours after infection, to constitute as much as 50% of the nuclear viral DNA and 20–25% of viral DNA in whole cells . These early observations have prompted the use of DNA circles as a standard marker for nuclear targeting of HIV preintegration complex [55, 56]. Shank et al. further demonstrated that the form I DNA of Rous Sarcoma Virus actually consists of at least two forms of circular viral DNA: the larger one with the same size as the linear DNA (2-LTR circle) and the smaller one with a 300 bp deletion at the end (1-LTR-circle) . In addition, the smaller circle (1-LTR-circle) is present in great excess over the larger circle (2-LTR circle) in infected cells . These findings were collaborated by a similar study by Yoshimura and Weinberg in Murine Leukemia Virus .
The precursor to the closed circles is the linear DNA synthesized in the cytoplasm of infected cells . However, it is not clear how the linear DNA is converted into circular form in the nucleus. It is believed that 2-LTR circles are the result of a simple ligation of the linear DNA [60–63] or auto-integration of the linear DNA into itself [60, 62, 64, 65]. The ligation reaction would generate 2-LTR circles with LTR-LTR junction (Simple 2-LTR-circle); whereas auto-integration of linear DNA would generate heterogeneous defective genomes of either single circle with two non-adjacent LTRs or double half-genomic circles each with one LTR [62, 64]. These defective LTR circles were also shown to exist in MLV and HIV infected cells and to carry processed LTR junctions typical of viral mediated integration [60, 62, 65]. These defective circles can also be regenerated, in vitro, from purified linear viral DNA in the extract of viral infected cells [62, 64], but not uninfected cells, suggesting that their formation is catalyzed by the viral integrase. Interestingly, in contrast, both the non-defective 1-LTR and Simple 2-LTR circles can be regenerated from linear DNA from the extract of uninfected cells , indicating cellular factors can mediate the formation of these circles independent of viral factors. Indeed, mutant cells lacking proteins of the non-homologous DNA end joining (NHEJ) pathway, such as Ku, ligase IV and XRCC4, did not generate 2-LTR-circles during HIV-1 infection . The generation of 1-LTR-circles has been proposed to arise either from homologous recombination between the LTRs on the linear DNA [57, 61, 62] or from the process of reverse transciption, as demonstrated by the in vitro reverse transcription of permeabilized virion particles [67–69]. The actual process for 1-LTR circle generation in vivo remains to be defined.
Influenced by the Campbell model for integration of lambda bacteriophage , it was originally thought that the circular forms were the precursors for integration [60, 71]. Direct evidence from a cell-free in vitro integration system  and others [73, 74] conclusively demonstrated that the linear DNA is the precursor for retroviral integration. The cytoplasmic extract from MLV infected cells contains predominantly linear DNA, and mediates efficient integration of the viral DNA into target sequences , suggesting that the linear DNA can function directly as a substrate for integration into purified target DNA. In HIV infection, the circles have also been shown to be associated with discrete nuclear complexes, rather than the viral integration complex , indicating that they might be isolated from the viral integrase following circulization by cellular factors. Pauza et al. have suggested that these non-integrating circles of HIV-1 are labile in the nucleus and have a half-life of less than 16 hours in proliferating T cells . Based on this notion, the 2-LTR circles have been used as a marker of active viral replication in HIV-1 infected patients [76–79]. However, recent studies on the metabolism of 2-LTR circles indicated that these circles are actually highly stable and to decrease in concentration only as a function of dilution resulting from cell division [80, 81]. It remains to be resolved whether the metabolism of viral DNA circles varies with cell types.
The notion that non-integrated HIV DNA could be active for viral antigen production came from early studies by Stevenson et al. [82, 83]. It was demonstrated that some integration negative viruses were fully competent for HIV-1 core and envelope antigen production, generating wild type levels of extracellular viral p24 antigen in two HTLV transformed T cell lines, MT-4 and Mo-T. Wiskerchen and Muesing  also created a panel of 42 HIV-1 integrase mutants and found that a subset of replication-defective mutants, with mutations in the catalytic residues, are capable of mediating transactivation of an indictor gene linked to the viral LTR promoter. These studies suggested that the Tat protein could be expressed from the non-integrated DNA [4, 5]. Preintegration transcription has also been shown to occur in HIV infection of resting CD4 T cells cultured in vitro [83, 84]. As early as one hour post infection, HIV-1 tat transcripts were readily detectable in the absence of integration . Spina et al. have also shown that HIV nef transcript was detectable three days after infection of resting CD4 T cells . We further demonstrated that the nef transcript generated was from non-integrated DNA, and that the Nef protein in resting CD4 T cells plays an important role in enhancing T cell activity and promoting viral infection . In a kinetic study of HIV infection of metabolically active T cells, we concluded that transcription from non-integrated DNA is a normal, early step in HIV replication, and that non-integrated DNA has the full capacity to synthesize all classes of viral transcripts, both the early, multiply spliced and the late, singly spliced and non-spliced transcripts. However, only the early multiply spliced transcripts encoding Nef, Tat and Rev were measurably translated. This restriction on protein expression was due to a lack of Rev function in the absence of integration . Recently, others  have further demonstrated that in non-dividing or growth arrested cells, the unintegrated lentiviral vector DNA can persist and sustain reporter gene expression to a level equivalent to wild type vectors, confirming the possibility that this early transcriptional activity from non-integrated viral DNA could be highly significant in certain cells.
Given that non-integrated viral DNA can transcribe in infected cells, it is important to know which forms, the linear DNA or the 1-LTR, 2-LTR circles, are active for transcription. Early attempts to address this question used transfection of different DNA forms into Hela cells . Not suprisingly, all forms of transfected DNA carrying the LTR promoter were found active in transcription. However, the efficiency differs among various DNA forms. It was shown that the circular forms, especially the 2-LTR circles, were an order of magnitude lower than the transfected, proviral DNA carrying flanking cellular sequences. These data suggested that non-integrated DNA can potentially function as templates for viral gene expression. The transfection experiment is reminiscent of early attempts to study viral integration by transfection of purified DNA into cells . It is likely that it may not reflect the actual situation in vivo in infected cells, especially considering possible complexes of non-integrated DNA with viral or cellular factors [55, 75]. Direct evidence suggesting 2-LTR circles as active templates came from studies by Wiskerchen and Muesing  and Engelman et al. . It was shown that integrase mutants with mutations in the catalytic domains are capable of mediating expression of a report gene linked to the LTR promoter, suggesting possible expression of the Tat protein from these mutants. In correlation with the ability of Tat-mediated transactivation, cells infected with these mutants contain elevated levels of 2-LTR circles, suggesting that these circles could be templates. We have also investigated transcriptional activity from one of the non-integrating HIV-1 mutants, D116N, and compared it with the wild type virus . We found similar levels of transcriptional activities at early time in both viruses in the absence of integration, although the levels of 2-LTR circles were two orders of magnitude higher in D116N infection. These data indicated that transcription from non-integrated DNA correlates with total viral DNA, rather than only 2-LTR circles. It is likely that even 2-LTR circles can transcribe, they are not the only templates. Other DNA forms such as the linear or 1-LTR circles may also function as templates. The 2-LTR circles are minor fractions of viral DNA early on, prior to integration, constituting about 5% of total viral DNA in SupT1 cells infected with HIV-based vector  and 0.03% in CEM cells infected with wild type HIV-1  at 12 hours post infection. Currently it remains to be determined which form or forms of non-integrated DNA function as templates for transcription.
The viral products generated from non-integrated DNA, prior to integration, are Nef, Tat and Rev  (Figure 1). There is still no direct evidence to suggest any of these proteins have a direct role in either stabilizing viral DNA or promoting integration, although Nef has been shown to enhance viral DNA synthesis  or prevent DNA oligonucleosomal fragmentation in apoptotic cells . Another aspect of Nef is its effect on the state of T cells rather than on the virus itself. Our study has shown that Nef, synthesized prior to integration, can modulate resting T cells and promote viral replication when activation stimulus arrives . Tat has a similar property for promotion of T cell activation . The Tat protein is required not only for the processivity of the RNA elongation process, but also the modulation of cellular chromatin to activate transcription from the integrated provirus. From this point of view, it is tempting to hypothesize that the small amount of Tat initially synthesized prior to integration would function as an "initiator" to relieve possible chromatin restriction on the LTR promoter. Thus, by this way, Tat can turn on viral gene expression immediately following integration without relying on transcription and translation from newly integrated provirus. The Tat protein synthesized could further activate the LTR through its association with TAR RNA and P-TEFb to increase processive transcription (Figure 2). Indeed, it has been shown that there is a marked difference between non-integrated DNA and integrated provirus in requirements for activation of transcription. The Tat-associated histone acetyltransferase activity is preferentially important for transactivation of integrated, but not unintegrated, HIV-1 LTR, supporting a Tat-independent trans-activation for non-integrated DNA and a Tat-dependent trans-activation for provirus [26, 29].
The Rev protein is required for the synthesis of late structural protein from partially or un-spliced transcripts. It has been demonstrated that a threshold amount of Rev is required for the nuclear export of partially or un-spliced viral DNA . Interestingly, in the absence of integration, Rev is present at a low level, and is not functional to support the late, structural protein syntheses . Only early products from multiply spliced transcripts are synthesized prior to integration. It is reasonable to hypothesize that the restriction imposed by the lack of Rev function would be an advantage for the virus. When cellular restriction is imposed on integration, it would be important to synthesize early regulatory proteins such as Nef and Tat to modulate cellular environment for viral integration and replication to occur. Interestingly, simple retroviruses do not encode these accessory proteins, and lack the ability to infect non-mitotic cells. It appears to suggest that pre-integration transcription may be a function most important to complex retroviruses; it would be a process evolved to provide direct control over functions that, in simple retroviruses, are provided by the host cells. This additional control may be important to break barriers imposed by host immune systems. It should be noted that the above hypothesis is based on multiple copies of viral DNA in a single infected cell. It is unknown, however, whether a transcribing DNA is still able to integrate when a single viral DNA molecule is present in infected cells.
The role of non-integrated DNA in the pathogenesis of HIV infection has not been clearly resolved. In addition to our demonstration of modulation of resting T cell activity by non-integrated DNA , one recent paper demonstrated a direct role of non-integrating HIV in inducing aberrant methylation in infected cells . In other retroviruses, non-integrated DNA has long been implicated in connection with viral pathogenesis. Keshet and Temin were the first to suggest a correlation between cell killing and accumulation of non-integrated DNA in spleen necrosis virus infection . Similar association was seen in avian leukosis virus induced osteoporosis, feline leukemia virus induced feline AIDS, and equine infectious anemia virus infection of horses [97–99]. In HIV infection, accumulation of non-integrated viral DNA correlates with the extent of syncytia formation , but not the occurrence of single-cell killing . Unintegrated circular viral DNA, particularly 2-LTR circles, in the peripheral mononuclear cells of infected patients appears to be associated with high levels of plasma HIV-1 RNA, rapid decline in CD4 count, and clinical progression of AIDS . Circular forms of unintegrated HIV DNA has also been linked with dementia and multinuclear giant cell in the brains of AIDS patients [48, 49]; particularly, the presence of 1-LTR circles was associated with multilnucleated giant cells and clinical diagnosis of dementia and cerebral atrophy . It is not clear, however, whether the mere presence of specific forms of unintegrated DNA triggering cellular process or the products from the DNA caused pathogenic effects.
The ability of non-integrated viral DNA to express viral genes has numerous applications. For example, a non-integrating lentiviral vector would be safer to use for therapy. It dose not disrupt normal cellular genes and induce mutagenesis, as has been demonstrated in previous examples [9, 102]. Recently, it has been demonstrated that the non-integrating lentivirus can be modified into an efficient expression system by incorporation of an functional origin of DNA replication from other viruses . Additionally, the non-integrating HIV mutants, with its restricted gene expression capacity in immu-functional cells such as antigen presentation cells (unpublished data), could be a potential vaccine to stimulate CTL responses [104, 105].
I thank Jon. W. Marsh for his helpful discussions on numerous issues in this review and Kathryn Crockett for her editorial assistance.
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.