Biochemical and virological analysis of the 18-residue C-terminal tail of HIV-1 integrase

Background The 18 residue tail abutting the SH3 fold that comprises the heart of the C-terminal domain is the only part of HIV-1 integrase yet to be visualized by structural biology. To ascertain the role of the tail region in integrase function and HIV-1 replication, a set of deletion mutants that successively lacked three amino acids was constructed and analyzed in a variety of biochemical and virus infection assays. HIV-1/2 chimers, which harbored the analogous 23-mer HIV-2 tail in place of the HIV-1 sequence, were also studied. Because integrase mutations can affect steps in the replication cycle other than integration, defective mutant viruses were tested for integrase protein content and reverse transcription in addition to integration. The F185K core domain mutation, which increases integrase protein solubility, was furthermore analyzed in a subset of mutants. Results Purified proteins were assessed for in vitro levels of 3' processing and DNA strand transfer activities whereas HIV-1 infectivity was measured using luciferase reporter viruses. Deletions lacking up to 9 amino acids (1-285, 1-282, and 1-279) displayed near wild-type activities in vitro and during infection. Further deletion yielded two viruses, HIV-11-276 and HIV-11-273, that displayed approximately two and 5-fold infectivity defects, respectively, due to reduced integrase function. Deletion mutant HIV-11-270 and the HIV-1/2 chimera were non-infectious and displayed approximately 3 to 4-fold reverse transcription in addition to severe integration defects. Removal of four additional residues, which encompassed the C-terminal β strand of the SH3 fold, further compromised integrase incorporation into virions and reverse transcription. Conclusion HIV-11-270, HIV-11-266, and the HIV-1/2 chimera were typed as class II mutant viruses due to their pleiotropic replication defects. We speculate that residues 271-273 might play a role in mediating the known integrase-reverse transcriptase interaction, as their removal unveiled a reverse transcription defect. The F185K mutation reduced the in vitro activities of 1-279 and 1-276 integrases by about 25%. Mutant proteins 1-279/F185K and 1-276/F185K are therefore highlighted as potential structural biology candidates, whereas further deleted tail variants (1-273/F185K or 1-270/F185K) are less desirable due to marginal or undetectable levels of integrase function.


Background
Retrovirus replication proceeds through a series of steps that initiate upon virus entry into a cell, followed by particle uncoating and reverse transcription. To support productive replication, the resulting double stranded cDNA must be integrated into a cell chromosome. The integrated DNA provides an efficient transcriptional template for viral gene expression and ensures for segregation of viral genetic material to daughter cells during division. Due to its essential nature, the integrase (IN) encoded by HIV-1 is an intensely studied antiviral drug target [1].
Integration can be divided into three enzyme-based steps, the first two of which are catalyzed by IN. In the initial 3' processing reaction, IN removes the terminal pGT OH dinucleotides from the 3' ends of the blunt-ended HIV-1 reverse transcript, yielding the precursor ends for integration [2][3][4]. In the second step, DNA strand transfer, IN uses the 3'-oxygens to cut the chromosomal target DNA in a staggered fashion and at the same time joins the viral 3' ends to the resulting 5' phosphates [3]. The final step, repair of single stranded gaps and joining of viral DNA 5' ends, is accomplished by cellular enzymes [5,6]. HIV-1 IN activities can be measured in vitro using oligonucleotide DNA substrates that mimic the ends of the reverse transcript and either Mg 2+ or Mn 2+ cofactor [7][8][9][10].
Insight into the mechanism of HIV-1 integration is somewhat hampered by lack of relevant 3-dimensional information, as structures for the enzyme bound to its DNA substrates, or the free holoenzyme, have yet to be reported. NTD-CCD [29][30][31] and CCD-CTD [32][33][34] twodomain x-ray crystal structures have nevertheless been informative. Three NTD-CCD structures, containing HIV-1, HIV-2, or maedi-visna virus domains, have revealed a dimer-of-dimers architecture for the active IN tetramer [29,30] and the high affinity binding mode of the common lentiviral integration cofactor LEDGFp75 [31]. An SH3 fold comprised of five  strands makes up the heart of the CTD [35,36], and a comparison of HIV-1 [32], SIV [33], and Rous sarcoma virus [34] CCD-CTD structures reveals considerable flexibility in CTD positioning with respect to the different CCDs. Nevertheless, extended viral DNA binding surfaces were ascribed to each CCD-CTD structure. Although residues 271-288, herein referred to as the tail, were present in the two-domain HIV-1 construct, they were disordered and therefore unseen in the resulting crystal structure [32].
The roles of the C-terminal tail in IN function and HIV-1 replication are largely unexplored. The IN 1-270 deletion mutant that lacked the tail supported 10-50% of wild-type (WT) Mn 2+ -dependent 3' processing and DNA strand transfer activities, whereas the activities of IN 1-279 were largely unimpaired (50-100% of WT) [25]. HIV-1 carrying the substitution of Ala for Lys-273 grew like the WT in Jurkat T cells, dispensing an obvious role for this highly conserved tail residue in virus replication [37]. To learn more about the role of this region in IN catalysis and HIV-1 replication, successive three amino acid deletion mutants were constructed and analyzed in various enzymatic and virus infection assays. The somewhat larger 23-residue HIV-2 tail was moreover swapped for the HIV-1 sequence to assess the activities of tail chimera enzyme and virus., C-terminal deletion mutants that lack all or part of the tail could be useful structural biology candidates due to their inability to adopt an ordered fold in previous crystal structures. Thus, one goal of this study was to evaluate the solubility-enhancing F185K CCD mutation [38] for its potential effects on the in vitro activities of tail deletion mutant enzymes.

Protein expression and purification
Escherichia coli strain PC2 [43] transformed with IN expression constructs were grown for 16 h at 30°C. The next day bacteria subcultured at 1:30 in 600 ml LB-100 g/ml ampicillin were grown at 30°C until A 600 of 0.6, at Louis, MO) per mg of protein for 3 h at room temperature, which left the heterologous LVPR sequence at each C-terminus. After removal of thrombin by incubation with Benzamidine beads (Novagen, Madison, WI), IN was concentrated using Centricon-10 Concentrators (Millipore, Billerica, MA) and dialyzed against buffer D for 4 h. Protein concentration was determined by spectrophotometer, and aliquots flash frozen in liquid N 2 were stored at -80°C. Quantitative image analysis (Alpha Innotech FlourChem FC2, San Leandro, CA) of Coomassie-stained gels revealed that each IN preparation was minimally 90% pure.
Recombinant LEDGFp75 expressed in bacteria was purified as previously described [44]. LEDGFp75 concentrations were determined using the Bio-Rad protein assay kit (Hercules, CA). Exonuclease III was from New England Biolabs (Beverley, MA).
Anti-IN monoclonal antibody 8G4 [45] was purified from hybridoma cell supernatant using protein G sepharose (GE Healthcare, Piscataway, NJ) following the manufacturer's recommendations. 500 ml of cell supernatant loaded onto 1 ml of protein G beads were subsequently washed with phosphate-buffered saline. Antibody eluted with 20 mM glycine-HCl, pH 2.8 was immediately neutralized by addition of 1 M Tris-HCl, pH 8.5. Pooled fractions were concentrated by ultrafiltration, and resulting antibody concentration was determined by spectrophotometry.
LEDGFp75-dependent concerted integration activity was assayed essentially as previously described [31]. A preprocessed 32 bp U5 end was prepared by annealing
HeLa-T4 cells [49] were grown in DMEM-10% FBS containing 100 IU/ml penicillin and 100 g/ml streptomycin. For infectivity measurements, cells plated at 75,000 cells/ well of 24-well tissue culture plates 24 h prior to infection were incubated in duplicate with 10 6 RT-cpm of virus for 17 h, after which cells washed with phosphate-buffered saline were replenished with fresh media. At 46 h postinfection, cells were collected, washed, and lysed using 75 l passive lysis buffer as recommended by the manufac-turer (Promega Corp., Madison, WI). Luciferase activities (20 l), determined in duplicate for each infection, were normalized to total levels of cellular protein as previously described [42]. For quantitative (Q)-PCR assays, 900,000 cells were plated per 10 cm dish the day before infection. Cells were infected with 2.3 × 10 7 RT-cpm of TURBO DNase-treated [42] native or heat-inactivated (65°C for 30 min) virus. 8G4 hybridoma cells were grown in DMEM containing 10% ultra low IgG FBS (Invitrogen Corporation) with penicillin and streptomycin.

Q-PCR assays for reverse transcription and integration
Total cellular DNA was isolated at 7 or 24 h post-infection using the QIAamp DNA mini kit (QIAGEN). Late reverse transcription (LRT) products were detected using primers and Taqman probe as previously described [50,51]. Twolong terminal repeat (2-LTR) containing circles were detected at 24 h post-infection using primers MH535/536 [50] and SYBR green (QIAGEN). Integration was measured at 24 h using a modified nested HIV-1 R-Alu format based on reference [52]. DNA (100 ng) was amplified using the phage lambda T-R chimera primer AE3014 [53] and Alu-specific AE1066 (5'-TCCCAGCTACTCGGGAG-GCTGAGG) with rTth DNA polymerase XL as recommended by the manufacturer (Applied Biosystems Inc, Foster City, CA). Samples (1 l) were then analyzed by Q-PCR using SYBR green with primers AE989 and AE990 [51]. DNA generated from WT-infected cells was endpoint diluted in DNA prepared from uninfected cells to generate the integration standard curve. LRT, 2-LTR, and Alu-integration Q-PCR values obtained from samples prepared using heat-inactivated virus were subtracted from those generated using native virus.

Experimental strategy
Little is known about the role of HIV-1 IN C-terminal tail (residues 271-288, Figure 1) in integration. This region of the protein, which overlaps the 5' end of the vif reading frame, is fairly well conserved among different HIV-1 isolates. Some clade C sequences harbor Ala in place of Asp-278 and numerous clades as well as SIVcpz carry Gly at position 283 ( Figure 1); the remaining residues by contrast show little or no sequence variation [54]. To ascertain the role of the tail in IN function, six nested deletions mutants lacking 3, 6, 9, 12, 15, or 18 amino acids from the C-terminus were constructed in the pKBIN6Hthr bacterial expression construct [39] and luciferase-based pNLX.Luc(R-) viral vector [42] (Figure 1). The CCD F185K mutation, which dramatically increases the solubility of the HIV-1 protein [38], was tested in some constructs to assess its potential affects on IN activities in vitro. The 1-266 deletion mutant, which lacked the C-terminal 22 residues and hence the fifth  strand of the CTD SH3 fold in addition to the tail ( Figure 1) [35,36], was used as a loss-of-function control [55]. Finally, the 23 residue HIV-2 tail (underlined in Figure 1) was swapped for the corresponding HIV-1 sequence to test the functionality of this marginally related sequence substitution. Because the viral changes necessarily altered the overlapping vif sequence, these constructs incorporated stop codons downstream of the IN region within the vif frame to negate synthesis of altered Vif proteins. Viruses were constructed in 293T cells, which lack APOBEC3G and thus do not require functional Vif to yield infectious particles [56].

The C-terminal tail and IN enzymatic activities
Recombinant proteins were engineered to contain C-terminal hexahistidine tags to facilitate purification. Though this might appear counterintuitive given the C-terminal focus of the study, it was necessary to obtain relatively pure preparations. The tail region is hypersensitive to proteolysis during expression in E. coli [57], and preliminary experiments with N-terminally tagged proteins yielded heterogeneous populations eluted from Ni-NTA beads whose purities were not substantially improved upon by subsequent ion exchange or size exclusion chromatography (data not shown). The C-terminal tag obviated this problem, as proteolyzed variants failed to bind Ni-NTA beads. Indeed, quantitative image analysis of purified WT and mutant proteins revealed near homogeneous preparations (Figure 2A).
IN activities were measured using three different assay designs, each of which incorporated an ~30 bp DNA mimic of the viral U5 end ( Figure 2B-D). Overall levels of IN 3' processing and DNA strand transfer activities were determined in two separate assays using differentially labeled 30 bp substrates ( Figure 2B and 2C). Under these conditions, the majority of DNA strand transfer reaction products result from the insertion of a single oligonucleotide end into one strand of a second target DNA molecule [8]. By contrast, integration in cells proceeds via the concerted insertion of viral U3 and U5 DNA ends into opposing strands of chromosomal DNA. Reactions that contain relatively low concentrations of IN protein [58], relatively long viral DNA substrates [59], or relatively high concentrations of oligonucleotide substrate in the presence of LEDGFp75 [31] support efficient concerted HIV-1 integration. Here, LEDGFp75 was used in a third assay format ( Figure 2D) to monitor the concerted integration activities of IN mutant proteins. His 6 -tags were removed from purified IN proteins by thrombin cleavage prior to enzyme assays, yielding the remnant LVPR C-terminal sequence. Experiments conducted with a subset of proteins prior to cleavage (WT, 1-279, 1-273, 1-270,1-266, and HIV-1/2) revealed similar levels of 3' processing activities relative to WT, indicating that the remnant sequence did not significantly influence mutant enzyme activities (data not shown).

HIV-1 SIVcpz HIV-2
To follow the course of the 3' processing reaction, oligonucleotide substrate DNA was labeled at the inter-nucleotide linkage of the 3'-terminal GT ( Figure 2B); IN mediated hydrolysis liberates pGT OH , which is readily distinguished from the 30 bp substrate following electrophoresis on high percentage DNA sequencing gels [3,4] ( Figure 3A, lanes 2 and 3; results quantified in panel B).
Exonuclease III-mediated hydrolysis by contrast yielded free pT OH ( Figure 3A, lanes 1 and 17). All IN preparations were basically void of contaminating exonuclease activity ( Figure 3A), reflecting the relatively high degrees of protein purity (Figure 2A). IN D64N and IN 1-266 , which contained the substitution of Asn for active site residue Asp-64 [14] and lacked part of the CTD SH3 fold, respectively, were predictably inactive ( Figure 3A, lanes 15 and 16).   Figure 3A, lane 20; Figure 3B). The F185K solubility mutation marginally impacted activity, generally yielding 20-25% reductions when compared to the same protein lacking the CCD change ( Figure 3B).
The preprocessed DNA strand transfer substrate was labeled at the 5' end of the strand that becomes joined to In the presence of LEDGFp75, some donor DNA is integrated into only one strand of the target to yield a tagged, nicked circle half-site reaction product. Concerted integration across the major groove by contrast yields a linearized product whose length exceeds that of the starting circle by twice the length of the viral donor. For panels B-D, thin and bold lines represent viral donor and target DNAs, respectively. *, positions of 32 P label (panels B and C).  (Figure 2C and 4A). Relative levels of IN mutant DNA strand transfer activities in large part mirrored 3' processing activities with some subtle differences noted (compare Figure 4B to Figure 3B).  Figure  4B). Mn 2+ can support more robust IN activity than Mg 2+ [9,60], which may have contributed to the previously reported residual level of IN 1-270 DNA strand transfer activity [25]. IN HIV1/2 DNA strand transfer activity, by contrast to IN 1-270 , was increased from its relative level of 3' processing activity ( Figure 4B and 3B).    Supercoiled pGEM-3 plasmid DNA was incorporated into the reaction mixture to help identify concerted integration reaction products ( Figure 2D and 5A). Integration of only one donor DNA end into one plasmid DNA strand yields a tagged circle whose mobility through agarose matches that of starting relaxed circular plasmid ( Figure 5A). Pairwise integration of two oligonucleotides by contrast yields a linearized product whose size is slightly larger than linear plasmid ( Figure 2D). IN DNA strand transfer activity was barely detectable in the absence of LEDGFp75, yielding slight increases in the nicked or open circular plasmid population ( Figure 5A, compare lanes 3 and 27 to lanes 2 and 26, respectively) [31]. LEDGFp75 greatly stimulated IN activity such that the supercoiled target DNA was largely consumed, yielding a mixture of half-site and concerted integration products ( Figure 5A, lanes 4 and 28). IN mutant product formation was quantified to reflect overall levels (half-site plus concerted, Figure 5B Figure 5A, lane 30 and Figure 5B) in the absence of detectable concerted integration activity ( Figure 5C). Taken together, our data indicate that the Cterminal tail does not play a specific role in concerted DNA integration, though the introduction of a foreign sequence for the HIV-1 tail can uncouple pairwise from single end integration activity. Though others noted that the F185K substitution ablated Mg 2+ -dependent integration of preprocessed oligonucleotide donor DNA into heterologous target DNA [61], our reaction conditions failed to reveal an affect of the solubilizing mutation on fulllength IN activity in the presence of LEDGFp75 ( Figure  5A, lane 6; panels B and C). We furthermore conclude that the C-terminal 9 amino acids of HIV-1 IN can be removed without dramatically effecting Mg 2+ -based single end or concerted DNA integration activities (Figures 3, 4, 5)., We highlight these derivatives as potential candidates for structural biology studies despite the approximate 20-25% reductions in IN 1-279 and IN 1-276 activities brought on by the F185K change. We would by contrast advise against extensive analysis of tailless IN 1-270 , due to its lack of detectable DNA strand transfer activity under these assay conditions (Figure 4 and 5).

Characterization of IN mutant viruses
To assess HIV-1 infectivity, HeLa-T4 cells were infected with normalized levels of single-round viruses that carry the luciferase reporter gene in place of nef. Two days postinfection, cells were harvested and resulting luciferase activities were normalized to the levels of total protein in the different cell extracts [42,47]. Deletion of up to 9 amino acids from the IN C-terminus failed to affect HIV-1 infectivity (Figure 6). IN mutants HIV-1 1-276 and HIV-1  supported about 50% and 20% of the level of WT infection, respectively, whereas HIV-1 1-270 , HIV-1 1-266 , and the HIV-1/2 tail chimera were non-infectious ( Figure  6).
IN mutations can affect multiple steps in the HIV-1 replication cycle, including particle release from virus-producing cells and/or reverse transcription during the subsequent round of infection (reviewed in ref. [62]). Viruses specifically blocked at integration are distinguished as class I, whereas class II mutants display additional stage defects. To assess potential affects on virus particle release, RT content in HeLa cell supernatants at 2 days post-transfection was normalized to levels of cellassociated luciferase activity. Normalized levels of mutant virus release did not significantly differ from the WT under this assay condition (data not shown). Defective mutant viruses (HIV-1 1-266 , HIV-1 1-270 , HIV-1 1-273 , HIV-1  , and HIV-1/2; Figure 6) produced from transfected 293T cells were analyzed by western blotting to assess levels of virion-incorporated IN protein. Monoclonal antibody 8G4, which recognizes discontinuous epitopes in the NTD and CCD [45], was utilized to avoid potential complications from the CTD mutations. Accordingly, 8G4 effectively recognized the different forms of recombinant IN protein (Figure 7 [63], which occurs via the CTD [64,65]. An RT binding interface was recently mapped to  strands 2-4 of the SH3 fold [66] and though residues 271-273 abut -5 (Figure 1), it is not unreasonable to suspect the disordered tail could affect RT binding. Alternatively, a number of NTD and CCD mutations in addition to CTD changes can impair DNA synthesis (see [62] for review), indicating that the C-terminal tail changes might perturb reverse transcription via global affects on IN and/or the preintegration complex.
HIV-1 1-276 and HIV-1 1-273 supported about 40% and 20% of WT integration, respectively ( Figure 8C), indicating that their partial infectivities ( Figure 6) were due to specific integration defects attributable to the intrinsic activities of the deletion mutant enzymes (Figure 3, 4, 5). Consistent with their non-infectious phenotypes and inabilities for recombinant IN proteins to catalyze concerted integration activity, neither HIV-1 1-270 nor the HIV-1/2 chimera supported a detectable level of integration during infection ( Figure 8C). As both of these viruses supported the formation of detectable 2-LTR circles ( Figure 8B), we group them as class II defective IN mutants that display marginal (3 to 4-fold) reverse transcription in additional to prominent integration defects. HIV-1 1-266 was a more severe class II mutant virus, harboring a significant reverse transcription as well as integration defect.

Conclusion
The results of this study revealed that nine amino acids can be removed from the HIV-1 IN C-terminus without significantly affecting the activity of the enzyme or infectivity of the virus. Additional removal of up to six amino acids impacted infectivity by up to 80%, yielding viruses that were specifically defective for integration due to the compromised activities of the associated IN 1-276 and IN 1-273 enzymes. Heuer and Brown [67] reported that residues 271-288 crosslink to viral and target DNA sequences within junctional disintegration substrates. We would therefore surmise that tail residues 271-279 interact with substrate DNA during integration. HIV-1 1-270 was noninfectious and harbored an approximate fourfold reverse transcription defect. This suggests IN residues 271, 272, and 273 might impact its physical association with RT. HIV-1 1-266 , which lacked the fifth  strand of the fold, failed to incorporate significant levels of IN protein and was in large part defective for reverse transcription. Thus, an intact SH3 fold apparently contributes to Gag-Pol packaging and subsequent viral DNA synthesis. Our results moreover highlight partial tailed variants 1-279/ F185K and 1-276/F185K as viable candidates for structural biology studies, as they retained >20% of IN enzymatic activities yet lacked at least half of the disordered region.