Genotypic prediction of HIV-1 subtype D tropism

Background HIV-1 subtype D infections have been associated with rapid disease progression and phenotypic assays have shown that CXCR4-using viruses are very prevalent. Recent studies indicate that the genotypic algorithms used routinely to assess HIV-1 tropism may lack accuracy for non-B subtypes. Little is known about the genotypic determinants of HIV-1 subtype D tropism. Results We determined the HIV-1 coreceptor usage for 32 patients infected with subtype D by both a recombinant virus phenotypic entry assay and V3-loop sequencing to determine the correlation between them. The sensitivity of the Geno2pheno10 genotypic algorithm was 75% and that of the combined 11/25 and net charge rule was 100% for predicting subtype D CXCR4 usage, but their specificities were poor (54% and 68%). We have identified subtype D determinants in the V3 region associated with CXCR4 use and built a new simple genotypic rule for optimizing the genotypic prediction of HIV-1 subtype D tropism. We validated this algorithm using an independent GenBank data set of 67 subtype D V3 sequences of viruses of known phenotype. The subtype D genotypic algorithm was 68% sensitive and 95% specific for predicting X4 viruses in this data set, approaching the performance of genotypic prediction of HIV-1 subtype B tropism. Conclusion The genotypic determinants of HIV-1 subtype D coreceptor usage are slightly different from those for subtype B viruses. Genotypic predictions based on a subtype D-specific algorithm appear to be preferable for characterizing coreceptor usage in epidemiological and pathophysiological studies.


Background
Human immunodeficiency virus type 1 (HIV-1) enters CD4-expressing cells using one or both of the chemokine receptors CCR5 and CXCR4 [1]. The receptor(s) used by HIV-1 must be identified before a patient is treated with CCR5 antagonists, as these drugs can only be used against R5 viruses alone [2]. Recombinant virus phenotypic entry assays have been widely used to determine HIV-1 tropism [3][4][5] but genotypic methods based on the V3 sequence could be easier. Several studies indicate that the V3 genotype, combined with bioinformatic algorithms, accurately predicts the phenotype of HIV-1 coreceptor usage for subtype B viruses [6][7][8][9][10]. But the V3-based genotypic algorithms may be unsuitable for predicting the tropism of non-B viruses because they were built using genotype-phenotype correlation data for subtype B viruses [9]. These algorithms can perform differently, as was reported for HIV-1 subtype B [6,10].
This study evaluates the performance of the genotypic algorithms built for subtype B viruses for predicting HIV-1 subtype D tropism. We determined subtype D coreceptor usage with both genotypic and phenotypic assays. The poor concordance between them led us to look for genotypic criteria that could be used to predict the coreceptor usage of subtype D viruses and to define a new genotypic tool for this particular subtype. Lastly, we checked the subtype D genotypic tool against a Gen-Bank data set of subtype D viruses for which both the V3 sequence and the entry phenotype were known.

Study subjects and samples
We studied 32 individuals infected with HIV-1 subtype D recruited at the Department of Infectious Diseases of Toulouse University Hospital, France and at the Department of Virology of Necker-Enfants Malades Hospital, Paris, France. The median age of the patients was 42 years and 46% were men. The median HIV-1 virus load was 4.91 log 10 copies/ml (IQR [4.1-5.16]). The median CD4 cell count was 355 cells/mm 3 (IQR [208. ) and the percentage of CD4 cells was 17% (IQR [10-19.5]). All viruses were identified as HIV-1 subtype D by analysis of the env sequence using the NCBI genotyping tool (http://www.ncbi.nlm.nih.gov/projects/genotyping/ formpage.cgi). We confirmed that these viruses belonged to the subtype D by neighbor-joining phylogenetic analysis of the sequences studied here, together with HIV-1 subtype reference sequences from the Los Alamos National Laboratory (http://www.hiv.lanl.gov/content/ index).

GenBank data set of HIV-1 subtype D viruses
The V3 sequences of HIV-1 subtype D viruses whose entry phenotype was known were selected from the Gen-Bank database. We selected sequences resulting from bulk sequencing or one sequence per patient in the case of clonal analysis. The entry phenotype of the 67 subtype D viruses had been determined with the MT2 assay or with the Trofile ® assay (Monogram, Biosciences).

Phenotypic characterization of HIV-1 coreceptor usage
We determined the HIV-1 tropism with the TTT phenotypic assay [3]. Briefly, a fragment encompassing the gp120 and the ectodomain of gp41 was amplified by RT-PCR using HIV-1 RNA isolated from the plasma or by PCR from HIV-1 DNA taken from PBMCs. The PCR products then underwent nested PCR. Two amplifications were performed in parallel on aliquots of each sample; the amplified products were then pooled to prevent sampling bias of the virus population.
The phenotype of HIV-1 coreceptor usage was determined using a recombinant virus entry assay with the pNL43-Δenv-Luc2 vector. 293T cells were co-transfected with NheI-linearized pNL43-Δenv-Luc2 vector DNA and the product of the nested PCR obtained from the challenged HIV-1-containing sample. The chimeric recombinant virus particles released into the supernatant were used to infect U87 indicator cells bearing CD4 and either CCR5 or CXCR4. Virus entry was assessed by measuring the luciferase activity in lysed cells (as relative light units; RLU). Minor X4 variants were detected when they accounted for 0.5% or more of the total population.

Genotypic prediction of HIV-1 coreceptor usage
The V3 region was directly sequenced from bulk env PCR products in both directions by the dideoxy chaintermination method (BigDye Terminator v3.1; Applied Biosystems, Courtaboeuf, France) on an ABI 3130 DNA sequencer. The two primer pairs used for sequencing have been described [10]. Results were analyzed with Sequencher (Genecodes, Ann Arbor, MI) by an operator blinded to the phenotype. Minority species were detected when the automated sequencer electropherogram showed a second base peak. Multiple alignments were performed with CLUSTALW 1.83, and sequence alignments were manually edited with BioEdit software. Phylogenetic analyses excluded any possibility of sample contamination (data not shown).
We used a combination of criteria from the 11/25 and net charge rules to predict HIV-1 tropism from the V3 genotype [10]. One of the following criteria is required for predicting CXCR4 coreceptor usage: (i) 11 R/K and/ or 25 K in V3; (ii) 25 R in V3 and a net charge of ≥ + 5; (iii) a net charge of ≥ +6. The V3 net charge was calculated by subtracting the number of negatively charged amino acids [D and E] from the number of positively charged ones [K and R]. All possible permutations were assessed when amino acid mixtures were found at some codons of V3. The combination resulting in the highest net charge was used to predict the tropism. We also used the geno2pheno tool (with a false positive rate of 10%) to predict HIV-1 coreceptor usage. Geno2pheno is available at http://coreceptor.bioinf.mpi-sb.mpg.de/cgibin/coreceptor.pl (September 2010).

Cloning of env PCR products
The env PCR products from three patients were subjected to clonal analysis using a TOPO-TA cloning kit (Invitrogen). Plasmids DNA containing env inserts were sequenced in the V3 region using the primers previously described [10].

Statistical methods
The kappa coefficient was measured using STATA SE 9.2 to assess agreement between the genotypic algorithms for HIV-1 tropism prediction and the phenotypic assay. The correlation between two tests is usually considered good when the kappa coefficient is superior to 0.60 with p < 0.05.

Nucleotide sequence accession numbers
The sequences reported here have been given GenBank accession numbers HQ906854-HQ906879.

Phenotypic characterization of HIV-1 subtype D viruses
The env products from the plasma sample of 27 of the 32 subtype D-infected patients were successfully amplified by PCR. The phenotype of receptor-mediated entry was then successfully determined for each of these 27 patients. We found 23 virus populations with an R5 phenotype, 2 virus populations with a dual/mixed R5X4 phenotype, and 2 virus populations with a pure X4 phenotype.

Genotypic prediction of subtype D coreceptor usage with algorithms built for subtype B viruses
The V3 region was sequenced from the bulk env PCR products of the viruses from 26 patients (Figure 1a). We thus obtained genotype-phenotype correlations for 26 patients. The combined 11/25 and net charge rule predicted 15 R5 viruses and 11 X4 viruses, but 7 of them were mis-predicted as X4. Geno2pheno10 predicted 13 R5 viruses and 13 X4 viruses (10 were mispredicted as X4). As summarized in Table 1 geno2-pheno10 was 75% sensitive and 54% specific and the combined rule was 100% sensitive and 68% specific for predicting CXCR4-usage by HIV-1 subtype D. The concordance between the genotypic and phenotypic approaches was 58% with geno2pheno10 and 73% with the combined rule (Table 1).

Genotypic determinants predicting CXCR4 use by HIV-1 subtype D viruses
We looked for V3 genotypic determinants known to be associated with CXCR4 usage by subtype B viruses, as 11 25    shown in Figure 1b [24,25]. One virus had an arginine (R) at position 25 with a net charge at +6 and had an R5X4 phenotype (Figure 1a and Table 2). Eleven viruses had no "R" or "K" at positions 11 or 25, net charges < +5, and R5 phenotypes. Five viruses each had a lysine (K) at position 25 with a net charge < +5, four of which had an R5 phenotype and only one of which had an R5X4 phenotype. Two viruses each had a K at position 25 with a net charge of +5 and were phenotyped as R5.
We studied different clones from the HIV-1 quasispecies of three patients harboring R5 virus populations on bulk phenotypic analysis but predicted to be X4 by the bulk genotypic analysis when using the algorithms built for subtype B viruses. All the clones successfully phenotyped were R5 and had a lysine (K) at position 25 with net charges comprised between +1 and +4 ( Figure 2). Thus, HIV-1 subtype D viruses frequently have a lysine at position 25 and viruses use exclusively CCR5 for entry when this amino acid is associated with a V3 net charge ≤ +5.
We, therefore, designed a genotypic rule based on the 11/25 and net charge rules for determining the tropism of HIV-1 subtype D. One of the following criteria was required for predicting subtype D CXCR4 coreceptor usage: (i) R or K at position 11 of V3; (ii) R at position 25 of V3 and a net charge of ≥ +5; (iii) a net charge of ≥ +6. The genotypic and phenotypic approaches using this rule were 92% concordant (Table 3). This subtype D genotypic algorithm was 75% sensitive and 95% specific with our data set.

Validation of the subtype D genotypic algorithm on an independent data set
The GenBank dataset of subtype D viruses included 25 R5X4/X4 viruses and 42 R5 viruses based on phenotypic assays. We analyzed phylogenetically the GenBank V3 sequences and the 26 V3 sequences from patients (Figure 3). We predicted the tropism of these viruses with the initially validated combined rule [10], with the gen-o2pheno tool and with the subtype D genotypic algorithm based on the simple 11/25 and net charge rules ( Table 4). The concordance between genotypic and phenotypic determinations was 69% with the combined rule and 67% with geno2pheno10. The concordance increased to 85% using the subtype D genotypic algorithm. The subtype D tool was 68% sensitive and 95% specific for detecting CXCR4-using viruses. Geno2-pheno10 was the most sensitive tool (96%) but its specificity was poor (50%).

Discussion
HIV-1 subtype D infections have been associated with rapid disease progression [14][15][16]18] and a high prevalence of CXCR4-using viruses according to phenotypic assays [19][20][21]. A genotypic assay for determining subtype D tropism would be useful for investigating the pathogenesis of this subtype and for facilitating the clinical use of CCR5 antagonists. But recent studies indicate that the genotypic algorithms currently used are relatively insensitive for non-B subtypes, although their performance for particular subtypes was not specifically  "R" or "K" at 11 No "R" or "K" at 11 or 25 < +6 14 0 ≥ +6 0 0 *The two viruses harboring an arginine at position 11 have also an arginine at position 25.
determined [9]. We have now analyzed the correlations between phenotypic and genotypic approaches for determining HIV-1 subtype D tropism. The prevalence of X4 viruses estimated in 26 patients with the TTT phenotypic assay was 15%. The TTT assay has previously been validated on B and non-B subtypes and correlated well with the enhanced sensitive Trofile assay [3,26]. The scarcity of CXCR4-using viruses in these patients could be because they were at a different stage of the disease compared to the patients recruited in Uganda and Sudan in previous studies [19][20][21]. The genotypic determination of HIV-1 tropism with algorithms built for the subtype B were adequately sensitive (75% with geno2pheno10 and 100% with the combined 11/25 and net charge rules), but were poorly specific (54% to 68%, respectively) for predicting CXCR4 use by subtype D viruses. One previous study of the performance of two genotypic algorithms for determining subtype D tropism reported poor specificity, 74% for the 11 RK/25 K rule and 53% for the PSSM algorithm [20].
We therefore analyzed the V3-loop sequences and the corresponding phenotype of these subtype D viruses. We found that the lysine at position 25 is a polymorphic amino acid in HIV-1 subtype D and should not be considered as a determinant of CXCR4 usage for this particular subtype, in the contrary with subtype B viruses. We confirmed this polymorphism at a clonal level on three virus populations phenotyped R5. The V1-V2 env region may also influence the virus tropism [27][28][29]. A recent study found that analysis of the V2-V3 region 11 25 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   slightly improved the sensitivity for predicting CXCR4 usage compared to analysis of V3 alone [30]. However, we previously analyzed the V1 and V2 regions of subtype B viruses and found no criteria that improved the genotypic prediction [8]. For subtype D viruses, no genotypic determinant has been identified in the V1-V2 region that improves the performance of the genotypic approaches (data not shown). The determinants identified for predicting CXCR4 usage by subtype D viruses were combined in a simple genotypic rule that differed slightly from the combined rule validated for subtype B, C and CRF02-AG [10][11][12]. The concordance between the subtype D genotypic rule and the TTT (kappa coefficient: 0.70) was better than that for the subtype B tools (kappa coefficients: 0.15-0.40). The sensitivity of the subtype D genotypic rule (75%) was similar to that of the subtype B genotypic algorithms, applied to subtype B viruses to determine tropism (69 to 88%) [10]. One limit of the study was the small number of X4 viruses in our patients (4/26). However, R5 viruses with an X4 genotype using current algorithms were very informative and analysis of our dataset enabled us to propose a new interpretation rule for HIV-1 subtype D tropism. This new rule was subsequently validated by examination of a GenBank set of 67 subtype D V3 sequences belonging to viruses whose phenotype was known. The best concordance with the phenotype was obtained with the subtype D combined rule (sensitivity 68%, specificity 95%), giving a good agreement with the phenotypic assay (kappa coefficient of 0.63). This specific genotypic algorithm predicted HIV-1 tropism better than did the MT2 or Trofile phenotypic assays (data not shown). The specificity of the V3 genotype is important for not excluding patients eligible for antiretroviral treatment based on a CCR5 antagonist and for epidemiological and pathophysiological studies. The specificity of the V3 genotype is also crucial for accurate characterization of HIV-1 quasispecies by ultra-deep Figure 3 Neighbour-joining phylogenetic tree of HIV-1 subtype D V3 sequences from 26 patients and 72 sequences from GenBank. Patients are identified with the same number than in Figure 1 and the GenBank sequences are identified with the country (two letters code) and the accession number. The corresponding phenotype is indicated by symbols: open circles indicate sequences from R5 viruses, solid circles indicate sequences from R5X4 viruses and solid squares indicate sequences from X4 viruses. Percentage bootstrap values are indicated on branches have been calculated for 1000 replicates. The genetic relatedness of two different sequences is represented by the horizontal distance that separates them, with the length of the bar at the bottom denoting a sequence divergence of 0.10. pyrosequencing, which improves the sensitivity for detecting CXCR4-using viruses.

Conclusion
The combined rule with criteria from the 11/25 and net charge rules modified for subtype D HIV-1 performed well for predicting the tropism of this particular subtype. Simple genotypic methods could make it easier to determine the impact of virus tropism on disease progression and to facilitate the clinical use of CCR5 antagonists. Further studies are now needed to optimize the various genotypic algorithms for predicting the tropism of other HIV-1 non-B subtypes.