Phylodynamics of HIV-1 Circulating Recombinant Forms 12_BF and 38_BF in Argentina and Uruguay

Background Although HIV-1 CRF12_BF and CRF38_BF are two epidemiologically important recombinant lineages circulating in Argentina and Uruguay, little is known about their population dynamics. Methods A total of 120 "CRF12_BF-like" and 20 "CRF38_BF-like" pol recombinant sequences collected in Argentina and Uruguay from 1997 to 2009 were subjected to phylogenetic and Bayesian coalescent-based analyses to estimate evolutionary and demographic parameters. Results Phylogenetic analyses revealed that CRF12_BF viruses from Argentina and Uruguay constitute a single epidemic with multiple genetic exchanges among countries; whereas circulation of the CRF38_BF seems to be confined to Uruguay. The mean estimated substitution rate of CRF12_BF at pol gene (2.5 × 10-3 substitutions/site/year) was similar to that previously described for subtype B. According to our estimates, CRF12_BF and CRF38_BF originated at 1983 (1978-1988) and 1986 (1981-1990), respectively. After their emergence, the CRF12_BF and CRF38_BF epidemics seem to have experienced a period of rapid expansion with initial growth rates of around 1.2 year-1 and 0.9 year-1, respectively. Later, the rate of spread of these CRFs_BF seems to have slowed down since the mid-1990s. Conclusions Our results suggest that CRF12_BF and CRF38_BF viruses were generated during the 1980s, shortly after the estimated introduction of subtype F1 in South America (~1975-1980). After an initial phase of fast exponential expansion, the rate of spread of both CRFs_BF epidemics seems to have slowed down, thereby following a demographic pattern that resembles those previously reported for the HIV-1 epidemics in Brazil, USA, and Western Europe.

Genetic characterization of BF1 recombinants in South America revealed some important differences across countries. Although four distinct BF1 circulating recombinant forms (CRFs) have been described in Brazil to date (CRF28_BF, CRF29_BF, CRF39_BF, and CRF40_BF) [15,16], the Brazilian BF1 epidemic is largely dominated by a variety of unique recombinants forms (URFs) that do not share a common recombinant ancestor [7,10,[17][18][19]. In contrast, the Argentine BF1 epidemic comprises the widespread CRF12_BF and several URFs with a CRF12-related structure [6,8,20,21]. The molecular epidemiology of HIV-1 in Uruguay is not so well characterized, but two previous studies suggested that BF1 recombinants circulating in this country are similar to those described in Argentina [3,20]. Very recently, a novel CRF38_BF1 was described among Uruguayan HIV-1 isolates, indicating that other BF1 recombinants besides CRF12_BF have gained epidemic importance in this country [14].
Previous studies performed by our group suggest that the HIV-1 subtype F1 and BF1 epidemics in South America were initiated after the introduction of a single F1 strain into Brazil between the middle and late 1970s [22][23][24]. After its introduction, this founder subtype F1 strain probably recombined with the local subtype B virus generating the large diversity of CRFs_BF1 and URFs_BF1 currently observed in the continent [24]. Based on monophyletic clustering and coincident recombination breakpoints, it was suggested that most BF1 recombinants circulating in Argentina and Uruguay derived from a common recombinant ancestor [21,24,25].
To date, however, very little is known about the evolutionary history and epidemic potential of the diverse BF1 recombinants that have expanded in the South American population. Only one previous study was conducted on a small number (n = 40) of CRF12_BF-like vpu sequences from a vertically infected population in Argentina [26]. This study estimated the age of the most recent common ancestor (MRCA) of those CRF12_BF-like viruses between 1981 and 1996, and further suggests an extremely rapid spread of the CRF12_BF-like recombinant viruses, compatible with the demographic pattern of explosive population growth observed in this pediatric population at the start of the epidemic.
The objective of the present study was to reconstruct the evolutionary and demographic history of the CRF12_BF circulating in Argentina and Uruguay through the analysis of a large data set (n = 120) of CRF12_BF-like pol sequences recovered from adults and children living in both countries. In addition, we also analyzed a small data set (n = 20) of CRF38_BF-like pol sequences to estimate the age and demographic history of the CRF38_BF epidemic spreading in Uruguay. This data represent an excellent opportunity to explore potential CRF-specific and regional-specific differences in the patterns of HIV-1 epidemic growth in South America.

Study population
A total of 66 and 17 samples with a CRF12_BF-like and CRF38_BF-like mosaic pattern at the pol gene, respectively, were selected from HIV-1-infected patients residing in Argentina and Uruguay who had previously been analyzed in two independent studies. The first study analyzed the genetic structure of BF1 pol recombinant sequences collected between 1997 and 2008 from HIV-1 infected children followed up at the "Hospital de Pediatria Garrahan" in Buenos Aires, Argentina, identifying 43 samples with a CRF12_BFlike mosaic pattern (Aulicino et al, publication in progress). The second study assessed the genetic diversity in a group of BF1 pol recombinant samples collected between 1997 and 2009 from HIV-1 infected adults and children residing in Uruguay, identifying 23 samples with a CRF12_BF-like mosaic structure and 17 samples with a CRF38_BF-like mosaic pattern (Ruchansky et al, publication in progress). These unpublished sequences were combined with CRF12_BF (Argentina n = 3; Uruguay n = 2) and CRF38_BF (Uruguay n = 3) reference sequences, and CRF12_BF-like pol sequences (Argentina n = 48; Uruguay n = 1) from adults patients with known sampling dates retrieved from the Los Alamos HIV Sequence Database http:// www.hiv.lanl.gov/content/index, as described in Table 1. Sequences were excluded if they originated from the same patient or from individuals known to be related by direct transmission. The sequences werẽ 1440 bp long and covered the protease (PR) and part of the reverse transcriptase (RT) genes (nucleotides 2266-3705 relative to the HXB2 clone), encompassing the recombinant fragments of the CRF12_BF and CRF38_BF at pol gene (Fig. 1a). Nucleotide sequences were aligned using CLUSTAL X program [27]. All positions with alignment gaps were excluded from analyses.

Characterization of "CRF-like" recombinant profiles
Two strategies were used to characterize the HIV-1 pol sequences used in the present study as CRF12_BF-like or CRF38_BF-like recombinants: 1) First, the recombination breakpoints of each sequence were identified by Bootscanning using Simplot version 3.5.1 [28]. Bootstrap values supporting branching with reference sequences were determined in Neighbor-Joining (NJ) trees constructed using the K2-P [29] nucleotide substitution model, based on 100 re-samplings, with a 200 bp sliding window moving in steps of 20 bases. Individual query sequences were compared to representative reference sequences of HIV-1 subtypes A1, B, C, and F1. Sequences were considered to have a "CRF-like" profile if recombination sites exactly match those identified in CRF12_BF and CRF38_BF reference sequences.
2) Second, Bayesian and Maximum Likelihood (ML) phylogenetic trees for the final pol alignment including all CRF-like sequences were built to confirm the overall topology and strong support of each CRF clade. Phylogenetic trees were constructed under the GTR [30] nucleotide substitution model, with a gamma-distribution model of among site rate heterogeneity and a proportion of invariable sites (GTR+I+Γ) selected using the Modeltest program [31]. A Bayesian phylogeny was estimated using MrBayes [32]. Two runs of four chains each were run for 50 × 10 6 generations, with a burn-in of 5 × 10 6 generations. Convergence of parameters was assessed by calculating the Effective Sample Size (ESS) using TRACER v1.4 [33], after excluding an initial 10% for each run. All parameter estimates for each run showed ESS values >100. ML trees were reconstructed with PhyML [34] using an online web server [35]. Heuristic tree searches were performed using the SPR branch-swapping algorithm, and the approximate likelihood-ratio test (aLRT) based on a Shimodaira-Hasegawa-like procedure was used as a statistical test to calculate branch support. Trees were visualized with the FigTree v1.1.2 program (available at http://tree.bio.ed.ac. uk/software/figtree/).

Estimation of evolutionary rates, dates, and demographic history
The evolutionary rate (μ, units are nucleotide substitutions per site per year), the age of the most recent common ancestor (T mrca , years), and the mode and rate (r, years -1 ) of population growth for the CRF12_BF and CRF38_BF strains were estimated using BEAST v1.4.7 [36,37]. Evolutionary and demographic parameters of CRF12_BF were estimated under a chronological timescale employing the dates of sample collection. The low number of CRF38_BF sequences analyzed (i.e., 20 sequences), however, was not sufficient to obtain an accurate estimate of the evolutionary rate of this lineage. Therefore, the rates of evolution at pol (PR/RT) gene previously estimated for other HIV-1 group M subtypes (1.5 × 10 -3 -2.5 × 10 -3 substitutions/site/year) [38][39][40][41] were incorporated as a prior probability distribution in the analysis of this CRF. Estimations of evolutionary and demographic parameters involved two steps. First, the Bayesian skyline plot method [42] was used to estimate μ, the T mrca , and the change in effective population size through time. Second, two different demographic models for each data set were compared: exponential and logistic growth; and estimates of the population growth rate were then obtained under the model that provided the best fit to the demographic signal in each data set. Model comparisons in a Bayesian framework were performed by calculating the Bayes Factor (BF) [43] with TRACER v1.4. Analyses were performed using the GTR+I+Γ nucleotide substitution model under either strict or uncorrelated Lognormal relaxed [44] molecular clock models. Two separate MCMC chains were run for 10-50 × 10 6 generations, with a burn-in of 1-5 × 10 6 . BEAST output was analysed using TRACER v1.4, with uncertainty in parameter estimates reflected in the 95% Highest Probability Density (HPD) intervals. Convergence of parameters was assessed through the ESS, with all parameter estimates for each run showing ESS values >100. A graphical representation of the effective number of infections through time was generated by using programs TRACER v1.4 and Prism 4 (GraphPad Software). Posterior trees samples from BEAST runs were summarized using TreeAnnotator v1.4.7 (available from http://beast.bio.ed.ac.uk) to generate time-scaled maximum clade credibility trees.

Results
A total of 115 pol sequences (Argentina = 91, Uruguay = 24) with a "CRF12_BF-like" recombination profile, and 17 pol sequences (Uruguay) with a "CRF38_BF-like" recombinant pattern were identified by Bootscanning analyses. These sequences were aligned with reference sequences of CRF12_BF, CRF38_BF and Brazilian CRFs_BF1, and analyzed using Bayesian and ML approaches. Both phylogenetic approaches showed that the CRF12_BF-like and CRF38_BF-like pol sequences segregated with their respective CRF reference sequences in two well supported monophyletic groups characterized by unique recombination profiles, confirming the common ancestry of each CRF (Fig. 1). Of note, Simplot analysis suggests that the CRF38_BF presents a more complex BF1 mosaic pattern at the PR/RT genomic region than that previously described [14], characterized by the presence of small subtype F1 fragments between positions 2640 and 3020 relative to HXB2 (Fig. 1). More detailed analysis of the pol genomic region should be performed in order to determine the precise mosaic structure of the CRF38_BF at that region.
Within the CRF12_BF clade, two strongly supported Uruguayan subclusters comprising four (cluster UY-1) and six (cluster UY-2) viruses were identified in both Bayesian (posterior probability [PP] > 0.80) (Fig. 1) and ML (aLTR > 0.70) phylogenetic trees (data not shown). Most (61%) CRF12_BF Uruguayan sequences, however, were randomly interspersed among Argentine sequences, which provides evidence against the existence of a specific Argentine or Uruguayan CRF12_BF lineage. This contrasts with the circulation of CRF38_BF which seems to be restricted to Uruguay, as no strains with a CRF38_BF-like structure were identified after analysis of more than 300 BF1 recombinant pol sequences from Argentina (74 unpublished sequences and 249 sequences retrieved from the Los Alamos HIV Sequence Database).
Bayesian MCMC analyses under a skyline tree prior were used to estimate the time-scale of the CRF12_BF and CRF38-BF epidemics. The mean estimated evolutionary rate for the CRF12_BF pol gene was 2.4 × 10 -3 subst./site/year, under both strict and relaxed molecular clock models ( Table 2). The median rate of evolution for the CRF38_BF pol gene was 1.8 × 10 -3 subst./site/ year (strict clock) and 1.9 × 10 -3 (relaxed clock), although the 95% HPD intervals of those estimates almost coincide with the informative prior interval ( Table 2), indicating that not much information was added by the data. Considering these substitution rates, the median T mrca of the CRFs was estimated at 1982 (strict clock) and 1983 (relaxed clock) for the CRF12_BF, and 1985 (strict clock) and 1986 (relaxed clock) for the CRF38_BF (Table 2).
Bayesian skyline plot analyses were also used to infer the demographic history of South American CRF_BF epidemics. According to this analysis the CRF12_BF epidemic experienced a fast exponential growth during the first 10-15 years followed by a more recent decline in growth rate since the mid-1990s (Fig. 2a). A very similar demographic pattern was observed for the CRF38_BF, showing that after an initial period of exponential growth of~10 years the growth rate of this CRF epidemic also slowed around the mid-1990s (Fig. 2b). These results suggest that a model of logistic population growth fits the demographic information contained in the CRF12_BF and CRF38_BF data sets better than the exponential growth model.
To test this hypothesis, approximate marginal log likelihoods for the logistic and exponential growth models were calculated. The analysis of BF clearly showed that, for both CRFs, the model of logistic population growth was strongly supported over the exponential growth model, under either a strict or a relaxed molecular clock (Table 3). On the other hand, models assuming a relaxed molecular clock fit the CRF12_BF and CRF38_BF data sets better than models enforcing a strict molecular clock (Table 3) indicating that substitution rate varies among branches consistent with other HIV-1 studies [39,[45][46][47]. Indeed, the coefficients of variation under the relaxed clock model were higher than zero for both CRF12_BF (mean = 0.21, 95% HPD: 0.16-0.26) and CRF38_BF (mean = 0.28, 95% HPD: 0.10-0.46).
A coalescent model of logistic growth was then used to estimate the initial growth rate of South American CRF_BF epidemics. Evolutionary parameters estimated  (1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990) under the logistic growth tree prior were almost identical to those estimated with Bayesian skyline ( Table 2). The mean estimated growth rate of CRF12_BF epidemic was 1.08 year -1 (strict clock), and 1.22 year -1 (relaxed clock). This rate corresponds to a mean epidemic doubling time of around six months ( Table 4). The mean growth rate of the CRF38_BF epidemic was estimated at 0.83 year -1 (strict clock) and 0.92 year -1 (relaxed clock), which corresponds to a mean epidemic doubling time of <1 year (Table 4).

Discussion
We have performed an extensive study of the evolutionary history and population dynamics of the two most prevalent HIV-1 BF1 recombinant strains (CRF12_BF and CRF38_BF) spreading in South America. The CRF12_BF was the first CRF to be described in South America among samples isolated from Argentina and Uruguay [20], and is responsible for a significant part of the HIV-epidemics in those countries. Our phylogenetic analyses of recombinant pol sequences with a CRF12-like  Table 3 Bayes Factors (BF) between exponential (Exp) and logistic (Log) growth demographic models for the HIV-1 CRF12_BF and CRF38_BF pol data sets. structure showed for the first time that CRF12_BF viruses spreading in Argentina and Uruguay constitute a single epidemic with evidences of multiple genetic exchanges among countries. In contrast, circulation of the recently described CRF38_BF [14] seems to be restricted to Uruguay as no strains with a CRF38_BF-like pol mosaic pattern were observed after screening of a large number (n > 300) of BF1 recombinant sequences from Argentina. According to our estimates, the mean T mrca of CRF12_BF and CRF38_BF was 1982-1983and 1985-1986, respectively; only a few years later than the mean estimated onset date of subtype F1 spread in Brazil (1976)(1977)(1978) [23,24]. These mean estimates are fully consistent with epidemiological data that revealed that BF1 recombinants related to the CRF12_BF have been in circulation in Argentine children since the mid 1980s [48]. In agreement with this study, we identified five Argentine children born between 1987 and 1989 infected by CRF12-like BF viruses, and one Uruguayan child born in 1987 infected by a CRF38-like BF strain.
Our study showed that the substitution rate of CRF12_BF viruses at pol gene (2.5 × 10 -3 subst./site/ year) is similar to rates previously described for HIV-1 subtypes B, C, and CRF31_BC [38,40,41], pointing to no major differences in evolution rate among HIV-1 strains circulating in South America. By contrast, some differences in epidemic growth rate may have existed across distinct HIV-1 variants circulating in South America. The estimated initial growth rate of the CRF12_BF epidemic in Argentina and Uruguay was higher than that described for Brazilian subtype B, C, and F1 epidemics [23,47], and similar to that recently described for the CRF31_BC viruses circulating in southern Brazil [47] ( Table 4); supporting the notion of a fast initial spread of CRF12_BF-like viruses [26]. The initial expansion rate of CRF38_BF also seems to be higher than those described for subtypes B and F1, although the 95% HPD interval of such an estimate was quite large ( Table 4).
The rapid initial growth rate of CRF12_BF, CRF31_BC, and CRF38_BF epidemics in South America may indicate that these CRFs displayed a higher fitness and/or transmissibility than parental HIV-1 subtypes. Alternatively, variations in the initial growth rate of HIV-1 variants may reflect differences in the susceptible populations that characterized the initial spread of each strain. Of note, the BF recombinants have been associated with injection drug use populations in Argentina and Uruguay [3,5], and introduction of a BF1 recombinant virus in such highly connected networks may explain the emergence and rapid initial dissemination of the CRF_BF1 viruses.
After the initial period of fast exponential growth, the expansion rate of the CRF12_BF and CRF38_BF epidemics slowed down since the mid 1990s. The same demographic pattern was described for HIV-1 epidemics in Brazil, USA, and some European countries [23,38,47,[49][50][51]. Such a recent decline in the growth rate of these HIV-1 epidemics may be the consequence of adequate prevention campaigns implemented after the official report of the first AIDS cases in the early 1980s, and/or the result of a saturation of high-risk transmission networks that are loosely connected with low-risk subgroups that exhibit modest levels of HIV-1 infection [52].
The overall demographic pattern of the CRF12_BF epidemic in Argentina and Uruguay observed in this study is similar with that previously described for the Estimates of the median growth rate (r, yr -1 ) and epidemic doubling time (λ, yr) for the HIV-1 CRF12_BF and CRF38_BF epidemics (95% HPD in parentheses). Growth rate estimates were used to calculate the time taken for the epidemic to double in size (λ) using the relation λ = ln(2)/r. a Data from Aulicino et al. [26]. b Data from Bello et al. [23]. c Data from Bello et al. [47].
CRF12_BF epidemic in Argentine children [26]; and our mean estimates of T mrca and population growth rate fell within the 95% HPD interval of the previous estimates ( Table 4). The confidence intervals of these new estimates, however, were much narrower than those previously obtained ( Table 4), indicating that the use of a larger data set have substantially improved the accuracy of parameter estimation. In agreement with this idea, the small CRF38 data set was associated with much larger confidence intervals of demographic parameter estimates than those obtained for the larger CRF12_BF data set. Thus, more precise estimations of demographic parameters of CRF38_BF will require the analysis of a larger number of sequences.

Conclusions
Our results suggest that CRF12_BF and CRF38_BF viruses were rapidly generated after the introduction of subtype F1 into South America and have been circulating in the continent over the last 25 years. Both CRFs seem to have spread exponentially at a fast rate during the 1980s and the early 1990s; but the rate of spread of the CRF12_BF and CRF38_BF viruses slowed down since the mid 1990s. Despite similar emergence dates and demographic histories, CRF12_BF have been widely disseminated in Argentina and Uruguay, whereas the CRF38_BF circulation was found limited to Uruguay. Determination of the factors that have shaped the pattern and rate of spread of distinct HIV-1 variants in South America is of paramount importance to understand the epidemic potential of these variants.