HIV-1 subtype distribution and its demographic determinants in newly diagnosed patients in Europe suggest highly compartmentalized epidemics

Background Understanding HIV-1 subtype distribution and epidemiology can assist preventive measures and clinical decisions. Sequence variation may affect antiviral drug resistance development, disease progression, evolutionary rates and transmission routes. Results We investigated the subtype distribution of HIV-1 in Europe and Israel in a representative sample of patients diagnosed between 2002 and 2005 and related it to the demographic data available. 2793 PRO-RT sequences were subtyped either with the REGA Subtyping tool or by a manual procedure that included phylogenetic tree and recombination analysis. The most prevalent subtypes/CRFs in our dataset were subtype B (66.1%), followed by sub-subtype A1 (6.9%), subtype C (6.8%) and CRF02_AG (4.7%). Substantial differences in the proportion of new diagnoses with distinct subtypes were found between European countries: the lowest proportion of subtype B was found in Israel (27.9%) and Portugal (39.2%), while the highest was observed in Poland (96.2%) and Slovenia (93.6%). Other subtypes were significantly more diagnosed in immigrant populations. Subtype B was significantly more diagnosed in men than in women and in MSM > IDUs > heterosexuals. Furthermore, the subtype distribution according to continent of origin of the patients suggests they acquired their infection there or in Europe from compatriots. Conclusions The association of subtype with demographic parameters suggests highly compartmentalized epidemics, determined by social and behavioural characteristics of the patients.


Background
Human immunodeficiency virus type 1 (HIV-1) is characterized by extensive genetic diversity. HIV-1 strains are divided in four groups (M, N, O and P), originating from four separate cross-species transmissions from chimpanzees and/or gorillas to humans. While HIV-1 groups O, N and P are mainly restricted to Central Africa, group M has caused the HIV pandemic [1][2][3][4]. HIV-1 group M has been further classified into 9 distinct subtypes, sub-subtypes and inter-subtype circulating recombinant forms (CRFs). Subtypes and subsubtypes arose from founder effects at different time points in the past, and inter-subtype recombinants can arise in patients co-infected with strains from two different subtypes. If these newly recombined strains have a significant epidemic spread, they are called Circulating Recombinant Forms (CRFs) [5].
The spread of HIV-1 subtypes is important for epidemiological purposes but can also be of relevance in clinical settings. Some biological properties differ between subtypes. They have different rates of evolution and their sequence variation may affect antiviral drug resistance development [6][7][8][9][10][11][12], but overall limited differences are found in the genetic barrier to drug resistance development between subtypes [13]. Other studies suggested differences in disease progression: subtype D seems to have a faster disease progression than subtypes A or C [14,15]. In the absence of antiretroviral prophylaxis, subtype C is transmitted from mother-to-child more frequently compared to subtype D, which in turn is more frequently transmitted than subtype A [16,17]. Some studies suggest that sexual transmission of subtype C is also more likely than of subtypes A and D [18,19]. In addition, it is still not well understood how to cope with the genetic variability of HIV-1 for the development of an efficient HIV-1 vaccine [20][21][22].
Hemelaar et al. documented the molecular epidemiology of HIV-1 in the world in 2011 using convenience sampling and a literature review. Subtype C was described as the most prevalent globally, representing 48% of the infections, while subtypes A, B, CRF02_AG, CRF01_AE, subtype G and D accounted for 12,11,8,5,5 and 2% of the infections, respectively. In this study, subtype B accounted for 85% of HIV-1 infections in Western and Central Europe, while subtype A, C and G followed, with 2-3% of infections [23]. Another manuscript by the EuroSIDA study group also based on analysis of HIV-1 genomic sequences from 939 HIV-1 patients from Europe, Israel and Argentina followed from May 1994 onwards, documented a subtype B prevalence of 86%, 2% of subtype A, 4% of subtype C and 7% of other subtypes [24].
We had access to sequences from the SPREAD (Strategy to Control Spread of HIV Drug Resistance) surveillance programme, which is coordinated by the European Society for Antiviral Resistance (ESAR). This programme was initiated with the objective of reliably determining the prevalence of transmission of drug resistance within the different patient risk-groups and to identify risk factors enhancing the risk of transmission of drug resistance. A second objective was to characterize the epidemiological and sequence diversity of HIV-1 in Europe. Different than in previous approaches, in this study the samples were collected in a representative way from newly diagnosed patients (http://www.esar-society.eu/). In this paper, we describe the subtype distribution of HIV-1 in Europe and Israel, based on the SPREAD sequences of three collection periods from patients newly diagnosed between 2002 and 2005 [25,26].

Subtype B accounts for 70% of HIV-1 infections in newly diagnosed patients living in Europe
Of the 2730 sequences included in the study, 2469 (90.4%) were successfully subtyped using the REGA Subtyping Tool version 2, while 261 (9.6%) were unclassified, of which 137 sequences (5.0%) remained untypable even after manual analysis.  Table 1). When adjusting for oversampling in some countries (Additional file 1: Figure S1), the proportion of new diagnoses with subtype B increased to 70.2%; subtypes C and A decreased to 5.0 and 3.6% respectively; CRF02_AG and subtype G increased to 4.9% and 4.8% respectively; CRF01_AE decreased to 1.9%; and U/URFs increased to 5.8% (Additional file 2: Figure S2). Even though some of these differences are statistically significant, they are limited to the extent that they have no substantial impact on the remaining analyses, and are thus not further reported separately. All adjusted analyses can be found in supplementary materials.

HIV-1 molecular epidemiology is highly heterogeneous between European countries
The country of sampling corresponds to the country where the sample and questionnaire were collected. For most cases the "country of sampling" corresponds also to the area where the patient resides and is clinically followed. For all countries, the subtype with the highest proportion of new diagnoses was subtype B, except Israel, where subtype C was more prevalent than subtype B (58.1 vs 27.9%). The countries with the highest proportion of non-B subtypes were Israel and Portugal (72.1 % and 60.8%, respectively). This is due to the parallel epidemics of subtypes B and C and subtypes B and G in Israel and Portugal, respectively. Poland and Slovenia were on the opposite side, with the lowest proportion of non-B subtypes (5.0% and 6.5%) (Figure 1; Additional file 3: Figure S3 and Additional file 4: Table S1).
New diagnoses with subtype B occurred in 79% of patients originating from and living in Europe The country of origin corresponds to the country where the patient was born. When re-analyzing the distribution of subtypes including only patients who originated from SPREAD countries (n = 2225), the proportion of newly diagnosed patients infected with subtype B increased significantly from 66.1% to 79.5%. Subtypes or CRFs A1, CRF01_AE, CRF02_AG and C decreased significantly from 6.9%, 4.0%, 4.7% and 6.8% to 5.2%, 2.6%, 1.7% and 2.5% respectively, while the proportion of subtype G remained approximately stable (3.8% to 3.3%) ( Table 1). A significant rise in proportion of newly diagnosed in this analysis means that the respective subtype is less found among the immigrant population.
The proportion of new diagnoses with different subtypes among native populations is country-specific Table 2 shows per country the 1 st , 2 nd and 3 rd most prevalent subtypes sampled, as well as the 1 st , 2 nd and 3 rd more prevalent subtypes among the native population. No consistent pattern exists for the proportion of new diagnoses with non-B subtypes among natives. In Belgium, the second most prevalent subtype was CRF02_AG in patients sampled in Belgium, while subtype C was more prevalent in patients originating from Belgium, implying that the high proportion of CRF02_AG is mostly caused by immigrants (in this case originating from Africa), while subtype C seems to have established among the Belgian population, alongside subtype B. In Portugal, subtype G is well established in the native population, while in Greece and Cyprus subtype A1 is well established in natives.
The most extreme case of discrepancy in infecting subtypes between natives and immigrants is Israel, where natives are exclusively infected with subtype B and the majority of infected immigrants -mainly from Ethiopiaare infected with subtype C.
The HIV-1 subtypes infecting immigrant patients living in Europe are mostly similar to the HIV-1 subtypes causing epidemics in their country/continent of origin When analyzing the distribution of subtypes of the patients originating from countries other than SPREAD countries, we found results consistent with our current HIV-1 molecular epidemiological knowledge in those countries, suggesting that the country where the patient originates is also the country where the infection was acquired or that they acquired their infection in Europe from compatriots (Additional file 5: Figure S4; Additional file 4: Table S2). The logistic regression results were consistent with these results by indicating continent of origin and countries of origin as significantly associated with the proportion of new diagnoses with different subtypes (see Additional file 4: Table S3 for details) while country of sampling was rarely associated with it.
The distribution of subtypes according to the continent of origin of the patient is represented in Figure 2. The proportion of newly diagnosed with subtype B was higher in patients from Western Europe (76.59%), Latin America (78.18%) and Eastern Europe and Central Asia (86.59%). In patients from South and South-East Asia, the most prevalent subtype was CRF01_AE (63.93%). Finally, in patients from Sub-Saharan Africa, almost all subtypes were found, but subtype C seems to dominate the epidemic (31.21%). Albeit subtype B was the most prevalent subtype in patients originating from North Africa and Middle East (58.33%), subtype C was also very prevalent in these patients (16.67%), (Figure 2).
HIV-1 molecular epidemiology in Europe is highly stratified according to gender and risk group The distribution of different subtypes according to gender is represented in Figure 3. The proportion of subtype P-values are for comparisons between the complete dataset and for the dataset including only patients originating from SPREAD countries.
Others -Samples classified as other subtypes or CRFs.
B is significantly higher in men than in women, while the proportion of subtypes A1, C, G, CRF01_AE, CRF02_AG and U/URFs is significantly higher in women.
The proportion of subtype B is significantly higher in MSM (men who have sex with men) patients than in IDUs and in heterosexuals and is significantly higher in IDUs than in heterosexual patients. Subtype A1 was significantly more prevalent in heterosexuals (9.48%) than in MSM (4.9%), but it was the 2 nd most prevalent subtype in MSM, with all other subtypes less than 1.3%. These patients were mostly from Greece (n=53), but also from Cyprus (n=3), Portugal (n=2), Spain (n=1), the Netherlands (n=1) and Ireland (n=1). The increase in proportion of subtypes A1 and C has also been described in the MSM population of the United Kingdom [27]. Subtype G was significantly less prevalent in MSM (0.32%) than in IDUs (10.05%) and heterosexuals (7.20%), while subtypes C, CRF02_AG and CRF01_AE were significantly more prevalent in heterosexuals (15.7%, 7.41% and 9.16% respectively) than in IDUs (1.44%, 3.35% and 0.48% respectively) and MSM(0.48%, 1.04% and 1.28% respectively). URFs were found significantly more in heterosexuals than in MSM ( Figure 4).
Logistic regression indicated risk group MSM as a positive predictor of infection by subtype B, while heterosexual risk group is linked to infection with subsubtype A1 and subtype C. Risk group MSM also indicates lower risk of infection with CRF01_AE, CRF02_AG and subtypes C and F. See Additional file 4: Table S3 for Odds Ratios.
Continent and country of origin and risk factor are the main determinants of subtype distribution Multinomial logistic regression indicated that gender, risk group and continent of origin were the main  determinants of subtype distribution. Country of sampling was also indicated as a determinant of subtype distribution by the regression model, but the p-value (p>0.05) for this association was not significant (Additional file 4: Table S4). Logistic regression, however, is not the best method to determine dependency between the variables. For that, we used Bayesian Network analysis. Its use helps to determine which associations occur directly between the analysed variables and which associations are secondary to other direct primary associations. In our Bayesian network analysis, risk factor, continent of origin and country of origin were identified as unconditionally associated with the infecting subtype, as illustrated by a direct arc ( Figure 5). However, only the arc connecting continent of origin and HIV clade was confirmed by a high bootstrap support (74% of the replicates). On the other hand, countries of sampling and gender were found to be only secondarily associated with subtype, since there is no direct arc with subtype, rather the    Table S5 for list of countries included in each continent region. connection is through the risk factor, continent of origin and country of origin variables. For example, the arc that connects continent of origin to HIV clade indicates that there is a direct and unconditional dependence between these two variables; therefore, the continent of origin of the patient is an important determinant of the HIV clade.
On the other hand, gender is connected to HIV clade through the risk group, which means that the unequal distribution of subtypes according to gender can be explained by the fact that there is a difference in subtype epidemic among MSM (all men) compared to heterosexual or IDU (men and women). Similar extrapolations can be made to other variables of the network ( Figure 5).

The proportion of newly diagnosed with CRF02_AG and subtype F increased significantly between 2002 and 2005
When analysing the proportion of diagnoses with different HIV-1 subtypes stratified over the years according to date of first positive confirmatory HIV-1 test, we found that most subtypes did not show any significant trend. The exception to this rule was CRF02_AG and subtype F with a significantly increasing (p-value=0.05 and 0.03, respectively) and subtype C with a significantly decreasing trend between 2002 and 2005 (p=0.004). No significant trends were found in the proportion of diagnoses with subtype B, which increased in 2004 but then decreased in 2005 ( Figure 6).

Discussion
In this paper, we have described the HIV-1 subtype distribution and its associated socio-demographic factors in newly diagnosed patients in West-Central Europe using sequences from the SPREAD programme (http://www. esar-society.eu/). Samples were collected between 2002 and 2005 from drug-naïve patients diagnosed not earlier than 6 months before sampling, together with clinical and epidemiological information. The sampling strategy was based on representative sampling over countries and risk groups, allowing a more accurate picture than in previous studies that were either based on convenience sampling [23] or focused solely on one country [28,29]. In addition, our study is unique since it permitted to combine for the first time molecular data of a 'continent'-scaled sample, together with demographic and behaviour information of the patients. The primary objective of the SPREAD study was to measure the extent of transmission of drug-resistant HIV, and an additional objective was to characterize the genetic diversity of the epidemic in Europe. For this second objective, we subtyped all samples from three inclusion rounds of the SPREAD programme (Sept 2002-Dec 2005), and analyzed the geographical spread and the factors associated with this subtype distribution. Although SPREAD sampling strategy was carefully designed to avoid sampling bias, we noticed that not all countries were sampled at the same density. Given the differences found in the sampling rate of newly diagnosed patients especially in small countries, we performed a weighted analysis to account for such differences. Such a strategy has, however, its own limits since diagnosis rates may differ among risk groups [30] or the proportion of infected patients that are undiagnosed may differ between countries. . Asterisks indicate significant differences in the proportion of one subtype between at least two of the risk groups. For example, for URFs the significant difference was found only between homosexuals and heterosexuals (p=1.2x10-5) while subtype B has a significantly different frequency in all risk groups. For more details, please refer to the methods and results sections.
Although we found some significant differences with the weighted analysis, this difference was not substantial and did not result in different conclusions. Because of its high specificity, even though at the cost of sensitivity [31,32], we only used the Rega subtyping tool in our study, and complemented it with manual analysis for unclassified sequences. Although our subtyping methodology was extremely meticulous, a limitation of any study using only one genetic region is the fact that no claims can be made with regard to the absence of recombination breakpoints outside of the sequenced genomic regions.
Similar as in the UNAIDS report for Western Europe [33], we found subtype B to be the most prevalent subtype but lower than reported before: 66.1% compared to the 85% reported by UNAIDS. In the subset of patients not only diagnosed but also originating from Europe the proportion of subtype B among newly diagnosed patients was significantly higher (79.5%) resembling the UNAIDS data. These differences are thus likely explained by the very different sampling strategies between the studies: UNAIDS used country of origin of the patient, while we used country of sampling; and UNAIDS used a convenience sample of data collected from patients diagnosed at any time and we used a representative sample of newly diagnosed patients. Subtypes A1 (6.9%), C (6.8%) and CRF02_AG As in our study only newly diagnosed patients were included in the sample, a major advantage of our sampling strategy is that it maps the more recent past compared to previous studies. Our findings may therefore be more relevant for the epidemic in the near future.
The information collected allowed us to make a detailed analysis of the relationship between the sociodemographic and other epidemiological indicators of the patient and the HIV subtype. Despite the increasing spread of different non-B subtypes, the continent of origin and the risk group of the patient are still good predictors of the subtype. While for now, we are still able to attribute the proportion of new diagnoses with many non-B subtypes in Europe to certain demographic groups; this might change in the future. For example, subtype G has already a well-established epidemic in Portugal even within natives and in both heterosexual and IDU risk groups. From our results, it is clear that some non-B subtypes are still being imported into Europe and remain largely limited to migrant populations, such as CRF02_AG in Belgium; while some have  established themselves in the native European population, like subtype G in Portugal and subtype A1 in Greece. Similar observations can be made for the risk group analysis, where subtypes are clearly compartmentalized in different transmission groups. This suggests highly stratified epidemics occurring in each country and risk group. HIV infection is still determined by social, behavioural and demographic characteristics of the patients, and we should thus target preventive measures to specific populations.
The subtype B epidemic was firstly described in the MSM population [34], but was found in IDUs soon afterwards [35]. Interestingly, the prevalence of subtype B is still higher in the MSM risk group. This could either indicate a higher transmission potential of subtype B in MSM, or could just be a reflection of the long-term establishment of subtype B infection in this risk group, that may be more compartmentalized than IDUs. Even though some reports are consistent with the hypothesis of a biological difference between subtypes with regard to transmission rates and routes [36,37], with respect to our findings, more data are needed to exclude simple epidemiological circumstances. In this study, we find that heterosexuals are more frequently infected with non-B subtypes, and this is reflected in the higher proportion of womenand consequently probably children -infected with such strains. Finally, we find that the MSM risk group presents a recent rise of proportion of sub-subtype A1 infections (4.9%), mostly caused by an epidemic among Greek MSMs.
The proportion of new diagnoses with HIV-1 CRF02_AG and subtype F increased significantly between 2002 and 2005, while for subtype C we saw a significant decrease. No other significant time trends were found, indicating stable epidemics of most HIV-1 subtypes. However, given the short time period studied and the fact that most of the patients have an unknown date of infection, no firm conclusions should be made in this respect.
Although different subtypes of HIV-1 represent different epidemics, they have been dealt as a single epidemic in UNAIDS and ECDC reports. Herein, we present the social, behavioural and demographic determinants of HIV-1 subtype distribution in Europe. Stratifying results by subtypes allows a better understanding of changing prevalence and mobility of the virus. To guarantee representativeness in each country, individuals were selected according to the national distribution of transmission risk groups and the geographical distribution of patients with new diagnoses of HIV-1 infection. The strategies used to achieve this were: in countries where more than 80% of all newly diagnosed individuals were expected to be covered by the participating centers, a random sample from all newly identified individuals was taken. In other countries, stratified sampling weighted for the proportion of newly diagnosed patients among different risk groups and among different geographical areas was performed or a consecutive number of patients up to a predefined number per geographic region were included. All recruited patients were antiretroviral drug naive at the time of sampling, and drug resistance genotyping was performed in the national reference laboratories as described before [25,26]. Details about the study design were reported previously and can be found on the website (http://www.esar-society.eu/) [25,26].

Sample collection
Although the sampling strategy was cautiously designed in order to representatively include different countries and risk groups, unavoidably some discrepancies in sampling numbers occurred between countries, especially to attain a representative sample in small countries. To assess the impact of such oversampling in some countries, the weighted proportion of newly diagnosed was calculated. The HIV infection rate per country (number of yearly newly diagnosed HIV-1 cases per inhabitant) was obtained for each included country (ECDC Report 2004, derived from infection rate per million) [33], and we estimated which percentage of infected patients was sampled in each country. Since the collection period was 39 months, we adjusted the number of samples to a 12 months period. Mathematically, the proportion of the HIV infected that was sampled corresponds to: Since the % of infected inhabitants sampled was variable between countries, the counts of subtypes of each country were weighted accordingly in the determination of the proportion of newly diagnosed patients with different subtypes in Europe. Although there were some significant differences, the overall conclusions of our analysis did not change. Therefore, the unweighted proportion is given, unless specified, and all weighted analyses can be found in supplementary material.
Reports generated by the subtyping tool were individually viewed and a csv formatted file with the results was downloaded.

Manual subtyping
For sequences that were too complex for the REGA subtyping tool to assign it to a subtype or CRF automatically and that were therefore classified as 'Unassigned' , a manual subtyping procedure was used. In this procedure, the sampled sequences were aligned against reference sequences of all pure subtypes and the reference sequences of the first 14 CRFs, using the reference set as described in the Los Alamos database. Although 51 CRFs have been described, CRFs 15 to 51 are not responsible for important epidemics and a BLAST search indicated none were present among our data. Therefore, we decided to leave them out from the phylogenetic analysis. The multiple alignment was generated using Clus-talW [39] and manually edited with Se-Al v2.0 [40]. The sequences were then tested for evidence of recombination using the bootscan plot as implemented in Simplot v3.5.1. Phylogenetic analyses were performed with and without including CRF reference sequences in the datasets. The putative recombination pattern was confirmed by separate phylogenetic analysis in the fragments with different evolutionary history. Any genomic region was assigned to a certain HIV subtype if it clustered with reference sequences of this subtype and this clustering was supported by bootstrap values higher than 70%.

Statistical analysis
Potential associations between demographic and other parameters (area of transmission, ethnicity, etc.) and the distribution of B and non-B subtypes were statistically analysed. The SPREAD questionnaire included information about gender, age, risk factor, continent and country of origin, country of sampling and country where infection was obtained. The univariate analysis of association between these factors and proportion of B vs non-B, C vs non-C, A1 vs non-A1, G vs non-G, CRF01_AE vs non-CRF01AE, CRF02AG vs non-CRF02AG subtypes and URFs vs non-URFs was tested using the Chi-square test. The p-values of the chi-square test were calculated using the R package [41]. The Holm-Bonferroni method was used to check whether multiple testing could lead to a false rejection of the null hypothesis (type I error). The odds ratio (OR) and the 95% confidence interval (CI) of the OR were calculated using a small script in Microsoft Excel. A multivariate analysis was also done with stepwise logistic regression, using as start variables: a) all variables; b) only the variables that were statistically significant in the previously described univariate analysis (p≤ 0.05). Both binary and multinomial logistic regression were performed with the R package.
Bayesian networks (BN) were run for variables that were significantly associated with the distribution of HIV-1 subtype/CRF using the univariate analysis. A BN is a probabilistic graphical model that illustrates the relationships among a set of variables. These relationshipsdependencies -are defined by a set of nodes that represent the variables and a set of arcs that represent direct/unconditional dependencies between two variables in the dataset. The lack of an arc between two variables represents a conditional independency, meaning that these two variables are only dependent through another variable [42]. This analysis allows to map the interdependence of the analysed parameters unveiling direct and indirect associations with the HIV-1 subtype/CRF. The best BN that models the observed correlations is determined by a scoring metric (trade-off between model complexity and accuracy), and we use a Bayesian metric that considers the most probable one as the best network (maximizing posterior probability of the model given the data). Since an exact search is computationally impossible, we use the search heuristic of simulated annealing. We then use non-parametric bootstrap resampling to assess how strongly the data support the most probable network. A bootstrap analysis with 100 replicates was then used to investigate the reproducibility of each arc of the BN. 70% bootstrap support was used as the cut-off to assign reliable arcs. To remove the bias caused by variable instances that are present in very few patients (less than 1%), we combined those instances together in a single instance called 'Others'. This procedure was done using the preprocessing filter available in the WEKA software.
Finally, to test for time trends in subtype distribution, we used the Cochran-Armitage test as implemented in the prop.trend.test function in the stats package of the R package.

Additional files
Additional file 1: Figure S1. Percentage of samples of the dataset sampled in each country (dark grey) (see methods for details on calculations involved) and percentage of infected inhabitants that was sampled in each country (white) as reported by the ECDC-UNAIDS in the 2004 report. AT -Austria, BE -Belgium, CY -Cyprus, DK -Denmark, FI -Finland, DE -Germany, GR -Greece, IE -Ireland, IT -Italy, LU -Luxembourg, NL -Netherlands, NO -Norway, PL -Poland, PT -Portugal, SI -Slovenia, ES -Spain, SE -Sweden, CS -Serbia, CZ -Czech Republic, SK -Slovakia, IL -Israel.
Additional file 2: Figure S2. Prevalence of subtypes for the complete dataset of patients not adjusted (left bars), for the complete dataset of patients adjusted according to size of the sample with respect to the epidemic (middle bars) and for the set of patients originating from SPREAD countries (right bars). Legend presents percentage values and 95% confidence intervals for each bar. See methods for details on the procedure to adjust for sampling bias. Asterisks indicate statistically significant differences in the prevalence of a certain subtype (p<0.05) when comparing the complete dataset and the dataset including only patients originating from SPREAD countries.  Table S3. -Goodness of fit for the logistic model, Odds ratio with Confidence Interval and p-values for the association between HIV-1 subtypes prevalence and demographic parameters. Associations were calculated using binomial logistic regression (see methods for details). Sub-Saharan Africa -SSA, South and South-East Asia -SS EA. Eastern Europe and Central Asia -EE CA. Western Europe -WE. Table S4. -Goodness of fit for the logistic model, Odds ratio with Confidence Interval and p-values for the association between HIV-1 subtypes prevalence and demographic parameters. Associations were calculated using multinomial logistic regression (see methods for details). Table S5. -List of countries included in each continent region.