HTLV-I antisense transcripts initiating in the 3'LTR are alternatively spliced and polyadenylated

Background Antisense transcription in retroviruses has been suggested for both HIV-1 and HTLV-I, although the existence and coding potential of these transcripts remain controversial. Thorough characterization is required to demonstrate the existence of these transcripts and gain insight into their role in retrovirus biology. Results This report provides the first complete characterization of an antisense retroviral transcript that encodes the previously described HTLV-I HBZ protein. In this study, we show that HBZ-encoding transcripts initiate in the 3' long terminal repeat (LTR) at several positions and consist of two alternatively spliced variants (SP1 and SP2). Expression of the most abundant HBZ spliced variant (SP1) could be detected in different HTLV-I-infected cell lines and importantly in cellular clones isolated from HTLV-I-infected patients. Polyadenylation of HBZ RNA occurred at a distance of 1450 nucleotides downstream of the HBZ stop codon in close proximity of a typical polyA signal. We have also determined that translation mostly initiates from the first exon located in the 3' LTR and that the HBZ isoform produced from the SP1 spliced variant demonstrated inhibition of Tax and c-Jun-dependent transcriptional activation. Conclusion These results conclusively demonstrate the existence of antisense transcription in retroviruses, which likely plays a role in HTLV-I-associated pathogenesis through HBZ protein synthesis.


Background
Natural antisense transcription has been described in several eukaryotic organisms and has been ascribed several functions [1][2][3]. Retroviruses have long been thought to lack antisense transcription and to rely on a single sense transcript for viral gene expression. Unspliced and spliced sense transcripts are thought to produce all viral proteins required for replication and survival in the infected host. Although a few studies have suggested that retroviruses might produce antisense transcripts with coding potential [4][5][6][7][8][9][10], the existence of such atypical RNAs has not been conclusively demonstrated. Recent identification of the HBZ (HTLV-I bZIP) protein, surprisingly encoded on the antisense strand of human T-cell leukemia virus type I (HTLV-I), revived the likely existence of antisense transcription among retroviruses [11].
HTLV-I is the etiological agent of adult T cell leukemia/ lymphoma (ATLL) and HTLV-I-associated myelopathy (also termed tropical spastic paraparesis) (HAM/TSP) [12][13][14][15][16][17]. In the sense strand, the HTLV-I genome encodes typ-ical retroviral proteins as well as other more HTLV-I-specific proteins, such as Tax. The viral Tax protein has been suggested to play an important role in the diseases occurring in HTLV-I-infected patients. Tax is an important transactivator and acts upon the HTLV-I gene expression by promoting protein complexes involving CREB and the CREB binding Protein (CBP) on the TRE1 regions present in the HTLV-I long terminal repeat (LTR) promoter region.
Upon its discovery, the HBZ-coding region has been shown to be located between Tax exon 3 and Env exon 2 in the antisense strand (see Fig. 1A) [11]. The HBZ protein possesses peculiar functions, which suggest that this viral protein could have a potential impact on HTLV-I-associated pathogenesis. Specifically, the HBZ protein can inhibit Tax activation of both AP-1 function and HTLV-I LTR-mediated gene expression through various proteinprotein interactions [11,[18][19][20]. A recent study by Arnold et al. [21] have demonstrated that, although HBZ was dis-Detection of the HTLV-1 antisense transcript in HTLV-I-producing 293T cells pensable for viral replication in cell culture, persistence of HTLV-I in inoculated rabbits was enhanced by HBZ.
Although several reports have characterized functions of the HBZ protein, the structure of its transcript and the mechanisms behind HBZ gene regulation remain poorlydefined. Complete characterization of the HBZ transcript is critical to conclusively demonstrate that antisense transcription is a mechanism of retroviral gene expression.
In this report, we have focussed on the characterization of the HBZ-encoding antisense transcript produced from the HTLV-I genome. Our results show that HBZ-encoding transcripts initiate in the 3' LTR, are polyadenylated and are alternatively spliced. Furthermore, the HBZ isoform produced from the most abundant spliced form possesses similar functional properties to the one previously attributed to the former HBZ isoform. These results will strongly impact the field of retrovirology, being the first clear demonstration of the existence of antisense transcription in retroviruses.

Detection of the antisense transcript in transfected 293T cells and HTLV-I-infected cell lines
The identification of the HBZ gene has raised several important issues regarding the various mechanisms governing retroviral gene expression. Its atypical positioning in the HTLV-I genome (Fig. 1A) warranted further investigation and a more thorough characterization of the HBZencoding RNA was thus conducted.
Our first objective was to specifically demonstrate that HTLV-I indeed produced antisense transcripts using RT-PCR. Negative controls were carefully selected to avoid previously reported autopriming artifacts that can occur during the reverse transcription step of RT-PCR analysis [7,22]. RT reactions were either performed without primer (control for autopriming) or with a primer complementary to the deduced HBZ ORF sequence (see Fig. 1A). Additional controls included RNA samples in which the RT step had been omitted prior to PCR amplification. Using these controls, RT-PCR analyses were first performed using two sets of PCR primers specific for the HBZ-coding sequence. As demonstrated in Fig. 1B lanes 5 and 6, antisense HBZ transcripts were observed in all HTLV-I-infected cell lines tested, while similar signals were not observed in the various controls. To confirm the above results, RT-PCR analyses were next conducted in 293T cells transfected with the HTLV-I K30 molecular DNA proviral clone ( Fig. 2A-B). The expected signal (although weak) was observed in transfected 293T cells. As demonstrated in lane 3 ( Fig. 2B), autopriming was however apparent in K30-transfected 293T cells, likely due to high levels of sense RNA that is reverse transcribed independently of the HBZ-specific primer. To eliminate this artefact, sense transcription from the K30 proviral DNA was knocked out by deletion of the 5' end of the proviral genome ( Fig. 2A-C). The resulting K30-3'/5681 construct was then transfected in 293T cells. RT-PCR analyses showed a stronger antisense-derived signal and no autopriming signal was observed, suggesting that sense RNAs were the source of the contaminating autopriming signal.
These results clearly demonstrated the existence of an antisense transcript in HTLV-I, which included the HBZ sequence. The use of HTLV-I proviral DNA clones and of infected cell lines demonstrated that a wide range of HTLV-I clones is capable of producing this transcript. Furthermore, data from the transfected 293T cells with the 5'LTR-deleted proviral DNA construct also argued that sense transcription could impede antisense transcription, which might be expected.  These results hence demonstrated that the HBZ transcript initiated in the 3' LTR at multiple positions. This multiplicity of initiation sites might be a consequence of the absence of TATA boxes at close distance. Our results parallel the data presented on the localisation of the transcription initiation sites specific for HIV-1 antisense transcripts, which were near or in the 3' LTR region [6,7]. Similar to HIV-1, based on the positioning of the transcription initiation sites, it is expected that the promoter region for HTLV-I antisense transcription would be present in the 3'LTR region as initially suggested by Larocca et al. [4]. Further investigations are required to determine the mechanism of regulation of this promoter region and to evaluate the possible involvement of adjacent cellular DNA in these regulatory mechanisms.

HBZ transcripts are alternatively spliced
The sequencing of the 5'RACE products provided more information regarding the HBZ transcript. Indeed, the sequence data allowed us to demonstrate that alternative splicing of the RNA encoding HBZ was occurring. The antisense transcript initiating within the 3' LTR is spliced at two different positions (367 and 227 of the antisense strand) and joined to an internal region of the HBZ ORF at position 1767 (Fig. 4A). These HBZ RNA variants, which are referred to as spliced RNA 1 (SP1) and spliced RNA 2 (SP2), differ in the size of their exon 1 leading to an intronic region of 1400 nt and 1540 nt, respectively. Results of 5'RACE further suggested that the SP1 variant occurs more frequently than SP2.
Another important feature of the SP1 RNA was the presence of the splice acceptor downstream of the AUG initiation codon initially suggested by Gaudray et al. [11]. However, further analysis of the SP1 RNA sequence originating in the 3' LTR revealed a new in frame AUG initiation codon that permits proper initiation of HBZ translation (Fig. 4B). In contrast, no in frame AUG was HBZ transcripts are alternatively spliced

SD consensus sequence MAGGTRAGT
proteins [23]. Amino acid sequence changes introduced limited variation in overall amino acid composition between these two potentially new HBZ isoforms and the previously published HBZ amino acid sequence [11]. For example, seven amino acids from the amino terminus of the original HBZ isoform would be substituted by four amino acids in the SP1-encoded isoform.
Sequence analysis of the HTLV-I K30 proviral DNA revealed typical splice donor (SD) and splice acceptor (SA) consensus sequences at each end of the presumed intronic sequence for the predicted splice junction of both HBZ SP1 and SP2 RNAs (Fig. 5). Comparison with other HTLV-I sequences demonstrated strong conservation of the splice acceptor (Fig. 5A). Comparison of the SP1 SD sequence further indicated that this sequence was highly conserved in all HTLV-I and simian STLV-I LTR sequences analysed (Fig. 5B). In these sequence comparisons, it was noted that certain HTLV-I isolates in fact had a better match to the consensus sequence than the corresponding SD or SA sequence from the K30 proviral DNA clone. The SP2 SD sequence was also highly conserved among the various HTLV-I isolates, although certain isolates did present non-consensus SD sequences in this region ( Fig.  5C and data not shown). In addition, comparison of LTR sequences from other HTLV-I and STLV-I isolates demonstrated a high degree of conservation within the predicted amino terminal sequences for both new HBZ isoforms ( Fig. 5B-C).
To demonstrate that both HBZ splice variants existed in HTLV-I-infected and transfected cells, RT-PCR analysis was performed on isolated RNA with the forward primer 20-19 derived from the transcribed spliced 3' LTR and the reverse primer 21-5 located downstream of the identified splice acceptor (see Fig. 4A). This RT-PCR strategy was expected to generate a 684 bp signal for the HBZ SP1 RNA and a 544 bp signal for the HBZ SP2 RNA. Indeed for both tested HTLV-I-infected cell lines, i.e. C8166-45 and MJ, an amplified signal of the expected size for SP1 was present (Fig. 4C). However, the SP2 variant was only weakly detected in these infected cell lines. Similar analyses conducted in 293T cells transfected with K30, K30-3'/5681 and a different proviral DNA clone, i.e. ACH amplified the spliced HBZ SP1 and SP2 templates (very faint for SP2). Because of nucleotide sequence variation of the LTR region complementary to primer 20-19, the forward primer 20-27 (similar to the 20-19 primer, but with nucleotide sequence specificity for ACH) was used for RT-PCR analyses of ACH-transfected cells. To further demonstrate the existence of these spliced transcripts, the detection of HBZ spliced variants was evaluated in cell clones derived from HTLV-I-infected individuals (Fig. 4D). Taking in consideration the variability occurring in between HTLV-I isolates in the LTR region, primers from the HBZ-coding sequence that encompass the highly conserved splice junctions of SP1 and SP2 were used to detect antisense transcripts. Analysis of amplified products indeed demonstrated expression of the HBZ SP1 RNA variant in certain cell clones while other clones appeared negative. As a control, HTLV-I-infected MT4 cells were similarly analyzed and demonstrated amplification of the expected band. However, no signals were observed with primers overlapping the splice SP2 junction (data not shown).
These data thereby provide evidence for the existence of splicing events occurring in the HTLV-I antisense transcripts. A recent study has also confirmed the spliced nature of the HBZ RNA, having demonstrated the existence of the SP1 HBZ transcript [24]. In our study, we further suggest that, although the SP1 RNA variant represents the most abundant transcript, other spliced variants could exist (such as SP2). We have also importantly demonstrated that SP1 RNA variant is present in patient-derived cell clones, and unlike Satou et al. [24], not all tested cell clones were found to be positive for HBZ expression.
Although more data is needed to understand the significance of these findings, these data might be indicative of a possible relationship between lack of HBZ expression and disease outcome. Furthermore, it is possible that the various identified HBZ RNA variants might contribute differently to HBZ protein synthesis. However, our PCR analysis has not permitted us to detect unspliced HBZ RNA in HTLV-I-infected cells or transfected 293T cells. Obviously, the designed PCR protocol used above favours shorther size PCR fragments derived from spliced HBZ RNA. Nonetheless, the formerly described HBZ isoform [11] could be produced from unspliced HBZ RNA although possible mechanisms might be needed for proper translation to occur from the resulting long 5' untranslated region of such a transcript. It should also not be excluded that other splice variants could also exist and contribute to post-transcriptional regulation of HBZ expression. Further experiments are presently underway to clearly establish if these other transcripts are indeed produced in infected cells.

Positioning of the polyA addition site
We next sought to demonstrate that the HBZ transcript was polyadenylated. A potential polyA signal has previously been suggested to direct the addition of a polyA tail to the 3' end of the HTLV-I antisense transcript [4]. Therefore, a variant of the K30-3'/5681 construct that includes this potential polyA signal was generated (K30-3'/4089). This new construct and the ACH proviral DNA were transfected into 293T cells. An SP1-derived signal was observed in both transfected cells following analysis of total RNA or mRNA using the RT-PCR approach described above (Fig.  6A), thereby demonstrating that this transcript was polyadenylated. The SP2-specific band was generally too weak to be easily detected in these analyses. The polyA addition site was precisely mapped using 3'RLM-RACE to specifically amplify the 3' end of polyadenylated RNA. RNA extracted from 293T cells transfected with K30 or from HTLV-I-infected MJ cells was used for the 3'RACE analysis. Initial analysis using a primer positioned downstream of the HBZ stop codon amplified a 600 bp fragment from both RNA samples (Fig. 6B). Sequencing of this fragment demonstrated that the polyA tail was positioned 1450 nt from the HBZ stop codon. The polyA addition site was located in a UA dinucleotide positioned 22 nucleotides downstream of the previously suggested polyA signal and a few nucleotides from a GU-rich segment, another typical Identification of the polyA addition site of the HBZ transcript  (Fig. 6C). These consensus sequences were highly conserved among other HTLV-I proviral DNAs (Fig. 6D).
These results hence have permitted to identify the 3'end of the spliced HBZ transcript. Taking into account the results of Fig. 4, we predict the size of the more abundant HBZ SP1 transcript to be 2.4 kb. This characterization of the HTLV-I antisense transcript hence agrees with previous findings of Larocca et al., who detected a 2.5 kb antisense transcript [4]. Our results also confirm the Northern blot data of this former study as to the possible existence of an intron at a similar position in the antisense transcript of HTLV-I. Furthermore, presence of the 3' untranslated region might suggest a potential role for this region in post-transcriptional regulation of HBZ expression. Further experiments will be needed to assess this possibility.

Synthesis of the various HBZ isoforms
Based on our data demonstrating the existence of differently spliced HBZ RNA, different HBZ isoforms could be expressed in HTLV-I-infected cells. However, the HBZ SP2 RNA appeared as a weak signal and depended on a non-AUG initiation codon. To confirm the translation of both isoforms, complete cDNAs (including the 5' untranslated region determined from our 5'RLM-RACE data) were amplified for each splice variant and tagged with the Myc epitope by cloning into the pcDNA3.1-Myc-His A expression vector. These constructs, and a vector expressing the originally published HBZ isoform [20], were transfected into 293T cells and detected by Western blot with a mouse anti-Myc antibody. Both new HBZ isoforms were detected in transfected 293T cells and the HBZ isoform produced from the SP1 cDNA had a lower molecular weight than either the original or the SP2 HBZ isoforms (Fig. 7). Although the position of the initiation codon was not determined for the HBZ SP2 isoform, the estimated size of the protein suggested that translation initiation occurred within exon 1. Immunofluorescent analysis of the transfected cells demonstrated nuclear localization of the two new HBZ isoforms, as described for the original HBZ protein (data not shown) [26].
The importance of splicing events for HBZ protein synthesis was next determined by generating a K30-3'/5681 construct (termed K30-3'-asLUC) in which the sequence downstream of the splice acceptor was replaced with an SV40 polyA signal and the luciferase reporter gene positioned in frame with the rest of the HBZ amino acid sequence. This construct provided a reliable and sensitive tool for quantification of HBZ transcription. Using the wild-type or a SA-mutated version of K30-3'-asLUC, the importance of the SA consensus sequence was then assessed by co-transfection experiments. Results presented in Fig. 8A indicated that mutation of the splice acceptor significantly reduced luciferase activity below that of the wild type vector in transfected 293T cells. RT-PCR analyses using primers derived from the luciferase gene and the 3' LTR confirmed the production of a spliced RNA from the wild type construct while no specific signals were observed in RNA samples from cells transfected with the mutated K30-3'-asLUC vector (Fig. 8B).
To confirm these data and extend our analyses to other splice consensus sequences and to the two different possible AUG initiation codon, mutations of the K30-3'/4089 construct specifically targeting SD/SA consensus sequences, as well as both putative AUG translation initiation codons, were specifically generated (Fig. 8C). Following transfection of wild-type and mutated K30-3'/ 4089 constructs into 293T cells, the HBZ protein was detected by Western blot (Fig. 8D). Significantly less HBZ protein was detected when the proviral DNA was mutated in the SA or SP1 SD sequence, or the SP1-specific AUG, suggesting that SP1 mRNA is important for HBZ protein synthesis. On the other hand, mutation of the intronic AUG or the SP2 SD sequence had little impact on HBZ protein levels. Interestingly, transfection of 293T cells with a vector expressing the original HBZ isoform produced HBZ protein of a higher molecular weight than K30 HBZ protein, which may depend on presence of the Myc tag and differences in amino terminus. These data indeed suggested the possible existence of different HBZ isoforms. In agreement with our RT-PCR analysis, our results suggest that the SP1 RNA-translated HBZ isoform contributes importantly to overall HBZ protein synthesis. It should be noted that, in our Western blot analyses, a constant shift in migration of the SP1-derived isoforms is observed when compared to the other HBZ isoforms. Although these results are unexpected given the small differences in amino acid composition between the various HBZ isoforms, we could speculate that the SP1  isoform is differently modified at a post-translational level, which would then account for these suggested variations. Further experiments are needed to address this issue.

Functional properties of the SP1 RNA-derived HBZ isoform
Since these data suggested that the HBZ SP1 mRNA was the most abundant HBZ transcript and contributed significantly to HBZ protein synthesis, we next determined whether the SP1-encoded HBZ protein had similar effects on transcription as described for the original HBZ protein [11,18,19]. The effect of the HBZ SP1 isoform on HTLV-I LTR activity was tested in the context of a complete proviral DNA containing a luciferase reporter gene inserted in frame with the envelope amino acid sequence. Transfection of the SP1 expression vector into 293T cells significantly reduced luciferase activity (Fig. 9A). The effect of the HBZ SP1 isoform on c-Jun-dependent transcriptional activation was also evaluated by co-transfecting CEM cells with HBZ SP1 and c-Jun expression vectors along with a collagenase promoter driving luciferase gene expression. The HBZ SP1 expression vector strongly reduced c-Junmediated induction of luciferase activity (Fig. 9B), arguing strongly that the SP1-derived HBZ isoform possesses a transcriptional inhibitory function similar to the original HBZ isoform. These data again reinforce the notion that the major HBZ isoform should act similarly as to the originally presented HBZ isoform and might thus play an important role in HTLV-I latency.
In this study, we have thoroughly characterized the antisense transcripts produced from the HTLV-I retrovirus and responsible for the synthesis of the previously described HBZ protein. Using different RT-PCR approaches, our results first demonstrated that antisense transcripts could be detected in HTLV-I-infected cell lines and 293T cells transfected with proviral DNA and initiated in the R and U5 segments of the LTR. Transcripts were alternatively spliced at a varying frequency and produced two new isoforms with translation initiating in exon 1, at least for the most abundant variant. PolyA site was positioned at a distance of 1450 nt form the HBZ stop codon and occurred next to known polyA signals. Mutation experiments also showed the importance of the SP1 mRNA for HBZ protein synthesis. Transfection experiments also indicated that the isoform produced from HBZ SP1 mRNA demonstrated suppression of AP-1-and Tax-dependent transcriptional activation.
Our results strongly argue that the major spliced antisense transcript is responsible for producing the HBZ protein.
However, the minor spliced form and the unspliced HBZ transcript may be important sources of HBZ expression in other cellular contexts or states. More data are needed to indeed confirm that the SP2 transcript is indeed produced in several other HTLV-I-infected cells and that both SP2and unspliced derived HBZ isoforms can be detected at the protein level in infected cells. In light of the possible existence of multiple HBZ RNA variants, it could then be postulated that transcriptional and post-transcriptional mechanisms might regulate HBZ mRNA and protein levels and drive the type of transcript (and isoform) being produced. These mechanisms might involve other HTLV-I viral proteins. Regulation of HBZ protein levels and functions will likely modulate HTLV-I latency and pathogenesis. Detection of varying levels of the major spliced form of HBZ RNA in several cellular clones isolated from infected patients (even in the same patient) is highly relevant in this regard. Future investigations will need to address the different mechanisms regulating HBZ protein synthesis.

Conclusion
Our study has an important impact on the field of retrovirology, in general. These data provide the strongest evidence for the existence of retroviral antisense transcripts, which have previously been seen as potential artefacts. It is likely that antisense transcripts are also produced in other retroviruses (human and non-human) and could encode for proteins as previously proposed for HIV-1 and FIV [5,8,22,27]. Based on our data, further studies on antisense transcription are warranted, specifically in complex retroviruses. The presence of one or more potentially new genes in these transcripts would provide important new insights into retroviral regulation and function, resulting in a more complete understanding of these viruses. It will be of great interest to determine whether regulatory processes linked to antisense transcription are active in HTLV-I, such as the antisense effect previously suggested for these transcripts in HIV-1 [28,29].

Transfection and gene reporter assays
293T cells were transfected with 5-10 µg of DNA through the calcium phosphate protocol as previously described [33]. CEM cells were transfected according to a previously described protocol [34]. In transfection experiments with K30-LUC or collagenase promoter-luciferase vectors, the pcDNA3.1-Myc-His A empty vector was used to standardize DNA quantity in between transfection samples. Transfected cells were lysed 48 hours post-transfection in a lysis buffer (25 mM Tris phosphate, pH 7.8, 2 mM DTT, 1% Triton X-100, 10% glycerol) and luciferase activity read out was performed with the MLX microplate luminometer (Dynex Technologies) with a single injection of a luciferase buffer (20 mM tricine, 1.07 mM (MgCO 3 ) 4 ·Mg(OH) 2 ·5H 2 O, 2.67 mM MgSO 4 , 0.1 mM EDTA, 220 µM Coenzyme A, 4.7 µM D-Luciferin potassium salt, 530 µM ATP, 33.3 mM DTT). Each sample was co-transfected with a β-gal-expressing vector for normalisation. The β-galactosidase activity was measured using the Galacto-Light™ kit (Applied Biosystems, Bedford, MS) according to manufacturer's suggestions. Luciferase activity are presented in Relative Light Units (RLU) and represent the calculated mean ± SD of three transfected samples normalised by the measured β-galactosidase activity.

Western blot analysis
Transfected 293T cells were lysed and total protein or nuclear extracts were prepared as previously described [26,35]. Equal quantities of extracts were run on a SDS-12% PAGE and transferred to PVDF membranes (Millipore). The blot was next blocked in PBS 1X/5% milk and incubated with a mouse anti-Myc 9E10 antibody (dilution 1/1000) or anti-HBZ antiserum (dilution 1/1000). After several washes, signals were revealed by the addition of peroxydase-conjugated goat anti-mouse IgG (dilution 1/2000) or goat anti-rabbit IgG (dilution 1/10000) antibodies and subsequent incubation with the ECL reagent (Amersham Pharmacia Biotech). Membranes were exposed on hyperfilms ECL (Amersham Pharmacia Biotech).