Use of a multi-virus array for the study of human viral and retroviral pathogens: gene expression studies and ChIP-chip analysis

Background Since the discovery of human immunodeficiency virus (HIV-1) twenty years ago, AIDS has become one of the most studied diseases. A number of viruses have subsequently been identified to contribute to the pathogenesis of HIV and its opportunistic infections and cancers. Therefore, a multi-virus array containing eight human viruses implicated in AIDS pathogenesis was developed and its efficacy in various applications was characterized. Results The amplified open reading frames (ORFs) of human immunodeficiency virus type 1, human T cell leukemia virus types 1 and 2, hepatitis C virus, Epstein-Barr virus, human herpesvirus 6A and 6B, and Kaposi's sarcoma-associated herpesvirus were spotted on glass slides and hybridized to DNA and RNA samples. Using a random priming method for labeling genomic DNA or cDNA probes, we show specific detection of genomic viral DNA from cells infected with the human herpesviruses, and effectively demonstrate the inhibitory effects of a cellular cyclin dependent kinase inhibitor on viral gene expression in HIV-1 and KSHV latently infected cells. In addition, we coupled chromatin immunoprecipitation with the virus chip (ChIP-chip) to study cellular protein and DNA binding. Conclusions An amplicon based virus chip representing eight human viruses was successfully used to identify each virus with little cross hybridization. Furthermore, the identity of both viruses was correctly determined in co-infected cells. The utility of the virus chip was demonstrated by a variety of expression studies. Additionally, this is the first demonstrated use of ChIP-chip analysis to show specific binding of proteins to viral DNA, which, importantly, did not require further amplification for detection.


Background
Microarray technology, developed in the wake of various genome projects, has increasingly become one of the most widely used functional genomic tools. Global gene expression arrays, in a single co-hybridization assay, can query the differing expression patterns of thousands of genes [1,2]. Differential gene expression can be measured by treating identical cells with different stimuli, such as drug treatment or cellular stress, or by analyzing related but distinct cells/tissues, for example normal tissue versus malignant tumors [1,2]. More recently, microarrays have shown great potential for clinical applications as well, including diagnosis of disease states, viral or bacterial subtyping, and even virus discovery [3][4][5].
Moreover, increasing attention is also being paid to the identification of viral genes expressed during latent and/ or lytic infection as targets for the development of antiviral treatments, an approach that has proven successful for the herpesviruses [6][7][8][9]. A viral microarray comprised of spotted amplicons from the open reading frames (ORFs) of a number of different viruses would have distinct advantages over conventional technologies, including Enzyme Linked Immunosorbent Assay (ELISA) and Northern blot analysis that do not allow for simultaneous multi-target analysis.
To determine whether a multi-virus microarray had potential in specific applications, for instance viral gene expression inhibition studies and protein-DNA binding experiments, we developed and characterized an array containing the amplified ORFs of eight human viruses. These include human immunodeficiency virus type 1 (HIV-1), human T cell leukemia virus types 1 and 2 (HTLV-I and -II), hepatitis C virus (HCV), Epstein-Barr virus/human herpesvirus 4 (EBV/HHV-4), human herpesvirus 6A and 6B (HHV-6A and -6B), and Kaposi's sarcoma-associated herpesvirus/human herpesvirus 8 (KSHV/HHV-8). These viruses were chosen for their importance in AIDS and AIDS-associated diseases. In addition, these viruses vary widely in their pathogenesis, including high (EBV, HHV-6A, HHV-6B, and KSHV) and low copy number (HIV-1, HTLV-I, and HTLV-II); differing viral expression during latent and lytic infection (herpesviruses); low (HCV) versus high (HTLV-I) replication rate; and representing both DNA (herpesviruses) and RNA (HTLV-I, HTLV-II, HIV-1 and HCV) genomes. Our hope was that by demonstrating the feasibility and applicability for a single microarray representing diverse, pathogenically important viruses, we would set the stage for a wide range of applications ranging from multiple virus detection to functional studies. In addition, larger arrays consisting of more viruses could easily be developed based on the data generated from this first-generation virus array.
Most DNA arrays are designed for the analysis of thousands of genes representing a single species. We chose to construct a virus chip for the detection of various human viruses and demonstrate that this technique can address complex problems in virology. We demonstrate that we can detect, with great specificity, genomic viral DNA from cells infected with the human herpesviruses, EBV, KSHV, HHV-6A, and HHV-6B. We further demonstrate that we can induce and observe viral gene expression in HIV-1 and KSHV latently infected cells, and that we can use this chip to detect inhibition of HIV and KSHV gene expression when subjecting infected cells to a cyclin dependent kinase inhibitor. In addition, we used the virus chip for new applications such as chromatin immunoprecipitation (ChIP) followed by hybridization to the virus chip (ChIP-chip) and observed specific hybridization of KSHV DNA. Importantly, we were able to detect immunoprecipitated DNA from latently infected as well as induced cells by direct labeling of immunoprecipitated DNA without further PCR amplification.

Design of the virus chip
Microarray technology allows the analysis of thousands of genes in a single assay. Although most DNA arrays are designed to represent a single species, we chose to construct a comprehensive array representing the annotated open reading frames of eight human viruses that are often associated with AIDS: HIV-1, HTLV-I, HTLV-II, HHV-6A, HHV-6B, EBV, KSHV, and HCV. In doing so, we hoped to be able to simultaneously detect the presence of these viruses while monitoring their patterns of gene expression. As a means of quality control, we also included thirty-one human sequences; 29 control cDNA clones obtained from a human EST library and glyceraldehyde-3phosphate dehydrogenase exon 7 as well as β-actin exon 3.
Although there are advantages in using oligonucleotides (oligos) over PCR products (amplicons) for the construction of DNA microarrays [10,11], we chose to use amplicons for a number of reasons. First, viral genomes are compact enough that it is comparatively easy to cover the entire genome, including both coding and non-coding regions, with a small number of PCR amplification reactions. This has a significant cost advantage compared to oligo synthesis. Second, although oligos offer the possibility of greater specificity, the ability of longer products to hybridize effectively despite some number of mismatches can be an advantage for viruses that exhibit a relatively high mutation rate and for the detection of virus strains from related families. One potential drawback of PCR is that, in certain cases, due to either the size of the gene to be amplified, its nucleotide composition, or other factors, amplification may be unsuccessful. For example, in our virus chip, primers could not be designed for 15 of the 86 EBV ORFs due to their repetitive nature. However, these genes may also prove difficult for long oligo designs as the repeats may pose similar problems. Overall, we were able to successfully amplify and array probes representing 254 to 264 of the 329 (>75%) ORFs within our eight target viruses.
PCR primer sets were generated for all specific viral open reading frames (see Additional file 1). Primers were designed using the Primer 3.0 program [12] with default parameters. Primers were selected such that there was no overlap between amplified regions, which ranged in size from 100 bp to 3.5 kbp; larger ORFs were represented by up to four amplicons. Following amplification, PCR products were analyzed by gel electrophoresis. Amplifications were repeated for reactions that failed to give a single product of the appropriate size. Failed amplifications resulting in multiple bands, or in bands of the wrong size, were printed but were flagged and excluded from subsequent analysis. Figure 1 illustrates the location and direction of each open reading frame of the printed viral genomes as well as which amplicons were successfully amplified. Amplicons were purified, resuspended in DMSO, and spotted in twelve replicate copies on aminosaline-coated microscope slides. Following printing and cross-linking, slide quality was assayed by staining a representative slide for each printing with Syto-61 and scanning at 635 nm and 532 nm wavelengths.

Specificity of DNA hybridizations
To validate our approach and determine the hybridization specificity for each virus, DNA was isolated from host cells in culture infected with a single viral species (Figure 2A), plasmid DNA of infectious clones ( Figure 2B), or genomic DNA of co-infected cells ( Figure 2C), labeled, and hybridized to the array. With the exception of the plasmid clone hybridizations, uninfected host cell DNA was used as a common reference sample for each competitive hybridization assay. Purified DNA was labeled using random primers and the Klenow fragment of the E. coli DNA polymerase such that labeling occurred irrespective of the sequence. This method for labeling allowed us to label any viral strain without prior knowledge of the sequence and, more importantly, did not require PCR amplification to detect specific and strong hybridization. The results were calculated as a ratio of the intensity of hybridization from infected host cell to the uninfected host cell DNA. Results from these hybridization assays are shown in Figure 2, where the hybridization of each array probe is represented as a color-coded bar. A threshold ratio of five was used as a minimum for detection and those array elements failing to meet that criterion are indicated in blue. Array probes that resulted in a ratio between five and ten were considered weak hybridization intensities and are indicated in yellow, while strong hybridizations resulted in a ratio greater than ten and are indicated in red.
As can be seen in Figure 2A, we were able to accurately detect each virus with little cross hybridization to the others represented on the array. Hybridizations were performed with three different EBV infected cell lines, MM2, B-95A, and Jijoye. Representative data from the B-95A hybridization are shown in Figure 2A. The three cell lines exhibited similar hybridization patterns, although we did observe a slightly lower hybridization signal from the Jijoye cell line than from the others (data not shown).
HHV-6A and HHV-6B are two variants of HHV-6 that differ in epidemiology, in vitro growth properties, and nucleotide sequence [13,14]. Although the majority of the ORFs in HHV-6A and HHV-6B have high sequence identity, there is a cluster of genes that exhibit less than 80% identity at the right end of the unique region of the viral genome spanning ORFs 86 to 100 [15,16]. Therefore, all of the HHV-6A ORFs were amplified, whereas only those HHV-6B ORFs that exhibited less than 80% sequence identity with HHV-6A were printed. Hybridization with genomic DNA from HHV-6A (U1102) infected cells hybridized to the HHV-6A targets with little to no crosshybridization with the HHV-6B amplicons ( Figure 2A). As was observed with the EBV infected cells, there was very little hybridization to other viral ORFs. Furthermore, as expected, hybridization with genomic DNA from HHV-6B (Z29) infected cells detected HHV-6A and HHV-6B targets.
Hybridization using the KSHV infected cell line, BCBL-1, gave a good signal for the KSHV ORFs ( Figure 2A). Although KSHV cross-hybridized to some viral probes, those that cross-hybridized were not consistent between the replicates (N = 3), with the exception of the HHV-6A U80 and HCV NS5 probes. These results indicate that we can accurately identify several closely related herpesviruses by their specific hybridization profile. The copy number for KSHV + cell lines ranges from 50 to 2000 copies/cell. Therefore, the minimum copy number required to detect viral sequences from infected cells appears to be 70 copies per cell, as this is the lowest copy number determined from the KSHV infected cell line, BCBL-1 [17]. However, subsequent analyses using the co-infected cell lines, Cra-BCBL and BBG1 ( Figure 1C), indicated that the minimum detectable viral copy number is 20.
We were not able to detect HIV-1, HTLV-I, HTLV-II, or HCV DNA sequences from infected cell lines (data not shown). This is likely due to the low copy number of HIV, HTLV-I, and HTLV-II in the infected cells used. The HIV-1 latently infected cell lines, ACH 2 and U 1 , contain one and two integrated copies [18,19], respectively, while the HTLV-I and HTLV-II cell lines, MT-2 and C19, contain 5 to 8 integrated copies [20,21]. HCV is a RNA virus and hybridization with genomic DNA from infected cells was not expected. As we were unable to detect HIV-1, HTLV-I, Hybridizations that resulted in a ratio of less than five for (A&C) or less than two for (B) were considered as non-hybridization and are indicated in blue. Hybridizations that resulted in a ratio of five to ten for (A&C) or two to five for (B) were considered low hybridization intensities and are indicated in yellow, while high hybridizations resulting in a ratio greater than ten for (A&C) or five for (B) are indicated in red.

Genomic organization of arrayed viruses
or HCV using genomic DNA from infected cells, we labeled infectious plasmid clones with Cy3-dCTP or Cy5-dCTP by nick translation. As can be seen in Figure 2B, specific hybridization was observed for all three viruses. Since the experiments were performed using plasmid DNA rather than DNA from infected cell lines, the same DNA was labeled with both Cy-dyes and "self-self" hybridizations were performed. The data is represented for each gene as the ratio of gene-specific hybridization intensity on the average intensity for all spots. The cut-off ratio for high hybridizations was set to five rather than ten. While no cross-hybridizations could be observed, only two of the 11 HIV probes represented on the array successfully hybridized with the labeled plasmid DNA.
We also wanted to determine if we could specifically detect multiple viruses in co-infected cells. Therefore, we used genomic DNA from two different KSHV/EBV coinfected cell lines, Cra-BCBL [22] and BBG1 [23] ( Figure  2C). Hybridization to the KSHV and EBV targets was observed. However, there was much less hybridization to the EBV targets than was observed using DNA from a singly infected cell line (Figure 2A, EBV probe). BBG1 cells contain approximately 2000 copies/cell of the KSHV genome and 20 copies/cell of the EBV genome [23]. The Cra-BCBL cell line contains 130 copies/cell of the KSHV genome (unpublished data) but the copy number for EBV is not known. Results suggest that the effective limit of detection is approximately 20 genomic copies per cell, as we observed hybridization of some of the EBV amplicons despite the large difference in copy number relative to KSHV.
In principal, genomic DNA should hybridize uniformly to all of the genes of a specific array. There are several possibilities why we observed differences in intensity levels for the various amplicons, including the size of PCR products and the G/C composition of the amplicons. Larger ORFs may exhibit a higher fluorescent intensity as there is more incorporated label than in smaller ORFs, and amplicons with a higher A/T content may not hybridize as strongly due to their lower melting temperature. The PCR products spotted on our virus array are not uniform in size as they vary from 100 bp to 3.4 Kbp. The genes that failed to hybridize tended to be between 100-300 bp in size: 50% of all amplicons smaller than 300 bp did not hybridize, or hybridized weakly, while the failure rate was approximately 5% for the remaining amplicons. The G/C content of the viruses varies, as does the G/C content of some ORFs within viruses. However, there did not appear to be any correlation between weak hybridization and G/C composition of the amplicons.
In the case of hybridizations to plasmid DNA, as seen in Figure 2B, there is less cross-hybridization and better representation of all targeted genes for HTLV-I and HCV. For HIV, while there was no cross-hybridization to amplicons of other viruses, only two (proviral LTR and nef) of the 11 spotted regions of the HIV-1 genome hybridized to the DNA sample. The G/C content of the entire coding region of HIV is 42%. However, the G/C content for Nef and the LTR is 49% and 53%, respectively. As the HIV proviral DNA was labeled with Cy-dCTP, the higher G/C content of Nef and LTR may explain the better hybridization observed for these amplicons.

Sensitivity of DNA hybridizations
DNA hybridizations were typically performed using 3 µg of genomic DNA. To determine the detection limits in these assays, genomic DNA from HHV-6A (U1102) infected cells was used in a titration study. DNA was isolated from HHV-6A-infected cells (U1102). The viral copy number was determined to be 400 copies per cell by quantitative real-time PCR (data not shown). The DNA was then serially diluted from 3 µg to 0.01 µg, labeled, and hybridized to the array. As expected, for most HHV-6 genes, there was a greater hybridization signal at higher concentrations ( Figure 3). We determined that there were 2 × 10 7 viral copies in 0.3 µg of genomic DNA and we detected a significant level of specific hybridizations at a 10-fold lower concentration than that used in the experiments shown in Figure 2.
Expression studies DNA hybridizations allowed us to validate hybridization for each target and determine the specificity and sensitivity for each virus in the array. However, many of the applications we envision will require determination of RNA expression levels. To demonstrate this capability, we focused on the expression of HHV-6B, HIV-1, and KSHV genes from virally infected cells.
To analyze HHV-6 gene expression, HHV-6B (Z29) infected cells were co-cultivated with uninfected T cells (SupT1) for seven days, after which time complete cytopathic effect was observed. Total RNA from infected and uninfected cells was isolated, reverse transcribed using random primers, labeled, and competitively hybridized to the virus chip. An example of HHV-6 gene expression is represented in Figure 4A. Differential expression was detected for 74% of the 108 printed ORFs. Of the 28 genes that were not detected, five did not hybridize in the DNA experiments. Many of the ORFs that did not hybridize were immediate early genes or genes with no known function. In addition, a number of late genes, including U33 and U94, did not exhibit any signal, although both U33 and U94 hybridized in the DNA experiments (Figure 2A). The fact that we did not detect U94 expression is reasonable considering its transcript is found in very low abundance [24,25]. Interestingly, U12 was identified as being differentially regulated in this experiment although the U12 transcript has not been detected by Northern analysis in previous studies [26]. Incidentally, four of the ten small amplicons (<300 bp) that did not hybridize in the DNA experiments were detected in these RNA expression experiments. Several studies have compared the sensitivities of DNA microarrays and Northern blots [27,28] and have found that the dynamic range and sensitivity between DNA microarrays and Northern blots was comparable, although differences were observed depending on the gene analyzed.
To further demonstrate the utility of the virus chip, we analyzed the effect of drug treatment on viral gene expression in HIV-1 and KSHV-infected cells. In the first experiment, we compared HIV gene expression before and after drug treatment. Cellular cyclins and cyclin dependent kinases (CDK) have been shown to be critical for the expression and replication of a number of viruses, including HIV and several herpesviruses [29][30][31][32][33][34][35]. We and others have previously shown that inhibiting CDKs with ATP analogs, such as CYC202 (r-CYC202; Cyclacel Ltd; http:// www.cyclacel.com), can suppress HIV-1 expression and replication in vitro and in vivo [35,36]. However, we were previously unable to specifically determine which ORFs (from doubly spliced or singly spliced messages) was inhibited in these drug treated cells. To determine which viral transcripts were suppressed by Cyc202, HIV-1 expression was induced in ACH 2 cells, a latently infected HIV-1 + cell line, by incubation with TNF-α for two hrs. Cells were subsequently washed and fresh media was added, with or without CYC202 (5 µM). Similar treat-ments were performed in uninfected CEM parental cells. Nine hours after induction, total RNA from uninfected, infected and CYC202 treated cells were isolated, labeled, and hybridized to the virus chip ( Figure 4B). Without any induction, both the LTR and Gag genes showed detectable levels of expression, while low expression was observed for Rev (red bars). This is consistent with previously published reports showing that latent cells exhibit low basal transcription with a mostly non-processive RNA polymerase II [37]. When cells were treated with TNF-α, all of the expected HIV-1 RNAs were transcribed and detected on the virus chip (green bars). This is also consistent with previously published reports, where the mRNA species producing Tat, Rev, and Nef are coordinately regulated by Rev and therefore these transcript levels accumulate in the absence of Rev protein and are down regulated in the presence of Rev [38].
When cells were treated with CYC202, we observed the down-regulation of a number of transcripts. Infected cells treated with CYC202 should have expression levels similar to infected cells not induced with TNF. We can in fact observe this in Figure 4B, where hybridizations from CYC202 treated cells, expressed as a Log2 (ratio) with untreated cells (yellow bars), were comparable to the Log2 ratios of uninduced cells. If CYC202 did not have an effect, the Log2 ratio values would have been closer to 0 (Log2 of a ratio of 1 is zero). Log2 intensity ratios less than one indicate that the drug had little or no effect on the expression of that gene. CYC202 mainly inhibited expression of the LTR and Pol and Nef genes. This is consistent with the mechanism of CYC202 inhibiting at the Determination of DNA hybridization sensitivity by titration Figure 3 Determination of DNA hybridization sensitivity by titration. Genomic DNA of infected cells was isolated and HHV-6 viral copy number was determined by a TaqMan assay [68], as described in Materials and Methods. The DNA was then serially diluted (3 µg, 1 µg, 0.3 µg, 0.01 µg), labeled, and hybridized to the array. Hybridizations that resulted in a ratio of less than 2 were considered as non-hybridization and are indicated in blue. Hybridizations that resulted in a ratio of two to ten were considered low hybridization intensities and are indicated in yellow, while high hybridizations resulting in a ratio greater than ten are indicated in red. . Both amplicons contain some overlap, with 84% of the Rev amplicon specific to Rev and 65% of the Tat amplicon specific to Tat. Tat and Rev indicated in the graph correspond to the amplicons that contain the majority of that particular sequence. Values were calculated and expressed as Log2 ratios. The red bars indicate hybridization in latent, uninduced cells; the green bars indicate genes that were expressed following TNF induction (20 ng/ml for 2 hrs). Both are compared to expression from uninfected cells. The blue bars indicate genes that were expressed following TNF induction compared to no induction; yellow bars indicate genes whose expression was effected by TNF and CYC202 (5 µM), as compared to untreated cells. (C) KSHV gene expression and inhibition by CYC202. Only the KSHV open reading frames are illustrated and indicated at the bottom of the panel. Values were calculated and expressed as mean Log2 ratios from four experiments. The yellow bars indicate genes that are expressed following TPA induction (20 ng/ml) while the red bars indicate genes whose expression was effected by TPA and CYC202 (5 µM). promoter (LTR) and the first ORF (Gag). The down-regulation of Nef, which is the most abundant doubly spliced transcript in infected cells, also implies that these transcripts are susceptible to regulation by CDK inhibitors. This novel finding was unexpected; however, it further characterizes which of the doubly or singly spliced messages are regulated by cyclin/cdk complexes in HIV-1 infected cells. Although these experiments indicate that CYC202 inhibits activated transcription at the HIV-1 LTR by inhibiting cdk2 and cdk9 [35], it does not address the changes in the half-lives of the other HIV-1 ORFs. However, in the absence of Gag or Nef, no HIV-1 particles were made, as demonstrated by the lack of the p24/gag antigen in the supernatants [35,36].

Analysis of viral gene expression
In a second experiment, we evaluated viral gene expression in the KSHV infected cell line, BCBL-1. BCBL-1 is a primary effusion lymphoma cell line established from a HIV + patient [39]. The KSHV genome remains latent with very little viral gene expression in infected cells cultured in vitro. After treatment with TPA, viral gene expression is activated [39]. As can be seen in Figure 4C, following nine hours of TPA treatment, low but significant expression (p < 0.05) was detected for a number of KSHV ORFs, including K5, K7, ORF72, and ORF73. The expression of these genes was previously shown to peak later in the viral life cycle [8]. We also detected hybridization to ORF50/Rta, which was previously shown to exhibit significant expression by 10 hours post-induction [8]. CYC202 has been shown to inhibit the gene expression and replication of a number of herpesviruses including herpes simplex virus (HSV) and cytomegalovirus (CMV) [29,32,36]. Therefore, we attempted to determine if CYC202 could inhibit KSHV gene expression. At the time of induction with TPA, CYC202 (5 µM) was added and nine hours after induction total RNA was isolated, labeled, and hybridized to the virus chip ( Figure 4C). Values were determined as the mean log2 ratios of four experiments. Evidence of suppression by CYC202 was demonstrated by the fact that the sample treated with CYC202 behaved like the sample with no TPA induction. The data indicate that CYC202 suppressed the expression of the lytically induced gene, K7. These results are similar to what has been observed with HSV [32].
These results demonstrate the utility of using a virus chip for gene expression studies of virally infected cells. The advantage of using a chip containing all ORFs of a virus is that it allows for analysis of global changes in gene expression. Furthermore, the ability of longer products to hybridize effectively despite some number of mismatches can be an advantage for viruses that exhibit a relatively high mutation rate and for the detection of virus strains from related families.

Use of ChIP-chip analysis to identify proteins bound to a viral genome in vivo
Lastly, we demonstrated a novel use for the virus chip to determine if proteins that associate with the natural chromatin structure associate with the KSHV genome. Chromatin immunoprecipitation (ChIP) has been used to determine if specific proteins bind to regions of a genome in vivo [40], to identify transcription factor binding to promoters [41,42], and to identify the binding of modified proteins to DNA in vivo [43,44]. Recently, ChIP has been paired with microarray analyses (ChIP-chip) [45] to identify the binding sites of transcription factors in the yeast genome [46]. Nucleosomes are important in regulation of chromatin structure as well as transcription [47][48][49] and therefore, we tested whether phosphorylated histone H3 (P-H3), which is associated with activated gene transcription [50], was associated with the KSHV genome during latent gene expression and whether binding of P-H3 increased following activation of viral gene expression. Following immunoprecipitation of bound DNA with anti-P-H3, DNA was labeled directly, without further amplification ( Figure 5). This is in contrast to previously reported ChIP-chip studies where a three step amplification procedure has been used [45,46]. Intensity ratios greater than two were considered significant. As can be seen in Figures 5A and 4C, DNA from uninduced BCBL-1 cells that was precipitated with anti-P-H3, hybridized to the K14 amplicon in KSHV latently infected cells. During latent KSHV infection, viral gene expression is limited to K12/kaposin, ORF73/LANA, ORF72/v-cyclin, K13/ ORF71/v-FLIP [51][52][53], and K10.5/LANA2 [54]. The K14 amplicon spans the promoter for the major latency transcript [51,55], which encodes ORF73, ORF72, and K13, indicating that an activated form of histone H3 is associated with an actively transcribed latent viral gene. In uninduced cells, we did not detect hybridization to other viral amplicons or the majority of the human amplicons, demonstrating the specificity of the immunoprecipitation and hybridization. Following induction of viral gene expression and anti-P-H3 immunoprecipitation, additional regions of the KSHV genome were observed to hybridize to the virus chip ( Figures 5B and 5C). These regions included ORFs 22,24,34,36,45,46,48,50, and K5. ORF50/Rta, the first gene turned on following induction [56], is required for activation of the KSHV lytic cycle [57,58] and subsequently turns on K8, an immediate early gene with homology to the EBV transcription factor bZIP [59]. The promoter for K8 is found within the sequences coding for ORF50 [60] while the promoter for ORF50 is found within the sequences coding for ORF48 [56,60]. Both of these regions exhibited detectable hybridization to the immunoprecipitated DNA from induced cells. The protein kinase B (PKB) amplicon also hybridized to DNA precipitated from uninduced and induced BCBL-1 cells.

PKB/AKT is involved in a number of cellular processes
ChIP-chip of KSHV DNA Figure 5 ChIP-chip of KSHV DNA. Chromatin immunoprecipitation (ChIP) was performed as described [43] using an antibody to the phosphorylated form of histone H3 and DNA was directly labeled with Cy3-dCTP or Cy5-dCTP. including positive regulation of the cell cycle [61] and activation of gene expression [62,63]. That an activated form of histone H3 was associated with protein kinase B suggests that PKB was expressed in BCBL-1 cells and that PKB may be involved in viral gene expression in KSHV infected cells. These results demonstrate that we are able to detect DNA immunoprecipitated from virally infected cells without the use of tagged proteins and, more importantly, without further amplification of precipitated DNA. These results are in contrast with previously published results that required a three-step amplification/labeling procedure to identify precipitated DNA [45,46]. Furthermore, these results are the first use of ChIP-chip in detecting binding of cellular proteins to viral DNA.
We have developed an amplicon based virus chip representing the genomic sequences from eight human viruses. Using labeled DNA from infected cells or infectious plasmids we were able to specifically identify each virus, with minimal cross-hybridization between the various viral species. Furthermore, we were able to accurately identify both viruses from co-infected cells. We further demonstrated the utility of the virus chip with a variety of expression studies as well as ChIP-chip analysis. Significantly, we were able to detect specific viral sequences immunoprecipitated from infected cells without further amplification. Given the above positive results, we plan to further develop this chip by including the open reading frames of other viruses that are known human pathogens. The potential applications of a viral microarray representing pathogenic viruses extends beyond profiling expression of viral genes to the discovery of novel viruses, drug target identification and drug development, and widespread screening of blood or organs for viral contamination prior to transplantation [11,64,65].

Design
PCR primer sets (see Additional file 1) were generated for all specific viral open reading frames from the eight target viral species. Coding sequence coordinates of the ORFs were adjusted to prevent overlaps between amplified regions and to limit the amplicon size to less than 3.5 kb. PCR primer sets were designed using Primer 3.0 [12] with optimized design parameters. Larger ORFs were represented by up to four amplicons. The largest PCR product obtained was 3.2 kb in length and the shortest 100 bp. We were able to successfully design primer sets for 302 of the 329 ORFs identified in the eight viruses (see details and primer sequences in Additional file 1). Coding regions for which primers could not be designed corresponded to very small target sequences or highly repetitive regions. Genomic DNA (1 ng) from infected cells was used as the template and was amplified with 800 nM primers and Taq DNA polymerase using the supplied buffer (Applied Biosystems). Conditions for PCR were 94°C, 30 sec; 60°C, for 45 sec; and 72°C for 2 min 35 sec; for 30 cycles, with a final extension at 72°C for 10 min. Following amplification, PCR products were purified with the Millipore 96well filtration system, according to manufacturer's directions. Five microliters of purified product were separated through a 1% agarose gel and analyzed for the presence and correct size of each amplicon. Each product was graded as strong, acceptable, questionable, smear, misprimed, or failed and scores are uploaded into the TIGR database. A score indicating an unsuccessful amplification will automatically give a null value for the particular spot corresponding to that amplicon. We obtained amplicons for 254 genes in the first generation of the chip and 264 in the second generation, with failed amplifications giving either no products or multiple bands. Figure 1 provides an overview of the genomic location of the ORFs that were successfully amplified and spotted for each virus. Purified amplicons were spotted in twelve replicate copies on aminosaline-coated microscope slides (Corning) using the Molecular Dynamics generation III arrayer (Sunnyvale, California). Printing was performed at room temperature and at a humidity level of 40-52%. Printed slides were cross-linked by drying for 2 hours at 80°C.

Sample labeling, hybridization, and scanning
Total RNA was isolated from control and infected cell lines with RNA-BEE (Tel-Test, Inc.) according to the manufacturer's directions. Residual genomic DNA was removed by DNase I digestion (RNase-free, Amersham Biosciences) and phenol-chloroform extraction. Total genomic DNA was isolated as described [66]. Genomic DNA (3 µg) was labeled using 15 µg random hexamers (Invitrogen), 3 mM Cy3-dCTP or 3 mM Cy5-dCTP (Amersham Pharmacia), and 15 units exo-Klenow large fragment (New England Biolabs), at room temperature for 3 hr. Labeled DNA probes were purified using a GFX column (Amersham Pharmacia).
Infectious plasmid clones were labeled by nick translation as described [67]. Cy3-dCTP or Cy5-dCTP (1 mM) was included with 1 mM each dTTP, dATP, and dGTP. Following incubation at 15°C, the labeled products were purified using a GFX column (Amersham Pharmacia).
For expression profiling experiments, complimentary DNA was synthesized by reverse transcription in the presence of aminoallyl-tagged dUTP (aa-dUTP, Ambion). Total RNA (2 µg) and 6 µg of random hexamers (Life Technologies) were incubated at 70°C for 10 min and snap frozen. cDNA was synthesized overnight at 42°C by reverse transcription (SuperscriptII, Invitrogen) in the presence of 25 mM each dATP, dCTP, dGTP, 10 mM dTTP, and 15 mM aa-dUTP (Ambion). The RNA template was hydrolyzed with 1 M NaOH at 65°C. The resulting cDNA was filtered and concentrated with a Microcon-30 spin column (Millipore) and dried under vacuum. The sample was resuspended in 0.1 M carbonate (Na 2 CO 3 , pH 9.0) buffer and NHS-Cy3 or -Cy5 (Amersham Pharmacia) to fluorescently label the cDNA probe (at room temperature for 3 hr). The coupling reaction was purified using the QIAquick PCR purification kit (Qiagen). The probe was eluted in 100 µl DNAse/RNase-free water and analyzed using a Beckman spectrophotometer to measure dye incorporation and nucleotides per dye.
Slides were rinsed in 0.1% SDS and denatured in boiling water. Labeled probes (Cy-3/Cy-5) were resuspended in water and salmon sperm DNA and denatured at 95°C. Hybridization solution (50% formamide, 5X SSC, 0.1% SDS, and 0.2 mg/mL bovine serum albumin) was added and the probes were incubated for 20 min at the appropriate hybridization temperature. Probes were allowed to hybridize overnight at 42°C and 48°C for RNA and DNA experiments, respectively. Slides for DNA experiments were washed sequentially in (i) 1X SSC, 0.2% SDS at 48°C; (ii) 0.1X SSC, 0.1% SDS at room temperature; and (iii) 0.1X SSC at room temperature. Each wash was for 4 min. Slides for RNA experiments were washed sequentially in (i) 2X SSC, 0.1% SDS at 42°C; (ii) 0.1X SSC, 0.1% SDS at room temperature; (iii) 0.1X SSC at room temperature; and (iv) 0.01X SSC at room temperature. Each wash was for 4 min. The slides were scanned with an Axon-4000B scanner and images saved as paired single-color TIFF images.

Determination of HHV-6 copy number
Genomic DNA from infected cells was isolated according to the manufacturer's directions (Gentra System) and the concentration was adjusted to 10 ng/µl. Immediate early gene sequences were amplified using a TaqMan assay [68]. A standard curve was generated using a known concentration of variant-specific HHV-6 plasmids. Results were plotted and sorted using the Sequence Detector System (Perkin Elmer). Results were normalized using a human genomic β-actin calibration curve. Absolute viral and β-actin DNA copy number was assessed and final viral DNA load per 10 6 cells was calculated by the following formula: [HHV-6 DNA copy number/(β-actin DNA copy number/2)] × 10 6 .

ChIP-chip analysis
Chromatin immunoprecipitation (ChIP) was performed as described previously [43] with slight modifications. Briefly, cells were cross-linked with 1% formaldehyde. Nuclei, prepared by hypotonic lysis, were resuspended in lysis buffer (1% SDS, 10 mM EDTA, 50 mM Tris-HCl, pH 8.1), sonicated to reduce DNA length to 200-1000 bp, and debris removed by centrifugation. The chromatin solution was precleared on protein A/G beads preadsorbed with sonicated salmon sperm DNA. The chromatin solution was then incubated with an antibody to histone H3 phosphorylated on serine 10 (Upstate Cell Signaling Solutions) or no antibody overnight at 4°C. Immune complexes were collected with protein beads pre-adsorbed with sonicated salmon sperm DNA. Following washes and elution, cross-linking was reversed by heating at 65°C for 4 to 5 hours, and DNA was recovered by digestion of proteins with proteinase K followed by phenol-chloroform extraction and ethanol precipitation. DNA was labeled directly as described above for genomic DNA, hybridized (chip), and washed accordingly.

Normalization
As a first step, in-slide replicate analyses were performed by calculating the geometric mean of the spots corresponding to each gene. For the DNA experiments where infected (test) versus uninfected (control) host cell DNA was used in the hybridizations, a linear normalization was performed based on the assumption that the ratio of the host cell genes should be equal to 1. In genomic DNA hybridizations, for each host cell gene represented on the array, the geometric mean of the measured fluorescence intensities was calculated for both the experimental and control and the ratio of these was used as a scaling factor to adjust the values of all fluorescence measures for the viral genes represented on the array. For the expression studies (cDNA hybridizations), data was generated by dye-swap replication experiments. Total normalization was performed followed by flip-dye consistency checking using the TIGR Microarray Data Analysis System [69] or by one-class T-test analysis. For the plasmid experiments, where the same DNA was labeled with both dyes, an iterative log mean centering normalization was performed using MIDAS v2.17 with the following parameters: global mode, +/-3 S.D. outlier range, Cy3 [69]. We treated the Cy3 and adjusted Cy5 intensities as technical replicates and calculated the mean of these values. The ratio of this mean on the average of the intensity across the array set was then obtained. A ratio greater than 2 indicates that there was hybridization to a specific gene 2-fold above the background intensity across the whole array.