Abstract
Human T-cell leukemia virus type I (HTLV-I) is a causative agent of neoplastic disease, adult T-cell leukemia (ATL). Although the encoding viral proteins play an important role in oncogenesis, the role of the HTLV-I proviral integration site remains unsolved. We determined the integration sites of HTLV-I proviruses in ATL cells and HTLV-I–infected cells in asymptomatic carriers. In carrier and ATL cells, HTLV-I provirus was integrated into the transcriptional unit at frequencies of 26.8% (15/56) and 33.9% (20/59), respectively, which were equivalent to the frequency calculated based on random integration (33.2%). In addition, HTLV-I provirus was prone to integration near the transcriptional start sites in leukemic cells (P = .006), and the transcriptional direction of the provirus was in accordance with that of integrated cellular genes in 70% of cases. More importantly, the integration sites in the carrier cells favored the alphoid repetitive sequences (11/56; 20%) whereas in leukemic cells they disfavored these sequences (2/59; 3.4%). Taken together, during natural course from carrier to onset of ATL, HTLV-I–infected cells with integration sites favorable for viral gene transcription are susceptible to malignant transformation due to increased viral gene expression.
Introduction
After infection with retrovirus, reverse transcriptase synthesizes proviral DNA and then integrates the provirus into the host genome by the action of an integrase. In some retrovirus-associated neoplasms, provirus insertion enhances the transcription of oncogenes, such as the myc gene, resulting in transformation of infected cells.1,2 Although the integration sites of proviruses in the host genome have been considered random, recent studies regarding various retroviruses revealed that human immunodeficiency virus type 1 (HIV-1) prefers transcriptional units,3 whereas murine leukemia virus (MLV) tends to integrate near the transcriptional start sites.4 These findings suggest that the integration of proviruses depends on mechanisms unique to each retrovirus, which interact with host factors associated with nuclear transport, DNA repair, and chromatin structure.5
Human T-cell leukemia virus type I (HTLV-I) is the causative virus of adult T-cell leukemia (ATL) and inflammatory disease, HTLV-I–associated myelopathy (HAM)/tropical spastic paraparesis (TSP).6 HTLV-I infection induces ATL in a portion of infected individuals after a long latent period. The characteristic of HTLV-I is the presence of accessory genes, which are encoded by the pX region between env and the 3′-long terminal repeat (LTR).7,8 Among the accessory genes, tax is considered to play a central role in the proliferation of infected cells and leukemogenesis because of its pleiotropic actions. Tax activates transcriptional pathways such as nuclear factor κB (NF-κB), serum response factor (SRF), and cyclic AMP response element binding protein (CREB), leading to activated transcription of growth factor and its receptor genes, and inhibition of apoptosis. In addition, Tax can transrepress the transcription of cellular genes and functionally inhibit p53,9 p16,10 and MAD1.11 Such pleiotropic actions induce the proliferation of HTLV-I–infected cells, and inhibit apoptosis, resulting in clonal expansion in vivo.
Although it has been reported that the integration sites of HTLV-I provirus are random, preferential integration into the transcriptional units has been reported in ATL cells.12,13 In this study, we compared HTLV-I integration sites between carriers and leukemic cells, and found that the provirus was frequently integrated into alphoid repetitive sequences in the carrier state, but not the leukemic state.
Materials and methods
Patient samples
Peripheral blood mononuclear cells (PBMCs) were separated from heparinized peripheral blood by Ficoll-Hypaque density gradient centrifugation. They were then digested with proteinase K and treated with RNase A to eliminate RNA. Genomic DNAs were extracted from the PBMCs of 16 HTLV-I carriers and 59 patients with ATL. Approval for this study was obtained from the institutional review board of the Kyoto University. The informed consent was obtained from blood donors and patients according to the Declaration of Helsinki.
Inverse long polymerase chain reaction
Inverse long polymerase chain reaction (IL PCR) was used to amplify the genomic DNA adjacent to the integration sites of the HTLV-I provirus. First, genomic DNA (1.5 μg) was digested with a restriction enzyme (HindIII, PstI, or EcoRI), then ligated by T4 DNA ligase. When DNA was digested with EcoRI, it was digested with MluI after ligation so as not to amplify the HTLV-I provirus itself. The resulting DNA was used as a substrate for IL PCR, which was performed using TaKaRa LA PCR (Takara, Shiga, Japan). Briefly, primers (final concentration, 0.2 μM), MgCl2 (2.5 mM), and deoxynucleoside triphosphates (dNTPs, 0.4 mM) were mixed (total 20 μL) then AmpliWax (Applied Biosystems, Norwalk, CT) was added to each tube. After wax layer formation by incubation at 80°C for 10 minutes and cooling at room temperature for 15 minutes, substrate DNA (0.5 μg), 10 × LA buffer (5 μL), and LA Taq (0.4 μL) were added (total 30 μL). Cycles for long PCR were as follows: one cycle of 98°C for 2 minutes, 5 cycles of 98°C for 30 seconds and 64°C for 10 minutes, and 35 cycles of 94°C for 30 seconds, 64°C for 10 minutes, and 72°C for 15 minutes. The primers used in this experiment were as follows: primers 1 and 2 were used for PstI-digested samples, primer 1: 5′-TAGCAGGAGTCTATAAAAGCGTGGAGACAG-3′; primer 2: 5′-TGGAATGTTGGGGGTTGTATGAGTGATTGG-3′. Primers 1 and 3 were used for HindIII-digested samples, primer 3: 5′-TGGGCAGGATTGCAGGGTTTAGAGTGG-3′. Primers 4 and 5 were used for EcoRI-digested samples, primer 4: 5′-TGCCTGACCCTGCTTGCTCAACTCTACGTCTTTG-3′, primer 5: 5′-AGTCTGGGCCCTGACCTTTTCAGACTTCTGTTTC-3′.
Cloning and sequencing
IL PCR products from ATL samples were used as a template for direct sequencing to determine the integration site. The primer used for sequencing was as follows: 5′-TCATTCACGACTGACTGCCGG-3′. To determine integration sites from carrier samples, IL PCR products were gel-isolated using a QIAquick Gel Extraction Kit (Qiagen, Valencia, CA), and ligated into a pPCR-Script Amp SK(+) cloning vector (Stratagene, La Jolla, CA) or pCR-TOPO-XL (Invitrogen, Carlsbad, CA). The plasmids were then used as a template for sequencing. Sequencing of IL PCR amplicon was performed using an ABI PRISM Genetic Analyzer 310 or 3100 (Applied Biosystems, Norwalk, CT) according to the manufacturer's instructions.
Mapping integration sites
The BLAST-like Alignment Tool program was used to map sequences to the human genome (University of California–Santa Cruz [UCSC] Human Genome Project Working Draft, July 2003 freeze).14 Sequence matches were judged to be authentic only if they (1) contained the LTR sequence, (2) showed 95% or greater identity to the genomic sequence over the high-quality sequence region, and (3) matched only one genomic locus with 95% or greater identity. Genomic features such as coding regions and repetitive sequences were investigated using the UCSC Genome Browser (http://genome.ucsc.edu/cgi-bin/hgBlat?command=start). Random integration sites (10 000 ×) in nongap regions of the human genome (UCSC Human Genome Project Working Draft, July 2003 freeze) were generated with a computer program using a uniform distribution algorithm, and used for comparison with the observed HTLV-I integration sites.
Statistical analyses
The χ2 or one-sided Fisher exact test was used to determine statistical significance.
Results
HTLV-I provirus integration sites
We determined the genomic sequences adjacent to the HTLV-I provirus integration sites in ATL cells (59 cases) and HTLV-I–infected cells (56 sites) in the 16 carriers by inverse PCR, and then analyzed (1) chromosomal locations, (2) the genes containing integration sites or neighboring genes, (3) the relation of transcriptional direction between provirus and cellular genes when the HTLV-I provirus was integrated within the gene, (4) distance from the transcriptional start site, and (5) repetitive sequences. The results are summarized in Table 1.
As reported previously,15,16 the integration sites of HTLV-I provirus were random in both leukemic cells and carriers (Figure 1) and were not associated with the guanine and cytosine (GC) content of surrounding genomes (data not shown). In addition, there was no specific chromosome for HTLV-I integration. To investigate whether HTLV-1 integration sites prefer transcription unit in the human genome, we used the UCSC Genome Browser RefSeq Genes track, which represents annotated genes based on National Center for Biotechnology Information mRNA reference sequences. An integration site is considered in a transcription unit if it locates within the transcription start site and stop site of a RefSeq gene. Among the 56 HTLV-I integration sites in the carrier cells, 15 (26.8%) were identified within the transcriptional units (RefSeq). On the other hand, 20 integration sites (33.9%) existed within the transcriptional units in the 59 ATL cases (Table 2). As a control, we simulated random HTLV-1 integration by placing 10 000 integration sites randomly into the same human genome, of which 33.2% were found within the transcription units (Table 2), a percentage identical to the estimated transcribed human genome.17,18 Although the frequency of integration into transcriptional units was low in carriers, the difference between the carriers and ATL cells was not statistically significant.
Since annotation for the human genome has been changing rapidly in the last few years, the previous reports on HTLV-I integration sites should be reexamined.12,13 The frequency of integration into the transcriptional units has reduced to 46.6% from 56.4%. When these data are combined with those presented here, the provirus landed in 48 of 119 cases (40.3%), which is not statistically significant compared with random integration (33.2%; P = .17 by a χ2 test). The frequency of integration into the transcriptional units did not differ between ATL and carriers (P = .08 by a χ2 test).
When HTLV-I provirus was integrated into transcriptional units, the transcriptional direction of the provirus was in accordance with that of integrated cellular genes in 70% of cases among the leukemic cells. Since the deletion of 5′-LTR is frequently observed in the provirus of ATL cells, which is a designated type 2 defective provirus,19 the cellular promoter might act as a promoter for viral genes in such proviruses as reported previously.20 Therefore, we investigated the relationship of transcriptional direction between cellular genes and type 2 defective provirus. In 2 of 4 cases with type 2 defective provirus, the provirus was integrated into transcriptional units, which were the mutated in colorectal cancers (MCC) and protocadherin 9 genes (Table 1). HTLV-I provirus was inserted in both genes in a sense direction, suggesting that the promoter of cellular genes might transcribe the viral genes in such cases.
HTLV-I provirus tends to be integrated near the transcriptional start sites in leukemic cells
In MLV, the provirus tended to be integrated near the transcriptional start sites.4 On the other hand, HIV-1 has no such tendency in spite of its preference for transcriptional units. Since HTLV-I is a human retrovirus, it is of interest to determine whether HTLV-I has a similar tendency to MLV or HIV-1. In HTLV-I, 9 of the 59 proviruses were integrated near the transcriptional start sites (± 5 kb) in ATL cells (9/59; 15.3%) as shown in Table 3 and Figure 2. In contrast, using the same genome assembly, only 5.6% of the random integration landed near transcriptional start sites. Hence, the frequency of integration near the transcriptional start sites was statistically significant (P = .006), which is similar to MLV. Since the number of integration within plus or minus 5 kb from transcription start sites is only 3, there is no preference of integration near the transcriptional start sites.
Integration sites in repetitive sequences
Next, we studied the relationship between repetitive sequences and integration sites. Percentages of repetitive sequences in human genome were based on the previous report by Venter et al.18 Among the carriers, 11 sites (11/56; 20%) resided in the alphoid repetitive sequences, which are a component of centromeric heterochromatin and have a monomeric repeating unit of 171 bp,21 whereas only 2 integration sites were identified within alphoid sequences in ATL (2/59; 3.4%). Since alphoid sequences were estimated to compose 3% to 5% of the human genome, we used 5% for the statistical analyses (Table 4). HTLV-I–infected cells that have the provirus integrated in alphoid sequences are enriched in HTLV-I–infected cells during the carrier state compared with ATL (P = .0153; one-sided Fisher exact test; Table 4). The difference between ATL and carrier cells was statistically significant (P = .0059; one-sided Fisher exact test), indicating that integration within alphoid sequences is disfavored in leukemic cells. There were no preferences of HTLV-I integration with other repetitive sequences by statistical analyses.
Discussion
Recent studies on the integration sites of proviruses have provided new insights into the mechanism of integration and pathogenesis of retroviral infections.3,4,22 In MLV,4 provirus integration tends to occur near transcriptional start sites although there was no preference toward transcriptional units (34.2%) as observed in HTLV-I. On the other hand, HIV-1 tends to be integrated within transcriptional units (57.8%4 and 69%3 ) in vitro. In vivo data on the integration of HIV-1 provirus demonstrated that most (91%) was integrated within transcriptional units, the genes of which were transcribed in T lymphocytes,22 indicating that HIV-1 integration targets transcriptional active regions more than expected from in vitro data. On the other hand, this report showed that integration of the HTLV-I provirus into transcriptional units was not frequent compared with random integration, which is not consistent with the previous studies.12,13 This is because of the changing database. We analyzed the integration sites and genes by the UCSC Genome Browser (July 2003). In HTLV-I carriers, integration into the transcriptional units was rather less frequent than random integration although the difference was not statistically significant. Combined with the finding that HTLV-I tends to be integrated near the transcriptional start sites, the pattern of HTLV-I integration is similar to that of MLV rather than that of another human retrovirus, HIV-1.
In the carrier state, HTLV-I provirus tends to be integrated into alphoid repetitive sequences, whereas the frequency of integration into alphoid sequences was significantly decreased in leukemic cells. When HIV-1 provirus is integrated into alphoid sequences, it establishes a latent infection by influencing the surrounding heterochromatin in vitro.23 Heterochromatin decreases the basal transcription of viral genes, resulting in a latent state. Taken together, it is possible that infected cells in which HTLV-I provirus is integrated into alphoid sequences are enriched in the carrier state. Such cells are considered to produce lesser amounts of viral proteins, which facilitate cells to escape from the host immune surveillance system.24 However, integration into alphoid sequences was not frequent in the leukemic cells, indicating that those with higher amounts of viral proteins are more likely to transform into malignant cells among surviving HTLV-I–infected cells. Since ATL occurs among HTLV-I carriers after a long latent period, these findings indicate the scenario as follows: after transmission of HTLV-I, HTLV-I provirus is randomly integrated into the host genome. The host immune system, including cytotoxic T lymphocytes (CTLs), excludes the HTLV-I–infected cells, in which Tax protein is the major target of CTLs.25 In such circumstance, HTLV-I–infected cells expressing lesser amount of viral proteins are selected in vivo. However, among such infected cells, HTLV-I–infected cells expressing viral protein, such as Tax, tend to proliferate in vivo. Such cells should have greater chance to transform into malignant cells. A higher provirus load has been reported to be a risk factor for development of ATL, which is consistent with this hypothesis. Therefore, integration into alphoid repetitive sequences is less frequent among leukemic cells since viral transcription of such cells tends to be silenced.
Although MLV can infect only dividing cells,26 lentiviruses such as HIV-1 can infect nondividing cells by transfer of a preintegration complex through the nuclear pore.27 Since transcriptional active sites are associated with nuclear transport machinery,28 preintegration complexes that pass through the nuclear pore might be integrated into transcriptional active sites due to the open structure of chromatin. This might be the reason for the high frequency at which HIV-1 provirus is integrated within transcriptional active genes. The data in this study reveal that the characteristics of HTLV-I integration in ATL cells resemble those of MLV, suggesting that HTLV-I cannot infect nondividing cells, although this requires clarification. Preference to transcriptional start sites was observed only in leukemic cells, indicating that such integration sites confer the advantage in leukemogenesis. Since the integration sites of HTLV-I provirus concentrate within 5 kb from transcriptional start sites, it is possible that such sites are suitable for transcription of viral genes.
Type 2 defective HTLV-I provirus lacks 5′-LTR and internal viral sequences such as gag and pol.19 It is possible that this provirus traps the cellular promoter, thus ensuring transcription. In this study, the provirus of 2 of the 4 cases with type 2 defective provirus was integrated in the transcriptional unit in a sense orientation, indicating that the cellular promoter might transcribe the viral gene. In addition, HTLV-I provirus contains an internal promoter sequence in the pol region,29 which is considered to transcribe the viral gene. This is thought to have occurred especially in the remaining 2 cases, which showed integration of the provirus outside the transcriptional units.
In HTLV-1–induced oncogenesis, viral proteins such as Tax promote the proliferation of HTLV-I–infected cells and induce ATL in about 2% to 6% of carriers after a long latent period.7,8 Since HTLV-I provirus integration is random,15 integration itself does not directly influence leukemogenesis. Viral products such as Tax promote the proliferation of HTLV-I–infected cells and induce transformation of infected T lymphocytes.30,31 However, expression of Tax protein is impaired by several mechanisms in ATL cells, including deletion of 5′-LTR,19 DNA methylation of 5′-LTR,32 and genetic changes (deletion, insertion and nonsense mutations) of the tax gene itself.33,34 In the carrier state, presence of the Tax protein is advantageous for the proliferation and survival of infected cells; however, since Tax is the major target of CTLs in vivo,25 the growth of Tax-expressing cells is suppressed by CTLs.35 When HTLV-I provirus is integrated into transcriptional active sites, viral gene transcription is thought to be active. During the carrier state, these producer cells are possibly eliminated by CTLs. Therefore, cells infected with HTLV-I provirus integrated into alphoid sequences have been enriched. On the other hand, higher production of viral proteins in HTLV-I–infected cells is thought to take advantage during malignant transformation. Therefore, the frequency of integration into alphoid sequences was low in leukemic cells compared with carrier states.
In this study, analyses of HTLV-I integration sites in both leukemic and HTLV-I–infected cells of carriers have emphasized the structural significance of the host genome, which influences viral gene transcription, during leukemogenesis.
Prepublished online as Blood First Edition Paper, April 19, 2005; DOI 10.1182/blood-2004-11-4350.
Supported by a grant-in-aid for Scientific Research from the Ministry of Education, Science, Sports, and Culture of Japan.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 U.S.C. section 1734.
We thank Suzuko Ohsako for technical help.