The NUP98 gene is the target of recurrent translocations in leukemia that fuse the 5′ portion of NUP98with coding sequence from the partner gene.1-8 Three of the known fusion partners, HOXA9, HOXD13, andPMX1, are homeobox genes. The other known fusion partners,DDX10, RAP1GDS1, TOP1, andLEDGF, are considered to share no common features. RecentlyNSD1, another nonhomeobox NUP98 fusion partner, has been reported in this journal.8 We now report that the proteins coded for by the nonhomeobox genes all have regions with a significant probability of adopting a coiled-coil conformation.
Oligomerization via the coiled-coil domains has recently been shown to activate the oncogenic potential of RARα and AML1 following fusion to partners with coiled-coil domains.9 It was shown that the PML-RARα, PLZF-RARα, NPM-RARα, and AML1-ETO fusion proteins each exist in oligomeric complexes in vivo and that oligomerization causes abnormal recruitment of the transcriptional corepressor N-CoR. Moreover, fusion of RARα to the oligomerization domain of p53 showed that oligomerization alone is sufficient for transformation. Other leukemia fusion genes also involve the fusion of transcription factors with genes coding for coiled-coil domains. For example, the inv(16)(p13q22) fuses the N-terminus of CBFβ with the C-terminus of the smooth muscle myosin heavy-chain gene.10 The coiled-coils of the myosin heavy-chain gene promote dimerization and are essential for the transforming properties of the fusion gene.11
Coiled-coils are characterized by sequence patterns known as heptad repeats, which result in the formation of amphipathic alpha helices, the hydrophobic faces of which undergo what is known as “knobs-into-holes packing” as first proposed by Crick.12 Potential coiled-coil forming sequences were sought using both algorithms in COILS 2.1 (http://www.ch.embnet.org/software/COILS_form.html). The original algorithm of Lupas et al gives equal weighting to each of the 7 heptad positions in its scoring of coiled-coil potential.13 This weighting system is biased toward hydrophilic charge rich sequences and can occasionally give rise to false positive coiled-coil predictions where there is no heptad periodicity. The revised algorithm increases the weighting of positions a and d, which code for hydrophobic amino acids, thereby decreasing scores for segments with a high number of charged residues and reducing false positive predictions.14
The protein sequences analyzed were DDX10 (PID g13514831), smgGDS, the product of RAP1GDS1 (PID g7239381), TOP1 (PID g13653668), LEDGF (PID g11360305), and NSD1 (PID g15213542). All proteins were predicted to form coiled-coils even when the weighting of positions a and d was increased. This contrasts with proteins in general where it is considered that 3% to 5% have potential coiled-coil domains.15 The potential coiled-coil domains were identified in DDX10 at Asn579-Lys600 (P = .59), smgGDS at Thr425-Glu452 (P = .66), and Ile505-Leu533 (P = .86), TOP1 at Lys310-Tyr338 (P = .59), Leu577-Leu605 (P = .54), and Lys638-Thr718 (P = 1.0), LEDGF at Lys309-Glu331 (P = .71) and Val370-Glu395 (P = .98), and NSD1 at Gly1729-Asn1760 (P = .96). The SOPM (self-optimized prediction method) secondary structure prediction program (http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_sopm.html) showed that all the putative coiled-coil sequences were predicted to form alpha helices. Furthermore, the predicted coiled-coil domain at Lys638-Thr718 of TOP1 has been verified by x-ray crystallography.16
In all cases of fusion with NUP98, the predicted coiled-coil domains are retained in the fusion protein. Thus translocations not involving homeobox genes result in the fusion of an amino-acid sequence with coiled-coil forming potential to the FG-repeat–rich amino terminus of NUP98. It has been shown that this FG-repeat region of NUP98 possesses strong transcriptional transactivation potential through direct interaction with CBP/p300.17 Another FG-repeat–containing nucleoporin gene, NUP214, is also involved in recurrent leukemia translocations. These involve fusion of NUP214 FG repeats to the SET protein or the DEK protein.18 19 Significantly, COILS 2.1 analysis shows that the portions of SET (PID g14745487) and DEK (PID g544150) retained in NUP214 fusions have a region with high coiled-coil forming potential (SET Lys35-Gln78 [P = .99] and DEK Glu323-Val350 [P = .92]). Interestingly, none of the 3 homeobox proteins fused to NUP98 in AML are predicted to form coiled-coils when analyzed with COILS 2.1. This probably reflects a different mode of action of the homeobox transcription factors and suggests that NUP98-homeobox fusions have a different mechanism of leukemogenesis.
It is a matter of speculation whether these coiled-coil regions promote self-dimerization/oligomerization or have a role in formation of multimeric complexes, which facilitate interaction with other transcription factors or cofactors. The latter may be more likely since 2 of the nonhomeobox fusion partners, LEDGF and NSD1, are known transcription factors. The exact significance of the coiled-coil structure in all reported nonhomeobox NUP98 fusion partners requires further research and may give a clue to the pathogenesis of NUP98 fusion proteins.
Acknowledgments. We thank Andrei Lupas from the Max Planck Institute for Developmental Biology, Tübingen, Germany, for advice on the use of COILS 2.1. We thank John Walshaw from the School of Biological Sciences, University of Sussex, United Kingdom, for commenting on COILS 2.1 predictions presented here and for helpful discussion. We thank Sally Stephenson and Chris Slape for reading the manuscript.