Abstract
Research indicates that myeloablative hematopoietic cell transplantation (HCT) impairs neurocognitive function. However, prospective studies on long-term effects are lacking. This longitudinal study examined neurocognitive changes over the first year in 142 adult recipients of allogeneic HC transplants who received neuropsychologic testing before transplantation and again after 80 days and 1 year. Age-, sex-, and education-adjusted population-based standardized scores were used for normative comparisons. Performance on all tests declined from before transplantation to 80 days (P < .05) and improved by 1 year (P < .05), returning to pretransplantation levels on all tests except for grip strength and motor dexterity. Although verbal fluency and memory recovered by 1 year, both were below norms at all 3 testing times (P < .01). Logistic regressions indicated that patients without chemotherapy, other than hydroxyurea, previous to HCT and patients not receiving chronic graft-versus-host disease (GVHD) medication at 1 year had lower risk of impaired function (P < .05). In conclusion, HCT was associated with significant generalized decline in neurocognitive performance at 80 days, with subsequent recovery to pretransplantation levels by 1 year for most survivors, except on motor tasks. Results indicate that long-term cognitive decrements, as distinct from motor disabilities, infrequently derive directly from HCT. (Blood. 2004;104:3386-3392)
Introduction
Adult patient complaints about cognitive difficulties following myeloablative hematopoietic cell transplantation (HCT) have been verified by objective studies of the acute neuropsychologic effect of HCT. Memory, attention, and information-processing deficits have been documented.1-6 Some recovery of cognitive function after transplantation has been suggested by studies that assessed patients 6 months or longer following HCT.3,5,7 In parallel with neuropsychologic test results, numerous imaging studies have determined that neurologic changes such as cortical atrophy and ventricular enlargement occur in some patients after HCT conditioning chemotherapy or total body irradiation and that cyclosporine or tacrolimus treatments for graft-versus-host disease (GVHD) after HCT have neurotoxicities.8-11 However, we are aware of no prospective, longitudinal studies that have documented both decrements in cognitive function and recovery following HCT and that have adequate power to permit conclusions about long-term neurocognitive outcomes after HCT.
The lack of longitudinal neurocognitive studies of patients receiving HC transplants, despite evidence that problems exist, is due to a number of practical problems. Highly trained neuropsychology examiners must do the testing. This requirement, high rates of mortality, and the need to perform testing in face-to-face sessions in persons dispersed over a wide geographic area make it difficult to implement these studies in HCT survivors.
This study investigates neurocognitive changes over time in a cohort of patients tested before transplantation, 80 days after transplantation, and again after 1 year. On the basis of prior research and patient complaints, we hypothesized that we would observe a significant decline in neurobehavioral functioning at 80 days after transplantation, and some recovery, but remaining deficits, at 1 year after transplantation relative to both pretransplantation performance and population norms. Also on the basis of previous research, we hypothesized that patients with more neurocognitive deficits before transplantation would have higher mortality rates in the first year after transplantation, patients without chemotherapy preceding HCT conditioning would have lower rates of pretransplantation neurocognitive impairment, and patients receiving systemic treatment for chronic GVHD at 1 year would have higher rates of neurocognitive impairment.
Patients, materials, and methods
Patients
A total of 161 adults who were receiving a myeloablative allogeneic transplant at the Fred Hutchinson Cancer Research Center consented before transplantation to participate in a quality-of-life study that included neuropsychologic testing. Eligible patients were at least 22 years of age; able to speak, read, and write English adequate to complete testing; diagnosed with a malignancy; and receiving their first allogeneic transplant. Patients with a major psychiatric disorder not in remission and disrupting treatment, as defined by clinical staff, were excluded, as were patients who were actively receiving treatments that had documented central nervous system (CNS) toxicities (eg, opioid medication or intrathecal methotrexate). Because testing had to be done in person, only patients who were present at the transplantation center at the designated time points could be tested.
Procedure
The transplantation center Institutional Review Board and Scientific Review Committee approved all procedures and tests. Psychometrists were trained in standard administration procedures by one of the authors (S.R.-R.). After pretransplantation consent, patients completed self-report measures that included demographic, medical, and head injury history. Neuropsychologic testing was scheduled as a second appointment and occurred 2 weeks to 1 day prior to the start of chemotherapy or radiation therapy conditioning for HCT, 80 ± 10 days after transplantation, and 1 year ± 1 month after transplantation. All testing was done in the ambulatory clinic at the transplantation center.
Measures
Demographic data were collected using a self-report background form. Medical records provided additional data, including diagnosis, type of transplantation, conditioning regimen, relapse, and survival status. To examine a range of function from motor strength and psychomotor performance to cognitive flexibility and verbal fluency, 6 tests were administered at each time point. At the 1-year examination, a novel test was added to permit us to evaluate the construct of “executive function” and to consider the possible influence of practice effects on the testing results. All tests had age-, sex-, and education-adjusted T-score norms. T scores have a mean of 50 and SD of 10 and permit comparison across tests as well as comparison with what is expected from persons with similar age, education, and sex. These demographics are known to affect performance on these measures.
IQ estimation. The Information subtest of the Wechsler Adult Intelligence Scale-Revised12 was administered at the first testing session to provide a brief measure from which to estimate IQ. This scale correlates with verbal IQ r = .79 and full scale IQ r = .76 according to the developers.12 We provide this result with caution, given the following 2 factors. (1) To confidently measure IQ requires much broader measurement of verbal and performance skill. (2) Although fund of information is presumed to be less influenced by acquired brain impairments after normal development, no research we are aware of has documented the influence of cancer diagnosis and treatment on this test. We believe the Information subtest can provide an indication of patients' intellectual function at the time tested, but it cannot be considered a “true baseline” indicator of the patient's premorbid function before diagnosis.
Motor strength, speed, and dexterity. The Hand Dynamometer13 measures grip strength in kilograms with a subject's dominant and nondominant hand, averaged across 3 trials. Because the dominant and nondominant hand outcomes are highly correlated in our cohort, we focus on the dominant hand strength. This test was introduced after the study began; consequently, 31 patients before transplantation and 4 patients at 80 days did not receive the test. The Grooved Pegboard14 measures motor speed and dexterity. Subjects insert a peg a bit larger than a toothpick into a board containing a 5 × 5 set of slotted holes angled in different directions. Each peg has a ridge along one side, requiring rotation for correct insertion into the hole. The score reported is time to completion in seconds with a subject's dominant hand.
Attention and processing speed. Two measures were used to capture these cognitive skills, although the tests measure additional functions as described for each test in this section. For the Trail Making Test Part B (Trails B)13 subjects are given a worksheet with 25 circles randomly distributed on a page, half numbered 1 to 13, the other half lettered A to L. The task is to connect the circles by drawing a line as quickly as possible between them, alternating sequentially between numbers and letters. The score reported is number of seconds required to correctly complete the task. This test requires visual-motor integration, divided attention, ability to inhibit competing stimuli, and psychomotor speed. The Digit Symbol Substitution Test (Digit Symbol)12 requires matching randomly ordered numbers with nonsense symbols that are designated on a number-symbol key at the top of the test page. Subjects write the appropriate symbol in a blank square below each number. The score is the number of blanks filled in correctly in 90 seconds. This measure requires sustained attention, visual-motor integration, learning, and psychomotor speed.
Verbal fluency and memory. The Controlled Oral Word Association Test (COWAT) is a measure of word-finding or verbal fluency.15-17 Subjects were asked to say all the words they could think of beginning with a given letter, excluding proper nouns, numbers, and duplicate words with different suffixes. Scores reflected the count of all acceptable words produced in three 1-minute trials, each with a different letter. Three different letter sets were used, with the order randomly counterbalanced across patients. The letters selected had equivalent frequency of appearance in the English language.18 This verbal fluency measure required memory for words, organizational skills, and ability to initiate cognitive activity. The Hopkins Verbal Learning Test-Revised (HVLT-R)19 required recall of a list of 12 words read out loud by the examiner. The same list was read and recalled 3 times. After 25 minutes, the patient was asked again to repeat all the words he or she could remember. The score reported was the sum of words recalled over the 3 trials. Three equivalent word lists were used to permit a new list at each time point, with the order of administration randomly counterbalanced across patients. This test was introduced to the testing battery after the start of the study when a different verbal learning test proved too time consuming within study constraints; thus, 32 patients before transplantation and 4 patients at 80 days did not receive the test.
Executive function. Higher level cognitive processing skills, which allow for organization, planning, and problem solving, are considered “executive function.” This was measured using The Wisconsin Card Sorting Test, which requires the ability to develop a problem-solving strategy across changing stimulus conditions.20 A patient is shown 4 stimulus cards [(1) red triangle, (2) green stars, (3) yellow crosses, (4) blue circles] and is instructed to sort 64 response cards, each to 1 of the stimulus cards in whichever way he or she thinks it matches. The patient is told whether each response is right or wrong. After a specified number of consecutive “correct” matches, the sorting principle is changed without notification. The patient continues sorting and must discern the new matching principle. Score reported here is the number of perseverative responses, which are scored when errors are made that match the just completed category that is no longer correct. This test was given only at 1 year.
Statistical analysis
Descriptive and inferential analyses were performed using Statistical Package for the Social Sciences 10.0 (SPSS, Chicago, IL), Stata 6.0 (StataCorp, College Station, TX), and SAS 8.0 (SAS Institute, Cary, NC). Kaplan-Meier survival analysis with log-rank comparison was used to test for sample bias by comparing the cohort of patients tested with patients receiving HC transplants during the enrollment time frame of the study but who were not tested. Independent t tests and chi-square tests were used to compare demographic, clinical, and pretransplantation neuropsychologic characteristics of 2 cohorts of tested (n = 142) and not tested (n = 181) HC transplant recipients. Raw scores for each neuropsychologic measure were transformed to T scores based on age-, sex-, and education-adjusted normative data, available either via computerized program20 or text reference.17,19 Changes over time were examined with graphic depiction and generalized estimating equations.21 This analysis method accounted for the statistical dependence of repeated observations over time and permitted inclusion of results for participants who did not complete all assessments. To determine the statistical difference of each test from the norm (a T score of 50), single sample t tests were performed for each test and time point.
To analyze rates of clinically meaningful impairment, we selected a cutoff score of 1 SD below the normative mean of 50; in other words, test scores at or below 40 were considered impaired. This was a conservative estimate of impairment because 0.5 SD has been indicated as a “clinically meaningful” level in quality-of-life research.22 An aggregate impairment score was created based on the number of impaired tests (0 to 6). Logistic regression was used to determine risk factors for impairment and whether the same patients were impaired before transplantation and at 1 year. Impairment score was also used to examine the potential relationship between cognitive function and mortality. A time-dependent Cox regression model was used to estimate the mortality hazard ratio for patients with an impairment score of 1 or higher relative to those with no impairments. Up to day 80, the classification was based on the pretransplantation impairment score; from day 80 to 365, the day 80 impairment score determined impairment classification.
Results
Study participation
Figure 1 provides a flow chart of patient status on the study relative to all patients screened for eligibility during the enrollment time frame. Of the 142 patients who consented to the neuropsychologic testing and were tested at least once, 120 patients (85%) were tested before transplantation. Fifty-four patients (38%) were tested at all 3 time points, 39 patients (27%) were tested twice, and 49 patients (35%) were tested only once. Forty-one (29%) of the 142 patients tested died or relapsed before completing 1-year testing. Kaplan-Meier survival analysis of the 142 patients tested, compared with the 181 HC transplant recipients who were screened but not tested, indicated that the patients not tested were more likely to die or relapse in the first year after HCT (log rank = 21.93, P < .001).
Patient characteristics
Table 1 lists pretransplantation demographic and medical characteristics for the cohort tested versus those screened for eligibility but not tested. The nontested patients were more likely to have acute leukemia in relapse, whereas the tested patients more commonly had CML in chronic phase. Tested patients were on average 41 years old and predominantly white, with a higher percentage of nonwhite or Hispanic HC transplant recipients in the nontested cohort (P = .003). Sex composition was 51% women, a higher percentage of women than in the nontested sample (P = .01). Half of the tested patients had completed college. The source of stem cells for the majority (85%) of the transplants was bone marrow. Twelve percent previously had received either cranial irradiation or intrathecal chemotherapy; and 63% received total body irradiation (TBI) as part of their conditioning regimen. About one third of the patients received TBI doses of 13.2 to 14.4 Gy (51 of 142, 36%), another 27% (38 of 142) received TBI doses of 9 to 12.0 Gy. Among the 53 patients who did not receive TBI, all but 3 received busulfan and cyclophosphamide as conditioning chemotherapy (50 of 142, 35%).
Patients completing the Information scale of the Wechsler Adult Intelligence Scale-Revised had a mean scaled score of 10.97 (SD = 2.78), and a mean age, sex, and education adjusted T score of 48.17 (SD = 8.63). Standardized scaled scores on this test have a set mean of 10 and SD of 3.12 These results indicate that the IQ of these patients was in the normal range when tested.
Change in neuropsychologic functioning
Figures 2, 3, and 4 display mean T scores over time for the 6 tests. Inspection of these figures indicates a consistent pattern of change over time, confirmed by the generalized estimating equation (GEE) analyses. Results for the GEE models are indicated in Table 2. For each test, there was a significant overall effect of time, (P < .001 for all, except verbal memory P = .03), with a decline in functioning between before transplantation and 80 days after transplantation (P < .01 for all, except verbal memory P = .04). Functioning at 1 year returned to pretransplantation levels (P > .05) with the 2 exceptions of grip strength (P = .006) and motor dexterity (P = .001), which did not recover to pretransplantation levels.
Neuropsychologic comparison with norms
Asterisks in Figures 2, 3, 4 denote statistical differences from norms. Before transplantation, patients performed comparable to norms in all areas except motor dexterity (P = .002) and verbal skills involving fluency and memory (P < .001). At 80 days patients were impaired in all areas tested (P < .01). Motor skills were again significantly below norms at 1 year (P = .025 for grip strength and P < .001 for motor dexterity). Verbal fluency and verbal memory were notable as both were significantly lower than norms at all time points (P < .01) even though the means returned to pretransplantation levels by 1 year. Executive function as measured by the Wisconsin Card Sorting Test at 1 year was not impaired (T-score mean = 53.9, SD = 13.1) and in fact was somewhat better than population norms (one sample t = 2.46, P = .02).
Incidence of impairment
Table 3 shows the percentage of patients impaired on each test at each time point. Before transplantation, 15% to 32% of the patients were impaired. (The expected population rate of T scores lower than 40 is 15.7%.) Pretransplantation impairments in motor dexterity, verbal memory, and verbal fluency were about twice the expected rate. At 80 days, the percentage of patients impaired increased substantially. However, by 1 year, results were similar to before transplantation except for motor dexterity, which remained impaired for 46%. Logistic regression indicated that patients receiving cyclosporine, tacrolimus, or mycophenolate mofetil were more likely to be impaired in motor dexterity (odds ratio = 2.76; 95% confidence interval (CI), 1.01-7.55; P = .05).
Table 4 displays the number of tests on which patients were impaired at each time point. Prior to transplantation, 71% of patients were impaired on 1 or more tests, with 45% impaired on 2 or more. Logistic regression indicated that patients with no previous chemotherapy, or only hydroxyurea before beginning HCT conditioning, were less likely to be impaired on any tests before transplantation (odds ratio = 2.99; 95% CI, 1.08-8.30; P = .04). At 1 year, 74% of patients were impaired on 1 or more tests, with 55% impaired on 2 or more. Eighty-eight percent of patients with impairments before transplantation also had impairments at 1 year, whereas 54% of patients not impaired before transplantation had impairments at 1 year. Logistic regression indicated that patients impaired before transplantation had a greater likelihood of being impaired at 1 year when compared with those not impaired before transplantation (odds ratio = 6.29; 95% CI, 1.24-31.96; P = .03). Contrary to our hypothesis, patients receiving cyclosporine, tacrolimus, or mycophenolate mofetil were not more likely to have 1 or more impaired tests at 1 year (P = .74).
Cognitive function and survival
Figure 5 displays results of time-dependent survival curves for patients who had no test versus 1 or more tests impaired either before HCT or at 80 days after HCT. Time-dependent Cox regression analysis indicated that, in comparing those with 1 or more impairments versus no impairments, the mortality hazard ratio was 2.84 (95% CI, 0.8-9.7; P = 0.06).
Discussion
This prospective, longitudinal cohort study examined neurocognitive function in 142 adult allogeneic transplant recipients tested before HCT and in nonrelapsed survivors at 80 days and 1 year after HCT with a goal of clarifying changes over time related to transplantation. Patients declined on all cognitive and motor skills tested between before transplantation and 80 days after transplantation, confirming our first hypothesis that HCT would cause significant declines in function. By 1 year, these patients had returned to their pretransplantation levels of attention, speed of information processing, learning, visual-motor integration, verbal fluency, and verbal memory. In contrast, grip strength and motor dexterity remained below pretransplantation levels at 1 year. Thus, our second hypothesis, that deficits seen at 80 days would not fully recover by 1 year, was only partially supported. Although verbal skills returned to pretransplantation levels, they continued to be below population norms at 1 year. As predicted, patients impaired before transplantation had a 6.29 higher relative risk of impairment at 1 year, although risk of mortality was only marginally higher for patients impaired before transplantation or at 80 days. On the basis of these results, deficits remaining at 1 year could not be attributed to HCT-related treatments for most patients, with the exception that chronic GVHD treatment increased risk of motor dexterity impairment at 1 year.
Nearly all patients were impaired on 1 or more tests at 80 days after transplantation, despite the 3 months since administration of high-dose cytotoxic treatment. Significant decline from before transplantation was observed in all areas assessed but was particularly notable in motor skills and verbal fluency. This decline in neurocognitive function suggests that all of the HCT-related regimens used with these patients had some neurotoxicity. Attention and speed of information processing also declined but not as much as the other areas, contradicting a theory that deficits in information processing speed explain the long-term neuropsychologic impairments seen after HCT.2
At 1 year, 70% of the patients remained on active treatment for chronic GVHD, and many were receiving other drugs with known CNS activity. Results confirmed that patients receiving the immune suppressants cyclosporine, tacrolimus, or mycophenolate mofetil were more likely to be impaired on the Grooved Pegboard test of motor dexterity. Glucocorticoids did not confer the same risk of impaired testing. Recent research documents that patient recovery from HCT is a lengthy process that takes longer than 1 year.23,24 Consequently, further improvement could be expected after 1 year as patients continue to recover both medically and functionally and are able to discontinue medications.
By 1 year most patients had recovered verbal skills to the levels they were at before transplantation. Return to pretransplantation levels, however, did not mean return to normal. In relation to population norms, before transplantation the percentage of patients impaired in verbal fluency and memory (both 32%) was twice that expected of individuals of similar age, sex, and education. Results confirmed that patients without previous chemotherapy, other than hydroxyurea, before beginning HCT treatment had lower risk of impairment in neurocognitive function before transplantation. Our rates of impairment were somewhat lower than those reported by Andrykowski et al25 who found that 40% of allogeneic transplantation candidates were impaired on memory tasks. This difference between studies may be explained by a higher percentage of patients with minimal prior treatment in our study (42% of patients with chronic phase CML our study versus 35% in the Andrykowski et al25 study, and a larger percentage of patients with myelodysplasia and myelofibrosis in our study).
Several limitations to this study and its conclusions must be considered. Foremost is that we expect the results are likely an underestimate of the adverse cognitive effects of HCT on those patients who are more ill before transplantation or have a more difficult transplantation course. The examination of change over time excludes from this report patients who started transplantation conditioning immediately after arrival at the transplantation center, had observable CNS impairment before their transplantation, or refused the study. Our data indicate that patients not represented in this cohort were more likely to have had acute leukemia relapses and hence more prior treatment, further supporting a potential for higher rates of impairment. The finding that patients who did not complete pretransplantation testing were more likely to die or relapse by 1 year, and the finding that patients with impairments had marginally higher mortality rates (P = .06) support the likelihood that these patients were more ill and potentially more vulnerable to long-term neurocognitive effects of treatment. It is also possible that conditioning regimens underrepresented in this cohort could have greater long-term effect on neurocognitive function. Other groups not represented by these results included autologous and nonmyeloablative transplant recipients.
Another limitation of the study is the lack of control for practice effects resulting from repeated exposure to the tests. Practice effects could artificially inflate test scores or mask impairments. This is another reason that results may underestimate the adverse effect of HCT on neuropsychologic functioning. However, the HVLT-R and COWAT used alternate forms, designed to control for recall from previous sessions, for each testing session. Nonetheless, the test methodology was familiar. In addition, although not an optimal test of practice effects, the Wisconsin Card Sorting Test was administered only at 1 year. Survivors had at a T-score mean of 54, a score a little higher than average. Thus, within this cognitive domain, average level of performance could not be attributed to practice effect. These test results support the view that practice effects may not have excessively inflated the functional levels represented by the repeated neuropsychologic measures. Although these elements do not fully control for practice effects, they do suggest that the return to pretransplantation levels of function were not merely artifacts of practice.
Another factor that may mask the effects of HCT is the use of pretransplantation neurocognitive function as an indication of baseline capability. We have interpreted the return to pretransplantation level of function as indicative of a lack of adverse effects of HCT. This is correct if the pretransplantation baseline represents the “true” capabilities of the individual and practice effects are not inflating the 1-year scores. It is possible, however, that the pretransplantation baseline was temporarily lowered because of psychological or medical illness. In this case, pretransplantation function would be suppressed relative to a patient's true capabilities. Again in that case, a true return to baseline would have resulted in a net improvement in neurocognitive performance relative to before transplantation. Therefore, in either the case of practice effects or suppression of pretransplantation function relative to true capabilities, the return to pretransplantation level of performance could mean that HCT had neurotoxic effects that endured past 1 year but were not detected in this study.
Risk factors for neuropsychologic deficits at various times before and after transplantation need to be more fully identified. This is beyond the scope of this paper and will be addressed elsewhere.
Among other next steps, we need to determine whether patients with impairments at 1 year continue to recover, and whether remaining neuropsychologic deficits affect return to work or quality of life. If neurocognitive function is similar to other patterns of quality-of-life recovery after HCT, those patients who have not recovered or are not off all CNS active medications by 1 year will continue to improve between 1 and 3 years after transplantation,23 although Lee et al24 reported no improvement in subjective cognitive complaints after 1 year. Also important, 2 longitudinal studies with small samples have reported that no deterioration in function occurred after 1 year; however, this finding needs to be confirmed with larger cohorts.3,5
The results of this study have major implications for advising health care professionals, patients, and their families about expectations after transplantation. Overall, patients and their families can be advised that, although short-term neurocognitive effects of HCT are severe, most patients return to their pretransplantation levels of cognitive function by 1 year after transplantation. Expectations during the recovery period must consider the cognitive and psychomotor limitations of most patients. In particular, patients who return to work, or who resume other major responsibilities early after transplantation, may find that complex cognitive or motor tasks are challenging. The nature and the degree of difficulties observed 3 months after transplantation are sufficient to hinder the successful resumption of many roles and responsibilities. It would be advisable to encourage patients to resume responsibilities gradually as they test their own capabilities. Neuropsychologic function should be evaluated in patients who continue to have cognitive complaints after 1 year. Almost no research has tested cognitive rehabilitation methods for cancer survivors.26,27 Rehabilitation strategies that have assisted other populations may prove to be effective for assisting survivors in adapting to residual deficits, whether or not they predate HCT.
Prepublished online as Blood First Edition Paper, July 13, 2004; DOI 10.1182/blood-2004-03-1155.
Supported by grants from the National Cancer Institute (CA63030, CA78990, and CA96468).
An Inside Blood analysis of this article appears in the front of this issue.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 U.S.C. section 1734.