Abstract
Recent developments of novel immunotherapeutic drugs have shown promising results for patients with hematologic malignancies, however, an unmet need for accurate and specific biomarkers persists. To address this need, we developed a novel integrative analysis procedure for the automated analysis of multidimensional flow cytometry data obtained from the peripheral blood of patients with chronic lymphocytic leukemia (CLL). State of the art flow cytometry analysis is accomplished by manual sequential segmentation, or gating, of cell populations based on similarities in fluorescence and light scatter characteristics through visualization of the data in one- or two-dimensional plots. This approach has a number of limitations, including the subjective nature of the gating and the inability to fully utilize the high-dimensional data. Recent efforts have produced sophisticated computational methods that overcome many of these limitations; however, these newer computational methods have not been rigorously tested in a clinical context and have focused on the rigorous and automated analysis of samples from individual patients, with substantially less effort towards the analysis of patient populations. The ultimate goal of our analysis is to develop computational approaches that will enable an identification of subsets of patients with distinct immunological markers.
We developed a novel analysis framework that facilitates automated identification of both common cell types and patient population subgroups, based on post-processing of individual sample analysis with the FLOCK program. FLOCK identifies clusters of putatively similar cells in an individual sample by multidimensional clustering of the fluorescence marker and light-scattering measurements. We developed a rigorous hierarchical clustering approach to identify common “cell signatures” across multiple patients. The cell signatures were then mapped back onto the individual patient samples and used in a second clustering that identified patient subgroups based on similar abundances of specific cell types.
We used our analytic framework to analyze multidimensional flow cytometry data (26 cell surface markers in 4 different antibody cocktails) from peripheral blood specimens of a heterogeneous group of 55 CLL patients and 13 healthy controls. Our analysis revealed distinct differences between controls and CLL patients. Analyzing the non-malignant peripheral blood cell types, we were furthermore able to differentiate between distinct clinical subpopulations of patients (e.g. identify treatment-naïve patients from those that had previously undergone chemotherapy).
Using a novel integrative analysis procedure to analyze complex flow cytometry data of the peripheral blood from CLL patients, we are able to identify distinct cell type distributions. We propose that this information is a marker for the overall health/disease status of the corresponding patient, and could ultimately be used for diagnosis, prognosis, and selection of optimal treatment. In the context of multiple novel treatment options for CLL patients, such a tool will be crucial for defining individual patient prognosis, and defining an accurately matched treatment plan.
No relevant conflicts of interest to declare.
Author notes
Asterisk with author names denotes non-ASH members.