Background: Recent advances in computer vision have given rise to image foundation models (FMs), large-scale neural networks pre-trained on millions of unlabeled histopathology patches, which can learn generalizable representations of tissue morphology. In many cancers, these FMs have predicted genetic mutations and prognoses directly from histology, suggesting that morphology encodes “hidden” molecular information. In lymphoma, this field is nascent but growing.General-purpose histopathology FMs have limited representation of lymphoid morphology. We developed and evaluated LymphoVision, a lymphoma-specialized image foundation model pretrained with self-supervised learning, to extract clinically meaningful features from routine hematoxylin and eosin (H&E) histology.

Methods: LymphoVision was pre-trained using knowledge distillation and consistency (DINOv2) on over 37 million multi-resolution patches (5x, 10x, 20x, 40x) sampled using a density-aware strategy from 31,211 H&E whole slide images (WSIs) from 9,155 archival lymphoma and reactive lymphoproliferative cases. To ensure diverse, representative samples, our patch sampling method uses an efficient graph-based clustering algorithm and prioritizes morphologically informative regions by clustering patch embeddings and performing density sampling within each cluster. The pretrained encoder was evaluated using a clustering-constrained attention multiple instance learning (MIL) framework on three diagnostic tasks: (1) cell-of-origin (COO) classification in diffuse large B-cell lymphoma (DLBCL), distinguishing germinal center B-cell (GCB) from non-GCB subtypes based on the Hans immunohistochemistry algorithm; (2) follicular lymphoma (FL) grading, differentiating low-grade (grades 1–2) from high-grade (grade 3); and (3) multiclass lymphoma subtype classification across six categories: DLBCL, FL, marginal zone lymphoma (MZL), mantle cell lymphoma (MCL), classical Hodgkin lymphoma (CHL), and benign/reactive lymphoid conditions. WSI-level labels were derived from pathology reports, and none of the evaluation cases were included in pretraining. Models used a ViT-Giant LymphoVision backbone to generate 20x patch embeddings, followed by MIL using 50% training data and 2-fold cross-validation (25% validation, 25% test). Performance was assessed using area under the receiver operating characteristic curve (AUC), accuracy, recall (sensitivity) and F1 score.

Results: For DLBCL COO classification (total n=358 cases), LymphoVision achieved a mean test AUC of 0.93 (range 0.92–0.94), accuracy of 84.5% and F1 of 0.85 across two independent test folds (n=91 per fold, 50% GCB). These results outperformed general-purpose FMs on the same task and cases, including Virchow-v2 (AUC 0.85) and UNI-v2 (AUC 0.88), despite those models being pretrained on 10–100 times more images. For FL grading (total n=660 cases), the test set in each fold included 166 cases (18% grade 3), yielding a mean test AUC of 0.88 (range 0.87–0.90), accuracy of 89.2% and F1 of 0.80. In multiclass lymphoma subtype classification (total n=1,176 cases), the test set included 296 cases per fold. LymphoVision achieved a mean test AUC of 0.98 (range 0.97-0.99), mean test accuracy of 87.1% and mean F1 of 0.84. Class-wise recall was 92.0% for DLBCL (n=112), 88.2% for FL (n=72), 77.2% for MZL (n=29), 70.8% for MCL (n=35), 87.5% for CHL (n=24), and 95.8% for benign/reactive lymphoid conditions (n=24).

Conclusion: LymphoVision is a lymphoma-specialized FM trained using self-supervision that achieves high diagnostic accuracy across multiple clinically meaningful tasks using weak supervision and standard H&E images, suggesting the encoder has learned fine-grained lymphoma morphology. Based on historical benchmarks from the 1997 International Lymphoma Study Group classification project, expert hematopathologists achieved diagnostic agreement rates of 55% to 84% using H&E morphology alone for the non-Hodgkin lymphoma (NHL) entities included in our multiclass task. LymphoVision approached or exceeded these benchmarks. These findings underscore the value of our approach using disease-specialized pretraining and intelligent patch sampling for data-efficient histopathology learning. LymphoVision has the potential to augment diagnostics, enable scalable AI-powered classification in real-world workflows, and serve as a foundation for prognostic and theragnostic applications. External validation and multimodal integration are ongoing.

This content is only available as a PDF.
Sign in via your Institution