Abstract
Introduction
Despite notable advances in deep learning, the clinical integration of artificial intelligence (AI) in hematopathology remains limited by a lack of interpretability. Many AI models are complex and opaque, making it difficult for physicians to trust or use them in real practice. Physicians are trained to evaluate cell morphology within the context of a full peripheral smear, not isolated images. Models that replicate this diagnostic approach while offering visual insight into the decision-making process are far more likely to gain clinicians trust. To bridge this gap, we introduce a model that performs cell classification and explains its predictions using field-level visual cues. This design aims to bring AI decision-making closer to the way human experts work and think about it.
Methods
We developed a convolutional neural network based on the MobileNetV2 backbone to classify full-field, unsegmented images of four major leukocyte classes: eosinophils, lymphocytes, monocytes, and neutrophils. The model was trained on peripheral smear images curated to reflect real-world heterogeneity, including overlapping cells and staining artifacts. The diagnostic performance was evaluated using 1,989 test images. To support the visual validation of the predictions, we implemented three complementary explainability methods. “Grad-CAM” was used to localize attention on key morphological features, such as nuclear lobulation and granule distribution. “LIME” helped identify diagnostic subregions that influenced classification. “Occlusion Sensitivity” was applied to assess prediction stability when critical regions were masked, mimicking diagnostic blind spots in microscopy.
Results
The model achieved a mean accuracy of 97%, with precision and recall exceeding 92% for all the classes. Lymphocytes and monocytes reached precision/recall values above 98%, whereas neutrophils showed a slightly lower F1-score (94%), likely due to their morphological diversity in reactive conditions. All of the three methods of explainability systematically focused on the regions that were considered clinically relevant like chromatin region of high density, cytoplasmic granules, and architectural structures in the nucleus that clinicians take into consideration in their diagnosis in clinical hematology. Importantly, the model avoided non-informative background elements (e.g., smudge cells and erythrocyte clusters), which adds to the credibility of the model as a resource to support diagnosis.From a clinical perspective, explainability overlays serve as a visual audit trail. Physicians reviewing the AI's predictions can quickly assess whether the rationale aligns with the established morphological cues. For instance, LIME highlighted atypical mononuclear regions in cases flagged as monocytes, whereas occlusion mapping flagged potential misclassifications when key nuclear segments were hidden, mirroring the diagnostic hesitation that may occur during manual review.
Conclusion
This study presents a high-performance, visually explainable framework for white cell classification that mirrors the human visual logic of peripheral smear analysis. By revealing where and how predictions are made, clinicians can confidently interpret, validate, or override AI decisions. Such transparency is not optional, it is essential for safe use of AI in practice. Future work will focus on extending this framework to pathological forms such as myelodysplastic cells and blast population, paving the way for responsible AI, with the goal of supporting early recognition of hematologic malignancies.