Abstract
Background: A comprehensive understanding of multiple myeloma (MM) in real-world settings is essential to improve patient care and guide clinical decisions. However, current real-world data on MM are often outdated, inconsistent, and lack detail, limiting insights into its epidemiology, management, and outcomes. Much of the relevant clinical information is buried in unstructured free text within electronic health records (EHRs), making analysis challenging. Natural language processing (NLP) offers a scalable solution to extract and structure this information, enabling more accurate insights into MM care in routine practice
Aims and methods: This study aimed to generate real-world evidence on the epidemiology, management, and outcomes of MM in Spain, primarily using unstructured clinical information from EHRs.
We retrospectively analyzed EHR data (2015–2021) from 9 Spanish hospitals. Clinical information was extracted and structured using EHRead®, an NLP and machine learning-powered tool which detects clinical terms, semantically maps them to standardized terminologies (SNOMED CT, ATC, LOINC), and captures their clinical context. Crude and age-adjusted prevalence and incidence rates were calculated by year, sex, and age group. In patients with newly diagnosed multiple myeloma (NDMM), we assessed transplant eligibility (TE) and first-line (1L) treatment patterns. Five-year overall survival (OS), progression free survival (PFS), and overall response rate (ORR) were measured from diagnosis. In those initiating second-line (2L) treatment (patients with 1L in which a subsequent treatment line was detected) we analyzed treatment regimens, along with ORR, OS, and PFS rates from 2L start. A two-year cutoff was used to reduce follow-up bias.
Results:
A total of 270,882,911 EHRs from 6,309,596 patients were processed, identifying 2451 MM patients. Among them, 766 (31.3%) were classified as NDMM. The overall crude prevalence (per 100,000 habitants) was 43.44 (95% CI: 38.18–48.69), 45.91 (95% CI: 40.71–51.11) in males, and 41.14 (95% CI: 35.82–46.47) in females. The crude incidence rates were 7.09 (95% CI: 4.91–9.27), 7.49 (95% CI: 5.14–9.84) in males, and 6.72 (95% CI: 4.61–8.84) in females. Both prevalence and incidence increased from the 55-59 to the 75-79 years age group, rising from 52.52 (95% CI: 52.52–66.11) and 11.41 (95% CI: 7.82–15.00) to 204.98 (95% CI: 166.07–243.88) and 27.93 (95% CI: 18.19–37.66), respectively.
Adjusted prevalence and incidence remained stable over the study period, ranging from 23.37 (95% CI: 15.95–30.80) to 27.84 (95% CI: 19.06–36.62) and from 2.52 (95% CI: 1.56–3.49) to 4.35 (95% CI: 2.99–5.71), respectively.
Among incident NDMM patients, 36.3% (mean [SD] age 60.3 [9.0]) were classified as TE while 63.7% (mean [SD] age 71.9 [11.8]) were non-TE. The most frequent 1L treatment regimens in both groups were based on proteasome inhibitors (PIs) (61.2% in TE and 50.6% in non-TE patients).
The 5-year PFS and OS rates were 43.6% (95% CI: 36.3%- 52.4%) and 51.3% (95% CI: 46.4%-56.8%) in non-TE patients and 41.4% (95% CI: 32.9%-52.1%) and 57.4% (95% CI: 51.0%-64.7%) in TE patients, respectively.
Following 1L treatment, 36.3% of patients transitioned to 2L treatment (defined by a period spanning from the detection of either disease progression following start of 1L or the identification of 2L, whichever occurred first within the study period). Notably, 81.1% of 2L patients were <75 years old, with daratumumab-based regimens being the most utilized treatments (44.4%). The 2-year PFS and OS rates were 46.4% (95% CI: 37.5%-57.3%), and 64.1% (95% CI: 58.0%-70.9%), respectively.
Conclusions: This multicenter, NLP-based analysis of 270 million EHRs reveals a stable MM burden in Spain, with increasing prevalence and incidence from age 60 onward. Real-world treatment patterns in our study aligned with evolving clinical guidelines: PI-based regimens dominated 1L, while daratumumab-containing regimens prevailed in 2L. Nonetheless, real-world OS, and PFS remained suboptimal. These findings underscore the need for more effective, personalized therapies and demonstrate the potential of large-scale NLP applied to EHRs to identify therapeutic gaps and guide MM care optimization.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal