We would like to thank the authors of the article “Sepsis mortality prediction with Machine LearningTechniques”1 for their valuable contribution to mortality prediction in septic patients using machine learning, with the aim of shedding new light on the heterogeneity of sepsis.
The authors aimed to evaluate machine learning models based on a local database and a public database (MIMIC-III).2 After their analysis, they found that lactate levels, urine output, and acid–base balance variables were the most relevant for predicting mortality.
The authors identified similar predictive variables across both databases, with strong results at the local level but mediocre outcomes when compared with the other dataset. They argue that this discrepancy is due to the differences in variables used in each model or the reduction in the number of variables, which is a mathematically sound explanation. However, we believe that a more in-depth analysis of the article's findings could provide greater insight, as medicine cannot always be fully explained through mathematical models.
- 1)
COMPARED POPULATIONS: The data presented by the authors suggest that the two septic patient populations are markedly different. The local database shows an approximate mortality rate of 44%, while in MIMIC-III, the mortality rate is 16.25%. This difference in mortality likely reflects variations in patient populations, underlying pathologies, and healthcare systems. Therefore, we may be addressing different healthcare challenges under the same label. Mathematically, population heterogeneity affects the model's generalizability.
- 2)
METRICS USED: The metric employed to evaluate the models is the area under the curve (AUC), which may be appropriate for measuring the event of interest in the local population, where the event rate is close to 50%. However, AUC becomes a “less informative” metric in MIMIC-III, where the event of interest is imbalanced.3
- 3)
VARIABLE IMPORTANCE: Furthermore, the metrics used to quantify variable importance—namely, the “Mean Decrease Accuracy (MDA)/Gini (MDG)” complex—are a “joint metric” that provides a more robust view of variable importance, as each captures different aspects4: MDA measures how the variable affects the overall predictive accuracy, while MDG measures how it impacts the quality of splits in the model. A closer examination of the values reveals that, although the highlighted variables may be similar, the degree to which they improve prediction or reduce impurity differs quantitatively, implying a different classification capacities.
In summary, we would like to add to the discussion that not only the variables used explain the models’ reduced accuracy but also the consideration of socio-health factors, event distributions, and the quantification of variable importance should be addressed.
FundingNone.
Conflicts of interestThe authors declare that none of them have any conflicts of interest.