In the article authored by H. Lozano Gómez et al. entitled “Design of a new mortality indicator in acute coronary syndrome on admission to the Intensive Care Unit,1” the authors aim to create an algorithm to detect mortality in a highly relevant condition such as acute coronary syndrome. Firstly, I would like to acknowledge the value of their work considering the inherent complexity of the database they dealt with, and the innovative methodology used.
However, I would like to emphasize an aspect that I believe has not been adequately addressed in the study limitations section. The database is highly complex due to its imbalance since the event of interest (mortality) has a very low prevalence (<5%). This causes various problems that I intend to discuss.
The evaluation metrics used in models are important because they guide the selection of the best possible model. If guidance is not accurate, then the selected model will not provide the results we’re looking for. The metric used in the study was the area under the receiver operating characteristic (ROC) curve, which tends to be particularly optimistic in imbalanced datasets.2 Maybe that is why, in the validation dataset, a 12% sensitivity rate and a 48% positive predictive value were reported. These values are likely far from the target set by the algorithm. However, it is often difficult to use a different metric (even if suboptimal) to maintain comparative power with other studies.
The problem of imbalance and the metric may have been more serious due to the algorithm chosen (Multilayer perceptron).3 Algorithms tend to optimize the overall result, and their internal optimization is often impacted by the dominant class (in the current problem, “alive”). Technically, the vector of the dominant class of the variable of interest (alive) is more “powerful” during the gradient descent obtained during optimization through back propagation. As a result, in imbalanced databases, the algorithm tends to optimize the proper classification of the dominant class in the event of interest (alive).
To address these problems, different methods have been proposed that act on both the database and the algorithms themselves:
- •
Regarding the dataset: sampling techniques are suggested for a better balance of the sample whether by increasing the number of cases of interest (deaths), decreasing the number of non-interest cases (alive) or both at the same time.
- •
Regarding the algorithm: using boosting-based algorithms, adopting algorithms with cost functions, and using threshold methods (which were used in the article but seemingly did not achieve the desired effect).
In conclusion, extreme imbalance poses complex statistical problems that are difficult to solve. The commonly used ROC curve can yield “misleading” results, and in this context, further reflection and consideration are required regarding data preprocessing and the algorithm of choice.
FundingNone whatsoever.
Conflicts of interestNone reported.