PREDICTIVE MODELING OF DIABETES, BREAST CANCER, CIRRHOSIS, AND THYROID DISORDERS IN MEXICAN WOMEN: A METHODOLOGICALLY RIGOROUS MACHINE LEARNING APPROACH WITH INTERSECTIONAL FAIRNESS EVALUATION
DOI:
https://doi.org/10.4238/0r6rws32Abstract
Background. Diabetes, breast cancer, cirrhosis, and thyroid disorders impose a disproportionate burden on Mexican women, with indigenous women at elevated diabetes risk [1]. Published machine-learning models often report near-perfect discrimination implausible for real data, owing to leakage-prone resampling and absent uncertainty reporting [2,3].
Objective. We test whether Random Forest and a Deep Neural Network can deliver reliable early identification of these four conditions under TRIPOD+AI and PROBAST standards, and whether performance holds for indigenous women under FAIR-MED [4,5].
Methods. We analyze ENSANUT 2022 (N = 115,307; indigenous n = 9,275; 38 variables). Random Forest is fit for diabetes, cirrhosis, and thyroid disorders; a five-layer Deep Neural Network for breast cancer. Imputation, normalization, and SMOTE-ENN are confined to training folds within stratified 5-fold cross-validation. We report G4, AUC-ROC, F1, Brier, and calibration slope and intercept with bootstrap 95% CIs [6], and compare SHAP feature importance.
Results. Discrimination is high but non-perfect (G4 0.87–0.93; AUC 0.91–0.95). Calibration is good (Brier 0.058–0.092; slopes 0.89–0.97); learning curves converge within five percent. Indigenous women show lower performance (FAIR-MED 0.17–0.23; G4 deficits of 3–4 points), with dominant predictors shifting from clinical to sociostructural variables.
Conclusions. Both architectures can support early identification of these conditions when methodological discipline replaces inflated metrics with calibrated estimates. Equitable deployment for indigenous women requires subgroup-aware modeling and external validation.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

