ResPredAI#

Antimicrobial Resistance Prediction via AI#

Python Version CI License: MIT ResPredAI Logo

ResPredAI is a machine learning pipeline for predicting antimicrobial resistance. It implements the methodology described in:

Bonazzetti, C., Rocchi, E., Toschi, A. et al. Artificial Intelligence model to predict resistances in Gram-negative bloodstream infections. npj Digit. Med. 8, 319 (2025). https://doi.org/10.1038/s41746-025-01696-x

Features#

  • Nested Cross-Validation: Rigorous evaluation with inner CV for hyperparameter tuning and outer CV for performance estimation

  • 8 ML Models: Support for Logistic Regression, Random Forest, XGBoost, CatBoost, MLP, TabPFN, and SVM variants

  • Threshold Optimization: Optional threshold tuning using Youden’s J statistic, F1, F2, or cost-sensitive objectives

  • Probability Calibration: Post-hoc calibration with sigmoid (Platt) or isotonic methods

  • Calibration Diagnostics: Brier Score, ECE, MCE metrics with reliability curves

  • Group-Aware CV: Prevent data leakage with stratified group k-fold

  • Feature Importance: Native importance extraction with SHAP fallback

  • Model Persistence: Save and resume training, cross-dataset validation

Output Structure#

The pipeline generates:

  • HTML report: Comprehensive self-contained report with metrics, confusion matrices, and configuration summary

  • Confusion matrices: PNG files with heatmaps showing model performance

  • Detailed metrics: CSV files with precision, recall, F1, MCC, balanced accuracy, AUROC and 95% confidence intervals

  • Trained models: Saved models for resumption and feature importance extraction

  • Feature importance: Plots and CSV files showing feature importance/coefficients

output_folder/
├── models/                                       # Trained models (if enabled)
│   └── {Model}_{Target}_models.joblib
├── metrics/                                      # Detailed metrics
│   ├── {target}/
│   │   ├── {model}_metrics_detailed.csv          # Metrics with 95% CI
│   │   └── summary.csv                           # Summary for this target
│   └── summary_all.csv                           # Global summary
├── confusion_matrices/                           # Confusion matrix heatmaps
│   └── Confusion_matrix_{model}_{target}.png
├── calibration/                                  # Calibration diagnostics
│   └── {target}/{model}_reliability_curve.png
├── feature_importance/                           # Feature importance (optional)
│   └── {target}/{model}_feature_importance.csv
├── report.html                                   # Comprehensive HTML report
├── reproducibility.json                          # Reproducibility manifest
└── respredai.log                                 # Execution log

Citation#

If you use ResPredAI in your research, please cite:

@article{Bonazzetti2025,
  author = {Bonazzetti, Cecilia and Rocchi, Ettore and Toschi, Alice and others},
  title = {Artificial Intelligence model to predict resistances in Gram-negative bloodstream infections},
  journal = {npj Digital Medicine},
  volume = {8},
  pages = {319},
  year = {2025},
  doi = {10.1038/s41746-025-01696-x}
}

Funding#

This research was supported by EU funding within the NextGenerationEU-MUR PNRR Extended Partnership initiative on Emerging Infectious Diseases (Project no. PE00000007, INF-ACT).

License#

This project is licensed under the MIT License.