Changelog#
All changes to ResPredAI are documented in this file.
[1.9.1] - 2026-04-13#
Fixed#
Group leakage in
perform_trainingconformal CV: now usesStratifiedGroupKFoldwhen groups are available, matchingperform_pipelineandperform_temporal_validation.
[1.9.0] - 2026-04-10#
Added#
FOR (False Omission Rate) metric: reported in all metrics CSVs and HTML reports
CV+ Conformal Prediction (Mondrian):
Per-class prediction sets
{S},{R}, or{S, R}with distribution-free coverage guaranteesComputed per-fold inside nested CV
CV+ formal guarantee:
1 - 2α; config[Uncertainty] alphareplacesmarginq_hatper class saved in model bundles; HTML report with dedicated section
Model category constants in
constants.py:SHAP_FALLBACK_MODELS = ("MLP", "RBF_SVC", "KNN", "TabPFN")NO_CLASS_WEIGHT_MODELS = ("MLP", "KNN", "TabPFN")
Subgroup evaluation documentation: clear guide on configuring
[Metadata]section for subgroup analysisConformal Prediction section in HTML report with per-model coverage tables
Changed#
Feature importance functions (
has_feature_importance,get_feature_importance,compute_shap_importance) now requiremodel_nameparameter and dispatch on category constants instead ofhasattrduck-typing
Removed#
Legacy config fallback for
group_columnin[Data]andtemporal_split_columnin[Validation]- use[Metadata]section (deprecated since v1.8.0)calculate_uncertainty()function - replaced bycompute_conformal_qhat()+conformal_prediction_sets()[Uncertainty] marginconfig key - replaced by[Uncertainty] alphauncertainty_marginfield in model bundles - replaced byconformal_q_hatandconformal_alpha
[1.8.0] - 2026-03-25#
Added#
Unified
[Metadata]config section with metadata column definitions:group_column(moved from[Data])temporal_column(moved from[Validation]astemporal_split_column)subgroup_columns(new, comma-separated) for subgroup performance analysis
Subgroup Performance Evaluation: compute full metric set (AUROC, F1, MCC, Precision, Recall, ECE, MCE, Brier Score) per subgroup value
Multiple subgroup columns supported simultaneously
Sample size and class prevalence reported per subgroup
Integrated with both CV and temporal validation pipelines
Per-subgroup metrics saved as CSV in
subgroup_analysis/directory“Subgroup Analysis” section added to HTML report with tables per subgroup column
Warns when subgroups have fewer than 10 samples
Signed Feature Importance Direction: determine whether features are risk factors or protective
New config flag:
compute_feature_direction(default:false)Linear models (
coef_): uses sign of coefficients directly (no extra computation)Tree-based models (RF, XGB, CatBoost): uses
shap.TreeExplainerOther models: falls back to
shap.KernelExplainerfor signed SHAP valuesDirectioncolumn added to feature importance CSV (Risk (+)/Protective (-))Feature importance plot color-coded by direction (firebrick = risk, seagreen = protective) with legend
CLI flag:
respredai feature-importance --direction
[1.7.1] - 2026-03-23#
Added#
BCa Bootstrap Confidence Intervals: replaced percentile bootstrap with bias-corrected and accelerated (BCa) method via
scipy.stats.bootstrapfor improved coverage on small samples and bounded/skewed metricsNadeau-Bengio Corrected Standard Error: new
SEcolumn in metrics CSV using the corrected variance formula(1/k + n_test/n_train) * s**2that accounts for training set overlap in k-fold CV. Summary report±notation now uses SE instead of raw StdConfigurable bootstrap CI parameters:
confidence_levelandn_bootstrapin[Pipeline]config sectionConstants module (
respredai/core/constants.py): centralized validation lists, directory names, and defaults
Fixed#
Name sanitization unified across 8 files -
.replace(" ", "_")andre.sub()patterns consolidated intosanitize_name()/sanitize_metric_name()(44 replacements)assertin temporal split replaced with properValueErrorEmpty CV fold validation with warning when train/test sets are empty after splitting
Explicit warning when all bootstrap samples fail (previously returned NaN silently)
Changed#
ConfigHandlersplit into 7 domain-specific dataclassesConfig validation lists now reference centralized constants from
constants.pyREADME quick-start config example now shows
[Validation]section (added in v1.7.0)HTML report confidence intervals row now dynamically reflects configured
confidence_levelandn_bootstrap
[1.7.0] - 2026-03-18#
Added#
Temporal (Prospective-Style) Validation:
validation_strategyconfig option:cv(default),temporal, orbothtemporal_split_column,temporal_split_date, andtemporal_split_ratioconfig optionsGroup-aware temporal splitting to prevent data leakage
--validation-strategyCLI override flag for theruncommandTemporal validation results section in HTML report
Per-fold One-Hot Encoding: OHE is now fitted inside each CV fold (train-only) instead of on the full dataset, preventing category leakage
NaN-safe scaling for KNN imputation: pre-scales features with NaN-tolerant statistics before distance-based imputation
scale_pos_weightparameter in XGBoost hyperparameter grid
Fixed#
SVC models now set
probability=Falsewhen external calibration is enabled, avoiding double Platt scalingTunedThresholdClassifierCV(CV method) now usesStratifiedGroupKFoldand passes group labels when grouped CV is configuredRepeated CV deduplication now uses the threshold-aware decision boundary instead of a fixed 0.5 cutoff
All
nanstdcalls now useddof=1for unbiased sample standard deviationReliability curve binning replaced with self-contained implementation to ensure
bin_countsalignmentLogger initialization deferred to after CLI overrides so log file uses the correct output folder
Feature importance name resolution improved with multi-source fallback; missing features across folds default to zero
evaluatecommand now reuses the saved OHE transformer from training instead of ad-hocpd.get_dummies
Changed#
validate-configsummary table now displays validation strategy and temporal split parameterscreate-configtemplate now includes a commented[Validation]sectionuncertainty_marginnow stored in training metadata and model bundles
[1.6.2] - 2026-03-05#
Fixed#
Bootstrap confidence intervals now deduplicate samples when using repeated outer CV
Threshold optimization (CV method) now correctly uses the calibrated estimator when probability calibration is enabled
Metrics aggregation now respects repeat structure
Reliability curve fold labels now indicate repeat number when using repeated CV
Reliability curves now use quantile binning for smoother calibration plots on imbalanced data
Added#
Makefile for development workflows
[1.6.1] - 2026-02-06#
Fixed#
trainsubcommand now applies probability calibration (CalibratedClassifierCV) whencalibrate_probabilities = truetrainsubcommand now supports CV threshold method (TunedThresholdClassifierCV) in addition to OOFReproducibility manifest now includes probability calibration parameters
Changed#
create-configtemplate now includesthreshold_objective,vme_cost,me_costparametersvalidate-configsummary table now displays probability calibration and threshold objective details
[1.6.0] - 2026-02-05#
Added#
Probability Calibration:
Optional post-hoc probability calibration on the best estimator per outer CV fold
Supports
sigmoid(Platt scaling) andisotoniccalibration methodsApplied after hyper-parameters tuning and before threshold tuning
Calibration Diagnostics:
Brier Score: Mean squared error of probability predictions (lower is better)
ECE (Expected Calibration Error): Weighted average of calibration error across bins
MCE (Maximum Calibration Error): Maximum calibration error across any bin
Reliability curves (calibration plots) per outer CV fold and aggregate
Repeated Stratified Cross-Validation:
outer_cv_repeatsconfig option (default:1)Set
>1for repeated CV with different shuffles for more robust performance estimates
Changed#
metric_dict()now includes Brier Score, ECE, and MCE by defaultHTML report includes new calibration diagnostics section
Output folder now includes
calibration/directory with reliability curve images
[1.5.1] - 2026-01-29#
Added#
OneHotEncoder
min_frequencyparameter to reduce noise from rare categorical values
Changed#
Updated
requirements.txtwith explicit version constraints for all dependenciesscikit-learn>=1.5.0required forTunedThresholdClassifierCV
[1.5.0] - 2026-01-20#
Added#
VME/ME report:
VME (Very Major Error): Predicted susceptible when actually resistant
ME (Major Error): Predicted resistant when actually susceptible
Flexible threshold objectives:
threshold_objectiveconfig option:youden(default),f1,f2,cost_sensitiveCost-sensitive optimization with configurable
vme_costandme_costweights
Per-prediction uncertainty quantification to flag uncertain predictions near decision threshold
uncertainty_marginconfig option (default: 0.1) defines margin around thresholdPredictions within margin are flagged as uncertain in evaluation output
Uncertainty scores (0-1) provided for each prediction
Reproducibility manifest (
reproducibility.json) generated withrunandtraincommands with environment info, data fingerprint, full configuration settings
Changed#
HTML report framework summary now displays threshold objective and cost weights (when applicable)
Evaluation output now includes
uncertaintyandis_uncertaincolumns
[1.4.1] - 2026-01-15#
Changed#
Migrated documentation from MkDocs to Sphinx
Documentation dependencies now loaded dynamically from
docs-requirements.txtDevelopment dependencies now loaded dynamically from
dev-requirements.txt
[1.4.0] - 2026-01-14#
Added#
K-Nearest Neighbors (KNN) classifier support
Missing data imputation with configurable methods:
SimpleImputer(mean,median,most_frequentstrategies)KNNImputerfor k-nearest neighbors imputationIterativeImputerwithBayesianRidgeorRandomForestestimator
Comprehensive HTML report generation with metadata run and framework summary tables, results tables with 95% confidence intervals and confusion matrices
Ruff linter integration in CI workflow for code quality
Changed#
Bootstrap confidence intervals now use sample-level predictions instead of fold-level metrics for more reliable statistical inference
Updated CI workflow to include lint checks before tests
Added Python 3.13 to CI test matrix
[1.3.1] - 2026-01-08#
Changed#
Reorganized package structure into sub-packages for clarity:
respredai/core/- Pipeline, metrics, models, and ML utilitiesrespredai/io/- Configuration and data handlingrespredai/visualization/- Plotting and visualization
Documentation#
Created
docs/structure with MkDocs
[1.3.0] - 2025-12-12#
Added#
traincommand for model training on entire dataset (cross-dataset validation)Uses GridSearchCV for hyperparameter tuning (inner CV only)
Saves one model file per model-target combination
Exports
training_metadata.jsonfor evaluation compatibility
evaluatecommand to apply trained models to new dataValidates new data columns against training metadata
Outputs per-sample predictions with probabilities
Calculates metrics against ground truth
Automatic summary report after
runcommandGenerates
summary.csvper target andsummary_all.csvgloballyAggregates Mean±Std for all metrics across models
SHAP-based feature importance as fallback for models without native importance
Supports MLP, RBF_SVC, and TabPFN via KernelExplainer
Computes mean absolute SHAP values across CV test folds
Output files have
_shapsuffix when SHAP is used--seedflag for reproducible SHAP computations
Documentation#
Added
docs/cli-reference/train-command.rstAdded
docs/cli-reference/evaluate-command.rstUpdated
docs/cli-reference/feature-importance-command.rstwith SHAP fallback details
[1.2.0] - 2025-12-10#
Added#
validate-configcommand to validate configuration files without running the pipelineOptional
--check-dataflag to also verify data file existence and column validity
CLI override options for the
runcommand:--models,--targets,--output,--seedCONTRIBUTING.md with development setup guide and contribution workflow
Changed#
Bootstrap confidence intervals (10,000 resamples) replace t-distribution CI in metrics output
User-friendly error messages for missing config files or data paths
Documentation#
Added
docs/cli-reference/validate-config-command.rstUpdated
docs/cli-reference/run-command.rstwith CLI overrides section
[1.1.0] - 2025-12-04#
Added#
Threshold optimization with dual methods (OOF and CV) using Youden’s J statistic
OOF method: Global optimization on concatenated out-of-fold predictions
CV method: Per-fold optimization with threshold averaging
Auto selection based on dataset size (n < 1000: OOF, otherwise: CV)
Grouped cross-validation (
StratifiedGroupKFold) to prevent data leakage in clinical datasets
Changed#
Expanded hyperparameter grids for XGBoost, Random Forest, CatBoost, and MLP
Enhanced CLI information display
Fixed#
XGBoost feature naming issue with special characters
Color scheme in feature importance plots
Documentation#
Added comprehensive command documentation (
docs/cli-reference/run-command.rst,docs/cli-reference/create-config-command.rst,docs/cli-reference/feature-importance-command.rst)Updated README with logo, quick start guide, and output structure
Add CHANGELOG.md
[1.0.0] - Initial Release#
Core Features#
Nested cross-validation framework (outer: evaluation, inner: hyperparameter tuning)
Eight machine learning models: LR, Linear SVC, RBF SVC, MLP, RF, XGBoost, CatBoost, TabPFN
Comprehensive metrics: Precision, Recall, F1, MCC, Balanced Accuracy, AUROC
Data preprocessing: StandardScaler, one-hot encoding, multi-target support
INI-based configuration system
Structured output: CSV metrics, confusion matrix plots, logs
Feature importance extraction command with visualization and CSV export
Citation#
If you use ResPredAI in your research, please cite:
Bonazzetti, C., Rocchi, E., Toschi, A. et al. Artificial Intelligence model to predict resistances in Gram-negative bloodstream infections. npj Digit. Med. 8, 319 (2025). https://doi.org/10.1038/s41746-025-01696-x
License#
This project is licensed under the MIT License - see the LICENSE file for details.
Funding#
This research was supported by EU funding within the NextGenerationEU-MUR PNRR Extended Partnership initiative on Emerging Infectious Diseases (Project no. PE00000007, INF-ACT).