Run Command#

The run command executes the full machine learning pipeline with nested cross-validation for antimicrobial resistance prediction.

Usage#

respredai run --config <path_to_config.ini> [options]

Options#

Required#

  • --config, -c - Path to the configuration file (INI format)

Optional#

  • --quiet, -q - Suppress banner and progress output

    • Does not suppress error messages or logs

CLI Overrides#

Override configuration file parameters without editing the file:

  • --models, -m - Override models (comma-separated)

    • Example: --models LR,RF,XGB

  • --targets, -t - Override targets (comma-separated)

    • Example: --targets Target1,Target2

  • --output, -o - Override output folder

    • Example: --output ./new_results/

  • --seed, -s - Override random seed

    • Example: --seed 123

  • --validation-strategy - Override validation strategy

    • Values: cv, temporal, or both

    • Example: --validation-strategy temporal

Examples with overrides:

# Run with different models
respredai run --config my_config.ini --models LR,RF

# Run only specific targets with a different output folder
respredai run --config my_config.ini --targets Target1 --output ./experiment1/

# Quick experiment with different seed
respredai run --config my_config.ini --seed 42 --quiet

# Run with temporal validation
respredai run --config my_config.ini --validation-strategy temporal

Configuration File#

The configuration file uses INI format with the following sections.

Note

Optional parameters can be disabled by commenting out the line with #. Empty values (e.g., group_column =) are treated as absent.

[Data] Section#

Defines the input data and features.

[Data]
data_path = ./data/my_data.csv
targets = Target1,Target2,Target3
continuous_features = Age,Weight,Temperature

Parameters:

  • data_path - Path to CSV file containing the dataset

    • Must include all features and target columns

    • First column is assumed to be the sample ID

  • targets - Comma-separated list of target column names

    • Each target will be trained separately

    • Must exist in the CSV file

  • continuous_features - Comma-separated list of continuous feature names

    • These features will be scaled using StandardScaler

    • All other features are treated as categorical and one-hot encoded

[Metadata] Section#

Defines metadata columns used for grouping, temporal splitting, and subgroup analysis.

[Metadata]
group_column = patient_id
temporal_column = collection_date
# subgroup_columns = ward, sex

Parameters:

  • group_column (optional) - Column name for grouping related samples

    • Use when you have multiple samples from the same patient/subject

    • Prevents data leakage by keeping all samples from the same group in the same fold

    • Enables StratifiedGroupKFold for both outer and inner cross-validation (if not specified, standard StratifiedKFold is used)

    • See details in Create Config Command

  • temporal_column (optional) - Name of the date/time column for temporal splitting

    • Required when validation_strategy is temporal or both

    • Values are parsed as dates

  • subgroup_columns (optional) - Comma-separated list of columns for subgroup analysis

    • Performance metrics are computed separately for each subgroup

    • Useful for evaluating model fairness across demographic or clinical categories

[Pipeline] Section#

Controls the machine learning pipeline configuration.

[Pipeline]
models = LR,RF,XGB,CatBoost
outer_folds = 5
inner_folds = 3
outer_cv_repeats = 1
calibrate_threshold = false
threshold_method = auto
calibrate_probabilities = false
probability_calibration_method = sigmoid
probability_calibration_cv = 5
confidence_level = 0.95
n_bootstrap = 1000
compute_feature_direction = false

Parameters:

  • models - Comma-separated list of models to train

    • Available models: LR, MLP, XGB, RF, CatBoost, TabPFN, RBF_SVC, Linear_SVC, KNN

    • Use respredai list-models to see all available models with descriptions

  • outer_folds - Number of folds for outer cross-validation

    • Used for model evaluation

  • inner_folds - Number of folds for inner cross-validation

    • Used for hyperparameter tuning with GridSearchCV

  • calibrate_threshold - Enable decision threshold optimization (optional, default: false)

    • true: Optimize threshold using Youden’s J statistic (Sensitivity + Specificity - 1)

    • false: Use default threshold of 0.5

    • Threshold optimization uses inner_folds for cross-validation

    • Hyperparameters are tuned first (optimizing ROC-AUC), then threshold is optimized

  • threshold_method - Method for threshold optimization (optional, default: auto)

    • auto: Automatically choose based on sample size (OOF if n < 1000, CV otherwise)

    • oof: Out-of-fold predictions method - aggregates predictions from all CV folds into a single set, then finds one global threshold maximizing Youden’s J across all concatenated samples

    • cv: TunedThresholdClassifierCV method - calculates optimal threshold separately for each CV fold, then aggregates (averages) the fold-specific thresholds

    • Key difference: oof finds one threshold on all concatenated OOF predictions (global optimization), while cv finds per-fold thresholds then averages them (fold-wise optimization then aggregation)

    • Only used when calibrate_threshold = true

  • outer_cv_repeats - Number of repetitions for outer cross-validation (optional, default: 1)

    • 1: Standard (non-repeated) cross-validation

    • >1: Repeated stratified cross-validation with different random shuffles

    • Provides more robust performance estimates by averaging over multiple CV runs

  • calibrate_probabilities - Enable post-hoc probability calibration (optional, default: false)

    • true: Apply CalibratedClassifierCV to the best estimator from GridSearchCV

    • false: Use uncalibrated probability predictions

    • Applied after hyper-parameters tuning and before threshold tuning

  • probability_calibration_method - Method for probability calibration (optional, default: sigmoid)

    • sigmoid: Platt scaling - fits a logistic regression on the classifier outputs

    • isotonic: Isotonic regression - non-parametric, monotonic transformation

    • Only used when calibrate_probabilities = true

  • probability_calibration_cv - Number of folds for probability calibration (optional, default: 5)

    • CV folds used internally by CalibratedClassifierCV

    • Must be at least 2

    • Only used when calibrate_probabilities = true

  • confidence_level - Confidence level for bootstrap confidence intervals (optional, default: 0.95)

    • Must be between 0.5 and 1.0

    • Controls the width of the reported CI bounds (e.g., 0.95 for 95% CI)

  • n_bootstrap - Number of bootstrap resamples for confidence intervals (optional, default: 1000)

    • Must be at least 100

    • Higher values give more stable CI estimates at the cost of computation time

  • compute_feature_direction - Compute the direction of feature effects (optional, default: false)

    • true: Determine whether each feature increases or decreases the predicted probability

    • false: Skip feature direction computation

    • Useful for interpretability alongside feature importance scores

[Reproducibility] Section#

Ensures reproducible results.

[Reproducibility]
seed = 42

Parameters:

  • seed - Random seed for reproducibility

    • Same seed ensures identical results across runs

    • Affects data splitting and model initialization

[Log] Section#

Controls logging behavior.

[Log]
verbosity = 1
log_basename = respredai.log

Parameters:

  • verbosity - Logging level

    • 0: No logging to file

    • 1: Log major events (model start/end, target completion)

    • 2: Verbose logging (includes fold-level details)

  • log_basename - Name of the log file

    • Created in the output folder

    • Contains detailed execution information

[Resources] Section#

Controls computational resources.

[Resources]
n_jobs = -1

Parameters:

  • n_jobs - Number of parallel jobs

    • -1: Use all available CPU cores

    • 1: No parallelization

    • N: Use N cores

[ModelSaving] Section#

Enables saving trained models for resumption.

[ModelSaving]
enable = true
compression = 3

Parameters:

  • enable - Enable saving trained models

    • true: Save models after each fold (enables resumption)

    • false: No model saving (faster but no resumption)

  • compression - Compression level for saved model files

    • Range: 1-9

    • 1: Minimal compression (fastest, largest files)

    • 3: Balanced compression (recommended)

    • 9: Maximum compression (slowest, smallest files)

[Imputation] Section#

Controls missing data imputation (optional).

[Imputation]
method = none
strategy = mean
n_neighbors = 5
estimator = bayesian_ridge

Parameters:

  • method - Imputation method

    • none: No imputation (default, requires complete data)

    • simple: SimpleImputer from scikit-learn

    • knn: KNNImputer for k-nearest neighbors imputation

    • iterative: IterativeImputer (MissForest-style)

  • strategy - Strategy for SimpleImputer (only used when method = simple)

    • mean: Replace missing values with column mean (default)

    • median: Replace with column median

    • most_frequent: Replace with most frequent value

  • n_neighbors - Number of neighbors for KNNImputer (only used when method = knn)

    • Default: 5

  • estimator - Estimator for IterativeImputer (only used when method = iterative)

    • bayesian_ridge: BayesianRidge estimator (default)

    • random_forest: RandomForestRegressor (MissForest-style)

[Output] Section#

Specifies output location.

[Output]
out_folder = ./output/

Parameters:

  • out_folder - Path to output directory

    • Will be created if it doesn’t exist

    • Contains all results, metrics, and saved models

[Preprocessing] Section#

Controls categorical feature encoding.

[Preprocessing]
ohe_min_frequency = 0.05

Parameters:

  • ohe_min_frequency (optional) - Minimum frequency for categorical values in OneHotEncoder

    • Categories appearing below this threshold are grouped into an “infrequent” category

    • Values in (0, 1): proportion of samples (e.g., 0.05 = at least 5% of samples)

    • Values >= 1: absolute count (e.g., 10 = at least 10 occurrences)

    • Set to 0 or omit to disable (keep all categories)

[Uncertainty] Section#

Controls conformal prediction for uncertainty quantification with distribution-free coverage guarantees.

[Uncertainty]
alpha = 0.1

Parameters:

  • alpha - Miscoverage rate for Mondrian conformal prediction (default: 0.1)

    • Range: 0 to 0.5 (exclusive)

    • Default 0.1 gives 90% target coverage per class

    • Prediction sets: {S}, {R}, or {S, R} with finite-sample coverage guarantees

    • Conformal diagnostics (coverage, fraction uncertain, avg set size) are appended to metrics CSV

[Validation] Section#

Controls validation strategy (optional, defaults to standard cross-validation).

[Validation]
validation_strategy = cv
# temporal_split_date = 2023-01-01
# temporal_split_ratio = 0.8

[Metadata]
temporal_column = collection_date
# subgroup_columns = ward, sex

Parameters:

  • validation_strategy - Validation approach (default: cv)

    • cv: Standard nested cross-validation only

    • temporal: Temporal (prospective-style) validation only

    • both: Run both CV and temporal validation

  • temporal_column - Name of the date/time column for temporal splitting (configured in [Metadata] section)

    • Required when validation_strategy is temporal or both

    • Values are parsed as dates

  • temporal_split_date - Cutoff date in ISO format (e.g., 2023-01-01)

    • Train set: dates before cutoff; test set: dates on or after cutoff

    • Mutually exclusive with temporal_split_ratio

  • temporal_split_ratio - Fraction of data for training by sorted date order

    • Must be between 0 and 1 (exclusive)

    • Mutually exclusive with temporal_split_date

Note: When group_column is configured in the [Metadata] section, temporal splitting assigns entire groups based on the group’s latest date to prevent data leakage.

Pipeline Workflow#

The run command executes the following steps:

  1. Configuration Loading - Parse and validate the configuration file

  2. Data Loading - Read CSV and validate features/targets

  3. Preprocessing - One-hot encode categorical features, prepare data

  4. Nested Cross-Validation - For each model and target:

    • Outer CV Loop: Split data for evaluation

    • Inner CV Loop: Hyperparameter tuning with GridSearchCV

    • Training: Train best model on outer training fold

    • Evaluation: Test on outer test fold

    • Save Models: Save trained models and metrics (if enabled)

  5. Results Aggregation - Calculate mean and std across folds

  6. Output Generation - Save confusion matrices, metrics, and plots

Output Files#

The pipeline generates the following output structure:

output_folder/
├── models/                                       # Trained models (if model saving enabled)
│   ├── {Model}_{Target}_models.joblib            # Saved models for resumption
│   └── ...
├── metrics/                                      # Detailed metrics
│   ├── {target}/
│   │   ├── {model}_metrics_detailed.csv          # Comprehensive metrics with CI
│   │   └── summary.csv                           # Summary across all models for this target
│   └── summary_all.csv                           # Global summary across all models and targets
├── confusion_matrices/                           # Confusion matrix heatmaps
│   └── Confusion_matrix_{model}_{target}.png     # One PNG per model-target combination
├── calibration/                                  # Calibration diagnostics
│   └── reliability_curve_{model}_{target}.png    # Reliability curves per fold + aggregate
├── subgroup_analysis/                             # Subgroup metrics (if subgroup_columns configured)
│   └── {target}/
│       └── {model}_{subgroup_col}_subgroup.csv   # Per-subgroup metrics
├── report.html                                   # Comprehensive HTML report
├── reproducibility.json                          # Reproducibility manifest
└── respredai.log                                 # Execution log (if verbosity > 0)

Metrics Files#

Each {model}_metrics_detailed.csv contains:

  • Metric: Name of the metric (Precision, Recall, F1, MCC, Balanced Acc, AUROC, VME, ME, Brier Score, ECE, MCE)

  • Mean: Mean value across folds

  • Std: Standard deviation across folds

  • SE: Nadeau-Bengio corrected standard error, accounting for training set overlap in k-fold CV

  • CI{n}_lower: Lower bound of confidence interval (BCa bootstrap)

  • CI{n}_upper: Upper bound of confidence interval (BCa bootstrap)

The CI percentage and number of bootstrap resamples are controlled by confidence_level and n_bootstrap in the [Pipeline] section (defaults: 95%, 1,000 resamples).

Calibration Metrics (always computed, independent of probability calibration setting):

  • Brier Score: Mean squared error of probability predictions (lower is better, range 0-1)

  • ECE (Expected Calibration Error): Weighted average of calibration error across probability bins

  • MCE (Maximum Calibration Error): Maximum calibration error across any probability bin

Confusion Matrix Plots#

Each Confusion_matrix_{model}_{target}.png shows:

  • Normalized confusion matrix for a single model-target combination

  • Mean F1, MCC, and AUROC scores with standard deviations

  • Color-coded heatmap (0.0 = poor, 1.0 = perfect)

HTML Report#

The report.html file provides a comprehensive, self-contained summary:

  • Metadata: Configuration settings, data path, timestamp

  • Framework Summary: Pipeline parameters, models, targets, and calibration settings

  • Results Tables: Per-target metrics with 95% confidence intervals for each model

  • Confusion Matrices: Embedded visualizations in a responsive grid layout

  • Calibration Diagnostics: Brier Score, ECE, MCE metrics with 95% CIs, plus reliability curve plots

The report can be opened in any web browser and shared without additional dependencies.

Model Saving System#

Each {Model}_{Target}_models.joblib file contains all data from the outer cross-validation in a single file:

  • fold_models: A list containing one trained model per outer CV fold

  • fold_transformers: A list containing one fitted transformer (scaler) per fold

  • fold_ohe_transformers: A list containing one fitted OneHotEncoder per fold

  • fold_thresholds: A list containing one calibrated threshold per fold

  • fold_hyperparams: A list containing the best hyperparameters per fold

  • fold_test_data: Optional list of (X_test_scaled, feature_names) tuples for SHAP computation

  • metrics: All metrics (precision, recall, F1, MCC, AUROC, confusion matrices) for every fold

  • completed_folds: Number of completed folds

  • timestamp: When the file was saved

For example, with outer_folds=5, each joblib file will contain 5 trained models and their corresponding transformers and metrics.

Examples#

Basic Usage#

respredai run --config my_config.ini

Quiet Mode (for scripts)#

respredai run --config my_config.ini --quiet

See Also#