Quick Start#

This guide will help you get started with ResPredAI in just a few minutes.

Step 1: Create a Configuration File#

Generate a template configuration file:

respredai create-config my_config.ini

Step 2: Edit the Configuration#

Open my_config.ini and customize it for your data:

[Data]
data_path = ./data/my_data.csv
targets = Target1,Target2
continuous_features = Feature1,Feature2,Feature3

[Metadata]
# group_column = PatientID  # Optional: prevents data leakage in CV
# temporal_column = collection_date  # Date column for temporal validation
# subgroup_columns = ward, sex  # Columns for subgroup performance analysis

[Pipeline]
models = LR,RF,XGB,CatBoost
outer_folds = 5
inner_folds = 3
calibrate_threshold = false
threshold_method = auto
calibrate_probabilities = false
probability_calibration_method = sigmoid

[Reproducibility]
seed = 42

[Log]
verbosity = 1
log_basename = respredai.log

[Resources]
n_jobs = -1

[ModelSaving]
enable = true
compression = 3

[Preprocessing]
ohe_min_frequency = 0.05

[Imputation]
method = none
strategy = mean
n_neighbors = 5
estimator = bayesian_ridge

[Output]
out_folder = ./output/

# [Validation]
# validation_strategy = cv
# temporal_split_date = 2023-01-01

Configuration Sections#

Section

Description

[Data]

Input data path, target columns, feature types

[Metadata]

Metadata columns for grouping, temporal splitting, and subgroup analysis

[Pipeline]

Models to train, CV folds, threshold optimization and probability calibration

[Reproducibility]

Random seed for reproducibility

[Log]

Logging verbosity and file name

[Resources]

Parallel processing settings

[ModelSaving]

Model persistence options

[Preprocessing]

Categorical encoding settings (e.g., ohe_min_frequency)

[Imputation]

Missing data imputation settings

[Uncertainty]

Conformal prediction miscoverage rate (alpha) for uncertainty quantification

[Output]

Output directory path

[Validation]

Validation strategy (CV, temporal, or both)

Step 3: Validate Configuration (Optional)#

Check that your configuration is valid before running:

respredai validate-config my_config.ini --check-data

Step 4: Run the Pipeline#

Execute the nested cross-validation pipeline:

respredai run --config my_config.ini

Step 5: Explore Results#

After the pipeline completes, check your output folder:

  • report.html - Comprehensive HTML report with all results

  • metrics/ - Performance metrics with 95% confidence intervals

  • confusion_matrices/ - Visualization of model performance

  • models/ - Saved models for feature importance extraction

Step 6: Extract Feature Importance (Optional)#

Analyze which features are most important:

respredai feature-importance --output ./output --model LR --target Target1

Available Models#

Code

Model

LR

Logistic Regression

RF

Random Forest

XGB

XGBoost

CatBoost

CatBoost

MLP

Neural Network

TabPFN

TabPFN

RBF_SVC

RBF SVM

Linear_SVC

Linear SVM

KNN

K-Nearest Neighbors

Next Steps#