Feature Importance Command#
The feature-importance command extracts and visualizes feature importance or coefficients from trained models across all outer cross-validation iterations.
Usage#
respredai feature-importance --output <output_folder> --model <model_name> --target <target_name> [options]
Options#
Required#
--output, -o- Path to the output folder containing trained modelsMust be the same folder used in the
runcommandMust contain a
models/subdirectory with saved model filesExample:
./output/or./out_run_example/
--model, -m- Model name to extract importance fromMust match one of the models trained in the pipeline
Examples:
LR,RF,XGB,CatBoost,Linear_SVCCase-sensitive
--target, -t- Target name to extract importance forMust match one of the targets from the training pipeline
Example:
Target1,Ciprofloxacin_RCase-sensitive
Optional#
--top-n, -n- Number of top features to display (default: 20)Features are ranked by absolute importance
Range: 1 to total number of features
Example:
--top-n 30for top 30 features
--no-plot- Skip generating the barplotOnly CSV file will be created
Useful for batch processing or server environments
--no-csv- Skip generating the CSV fileOnly plot will be created
Useful if you only need visualizations
--seed, -s- Random seed for SHAP reproducibilityEnsures reproducible SHAP values across runs
Only affects models using SHAP fallback
Supported Models#
The command uses native importance when available, with SHAP as fallback:
Native Importance (Primary)#
Linear Models (Coefficients)
LR (Logistic Regression) - Uses coefficient values
Linear_SVC (Linear SVM) - Uses coefficient values
Tree-Based Models (Feature Importances)
RF (Random Forest) - Uses Gini importance
XGB (XGBoost) - Uses gain-based importance
CatBoost - Uses feature importance scores
For tree-based models importance values are always positive.
SHAP Fallback#
For models without native importance/coefficients, SHAP (SHapley Additive exPlanations) values are computed as a fallback:
MLP (Multi-Layer Perceptron) - Uses KernelExplainer
RBF_SVC (RBF SVM) - Uses KernelExplainer
TabPFN - Uses KernelExplainer
SHAP values are computed on the test fold of each outer CV iteration and aggregated across folds. The mean absolute SHAP value represents feature importance.
Note: SHAP computation with KernelExplainer can be slow for large datasets.
Output Files#
The command generates files in the following structure:
output_folder/
└── feature_importance/
└── {target}/
├── {model}_feature_importance.csv # Native importance (if available)
├── {model}_feature_importance.png
├── {model}_feature_importance_shap.csv # SHAP importance (fallback)
└── {model}_feature_importance_shap.png
Files have _shap suffix when SHAP is used instead of native importance.
CSV File Format (Native)#
For models with native importance:
Column |
Description |
|---|---|
|
Feature name |
|
Mean importance across folds (signed for linear models) |
|
Standard deviation across folds |
|
Absolute mean importance (used for ranking) |
|
Formatted string with mean ± std |
CSV File Format (SHAP)#
For models using SHAP fallback:
Column |
Description |
|---|---|
|
Feature name |
|
Mean absolute SHAP value across folds |
|
Standard deviation across folds |
|
Formatted string with mean ± std |
Features are sorted by importance (absolute mean value).
Across all folds:
Calculate mean importance for each feature
Calculate standard deviation (uncertainty measure)
Rank features by importance
Plot Color Coding#
The barplot uses different colors to indicate importance type:
Method |
Color |
Meaning |
|---|---|---|
SHAP |
Orange |
Mean absolute SHAP value |
Native (tree-based) |
Blue |
Feature importance (always positive) |
Native (linear, positive) |
Red |
Positive coefficient |
Native (linear, negative) |
Green |
Negative coefficient |
Error bars show standard deviation across CV folds.
Examples#
Basic Usage#
Extract top 20 features for Logistic Regression on Target1:
respredai feature-importance --output ./output --model LR --target Target1
Custom Number of Features#
Show top 5 features:
respredai feature-importance -o ./output -m RF -t Target2 --top-n 5
Multiple Models#
Extract importance for multiple models (run separately):
respredai feature-importance -o ./output -m LR -t Target1
respredai feature-importance -o ./output -m RF -t Target1
respredai feature-importance -o ./output -m XGB -t Target1
CSV Only (No Plot)#
Generate only the CSV file for automated analysis:
respredai feature-importance -o ./output -m LR -t Target1 --no-plot
Plot Only (No CSV)#
Generate only the visualization:
respredai feature-importance -o ./output -m RF -t Target1 --no-csv
See Also#
Run Command - Train models with nested CV and save model files
Train Command - Train models on entire dataset for cross-dataset validation
Create Config Command - How to create the configuration file