Output Files¶
PhenoCluster writes all results to the configured output directory
(default: results/).
Main outputs¶
File |
Description |
|---|---|
|
Comprehensive interactive HTML report with all results and visualisations. Skipped when |
|
Phenotype sizes, feature distributions, classification quality metrics |
|
Odds ratios with confidence intervals and FDR-corrected p-values |
|
Kaplan-Meier estimates, Nelson-Aalen curves, and Cox PH hazard ratios with confidence intervals |
|
Transition hazard ratios, state occupation probabilities, and pathway analysis |
|
Information criteria (BIC, AIC, etc.), entropy, and average posterior probabilities |
|
Original dataset augmented with phenotype assignments |
|
Posterior class membership probabilities per patient |
|
Model selection comparison table and best model info |
|
Feature characterisation per phenotype (effect sizes, dominant categories) |
|
Internal validation metrics (train/test log-likelihood, cluster proportions) |
|
Consensus clustering stability metrics |
|
Train/test split details (sample counts, stratification) |
|
External validation results (when |
|
Temporal generalizability cohorts (v0.3.0, when |
|
Multi-site (LOGO / holdout) cohorts (v0.3.0, when |
|
External-CSV cohorts (v0.3.0, one entry per file listed under |
|
Aggregate ARI / PSI per kind plus the resolved |
|
Per-cohort |
|
Pipeline execution log (when |
|
Cached intermediate results for incremental re-runs |
Visualisations¶
All plots are saved in the configured format (default: interactive HTML via Plotly). PhenoCluster uses the colorblind-safe Wong (2011) palette.
Plot |
Description |
|---|---|
Model selection |
Information criterion (e.g. BIC) vs number of clusters with best-k annotation |
Phenotype size distribution |
Bar chart of patient counts per phenotype |
Classification quality |
Posterior probability distributions per phenotype |
Continuous heatmap |
Z-score standardised continuous features by phenotype |
Categorical heatmap |
Within-phenotype proportions for categorical features |
Forest plots (OR) |
Odds ratios with confidence interval bars |
Forest plots (HR) |
Hazard ratios with confidence interval bars |
Kaplan-Meier curves |
Survival curves per phenotype with step-function interpolation |
Nelson-Aalen curves |
Cumulative hazard estimates per phenotype |
Cumulative incidence functions |
Transition-specific hazard curves per transition |
State occupation probabilities |
Time-varying probability of being in each state |
Pathway frequency |
Most common clinical pathways from Monte Carlo simulation |
Cohort prevalence heatmap |
Phenotype prevalence (%) across temporal or multi-site cohorts (v0.3.0) |
Drift bar chart |
Top-K features by absolute PSI per cohort (v0.3.0) |
OR concordance scatter |
Derivation vs validation log(OR) per phenotype with identity line (v0.3.0) |
LOGO / window forest |
Per-cohort refit-and-match ARI dot plot (v0.3.0) |