CLI Reference

PhenoCluster provides a command-line interface for running the pipeline, generating configuration files, and validating configurations.

phenocluster

PhenoCluster - Clinical Phenotype Discovery Pipeline

Usage

phenocluster [OPTIONS] COMMAND [ARGS]...

create-config

Generate a configuration YAML file from a profile template.

Profiles set sensible defaults for common use-cases. Data-specific parameters (column names, survival targets) are left as placeholders that you fill in.

Profiles:

descriptive - Phenotype discovery only, no statistical inference complete - All analyses enabled (inference + survival + multistate) quick - Fast iteration for development (reduced runs)

Example:

phenocluster create-config -p complete -o config.yaml phenocluster create-config -p quick -o quick_config.yaml

Usage

phenocluster create-config [OPTIONS]

Options

-o, --output <output>

Output YAML path

Default:

'config.yaml'

-p, --profile <profile>

Profile: descriptive, complete, quick

Default:

'complete'

run

Run the phenotype discovery pipeline.

All parameters are controlled via the configuration YAML file. Use ‘create-config’ to generate a config from a profile.

Example:

phenocluster run -d data.csv -c config.yaml phenocluster run -d data.csv -c config.yaml –force-rerun

Usage

phenocluster run [OPTIONS]

Options

-d, --data <data>

Required Path to input CSV file

-c, --config <config>

Required Path to configuration YAML file

--force-rerun

Ignore cached artifacts and re-run all steps

Default:

False

validate-config

Validate a configuration YAML file.

Checks YAML structure, required sections, value ranges, and internal consistency (e.g. survival targets reference valid columns, multistate transitions reference valid states).

When –data is supplied, also cross-references every column name in the config against the actual CSV header, catching typos and missing columns before a long pipeline run.

Examples:

phenocluster validate-config -c config.yaml phenocluster validate-config -c config.yaml -d data.csv

Usage

phenocluster validate-config [OPTIONS]

Options

-c, --config <config>

Required Path to configuration YAML file

-d, --data <data>

Path to CSV file - cross-checks column names against actual data

version

Show version information.

Usage

phenocluster version [OPTIONS]