Focused tools that solve one problem well. Some plug into the frameworks above,
others stand on their own and are used by collaborators across labs.
Scikit-learn compatible ComBat batch-effect correction.
Integrates Johnson, Fortin (neuroComBat), and Chen (CovBat) harmonization methods
into scikit-learn pipelines, with leakage-safe cross-validation handling.
Plugs into existing workflows without breaking the scikit-learn API contract.
Python
scikit-learn
PyPI
ComBat
Nested cross-validation with calibration, threshold optimization, and statistical tests.
A toolkit for the kind of model evaluation that survives peer review: nested CV
with proper hyperparameter selection, probability calibration, decision-threshold
tuning, and built-in statistical comparisons between models.
Python
scikit-learn
nested CV
calibration
Automated Cas9 PAM-compatibility comparison with ClinVar integration.
A bioinformatic tool for comparing Cas9 nucleases across clinically relevant genomic
contexts. Detects overlapping PAM sites between variants and identifies allele-specific
targets arising from pathogenic mutations. Published in Frontiers in Genome Editing (2025).
Python
CRISPR
genomics
bioinformatics
APOBEC-style mutation identification from multiple sequence alignment.
Detects mutational patterns consistent with APOBEC enzyme activity from multiple
sequence alignments, supporting downstream statistical description of mutational signatures.
Python
mutational signatures
APOBEC
MSA
Broken-stick model extension for metagenomic simulation.
Extends the CAMISIM metagenomic simulator with a broken-stick abundance model and
a configurable number of strains, producing synthetic communities with controlled
relative-abundance distributions for benchmarking metagenomic pipelines.
Python
metagenomics
simulation
benchmarking