PhD Researcher · University of Bologna

Ettore Rocchi

Physics background, biomedical mission.

Building interpretable machine learning pipelines for clinical microbiology and multi-omics data.

I develop computational methods to predict antimicrobial resistance, discover patient phenotypes, and make sense of high-dimensional omics data. My work spans MALDI-TOF mass spectrometry, multi-omics integration, genomics, and metagenomics, with a focus on interpretability and clinical impact.

Ettore Rocchi

About

I trained as a physicist and moved into biomedical data science to apply quantitative methods to problems with direct clinical impact. My current work focuses on antimicrobial resistance prediction from MALDI-TOF mass spectrometry, multi-centre data harmonisation, computational patient phenotyping, and computational genomics.

I work within the Physics4MedicineLab, part of the Multi-Omics and Health-Care Data Analytics Unit at Sant'Orsola Hospital in Bologna. My research combines machine learning with clinical microbiology, pharmacokinetics, and genomics. I care about making models that are not only accurate but interpretable, because a prediction without an explanation is just a well-dressed guess.

I write Python, build pipelines, publish open-source tools, and contribute to interdisciplinary projects. The through-line is a commitment to reproducibility and a mild obsession with doing things properly.

What I Actually Do All Day

In practice, my days involve writing Python until something either converges or breaks, reading papers about problems I didn't know existed until last week, preprocessing mass spectra that have strong opinions about baseline correction, and discussing model outputs with clinical collaborators, figuring out together when the data is telling a story we hadn't expected.

There's also a fair amount of staring at loss curves, debugging pipelines that worked yesterday, and writing documentation future-me will thank present-me for. The ratio of thinking to typing is higher than most people expect. The ratio of coffee to output is best left unquantified.

Research

Four threads I am currently pulling, at the intersection of applied physics, machine learning, and clinical data.

🧬

Antimicrobial resistance prediction

Machine learning on mass spectra and clinical data to anticipate resistance phenotypes prior to culture-based diagnostics.

MALDI-TOF · supervised learning · deep neural networks

🔗

Multi-centre data harmonisation

Batch-effect correction methods for ML models trained on high-throughput data collected across instruments and clinical sites.

ComBat · batch-mixing diagnostics · cross-site validation

👥

Computational patient phenotyping

Discovery of clinically meaningful subgroups from heterogeneous patient cohorts, with prognostic and trajectory modelling.

unsupervised clustering · survival analysis · multi-state models

🧪

Genomics & metagenomics analyses

Network-based modelling of microbial communities, structural and somatic variants, mutational signatures, and CRISPR-Cas9 editing tools.

microbial networks · variant calling · mutational signatures

Projects

Open-source tools and research pipelines. A selection below; the full catalog lives on the projects page.

MaldiSuite GitHub ↗

A Python ecosystem for end-to-end clinical AMR pipelines on MALDI-TOF spectra.

Three sklearn-compatible packages that chain into a single workflow: preprocess with MaldiAMRKit, harmonise across batches and sites with MaldiBatchKit, classify with MaldiDeepKit. Designed to be modular, reproducible, and clinically deployable.

Python scikit-learn PyTorch PyPI MALDI-TOF
ResPredAI GitHub ↗

AI model to predict resistances in Gram-negative bloodstream infections.

CLI-based ML framework with nested cross-validation, nine model architectures, probability calibration, and threshold optimization for clinical decision-making. Configured via a single .ini file. Published in npj Digital Medicine.

Python XGBoost PyTorch CLI PyPI
phenocluster GitHub ↗

Unsupervised clinical phenotype discovery with survival and multi-state modeling.

Tools for identifying clinically meaningful patient subgroups from heterogeneous cohorts, combining unsupervised clustering with survival analysis and multi-state trajectory modelling for prognostic interpretation.

Python clustering survival phenotyping

Recent Publications

A selection of recent work. The full list, with BibTeX, is on the publications page.

  1. 2026 BMC Microbiology Open Access

    Combining mass spectrometry and machine learning models for predicting Klebsiella pneumoniae antimicrobial resistance: a multicenter experience from clinical isolates in Italy

    Rocchi E, Nicitra E, Calvo M, Cento V, Peiretti L, Asif Z, Menchinelli G, Posteraro B, Sala C, Colosimo C, Cricca M, Sambri V, Sanguinetti M, Castellani G, Stefani S

  2. 2025 npj Digital Medicine Open Access

    Artificial Intelligence model to predict resistances in Gram-negative bloodstream infections

    Bonazzetti C, Rocchi E, Toschi A, Derus NR, Sala C, Pascale R, Rinaldi M, Campoli C, Pasquini ZAI, Tazza B, Amicucci A, Gatti M, Ambretti S, Viale P, Castellani G, Giannella M

  3. 2025 Frontiers in Genome Editing Open Access

    CATS: a bioinformatic tool for automated Cas9 nucleases activity comparison in clinically relevant contexts

    Rocchi E, Magnani F, Castellani G, Carusillo A, Tarozzi M

Tools I Use

The day-to-day toolbox - languages, libraries, and workflow managers I rely on.

Python
PyTorch
scikit-learn
R
Nextflow
Snakemake

Contact

I am always happy to talk about research collaborations and open-source development, but also about ideas, methods, and topics I have not explored yet. I am as interested in learning something new from someone as in starting something new together. If you have an idea, a question, or a dataset that misbehaves, I would be glad to hear about it.

ettore.rocchi3@unibo.it

Based in Bologna, Italy