PhD Researcher · University of Bologna

Ettore Rocchi

Physics background, biomedical mission.

Building interpretable machine learning pipelines for clinical microbiology and multi-omics data.

I develop computational methods to predict antimicrobial resistance, discover patient phenotypes, and make sense of high-dimensional omics data. My work spans MALDI-TOF mass spectrometry, multi-omics integration, genomics, and metagenomics, with a focus on interpretability and clinical impact.

Ettore Rocchi
Scroll

About

I trained as a physicist and moved into biomedical data science to apply quantitative methods to problems with direct clinical impact. My current work focuses on antimicrobial resistance prediction from MALDI-TOF mass spectrometry, multi-centre data harmonisation, computational patient phenotyping, and computational genomics.

I work within the Physics4MedicineLab, part of the Multi-Omics and Health-Care Data Analytics Unit at Sant'Orsola Hospital in Bologna. My research combines machine learning with clinical microbiology, pharmacokinetics, and genomics. I care about making models that are not only accurate but interpretable, because a prediction without an explanation is just a well-dressed guess.

I write Python, build pipelines, publish open-source tools, and contribute to interdisciplinary projects. The through-line is a commitment to reproducibility and a mild obsession with doing things properly.

What I Actually Do All Day

In practice, my days involve writing Python scripts until something either converges or breaks, reading papers about problems I didn't know existed until last week, preprocessing mass spectra that have strong opinions about baseline correction, and discussing model outputs with clinical collaborators, figuring out together when the data is telling a story we hadn't expected.

There's also a fair amount of staring at loss curves, debugging pipelines that worked yesterday, and writing documentation future-me will thank present-me for. The ratio of thinking to typing is higher than most people expect. The ratio of coffee to output is best left unquantified.

News

Recent updates - papers, releases, conferences, and visits.

Ongoing Collaborations

Institutions and groups I currently work with on research projects.

Research

Four threads I am currently pulling, at the intersection of applied physics, machine learning, and clinical data.

💊

AMR & clinical machine learning

Machine learning on mass spectra and clinical data to anticipate antimicrobial resistance ahead of culture-based diagnostics, with cross-site harmonisation and generative modelling extending the pipeline beyond single-instrument settings.

MALDI-TOF · supervised & generative learning · cross-site harmonisation

🦠

Infectious risk & pathogen surveillance

Stratification of infectious risk in fragile populations such as transplant recipients, and surveillance of circulating pathogens through metagenomic monitoring and computational phenotyping.

patient phenotyping · survival & multi-state models · metagenomic surveillance

🧬

Computational genomics

Discovery and interpretation of structural and somatic variants from short- and long-read sequencing, in clinically relevant genomic contexts.

structural variants · somatic calling · long-read sequencing

⚙️

Computational methodologies

Open-source tools and reusable methods built around specific biomedical questions, designed to be reproducible, well-documented, and useful beyond their original project.

bioinformatic tools · open-source software · reproducible pipelines

Projects

Open-source tools and research pipelines. A selection below; the full catalog lives on the projects page.

MaldiSuite GitHub ↗

A Python ecosystem for end-to-end clinical AMR pipelines on MALDI-TOF spectra.

Three sklearn-compatible packages that chain into a single workflow: preprocess with MaldiAMRKit, harmonise across batches and sites with MaldiBatchKit, classify with MaldiDeepKit. Designed to be modular, reproducible, and clinically deployable.

Python scikit-learn PyTorch PyPI MALDI-TOF
ResPredAI GitHub ↗

AI model to predict resistances in Gram-negative bloodstream infections.

CLI-based ML framework with nested cross-validation, nine model architectures, probability calibration, and threshold optimization for clinical decision-making. Configured via a single .ini file. Published in npj Digital Medicine.

Python XGBoost PyTorch CLI PyPI
phenocluster GitHub ↗

Unsupervised clinical phenotype discovery with survival and multi-state modeling.

Tools for identifying clinically meaningful patient subgroups from heterogeneous cohorts, combining unsupervised clustering with survival analysis and multi-state trajectory modelling for prognostic interpretation.

Python clustering survival phenotyping

Recent Publications

A few recent highlights. A curated list with BibTeX lives on the publications page; the complete record is on Google Scholar and Scopus.

  1. 2026 BMC Microbiology Open Access

    Combining mass spectrometry and machine learning models for predicting Klebsiella pneumoniae antimicrobial resistance: a multicenter experience from clinical isolates in Italy

    Rocchi E, Nicitra E, Calvo M, Cento V, Peiretti L, Asif Z, Menchinelli G, Posteraro B, Sala C, Colosimo C, Cricca M, Sambri V, Sanguinetti M, Castellani G, Stefani S

  2. 2025 npj Digital Medicine Open Access

    Artificial Intelligence model to predict resistances in Gram-negative bloodstream infections

    Bonazzetti C, Rocchi E, Toschi A, Derus NR, Sala C, Pascale R, Rinaldi M, Campoli C, Pasquini ZAI, Tazza B, Amicucci A, Gatti M, Ambretti S, Viale P, Castellani G, Giannella M

  3. 2025 Frontiers in Genome Editing Open Access

    CATS: a bioinformatic tool for automated Cas9 nucleases activity comparison in clinically relevant contexts

    Rocchi E, Magnani F, Castellani G, Carusillo A, Tarozzi M

Education

A physics training that gradually re-oriented itself toward biomedical questions.

  1. PhD in Health and Technologies

    University of Bologna

    Supervisor: Prof. Gastone Castellani

    Visiting researcher · 6 months

    Max Planck Institute of Biochemistry, Munich, Germany

    Machine Learning & Systems Biology group - Prof. Karsten Borgwardt

  2. MSc in Applied Physics

    University of Bologna

  3. BSc in Physics

    University of Bologna

Tools I Use

The day-to-day toolbox - languages, libraries, and workflow managers I rely on.

Python
PyTorch
scikit-learn
R
Nextflow
Snakemake

Contact

I am always happy to talk about research collaborations and open-source development, but also about ideas, methods, and topics I have not explored yet. I am as interested in learning something new from someone as in starting something new together. If you have an idea, a question, or a dataset that misbehaves, I would be glad to hear about it.

ettore.rocchi3@unibo.it

Based in Bologna, Italy