ProteoPy

An AnnData-based framework for integrated proteomics analysis

ProteoPy is a Python library that brings quantitative proteomics analysis into the AnnData ecosystem. It provides a unified framework for protein- and peptide-level analysis — from data import through quality control, preprocessing, and differential abundance testing — while storing all data and metadata in a single portable object.

ProteoPy provides simplified yet extensible functions for common proteomics workflows, seamlessly integrating with the scanpy, MUON, and the broader single-cell Python ecosystems for reproducible and scalable multi-omics analysis.

Key features

Flexible data import from DIA-NN, MaxQuant, and generic tabular formats
Quality control & filtering with completeness metrics, CV analysis, and contaminant removal
Preprocessing including normalization, batch correction (via scanpy), and missing-value imputation
Peptide-level analysis with overlapping peptide grouping, peptide-to- protein quantification, and per-protein peptide intensity visualization
Differential abundance analysis with t-test, Welch’s test and multiple testing correction
Proteoform inference via a reimplementation of the COPF algorithm for detecting functional proteoform groups from peptide-level data
Exploratory analysis via direct access to scanpy routines for PCA, UMAP, and clustering
Publication-ready visualizations for QC, exploratory analysis, and statistical results

Installation

How to install ProteoPy and its dependencies.

Installation

Tutorials

Step-by-step notebooks for protein-level and peptide-level workflows.

Tutorials

API Reference

Complete reference for all modules and functions.

proteopy

News

What’s new in ProteoPy.

News

Changelog

Full release history.

Changelog

Community

Get help, report issues, and connect with other users.

Community

Source code

ProteoPy is open source and available on GitHub under an Apache 2.0 license: https://github.com/UKHD-NP/proteopy.

Citation

ProteoPy is developed and maintained by the BludauLab in the Department of Computational Neuropathology, University Hospital Heidelberg.

If you use ProteoPy in your research, please cite:

[FSGB26]

Fichtner ID, Sahm F, Gerstung M, and Bludau I. Proteopy: an anndata-based framework for integrated proteomics analysis. UNPUBLISHED, 2026.

@article{fichtner2026proteopy,
    title={ProteoPy: an AnnData-based framework for integrated proteomics analysis},
    author={Fichtner, Ian Dirk and Sahm, Felix and Gerstung, Moritz and Bludau, Isabell},
    journal={UNPUBLISHED},
    year={2026}
}

If you use the COPF proteoform inference functionality, please also cite:

[BFD+21]

Bludau I, Frank M, Dörig C, Cai Y, Heusel M, Rosenberger G, Picotti P, Collins BC, Röst H, and Aebersold R. Systematic detection of functional proteoform groups from bottom-up proteomic datasets. Nature Communications, 12:3810, 2021. URL: https://www.nature.com/articles/s41467-021-24030-x, doi:10.1038/s41467-021-24030-x.

@article{bludau2021systematic,
    title={Systematic detection of functional proteoform groups from bottom-up proteomic datasets},
    author={Bludau, Isabell and Frank, Max and D{\"o}rig, Christian and Cai, Yujia and Heusel, Moritz and Rosenberger, George and Picotti, Paola and Collins, Ben C. and R{\"o}st, Hannes and Aebersold, Ruedi},
    journal={Nature Communications},
    volume={12},
    pages={3810},
    year={2021},
    doi={10.1038/s41467-021-24030-x}
}