ProteoPy
An AnnData-based framework for integrated proteomics analysis
ProteoPy is a Python library that brings quantitative proteomics analysis into the AnnData ecosystem. It provides a unified framework for protein- and peptide-level analysis — from data import through quality control, preprocessing, and differential abundance testing — while storing all data and metadata in a single portable object.
ProteoPy provides simplified yet extensible functions for common proteomics workflows, seamlessly integrating with the scanpy, MUON, and the broader single-cell Python ecosystems for reproducible and scalable multi-omics analysis.
Key features
Flexible data import from DIA-NN, MaxQuant, and generic tabular formats
Quality control & filtering with completeness metrics, CV analysis, and contaminant removal
Preprocessing including normalization, batch correction (via scanpy), and missing-value imputation
Peptide-level analysis with overlapping peptide grouping, peptide-to- protein quantification, and per-protein peptide intensity visualization
Differential abundance analysis with t-test, Welch’s test and multiple testing correction
Proteoform inference via a reimplementation of the COPF algorithm for detecting functional proteoform groups from peptide-level data
Exploratory analysis via direct access to scanpy routines for PCA, UMAP, and clustering
Publication-ready visualizations for QC, exploratory analysis, and statistical results
How to install ProteoPy and its dependencies.
Step-by-step notebooks for protein-level and peptide-level workflows.
Complete reference for all modules and functions.
What’s new in ProteoPy.
Full release history.
Get help, report issues, and connect with other users.
Source code
ProteoPy is open source and available on GitHub under an Apache 2.0 license: https://github.com/UKHD-NP/proteopy.
Citation
ProteoPy is developed and maintained by the BludauLab in the Department of Computational Neuropathology, University Hospital Heidelberg.
If you use ProteoPy in your research, please cite:
Fichtner ID, Sahm F, Gerstung M, and Bludau I. Proteopy: an anndata-based framework for integrated proteomics analysis. UNPUBLISHED, 2026.
@article{fichtner2026proteopy,
title={ProteoPy: an AnnData-based framework for integrated proteomics analysis},
author={Fichtner, Ian Dirk and Sahm, Felix and Gerstung, Moritz and Bludau, Isabell},
journal={UNPUBLISHED},
year={2026}
}
If you use the COPF proteoform inference functionality, please also cite:
Bludau I, Frank M, Dörig C, Cai Y, Heusel M, Rosenberger G, Picotti P, Collins BC, Röst H, and Aebersold R. Systematic detection of functional proteoform groups from bottom-up proteomic datasets. Nature Communications, 12:3810, 2021. URL: https://www.nature.com/articles/s41467-021-24030-x, doi:10.1038/s41467-021-24030-x.
@article{bludau2021systematic,
title={Systematic detection of functional proteoform groups from bottom-up proteomic datasets},
author={Bludau, Isabell and Frank, Max and D{\"o}rig, Christian and Cai, Yujia and Heusel, Moritz and Rosenberger, George and Picotti, Paola and Collins, Ben C. and R{\"o}st, Hannes and Aebersold, Ruedi},
journal={Nature Communications},
volume={12},
pages={3810},
year={2021},
doi={10.1038/s41467-021-24030-x}
}