ProteoPy

An AnnData-based framework for integrated proteomics analysis

PyPI version License Tests

ProteoPy is a Python library that brings quantitative proteomics analysis into the AnnData ecosystem. It provides a unified framework for protein- and peptide-level analysis — from data import through quality control, preprocessing, and differential abundance testing — while storing all data and metadata in a single portable object.

ProteoPy provides simplified yet extensible functions for common proteomics workflows, seamlessly integrating with the scanpy, MUON, and the broader single-cell Python ecosystems for reproducible and scalable multi-omics analysis.

Key features

  • Flexible data import from DIA-NN, MaxQuant, and generic tabular formats

  • Quality control & filtering with completeness metrics, CV analysis, and contaminant removal

  • Preprocessing including normalization, batch correction (via scanpy), and missing-value imputation

  • Peptide-level analysis with overlapping peptide grouping, peptide-to- protein quantification, and per-protein peptide intensity visualization

  • Differential abundance analysis with t-test, Welch’s test and multiple testing correction

  • Proteoform inference via a reimplementation of the COPF algorithm for detecting functional proteoform groups from peptide-level data

  • Exploratory analysis via direct access to scanpy routines for PCA, UMAP, and clustering

  • Publication-ready visualizations for QC, exploratory analysis, and statistical results

Installation

How to install ProteoPy and its dependencies.

Installation
Tutorials

Step-by-step notebooks for protein-level and peptide-level workflows.

Tutorials
API Reference

Complete reference for all modules and functions.

proteopy
News

What’s new in ProteoPy.

News
Changelog

Full release history.

Changelog
Community

Get help, report issues, and connect with other users.

Community

Source code

ProteoPy is open source and available on GitHub under an Apache 2.0 license: https://github.com/UKHD-NP/proteopy.

Citation

ProteoPy is developed and maintained by the BludauLab in the Department of Computational Neuropathology, University Hospital Heidelberg.

If you use ProteoPy in your research, please cite:

[FSGB26]

Fichtner ID, Sahm F, Gerstung M, and Bludau I. Proteopy: an anndata-based framework for integrated proteomics analysis. UNPUBLISHED, 2026.

@article{fichtner2026proteopy,
    title={ProteoPy: an AnnData-based framework for integrated proteomics analysis},
    author={Fichtner, Ian Dirk and Sahm, Felix and Gerstung, Moritz and Bludau, Isabell},
    journal={UNPUBLISHED},
    year={2026}
}

If you use the COPF proteoform inference functionality, please also cite:

[BFD+21]

Bludau I, Frank M, Dörig C, Cai Y, Heusel M, Rosenberger G, Picotti P, Collins BC, Röst H, and Aebersold R. Systematic detection of functional proteoform groups from bottom-up proteomic datasets. Nature Communications, 12:3810, 2021. URL: https://www.nature.com/articles/s41467-021-24030-x, doi:10.1038/s41467-021-24030-x.

@article{bludau2021systematic,
    title={Systematic detection of functional proteoform groups from bottom-up proteomic datasets},
    author={Bludau, Isabell and Frank, Max and D{\"o}rig, Christian and Cai, Yujia and Heusel, Moritz and Rosenberger, George and Picotti, Paola and Collins, Ben C. and R{\"o}st, Hannes and Aebersold, Ruedi},
    journal={Nature Communications},
    volume={12},
    pages={3810},
    year={2021},
    doi={10.1038/s41467-021-24030-x}
}