.differential_abundance_df

proteopy.get.differential_abundance_df(adata, keys=None, key_group=None, min_logfc=None, max_logfc=None, max_pval=None, sort_by=None)[source]

Retrieve differential abundance results from .varm as a long-format DataFrame.

Merges one or more test result DataFrames stored in adata.varm into a single tidy DataFrame with an added column identifying the source test.

Parameters:
  • adata (AnnData) – Annotated data object containing differential abundance results in .varm.

  • keys (str | Sequence[str] | None) – One or more keys in adata.varm corresponding to differential abundance test results (e.g., "ttest_two_sample_treated-control" or "welch_A-vs-rest"). Mutually exclusive with key_group.

  • key_group (str | None) – Alternative to keys. A key group identifier (e.g., "welch_one_vs_rest") that selects all .varm keys belonging to that group. Use tests() to see available key groups. Mutually exclusive with keys.

  • min_logfc (float | None) – If provided, filter to rows where logfc >= min_logfc.

  • max_logfc (float | None) – If provided, filter to rows where logfc <= max_logfc.

  • max_pval (float | None) – If provided, filter to rows where adjusted p-value <= max_pval. Uses pval_adj column if present, otherwise falls back to pval.

  • sort_by (str | None) – Column name to sort by in descending order (e.g., "logfc").

Returns:

Long-format DataFrame with columns:

  • var_id: Variable identifier (from adata.var_names).

  • test_type: The statistical test method (e.g., "welch").

  • group_by: The .obs column used for grouping.

  • design: Underscore-separated design identifier (e.g., "A_vs_rest").

  • design_label: Human-readable description of what the test compares.

  • mean1: Mean expression in group 1.

  • mean2: Mean expression in group 2.

  • logfc: Log fold change.

  • tstat: t-statistic.

  • pval: Raw p-value.

  • pval_adj: Adjusted p-value.

  • is_diff_abundant: Boolean indicating significance.

Return type:

pandas.DataFrame

Raises:
  • ValueError – If both keys and key_group are provided, or if neither is provided.

  • TypeError – If keys is neither a string nor a sequence of strings.

  • KeyError – If any specified key is not found in adata.varm, or if key_group does not match any test group.

Examples

>>> import proteopy as pp
>>> # Using explicit keys
>>> df = pp.get.differential_abundance_df(
...     adata,
...     keys=["welch_treated-control", "welch_A-vs-rest"],
... )
>>> sig_proteins = df[df["is_diff_abundant"]]
>>>
>>> # Using key_group to select all tests in a group
>>> df = pp.get.differential_abundance_df(
...     adata,
...     key_group="welch_one_vs_rest",
... )