.hclustv_profiles

proteopy.tl.hclustv_profiles(adata, cluster_key='auto', layer=None, group_by=None, method='median', zero_to_na=False, fill_na=None, skip_na=True, inplace=True, key_added=None, verbose=True)[source]

Compute cluster profiles from cluster annotations.

Summarizes variables within each cluster using mean or median to create cluster profile intensities across all observations.

Parameters:
  • adata (AnnData) – AnnData with cluster annotations in .var (from hclustv_cluster_ann()).

  • cluster_key (str) – Column in adata.var containing cluster assignments. When 'auto', auto-detects from available 'hclustv_cluster;...' columns. When multiple columns exist, must be specified explicitly.

  • layer (str | None) – Layer to use for computing profiles. When None, uses adata.X.

  • group_by (str | None) – Column in adata.obs to group observations by before computing cluster profiles. When specified, observations are first summarized by this column using method, then cluster profiles are computed on the grouped data.

  • method (str) – Summarization method for computing cluster profiles. One of "mean" or "median". Also used for grouping observations when group_by is specified.

  • zero_to_na (bool) – If True, convert zeros in the data matrix to np.nan before any computation.

  • fill_na (float | int | None) – If specified, replace np.nan values with this constant before computing profiles. Applied after zero_to_na.

  • skip_na (bool) – If True, exclude np.nan values when computing summaries. If False, return np.nan if any value in the group is np.nan.

  • inplace (bool) – If True, store results in adata.uns and return None. If False, return a modified copy of adata.

  • key_added (str | None) – Custom key for storing results in adata.uns. When None, uses the default format 'hclustv_profiles;<group_by>;<var_hash>;<layer>' derived from the cluster key components.

  • verbose (bool) – Print storage location key after computation.

Returns:

If inplace=True, returns None. If inplace=False, returns a copy of adata with cluster profiles stored in .uns.

Return type:

AnnData | None

Raises:
  • ValueError – If no cluster annotations are found in adata.var. If multiple cluster columns exist and cluster_key is not specified. If method is not "mean" or "median". If auto-generated storage key cannot be derived.

  • KeyError – If specified cluster_key is not found in adata.var. If specified layer is not found in adata.layers. If specified group_by column is not found in adata.obs.

Notes

The cluster profiles DataFrame is stored at adata.uns['hclustv_profiles;<group_by>;<var_hash>;<layer>'].

Examples

>>> import proteopy as pr
>>> adata = pr.datasets.karayel_2020()
>>> pr.tl.hclustv_tree(adata, group_by="condition")
>>> pr.tl.hclustv_cluster_ann(adata, 5)
>>> pr.tl.hclustv_profiles(adata)