.hclustv_cluster_ann
- proteopy.tl.hclustv_cluster_ann(adata, k, linkage_key='auto', values_key='auto', inplace=True, key_added=None, verbose=True)[source]
Annotate variables with cluster assignments from hierarchical clustering.
Uses
scipy.cluster.hierarchy.fcluster()to cut the dendrogram atkclusters and stores cluster assignments in.var.- Parameters:
adata (AnnData) –
AnnDatawith hierarchical clustering results stored in.uns(fromhclustv_tree()).k (int) – Number of clusters to generate (required).
linkage_key (str) – Key in
adata.unscontaining the linkage matrix. When'auto', auto-detects the linkage key if exactly one'hclustv_linkage;...'key exists. When multiple keys are present, must be specified explicitly.values_key (str) – Key in
adata.unscontaining the values DataFrame. When'auto', auto-detects the values key if exactly one'hclustv_values;...'key exists. When multiple keys are present, must be specified explicitly.inplace (bool) – If
True, store results inadata.varand returnNone. IfFalse, return a modified copy ofadata.key_added (str | None) – Custom key for storing results in
adata.var. WhenNone, uses the default format'hclustv_cluster;<group_by>;<var_hash>;<layer>'derived from the linkage key components.verbose (bool) – Print storage location key after computation.
- Returns:
If
inplace=True, returnsNone. Ifinplace=False, returns a copy ofadatawith cluster annotations stored in.var.- Return type:
AnnData | None
- Raises:
ValueError – If no hierarchical clustering results are found in
adata.uns. If multiple clustering results exist andlinkage_keyis not specified. If linkage matrix has invalid shape. Ifk < 2(single cluster is semantically meaningless). If auto-generated storage key cannot be derived from a custom linkage key.TypeError – If linkage matrix is not a numpy array.
KeyError – If specified
linkage_keyis not found inadata.uns.
Notes
Cluster assignments are stored at
adata.var['hclustv_cluster;<group_by>;<var_hash>;<layer>']Variables not included in the clustering (e.g., filtered out due to NaN values) will haveNaNin this column.Examples
>>> import proteopy as pr >>> adata = pr.datasets.karayel_2020() >>> pr.tl.hclustv_tree( ... adata, group_by="condition", selected_vars=adata.vars[0:1000] ... ) >>> pr.tl.hclustv_cluster_ann(adata, 5)
Access cluster assignments:
>>> adata.var['hclustv_cluster;condition;a1b2c3d4;X']