.filter_samples_completeness

proteopy.pp.filter_samples_completeness(adata, min_fraction=None, group_by=None, zero_to_na=False, inplace=True)

Filter observations based on data completeness.

This function filters the AnnData object along a the obs axis based on the fraction of non-missing values (np.nan). Filtering can be performed globally or within groups defined by the group_by parameter.

Parameters:
  • adata (anndata.AnnData) – The annotated data matrix to filter.

  • min_fraction (float, optional) – The minimum fraction of non-missing values required to keep an observation or variable. If group_by is provided, this threshold is applied to the maximum completeness across all groups.

  • group_by (str, optional) – A column key in adata.obs (if axis=1) or adata.var (if axis=0) used for grouping before applying the filter. The maximum completeness or count across the groups is used for filtering.

  • zero_to_na (bool, optional) – If True, zeros in the data matrix are treated as missing values (NaN).

  • inplace (bool, optional) – If True, modifies the adata object in place. Otherwise, returns a filtered copy.

Returns:

If inplace=False, returns a new filtered AnnData object. Otherwise, returns None.

Return type:

anndata.AnnData or None

Raises:
  • KeyError – If the group_by key is not found in the corresponding annotation DataFrame.

  • Note: – This function is a partial of filter_axis, with the following arguments fixed: axis=0, min_count=None.