.remove_zero_variance_vars

proteopy.pp.remove_zero_variance_vars(adata, group_by=None, atol=1e-08, inplace=True)[source]

Remove variables (columns) with near-zero variance, skipping NaN values.

This function removes variables (e.g., peptides, proteins or features) whose variance across observations is less than or equal to a given tolerance. If a grouping variable is provided via group_by, a variable is removed if it has near-zero variance (≤ atol) in any group.

Parameters:
  • adata (anndata.AnnData) – Annotated data matrix.

  • group_by (str or None, optional (default: None)) – Column name in adata.obs to compute variance per group. If provided, variables are removed if their variance is ≤ atol within any group. If None, variance is computed across all observations.

  • atol (float, optional (default: 1e-8)) – Absolute tolerance threshold. Variables with variance ≤ atol are considered to have zero variance and are removed.

  • inplace (bool, optional (default: True)) – If True, modifies adata in place. Otherwise, returns a copy with low-variance variables removed.

Returns:

If inplace=True, returns None and modifies adata in place. Otherwise, returns a new AnnData object containing only variables with variance > atol.

Return type:

None or anndata.AnnData

Notes

  • NaN values are ignored using np.nanvar (population variance, ddof=0).

  • For sparse matrices, the data is densified for variance computation. Without grouping this happens once on the full matrix; with grouping it happens per-group slice to limit peak memory.

  • If group_by is provided, any variable that has variance ≤ atol in any group is removed globally.