.remove_contaminants
- proteopy.pp.remove_contaminants(adata, contaminant_path, protein_key='protein_id', header_parser=None, inplace=False)[source]
Remove variables whose protein identifier matches a contaminant FASTA entry.
- Parameters:
adata (anndata.AnnData) – Annotated data.
contaminant_path (str | Path) – Path to the contaminant list. The file can be in FASTA format, in which case the headers are parsed to extract the contaminant ids (see param: header_parser); or tabular format TSV/CSV files, in which case the first column is extracted as contaminant ids..
protein_key (str, optional (default: "protein_id")) – Column in
adata.varcontaining protein identifiers to match.header_parser (callable, optional) – Function to extract protein IDs from FASTA headers. Defaults to splitting the header on
"|"and returning the second element, falling back to the full header if not present.inplace (bool, optional (default: False)) – If True, modify
adatain place. Otherwise, return a filtered view.
- Returns:
Noneifinplace=True; otherwise the filtered AnnData view.- Return type:
None or anndata.AnnData