CCP cRAP – Cambridge Centre for Proteomics cRAP

Overview

The CCP cRAP database is a contaminant protein collection maintained by the Cambridge Centre for Proteomics (CCP) as part of their camprotR R package. It contains 125 protein entries and is largely based on the GPM cRAP database, with several additions.

Obtaining the File

ProteoPy does not bundle the CCP cRAP file. Users must obtain it themselves using the camprotR R package. See the cRAP vignette for full details.

library(camprotR)

ccp_crap <- download_ccp_crap(tempfile(fileext = ".fasta"), is_crap = TRUE)

library(Biostrings)
writeXStringSet(ccp_crap, filepath = "/path/to/ccp_crap.fasta")

Entries to Consider Removing

Two entries in the CCP cRAP use placeholder accession numbers (000000) that are not valid UniProt accessions. These will not match any protein in a UniProt-based search database. Depending on your workflow, you may want to manually remove them from the FASTA before use:

  • cRAP126|000000|ENDOP_GLUC – Endoproteinase Glu-C (NEB, P8100S)

  • cRAP127|000000|RECOM_LYSC – recombinant Lys-C (Promega, V167A)

Both are commercial protease products added by CCP as extensions to the GPM cRAP base.

License

The camprotR R package (and its bundled cRAP FASTA) is released under the MIT License by the Cambridge Centre for Proteomics.

Resources