GPM cRAP – common Repository of Adventitious Proteins
Overview
The cRAP (common Repository of Adventitious Proteins, pronounced “cee-RAP”) is a curated database of protein sequences commonly found in proteomics experiments through either accidental contamination or deliberate use as laboratory reagents. The database contains 116 protein entries and is maintained by The Global Proteome Machine Organization.
Library Composition
The website documents 115 entries organized into five categories. The downloadable FASTA contains one additional unlisted entry (see note below).
Category |
Entries |
|---|---|
Sigma-Aldrich Universal Protein Standard (UPS) reference proteins |
48 |
Dust and contact proteins (human skin, hair, saliva, sheep wool, latex gloves) |
38 |
Laboratory reagent proteins (BSA, trypsin, chymotrypsinogen, pepsin, Lys-C, and others) |
19 |
Molecular weight markers and standard proteins (horse cytochrome C, E. coli beta-galactosidase, rabbit aldolase, and others) |
9 |
Common viral contaminants (S. cerevisiae virus L-A coat protein) |
1 |
Unlisted in website (KKA1_ECOLX, E. coli kanamycin nucleotidyltransferase – antibiotic resistance marker) |
1 |
Note
The 48 UPS entries are human proteins from the Sigma-Aldrich Universal Protein Standard, a commercial mixture used for absolute quantification. These are not contaminants per se but are included because they frequently appear in proteomics experiments as spike-in standards.
Organism Breakdown
Entries in the FASTA file originate from the following organisms:
Organism |
Proteins |
|---|---|
Homo sapiens (keratins, UPS standards, saliva proteins) |
68 |
Ovis aries (sheep wool keratins) |
16 |
Bos taurus (BSA, caseins, trypsin, and other reagents) |
13 |
Sus scrofa (porcine pepsin, trypsin) |
4 |
Equus caballus (cytochrome C, myoglobin) |
2 |
Hevea brasiliensis (latex glove proteins) |
2 |
Gallus gallus (lysozyme, ovalbumin) |
2 |
Other organisms (9 species, 1–2 entries each) |
9 |
Category Details
Laboratory proteins include enzymes and reagents routinely used in sample preparation:
Bovine serum albumin (BSA)
Bovine trypsin variants (TRY1, TRY2)
Porcine trypsin and pepsin isoforms (A, B, C)
Bovine chymotrypsinogen variants (A, B)
Lysyl endopeptidase (Lys-C)
Staphylococcus aureus V8 protease (Glu-C)
Dust and contact proteins represent environmental contaminants from human skin and hair (keratins, keratin-associated proteins), sheep wool (from clothing), and latex gloves (Hevea brasiliensis rubber elongation factor and small rubber particle protein).
Molecular weight markers are commonly used calibration standards such as horse cytochrome C, horse myoglobin, chicken ovalbumin and lysozyme, rabbit aldolase, yeast alcohol dehydrogenase, E. coli beta-galactosidase, and bovine glutamate dehydrogenase.
Note
ProteoPy downloads the cRAP database via
pr.download.contaminants(source="gpm_crap").
Resources
Website: thegpm.org/crap
FASTA download: ftp://ftp.thegpm.org/fasta/cRAP/crap.fasta