PISCES: A Protein Sequence Culling Server
PISCES has two ways of producing subsets of sequences from larger sets:
- Subsets of protein chains and sequences culled from the entire PDB according to structure
quality and maximum mutual sequence identity
- Subsets of protein chains and sequences culled from a list of PDB chains or PDB entries input
by the user, according to structure quality and maximum mutual sequence identity
Important features of this service:
- Sequence identities for PDB sequences are determined by creating a hidden Markov model
for every unique PDB sequence with the program HHblits (Söding et al.) and searching
the resulting collection of HMMs with each individual HMM with the program HHsearch. PISCES
can therefore provide meaningful results at low sequence identity (15-30%).
- For each calculated list, the server provides an output list of accession IDs (e.g., 1ABCA) with sequence length,
structure determination method, resolution, and R-factor (if available) and a file
of the sequences in FASTA format. The email containing links to these files will be emailed to the user upon
completion of the calculation, and will be stored for at least one week.
- PDB sequences are updated weekly from the PDB mmCIF files.
- PISCES correctly handles multi-character chain IDs, which are now used in very large
structures by the PDB (and some small structures for no good reason).
- PISCES now allows the user to select whether to include X-ray, NMR, or cryo-EM structures in
the output lists. X-ray structures are evaluated with resolution and R-factor criteria, and
EM structures are evaluted by resolution (most EM structures do not have R-factors). NMR
structures are not currently evaluated for quality.
- PISCES has a new option for creating lists of chains without chain breaks (segments with
missing coordinates within the protein chain), or with no disordered regions whatsoever. The output files
are labeled with “noBrks” or “noDsdr” if these options are used.
- If a user uploads a PDB entry or chain list for culling, the server returns a log file of which
entries or chains were eliminated and why (e.g., resolution, sequence identity, chain breaks).
Access the server to create your own lists
Download precompiled CulledPDB lists (tar.gz file, ~800 Mbytes)
Download precompiled CulledPDB lists (directory)
Please cite the following in any work that uses lists provided by PISCES
G. Wang and R. L. Dunbrack, Jr. PISCES: a protein sequence culling server. Bioinformatics, 19:1589-1591, 2003.
Institute for Cancer Research
Fox Chase Cancer Center
333 Cottman Avenue
Philadelphia PA 19111
Last modified: March 2, 2022