lab

Pisces

Culling the PDB by Resolution and Sequence Identity

Last update: November 27, 2020

Pre-compiled culledPDB lists from Pisces

From this page, you can download lists that we have already compiled for various parameter sets (resolution, sequence identity, etc.). The current list of Pisces CulledPDB sets is shown below. You may request other lists by using our Protein Sequence Culling Server .

In the list below, the resolution and percent identity cutoffs are given in each filename. E.g., for cullpdb_pc20_res1.8_R0.25_d201125_chains5640, the percentage identity cutoff is 20%, the resolution cutoff is 1.8 angstroms, and the R-factor cutoff is 0.25. The list was generated on November 27, 2020. The number of chains in the list is 5640. Files with "inclNOTXRAY" include sequences from non-xray-derived structures (mostly NMR but also including electron diffraction, FTIR, fiber diffraction, etc.). Files with "inclCA" include sequences of structures that contain only backbone CA coordinates.

Each file gives the PDB entry (four-letter code), chain code ("0" if there is only one chain in the entry), the experimental method (XRAY, NMR, etc.) the number of residues in the chain, the resolution, the R-value, and free R-value (if available; otherwise NA). The directory includes fasta sequence files for each list.

Culled PDB FTP directory

Gzipped file with all the lists Cull_d201125.tar.gz

Standalone package

Run Pisces locally in a standalone mode Pisces.tar.gz

PDBAA related database files for Pisces. If you have downloaded Pisces.tar.gz before, you only need to download this to get updated. You should put all unpacked files into Pisces/BLASTDB to finish the updating BLASTDB.tar.gz

BLAST searchable database files of pdbaa MolIDE's usage pdbaa.tar.gz

cullpdb_pc20_res1.6_R0.25_d201125_chains3783.gz cullpdb_pc20_res1.8_R0.25_d201125_chains5640.gz
cullpdb_pc25_res1.6_R0.25_d201125_chains4867.gz cullpdb_pc25_res1.8_R0.25_d201125_chains7486.gz
cullpdb_pc30_res1.6_R0.25_d201125_chains5591.gz cullpdb_pc30_res1.8_R0.25_d201125_chains8944.gz
cullpdb_pc40_res1.6_R0.25_d201125_chains6684.gz cullpdb_pc40_res1.8_R0.25_d201125_chains11069.gz
cullpdb_pc50_res1.6_R0.25_d201125_chains7445.gz cullpdb_pc50_res1.8_R0.25_d201125_chains12535.gz
cullpdb_pc60_res1.6_R0.25_d201125_chains7937.gz cullpdb_pc60_res1.8_R0.25_d201125_chains13553.gz
cullpdb_pc70_res1.6_R0.25_d201125_chains8336.gz cullpdb_pc70_res1.8_R0.25_d201125_chains14367.gz
cullpdb_pc80_res1.6_R0.25_d201125_chains8717.gz cullpdb_pc80_res1.8_R0.25_d201125_chains15121.gz
cullpdb_pc90_res1.6_R0.25_d201125_chains9165.gz cullpdb_pc90_res1.8_R0.25_d201125_chains16109.gz
cullpdb_pc20_res2.0_R0.25_d201125_chains7511.gz cullpdb_pc20_res2.2_R1.0_d201125_chains8750.gz
cullpdb_pc25_res2.0_R0.25_d201125_chains10203.gz cullpdb_pc25_res2.2_R1.0_d201125_chains11971.gz
cullpdb_pc30_res2.0_R0.25_d201125_chains12415.gz cullpdb_pc30_res2.2_R1.0_d201125_chains14668.gz
cullpdb_pc40_res2.0_R0.25_d201125_chains15788.gz cullpdb_pc40_res2.2_R1.0_d201125_chains18974.gz
cullpdb_pc50_res2.0_R0.25_d201125_chains18220.gz cullpdb_pc50_res2.2_R1.0_d201125_chains22058.gz
cullpdb_pc60_res2.0_R0.25_d201125_chains19908.gz cullpdb_pc60_res2.2_R1.0_d201125_chains24249.gz
cullpdb_pc70_res2.0_R0.25_d201125_chains21247.gz cullpdb_pc70_res2.2_R1.0_d201125_chains25979.gz
cullpdb_pc80_res2.0_R0.25_d201125_chains22462.gz cullpdb_pc80_res2.2_R1.0_d201125_chains27599.gz
cullpdb_pc90_res2.0_R0.25_d201125_chains24165.gz cullpdb_pc90_res2.2_R1.0_d201125_chains29819.gz
cullpdb_pc20_res2.5_R1.0_d201125_chains10057.gz cullpdb_pc20_res3.0_R1.0_d201125_chains11269.gz
cullpdb_pc25_res2.5_R1.0_d201125_chains13815.gz cullpdb_pc25_res3.0_R1.0_d201125_chains15423.gz
cullpdb_pc30_res2.5_R1.0_d201125_chains17046.gz cullpdb_pc30_res3.0_R1.0_d201125_chains19096.gz
cullpdb_pc40_res2.5_R1.0_d201125_chains22295.gz cullpdb_pc40_res3.0_R1.0_d201125_chains25146.gz
cullpdb_pc50_res2.5_R1.0_d201125_chains26112.gz cullpdb_pc50_res3.0_R1.0_d201125_chains29603.gz
cullpdb_pc60_res2.5_R1.0_d201125_chains28907.gz cullpdb_pc60_res3.0_R1.0_d201125_chains32940.gz
cullpdb_pc70_res2.5_R1.0_d201125_chains31090.gz cullpdb_pc70_res3.0_R1.0_d201125_chains35584.gz
cullpdb_pc80_res2.5_R1.0_d201125_chains33182.gz cullpdb_pc80_res3.0_R1.0_d201125_chains38182.gz
cullpdb_pc90_res2.5_R1.0_d201125_chains36107.gz cullpdb_pc90_res3.0_R1.0_d201125_chains41981.gz

PDBAA: Sequence files representing entire PDB

Three gzipped FASTA-format files of all PDB sequences are also available:

  • Every protein chain in every PDB file has a unique entry in pdbaa.gz
    For example, 1A01B and 1A01D are two chains from PDB entry 1A01, and have separate entries in pdbaa.gz although their sequences are identical:
        >1A01B 146 XRAY 1.80 0.169 0.223 HEMOGLOBIN (BETA CHAIN)
        MHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPATQRFFESFGDLST
        PDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDP
        ENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH
    
        >1A01D 146 XRAY 1.80 0.169 0.223 HEMOGLOBIN (BETA CHAIN)
        MHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPATQRFFESFGDLST
        PDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDP
        ENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH
        
  • Only non-redundant sequences in each PDB file have unique entries in pdbaaent.gz
    The redundant chain IDs in the PDB file are listed at the end of the title of the representative chain entries. For example, 1A01D does not have an entry in pdbaaent.gz, because it is identical to 1A01B. The title of 1A01B is changed to:
        >1A01B 146 XRAY 1.80 0.169 0.223 HEMOGLOBIN (BETA CHAIN)  ||  1A01D
      
  • Only non-redundant sequences across all PDB files have unique entries in pdbaanr.gz
    The redundant chain IDs from all other PDB files are added at the end of the title of the representative chain entries. Representative chains are selected based on the highest resolution structure available and then the best R-values. Non-X-ray structures are considered after X-ray structures. For example, 1A01D, 1A0WB, and 1A0WD do not have entries in pdbaanr.gz, because they are identical in sequence to 1A01B. The title of 1A01B is changed to:
        >1A01B 146 XRAY 1.80 0.169 0.223 HEMOGLOBIN (BETA CHAIN)  ||  1A01D 1A0WB 1A0WD
        


Contact

Qifang Xu (qifang.xu@fccc.edu)
Roland Dunbrack Jr. (roland.dunbrack@fccc.edu)

Developed by Guoli Wang, Qifang Xu & Roland L. Dunbrack, Jr.