PDBrenum provides PDB files in mmCIF and legacy-PDB format in which residues in the coordinates (and all other fields) are renumbered according to their UniProt sequences. The matching of UniProt numbering for each residue in a PDB chain is obtained from the SIFTS database https://www.ebi.ac.uk/pdbe/docs/sifts/index.html

Sequence tags and insertions that do not have UniProt correspondences are renumbered by adding a large number to their position in the sequence (numbered from 1 to L, the length of the sequence). This number is 50000 for mmCIF-format files and 5000 for legacy-PDB format files.

Protein chains with no UniProt data in SIFTS (e.g., most peptides and antibodies) and nucleotide chains are not renumbered at all.

Chimeric chains with more than one UniProts are numbered according to the UniProt sequence that covers the largest portion of the sequence. The only exception to this rule is for some protein chains that are used as crystallization chaperones: GFP_AEQVI GCN4_YEAST C562_ECOLX ENLYS_BPT4 MALE_ECOLI

Last PDBrenum database update was on: 07 Mar 2022

184522 PDB IDs in the PDBrenum database

Developed by Bulat Faezov (bulat.faezov@fccc.edu) and Roland Dunbrack (roland.dunbrack@fccc.edu)
homeDr. Roland Dunbrack's Lab
Fox Chase Cancer Center