GDPR statement: No marketing. No 3rd party. Data you submit is kept for debugging purposes.
This site contains several tools with different citations.
To use a ligand or non-canonical amino acid in Rosetta, a params file needs to be provided, which defines the topology of the "residue" (the name in a PDB for a ligand, nucleotide, amino acid residue, ion or water).
This tool does not utilise the mol_to_params.py script included in Rosetta, but a module I wrote
which parameterises ligands from RDKit.Chem.Mol
objects (cf.
Rdkit-to-params repo ).
It does a lot more things,
like allow atom names in a pre-existing params file to be modified,
rename atom names from a PDB file,
rename atom names based upon a template molecule,
testing (via Pyrosetta) of the file,
etc.
To mark covalent attachments (CONNECT
/UPPER
/LOWER
entries),
this module uses a a dummy atom to mark a covalent connection, this is a
*
element in SMILES (called in RDKit, atomic number of zero) or
R
element in a MDL file.
NB. Make sure your PDB of your protein has a LINK
line or it will be ignored!
Specifying an amino acid requires two connections in the backbone, i.e. *NC({sidechain})C(=O)*
.
NB. the atom order in the params is not the atom order, so the backbone can be anywhere in the smiles
(e.g. *NC(C(=O)*){sidechain}
or *NC{sidechain}(C(=O)*)CN*
).
The names of the atoms can be specified. Using the atom order as they appear in the SMILES,
you can supply either an array/list (with '-' or 'null' to mark atoms you do not which to specify, (e.g.['-', ' CX ', 'CY']
) or
a object/hash/dictionary with keys the index and the value the name (e.g.{5: ' CX '}
)
Note. This is ignored if you provided a PDB (third option).
A SMILES is
a string that represents a molecule . For example CC
is propane.
Protonation is very important for Rosetta, so it is important to make sure the protonation is correct. Specifically,
The SMILES CC(=O)O
is acetic acid (no charge), while CC(=O)[O-]
is acetate (negative, conjugate base),
which is more likely to be the required tautomer. Likewise ethylamine at pH 7 is most likely CC[NH3+]
than CCN
(same as CC[NH2]
).
If the SMILE provided is in the form *NC({sidechain})C(=O)*
then the atom names will be greek lettered
and the compound will be an amino acid. If [NH3+]C({sidechain})C(=O)[O-]
is provided a ligand is made.
A ligand in a PDB is often unprotonated and has no indication of what the format charges are, nor are bonds necessarily correct —a double bond is a repeated CONECT entry, but aromatic bonds cannot be specified. Hence why the third option has a SMILES input box. Okay, if this is omitted the ligand called by 3-name will be extracted and parameterised whereas in the second option sending over a pdb assumes there is only the ligand. Also, this piece of code has a weird behaviour with CHI entries so the resulting value is empty as a temp fix.
As this homeserver is currently on a Rasperry Pi and hosts other apps so I don't wish to overburden it so the
params is not tested. The module does it (params.test()
).
The rotamer file needs to specified in the params if you want to use it via
PDB_ROTAMERS filename.pdb
.
This is a simple interface to the module. If there is interest (press feedback) I will add
Overview
Links to academic resources and academic social media presence of the authors