How to dock flexible cyclic molecules with AutoDock and live happy

Soraphen-A antibiotic docked to the acetyl-coenzime A carboxylase  (PDB id 1W96)

AutoDock is not able to manage directly the flexibility associated to bonds comprised in cyclic molecules. The genetic algorithm is indeed not able to modify rotatable bonds in a concerted fashion, then the rotation of an intra-cyclic bond would lead to a distorted structure. For this reason, cyclic portions of the ligands are considered as rigid. Different approaches can be used todock macrocyclic molecules, usually identifying one (or many) low energy conformations and dock them as different ligand(s).

But using an indirect method, is possible to manage the ring as a fully flexible entity and use the AutoDock GA to explore its flexibility. The method was initially developed for the version 3.05 [PubMed], while now is officially implemented since  version 4.2.

The method
The protocol consists basically in converting the cyclic ligand into its corresponding acyclic form by breaking a bond and dock it as fully flexible, letting AutoDock to restore the original cycle structure during the calculation by means of special (G, glue) atom types. The protocol can be subdivided in three main steps:

Ring opening (a): by removing a bond, the ring is opened and the ligand is transformed to an acyclic form.
Ligand pre-processing (b):
the ligand is processed following the standard AutoDockTools protocol, but the edge atoms are replaced with G atoms.
Docking and ring closure (c)
: the ligand is docked applying a 12-2 pseudo-Lennard-Jones potential to the G-atoms that restore the cyclic structure (see the AutoDock User Guide for details).


To convert the molecule into the acyclic form one bond to be broken must be identified. The way the acyclic form is obtained can affect the calculation and the ring closure.
Keep number rotatable bonds low
The ring opening can increase dramatically the total number of rotatable bonds, requiring longer calculation times. Therefore, when less flexible or partially rigid regions are present (i.e. conjugated bonds) they should not be broken. Any further information available (i.e. NMR studies) should be used to keep rotatable bonds fixed.

Break carbon-carbon bonds
The atoms formerly connected by the broken bond will be "transformed" into glue atoms, but they are actual carbon atoms. This will allow to simplify the calculation setup by using a single extra atom type and refer it to the actual C.map file.

Avoid chiral atoms (...whenever possible)
If  a bond between one or more chiral carbon atoms is broken, original chirality is not guaranted in the resulting closed ring. Often cyclic molecules present many chiral centers (i.e. natural compounds, antibiotics) then identify bonds between achiral carbons is not possible. In this case, any carbon-carbon bond can be suitable, while chirality in docking results should be inspected and manually corrected if needed.

Step-by-step tutorial

The 8-member ring cyclo-alkylic inhibitor bound to HIV-2 protease (PDB entry: 5upj) will be used as an example of flexible docking of a cyclic molecule.

-The tutorial assumes a basic knowledge of AutoDock.
- AutoDock 4.2 and AutoDockTools (v.1.5.4 or newer)
are required.
- The input files generated with this tutorial and the docking results are available here.

1. Prepare the structure
Ligand molecule in the acyclic form can be built from scratch or obtained by opportune manual ring opening of the cyclic structure with either ADT or any molecular modeling tool. In this tutorial-

-Open ADT and choose the AD4.2 setup

-Load the 5UPJ complex from the web and remove the water molecules:

[PMV Menu] File->Read->Molecule from Web
PDB ID field: type “5upj”, and press OK

[PMV Menu] Select->Select From String

[Select From String]:
Molecule: 5upj
Residue: HOH*
Click on “Add”

[PMV Menu] Edit->Delete->Delete AtomSet, and press Continue

-Save the ligand and the protein in two separate structure files:

[Select From String]
Molecule: 5upj
Residue: UIN*
Click on “Add”

[PMV Menu] File->Save->Write PDB (save it as ligand_xray.pdb)

[PMV Menu] Edit->Delete->Delete AtomSet, and press Continue

[PMV Menu] File->Save->Write PDB (save it as protein.pdb)

[Select From String] Click on “Dismiss”

-Clean up the session: 

[PMV Menu] Edit->Delete->Delete All molecules, and press Continue

2. Remove the bond using ADT
This step is optional, if the ligand in the acyclic form is prepared using another software.
NOTE: If preparing the molecule with a different tool, if the atoms previously connected in the cycle are not separated by more than their bond distance, they will be considered still bound when imported in  ADT or processed by the prepare_ligand4.py script.

-Load the the ligand structure and label atoms to identify the bonds to be removed :

[PMV Menu] File->Read Molecule (ligand_xray.pdb)

[PMV Menu] Display->Label->by Properties

[LabelAtom by properties] Click on Change PCOM level (Atom), Select Choose one or more properties, (“name”)

-Remove the bond and open the ring:

[PMV Menu] Edit->Bonds->Remove Bonds

[Viewer] Delete the bond CD3-CD4 by clicking with left mouse button on the bond.

The deletion of the CD3-CD4 bond satisfied all the guidelines, leading to two short chains (2 torsions) ending with non chiral C carbon atoms. Breaking the CD1-CA4, for example, involves two different AutoDock atom types (C and A, respectively). The deletion of the CD1-CD2 bond, on the other end, would result in a single long flexible chain of 4 torsions. While being not problematic for such a small ring, longer chains increase the complexity of ring closure process in larger systems.

-Change the torsion angle of the ring chains; the newly created chains will be modified to open up the ring and separate the two edges atoms before adding hydrogens to the molecule:

[PMV Menu] Edit->Torsion angles

[Viewer] click on the four atoms defining the first torsion

[Set Torsion Angles] modify the Set Angle value to pull apart the edge atoms, then click Done.


[Set Torsion Angles] click on New Torsion and repeat the torsion variation for the other chain.

-The original ligand position from the X-ray structure is randomized to avoid any possible bias in the input conformation:
[PMV Buttons] Click on the DeJaVu icon

[DeJaVu] (1) disable the option “mouse transforms apply to “root” object only”

[DeJaVu] (2) select the ligand object “ligand”

[Viewer] Rotate and translate randomly the ligand

[PMV Menu] File->Save->Write PDB (check the “Save Transformed
Coordinates” option, Filename: ligand.pdb)

-Clean up the session:

[PMV Menu] Edit->Delete->Delete All molecules, and press Continue

3. Generate the PDBQT file
-Process the input acyclic structure following the standard AutoDock protocol by adding hydrogens and selecting it for the PDBQT generation:

[PMV Menu] File->Read Molecule (ligand.pdb)

[PMV Menu] Edit->Hydrogens->Add (All Hydrogens, noBondOrder, yes)

[ADT Menu] Ligand->Input->Choose [or Open], select the ligand  molecule name, then “Select Molecule for AutoDock4”

-Check the torsion tree:

[ADT Menu] Ligand->Torsion Tree->Choose torsion

The torsion count window will now report the total number of rotatable bonds (8/32) activated for the ligand, including the chains formerly connected in the ring.
-If the rotatable bonds are correct, the structure can be saved and removed from the working session:

[ADT Menu] Ligand->Output->Save as PDBQT (ligand.pdbqt)

4. Specify the G atoms in the ligand
The carbon atoms previously connected will be manually renamed as "G".

-Make a copy of the original ligand file and name it as ligandG.pdbqt

-Open the file ligandG.pdbqt in a text editor and locate the lines corresponding to the edge atoms previously connected by the bond removed:

HETATM   21  CD4 UIN B 100      -2.919  22.061  19.604  1.00 19.90     0.005 C
HETATM   24  CD3 UIN B 100      -3.821  22.402  20.791  1.00 19.60     0.005 C

-Modify the atom type definition by renaming the “C” atoms as “G” atoms in the last column:

HETATM   21  CD4 UIN B 100      -2.919  22.061  19.604  1.00 19.90     0.005 G
HETATM   24  CD3 UIN B 100      -3.821  22.402  20.791  1.00 19.60     0.005 G

-Save the changes and close the text editor.

5. Prepare the protein and calculate grid maps
The special G atom does not need specific maps being calculated.
For sake of ligand-protein interactions a G atom is a C carbon tout-court, and in this way is parametrized in the AutoDock energy parameters table. If a G map would be calculated with AutoGrid, it will result in a file identical to the C map file. Therefore the C carbon maps are used for G atoms.

-Load the protein structure (“protein.pdb”) and add the hydrogens:

[PMV Menu] File->Read Molecule (protein.pdb)

[Dashboard] Click on the Sel. Button corresponding to the “protein” entry

[PMV Menu] Edit->Hydrogens->Add (All Hydrogens, noBondOrder, yes)

-Generate the PDBQT and set the grid maps:

[ADT Menu] Grid->Macromolecule->Choose->Select “protein”, then click on Select Molecule (press “OK”, and save it as protein.pdbqt)

[ADT Menu] Grid->Set Map Types->Choose Ligand->Select “ligand”, then click on Select Ligand

[ADT Menu] Grid->Grid box...

[Grid Options] Set the number of points as 50, 50, 50 and center the box in the middle of the dimer interface approximatively at coordinates: [0, 18, 20]

[Grid Options] File->Close saving current

[ADT menu] Grid->Output->Save GPF... (save it as protein.gpf)

-Run AutoGrid.

6. Generate the DPF and run the docking job
-Set the protein filename and define the atom type parameters (the G atom type is included automatically):
[ADT Menu] Docking->Macromolecule->Set Rigid Filename->Select protein.pdbqt

[ADT Menu] Docking->Ligand->Open...>Select ligandG.pdbqt (accept the default parameters clicking on “Accept”)

[ADT Menu] Docking->Output->Lamarckian GA(4.2)Ligand-> Save it as ligandG_protein.dpf
-Edit the DPF file to include parameters for the ring closure.
The DPF file must be slightly modified to set the special parametrization of the G atoms for driving back together the edges of the ring during the docking and for tune the search parameters accordingly to the increase of complexity. For more complex searches, the ga_run and ga_num_evals values can also be increased. The file can be modified also with a text editor.

[ADT Menu] Docking->Edit DPF...

Edit the corresponding G atom map entry to point to the C maps:

Modify: “map protein.G.map”  => “map protein.C.map”

Before the "move" keyword, override the internal non-bonded parameter table defining the pseudo-Lennard-Jones potential for the G-G interaction (req = 1.51 A, ε = 10 kcal/mol, n=12, m=2).

Add: “intnbp_r_eps 1.51 10.000000 12 2 G G“

The following parameters are changed only to increase the
Increase the GA population size from the default value to 350:

Modify: “ga_pop_size 50”  => “ga_pop_size 350”

Increase the local search probability from the default value to 0.26 for helping the ring closure

Modify: “ls_search_freq 0.06”  => “ls_search_freq 0.26”

Save the changes by clicking on "OK" and save the DPF

[ADT Menu] Docking->Output...->Lamarckian GA( 4.2)...

The final DPF should look like the following example (modified lines are in red and marked with “<===”):

autodock_parameter_version 4.1       # used by autodock to validate parameter set
outlev 1                             # diagnostic output level
intelec                              # calculate internal electrostatics
seed pid time                        # seeds for random generator
ligand_types A C G HD OA             # atoms types in ligand              <===
fld protein.maps.fld                 # grid_data_file
map protein.A.map                    # atom-specific affinity map
map protein.C.map                    # atom-specific affinity map
map protein.C.map                    # C map used in place of G atom map  <===
map protein.HD.map                   # atom-specific affinity map
map protein.OA.map                   # atom-specific affinity map
elecmap protein.e.map                # electrostatics map
desolvmap protein.d.map              # desolvation map
intnbp_r_eps 1.51 10.000000 12 2 G G # pseudo-LJ potential                <===
move ligandG.pdbqt                   # small molecule
about -0.8665 18.5882 20.1623        # small molecule center
tran0 random                         # initial coordinates/A or random
quat0 random                         # initial quaternion
dihe0 random                         # initial dihedrals (relative) or random
tstep 2.0                            # translation step/A
qstep 50.0                           # quaternion step/deg
dstep 50.0                           # torsion step/deg
torsdof 8                            # torsional degrees of freedom
rmstol 2.0                           # cluster_tolerance/A
extnrg 1000.0                        # external grid energy
e0max 0.0 10000                      # max initial energy; max number of retries
ga_pop_size 350                      # number of individuals in population          <====
ga_num_evals 2500000                 # maximum number of energy evaluations        
ga_num_generations 27000             # maximum number of generations
ga_elitism 1                         # number of top individuals to survive to next generation
ga_mutation_rate 0.02                # rate of gene mutation
ga_crossover_rate 0.8                # rate of crossover
ga_window_size 10                    #
ga_cauchy_alpha 0.0                  # Alpha parameter of Cauchy distribution
ga_cauchy_beta 1.0                   # Beta parameter Cauchy distribution
set_ga                               # set the above parameters for GA or LGA
sw_max_its 300                       # iterations of Solis & Wets local search
sw_max_succ 4                        # consecutive successes before changing rho
sw_max_fail 4                        # consecutive failures before changing rho
sw_rho 1.0                           # size of local search space to sample
sw_lb_rho 0.01                       # lower bound on rho
ls_search_freq 0.26                  # probability of performing local search on individual  <====
set_sw1                              # set the above Solis & Wets parameters
unbound_model bound                  # state of unbound ligand
ga_run 10                            # do this many hybrid GA-LS runs
analysis                             # perform a ranked cluster analysis

Run AutoDock with the DPF:

       autodock4 -p ligandG_protein.dpf -l ligandG_protein.dlg

7. Results analysis
The G atoms are included in the internal energy parameters table AutoDock and are described by default with carbon atoms parameters. For sake of ring closure,  the energy table is modified with the intnb_r_eps DPF keyword to define the pseudo-Lennard-Jones potential. The potential definition is logged in the DLG file:

DPF> intnbp_r_eps  1.51 10.000042 12 2  G G # pseudo-LJ potential

Ring closure distance potential found for atom type G :
    Equilibrium distance   = 1.51 Angstroms
    Equilibrium potential  = 10.000042 Kcal/mol
    Pseudo-LJ coefficients = 12 - 2

Calculating internal non-bonded interaction energies for docking calculation;

Non-bonded parameters for G-G interactions, used in internal energy calculations: 

                   281.0            27.4
    E      =  -----------  -  -----------
      G,G          12               2
                  r               r

The docking results can be compared with the X-ray conformation of the ligand:
-Clean up the session from all previous molecules and load the protein and the reference X-ray ligand:

[PMV Menu] Edit->Delete->Delete All molecules, and press Continue

[PMV Menu] File->Read Molecule (ligand_xray.pdb)

[PMV Menu] File->Read Molecule (protein.pdbqt)

-Load the DLG file and show the clustering results:

[PMV Menu] Edit->Delete->Delete All molecules, and press Continue

[ADT Menu] Analyze->Dockings->Open...->Select ligandG_protein.dlg

[ADT Menu] Analyze->Clustering->Show...->Click on the cluster histogram bars and browse the results with the conformations player.

If the docking has been successful, the docked ligand overlapped with the experimental pose should look like this:

The red sticks shows the X-ray pose, while the green sticks are the docked result. The
two G atoms are not be represented as connected because they are not treated as carbon atoms by ADT.

Despite being very close to the crystal structure, the ring conformation and G-G bond distance are not . For this reason, a molecular mechanics energy minimization is strongly adviced if any detailed analysis and/or measurement is going to be performed.

8. Reference
If you are going to use this protocol, please cite the following paper:

"Lennard-Jones Potential and Dummy Atom Settings to Overcome the AUTODOCK Limitation in Treating Flexible Ring Systems"
Forli, S., Botta, M., J. Chem. Inf. Model., 2007, 47 (4), pp 1481–1492.

All the images are generated with PMV.

For any questions or help about the flexible rings or AutoDock usage, please use the forum or the mailing list.

Go Back to the Home Page

Stefano Forli, TSRI, v0.1