CHARMm Principles



6       Performing Free Energy Simulations

CHARMm supports several methods of performing free energy simulations and includes several implementations of these methods. This chapter describes the methods that are available, the calculations performed for each method, and the implementation of each method.

This chapter explains

Use of the term free energy simulations is a misnomer for two reasons:

For example, calculation of free energies of solvation or drug/enzyme binding is computationally very difficult. If you are satisfied with relative changes in free energies, it is more tractable to transmutate various parts of a system in a way that is usually physically unreasonable, but computationally feasible. The result is one in which relative changes in free energies (ýýA) are calculated in a simulation that is thermodynamically equivalent to the physical process. The relationships used for computing the relative free energy differences are all exact in the statistical mechanical sense.

For an overview of free energy simulation methods, consult Mezei and Beveridge (1986), Straatsma (1987), or Fleischman and Brooks (1987).


Understanding the relative free energy Hamiltonian

In all of the methods for calculating relative free energy in CHARMm, a hybrid Hamiltonian is used:

Where:

HE Environment part of the Hamiltonian

HR Reactant part of the Hamiltonian

HP Product part of the Hamiltonian

Coupling parameter (extent of transformation)

N Integer exponent

The molecular system is divided into four sets of atoms:

Reactant and product designations are arbitrary and are used as a convention to denote the direction in which the molecular system is being transformed. The process starts with reactant and ends with product.

For example, examine the calculation of the relative change in the solvation free energies of methanol and ethane. This example is taken from Fleischman and Brooks (1987).

The system is represented by water molecules (usually a box of 125 or more, using periodic boundary conditions) and the hybrid methanol/ethane system.

The following figure illustrates the transformation of methanol to ethane:

Using the above depiction, the reactant atoms are the hybrids O1 and H1. The product atom is the hybrid C1 methyl group (not depicted). The C2 hybrid molecule's methyl group changes charge as the transformation goes from reactant to product. The co-located charge atom starts with the methanol methyl group charge and ends up (at = 1.0) with the ethyl-methyl group charge. C2 is also considered an environment atom. The atoms of the water molecules constitute the actual environment atoms in this system. If the hybrid molecule were larger, it could contain additional environment atoms.

All potential energy terms involving reactant atoms, as well as electrostatic interactions involving co-located charge atoms with their reactant charges, go into HR. Kinetic energies of the reactant atoms are also included in this term.

Similarly, potential energy terms involving product atoms and co-located product charge electrostatic interactions, along with kinetic energies of the product atoms, go into HP. The rest of the energy terms are incorporated into HE.

Note that for a potential energy term to be included in HR, only one atom in the given interaction has to be a reactant atom. The same is true for product terms.

Electrostatic terms involving co-located charge atoms are calculated twice: once with the reactant charges and again with the product charges. Terms between co-located reactant or product charges and reactant or product atoms are omitted.

CHARMm assumes that when the hybrid molecule is constructed in the residue topology file, no internal coordinate energy terms exist involving reactant and product atoms. Similarly, CHARMm assumes that nonbonded exclusions have been specified between reactant and product atoms. No checking is done in the program.

In CHARMm, the Hamiltonian is constructed exactly as specified in . In other implementations, the dependence of the Hamiltonian is more complex. Complexity is higher in those implementations where force constants and other parameters in the energy terms are factored by rather than by calculating various energy terms and factoring those terms.

In a statistical mechanical sense, no particular reason exists to factor the Hamiltonian one way or another. The thermodynamic interactions hold regardless of path. The equipartition theorem for obtaining kinetic energy works just as well although, in some implementations, factoring of kinetic energy sometimes appears to be ignored.

Certain advantages result from constructing the Hamiltonian as it is done in CHARMm:


Using free energy calculation methods

Three basic methods are available for calculating relative changes in free energy and temperature derivative properties:

Perturbation method

The method that is used most frequently is the thermodynamic perturbation method. For this, the free energy change is given as follows:

Eq. 14     

Notice that all of the averages are at .

To get the total ýA:

Eq. 15     

You must ensure that the whole range is covered. For example, in a methanol to ethane calculation, three values could be used: = 0.125, 0.500, and 0.875. To cover the range completely, you must perform a series of six simulations:

to ': 0.125 to 0.000 and 0.125 to 0.250;

to ': 0.500 to 0.250 and 0.500 to 0.750;

to ': 0.875 to 0.750 and 0.875 to 1.000.

Double-wide sampling

Dynamics calculations are run and two ýA's are calculated: one lower and one higher than the corresponding . This is termed double-wide sampling. Note that the simulations are joined at common s.

Windowing

The technique for running these simulations is called windowing. With windowing, simulations are run at a discrete number of points. If very few points are used, long simulations are required in order to smoothly join the results from the trajectories. If many points are used (tens to hundreds), shorter simulations are required.

In the case of thermodynamic perturbation, the total free energy change is pieced together from perturbations done with each window.

Thermodynamic integration method

For the thermodynamic integration method, the following expression is used:

Eq. 16     

Integration is done by a quadrature method. In this implementation, the ensemble average is fitted as a function of to a cubic spline polynomial and the polynomial is integrated analytically. No extrapolation to end-points is done.

Expressions for energy and entropy changes can be derived for this equation and have been incorporated into CHARMm. The expressions suffer from very high uncertainties due to the presence of ensemble averages over the total energy that are multiplied by ensemble averages over ¹H/¹. You will probably get better results by averaging energies at the endpoints and subtracting.

Slow-growth method

Slow growth is an approximation of the thermodynamic integration method. In this method, instead of being constant for a given trajectory, the parameter varies monotonically with each time step, as indicated in Eq. 17:

Eq. 17     

Where:

i = i+1 - i

H Hamiltonian

0= 0

n= 1

Advantages

Some advantages are associated with this method:

Disadvantages

Some disadvantages are also associated with this method:


Running a free energy simulation

CHARMm includes three implementations of the free energy simulation (FES) calculations:

This flexible implementation allows connectivity to change. Also, the energy restrain terms harmonic, dihedral, and NOE are subject to change, allowing a flexible way to compute free energy differences between conformations. The implementation works with all minimizers and integrators.

Potential energy terms are written to output during a trajectory and, in the case of the window method, trajectories can be combined. Any to ¢ can be calculated following analysis and additional points can be added, as appropriate.

This implementation is described in more detail in the rest of this section.

FES calculation

The calculation is done in three steps:

1.   Setting up a simulation

2.   Running dynamics

3.   Post-processing data

The first two steps occur in the same input file. The last step is generally done with a separate input file because the output of several trajectories is usually used.

Setting up a simulation

To set up the free energy simulation input file, gather or generate the usual files for a dynamics simulation including a PSF, coordinates, parameters, and image or stochastic boundary condition input.

The exact syntax of the free energy commands depends on the method you choose for the calculation. For example, the TSM commands define reactant, product, and co-located charge lists, and the procedure for simulation (slow growth or window). Note that both thermodynamic simulation and thermodynamic integration can be done with the window procedure.

Note that most minimization routines work using the hybrid V() potential. To remove bad contacts, minimization is generally done prior to dynamics, with the hybrid molecule unperturbed.

Running dynamics

After the free energy simulation has been set up, dynamics can be run with no changes in the commands. Normally, you run some thermalization runs with data being discarded. For a thermalization run, the SAVE command in the free energy simulation command sequence is generally not used. For production runs using thermodynamic integration or thermodynamic simulation, the SAVE option must be issued in the free energy simulation input. This will result in the output of V and V¢ in a formatted file.

Example: Setting up an FES simulation and running dynamics

Below is a fragment of the input file for setting up the thermalization of an ethanol to propane hybrid. Windowing is used during the simulation.

The system represented in the figure below is partitioned as follows:

For this example, the extended atom model is used, with aliphatic hydrogens considered part of the carbon to which they are attached. The example also provides for linear scaling of , although non-linear scaling can be specified.


*Ethanol -> Propane
*
! Read topology file
READ RTF CARD
* TOPOLOGY FILE ethanol -> propane
*
20 1! Version number
MASS 1 H 1.00800! hydrogen which can h-bond to neutral atom
MASS 13 CH2E 14.02700! - " - two
MASS 14 CH3E 15.03500 ! - " - three
MASS 53 OH1 15.99940 ! hydroxy oxygen
! This is put in to force the necessity of using a GENERATE
! Noangles in the input file. The standard topology files use
!this statement.
AUTOGENERATE ANGLEs
RESI ETP 0.000
GROU
C1 CH3E 0. ! environment atom
C2 CH2E 0.265 ! co-located charge atom
O1 OH1 -0.7 ! reactant atom
H1 H 0.435 C3 ! reactant atom note the non-bonded exclusion
! with GROU
C3 CH3E 0. ! product atom
BOND C1 C2 !environment term
BOND C2 O1 O1 H1 !reactant terms
BOND C2 C3 !product term
! the angles MUST be specified. Note the absence of O1 C2 C3
! between reactant and product atoms
ANGLe C1 C2 C3 !product term
ANGLe C1 C2 O1 C2 O1 H1 !reactant terms
! this will be a V(R) term.
DIHED C1 C2 O1 H1
! don't really need it but what the heck.
DONO H1 O1
ACCE O1
IC C1 C2 O1 H1 1.54 111. 180. 109.5 0.96
IC C2 O1 H1 BLNK 0. 0. 0. 0. 0.
IC C3 C2 C1 BLNK 0. 0. 0. 0. 0.
PATCH FIRST NONE LAST NONE
END
! Read parameter file
READ PARAM CARD
* parameter file for ETP hybrid.
BOND
CH2E CH3E 225.0 1.54
CH2E OH1 400.0 1.42
OH1 H 450.0 0.96
THETA
CH3E CH2E CH3E 45.0 112.5
CH3E CH2E OH1 45.0 111.0
CH2E OH1 H 35.0 109.5
PHI
CH3E CH2E OH1 H 0.5 3 0.0
NONBONDED NBXMOD 5 ATOM CDIEL SHIFT VATOM VDISTANCE VSWIT -
CUTNB 8.0 CTOFNB 7.5 CTONNB 6.5 EPS 1.0 E14FAC 0.4 WMIN 1.5
! Emin Rmin
! (kcal/mol) (A)
H 0.0440 -0.0498 0.8000
CH2E 1.77 -0.1142 2.235 1.77 -0.1 1.9
CH3E 2.17 -0.1811 2.165 1.77 -0.1 1.9
OH1 0.8400 -0.1591 1.6000
HBOND AEXP 4 REXP 6 HAEX 0 AAEX 0 NOACCEPTORS - HBNOEXCLUSIONS ALL -
CUTHB 0.5 CTOFHB 5.0 CTONHB 4.0 CUTHA 90.0 CTOFHA 90.0 CTONHA 90.0
H* N% -0.00 2.0 ! WER potential adjustment
H* O* -0.00 2.0
END
! read the sequence of one residue, read sequence card
* ETP
*
1
ETP
! Generate the hybrid molecule. Note that we use the NOANGLE
! command because of the AUTOGENERATE ANGLES command in the RTF ! file.
GENERATE ETP SETUP NOANGLE
! determine the geometry and coordinates
IC SEED 1 C1 1 C2 1 O1
IC PARAM
IC PURGE
IC BUILD
! The Hybrid molecule is built. Now set up the FES stuff.
TSM
! Assign reactant list:
REAC sele etp 1 O1 .or. etp 1 H1 end
! Assign product list:
PROD sele etp 1 C2 end
! Set lambda - we will use TI or TP.
! The lambda dependence of the Hamiltonian will be linear.
! This is the default and the POWEr 1 command is actually
! unnecessary.
LAMBda .125 POWEr 1
! The common methyl group is a co-located chargeatom. Since
! the charge in the rtf was for the reactant the RCHArge
! command is actually unnecessary.
co-located chargeETM 1 C2 PCHArge 0. RCHArge 0.265
! This is a thermalization run - so no save statement.
! Just terminate the FES setup with an END statement.
END
! Set up dynamics.
! Since we are interested in the thermodynamic properties and
! not the dynamics, we can use Langevin heat bath dynamics to
! maintain title
* etp: Ethanol To Propane
*
! a simple expedient
shake bond angle
! Set-up Langevin dynamics for temperature control
scalar fbeta set 50.0 sele .not. hydrogen end
! open restart file for output
open unit 3 write form name etp0.res
dynamics langevin timestep 0.001 nstep 10 nprint 2 iprfrq 2 -
firstt 298.0 finalt 298.0 twindl -5.0 twindh 5.0 -
ichecw 1 teminc 60 ihtfrq 20 ieqfrq 200 -
iasors 0 iasvel 1 iscvel 0 -
iunwri 3 nsavc 0 nsavv 0 iunvel 0 -
iunread -1-
!{* Nonbond options *}
inbfrq 10 imgfrq 10 ilbfrq 0 tbath 300.0 rbuffer 0.0 -
eps 1.0 cutnb 8.0 cutim 8.0 ctofnb 7.75
stop
Detailed descriptions of the commands provided in this example can be found in the CHARMm Dictionary.

Setup for slow-growth method

If both reactants and products exist, set up a slow-growth thermalization with the LAMBDA command (not the SLOW command), and use a value a little away from 0 (for the 0 to 1 transformation) or 1 (for the 1 to 0 case). Run slow-growth production dynamics.

You can switch direction in one of two ways. The easiest way is to issue the SLOW command, as follows:


SLOW LFROM 1. LTO 0. TEMP 298. 

Note that the calculation is transforming from the product to the reactant.

The more difficult alternative is to switch the reactant and product designations. If you do this, remember to switch the PCHARGE and RCHARGE values for the co-located charge atoms. Use of the RCHARGE parameter is mandatory because the default is to assume that the charge in the RTF (or set by a SCALAR command) is the REACTANT charge.


Applying scaling

Because approximations to ensemble averages are obtained from finite-length trajectories, determining values of these quantities becomes computationally intractable. In both the thermodynamic integration and thermodynamic perturbation methods, calculating the dynamics trajectory is generally problematical, with large movements of the atoms resulting in bad van der Waals contacts and fraying of bonds with approaching zero or one.

Another way of viewing the situation is that at = 0 or 1, the product or reactant atoms, respectively, do not yet exist. Doing a simulation to ' or viewing the derivative ¹H/¹ as a simulation to + requires having coordinates of atoms that do not exist either yet or any longer.

Thermodynamic integration method

The thermodynamic integration integral over tends to diverge when linear scaling is used. This is the result of the fact that, as approaches zero or one, the affected atoms (mostly product or reactant atoms and sometimes the environment atoms bonded to them) feel forces that approach zero and thus can have positions anywhere in phase space.

Non-linear scaling for the thermodynamic integration method can be used to avoid this difficulty. This scaling method has one major advantage. At = 0, the components of the derivatives ¹H()/¹ due to the product part of the Hamiltonian are identically zero. Similarly, at = 1, the components due to the reactant part of the Hamiltonian are zero. This solves the -goes-to-zero problem.

Another way to avoid the problem is to scale the thermodynamic integration method integral by a function that reduces the weight of the integrand as approaches 0 or 1.

If the thermodynamic integration method with non-linear scaling is used, a command should be issued to delete the product atoms from the hybrid molecule RTF prior to the free energy simulation commands at = 0. For example:


DELEte ATOMs SELEct etp 1 C2 END 

This is a standard CHARMm PSF modification command and is issued after segment generation.

Furthermore, the free energy simulations command sequence changes slightly from the ethane to propane example, as follows:


TSM
REAC sele etp 1 O1.or. etp 1 H1 end
! no product atom at lambda = 0
PROD NONE
! non-linear lambda scaling
LAMBda .125 POWEr 2
END
Because no product atoms exist at = 0, the PROD NONE command is issued. Also, there is no need for the COLO command. For = 1, an equivalent procedure is used.

Thermodynamic perturbation method

To get around the same problem when you use the thermodynamic perturbation method with linear scaling, do not run dynamics at = 0 or 1. Instead, run dynamics at values of a small distance away from 0 or 1 and extrapolate down to the endpoints.

One problem can occur with this procedure. In cases involving transformation of hydrophobic solute in aqueous solution, where water structure rearrangements around the solute are the major contributing factor to the free energy change, not sampling at = 0 or 1 can mean that the significant part of phase space for the rearrangement is not adequately sampled. If, in going from reactant to product (or vice versa), a significant volume becomes newly accessible to the solvent, the presence of the r-12 repulsive forces from the almost (but not completely) disappeared atoms can conceivably prevent the necessary configurations of the water molecules from appearing in the finite length trajectory. This problem has not been fully investigated.

Other considerations

Non-linear scaling may be preferred for sampling efficiency. However, problems can result because the monotonicity of the integrand in the thermodynamic integration method integral is no longer assured. In this method, non-linear scaling forces result in very small perturbations from to ', and the non-linear exponent makes ýV( to ') very large. For example, if the exponent is 6, is 0.5 and ' is 0.25 (a reasonable window), the potential energy term for the product is multiplied by 0.56 (giving 0.16 for ) and by 0.256 (giving 0.00024 for '). You end up with the term exp(-(0.16V¢ - 0.00024V¢) in the ensemble average, causing extremely slow convergence.

For the temperature derivative related properties, you also run into problems with floating-point overflows.

To deal with these difficulties, add a SAVE statement to the free energy simulation commands before the END statement in production runs. Repeat the procedure for more points. Then, do a thermalization run at each value of , followed again by production runs.

The advantage of this implementation is that you can always run additional trajectories at any given value of and add the output to previous runs. You can also insert trajectories at other values of and recalculate thermodynamic properties.


Post-processing data

You must post-process output files so that you can compute free energy changes. For purposes of this discussion, assume you are working with three free energy simulation output files: etp1.prt, etp2.prt, and etp3.prt for = 0.125, 0.500, and 0.875, respectively. For comparison purposes, assume there are three additional files named etp4.prt, etp5.prt, and etp6.prt for = 0.125, 0.500, and 0.875, respectively.

The following input file processes output files:


* Post-processing Example ETP: ethanol -> propane vacuum.
* TP method linear lambda scaling.
*
! open FES data files for input.
open unit 10 form read etp1.prt
open unit 11 form read etp2.prt
open unit 12 form read etp3.prt
open unit 13 form read etp4.prt
open unit 14 form read etp5.prt
open unit 15 form read etp6.prt
!
! now the post-processing input
!
TSM POST PSTAck 6 PLOT
! lambda = .125 -> lambda' = 0.
PROC FIRST 10 NUNIT 2 LAMB 0.0 TEMP 298.0 DELT 10. BINS 100 CTEM
! lambda = .125 -> lambda' = 0.25
PROC FIRST 10 NUNIT 2 LAMB 0.25 TEMP 298.0 DELT 10. BINS 100 CTEM
! lambda = .5 -> lambda' = 0.25
PROC FIRST 12 NUNIT 2 LAMB 0.0 TEMP 298.0 DELT 10. BINS 100 CTEM
! lambda = .5 -> lambda' = 0.75
PROC FIRST 12 NUNIT 2 LAMB 0.0 TEMP 298.0 DELT 10. BINS 100 CTEM
! lambda = .875 -> lambda' = 0.75
PROC FIRST 14 NUNIT 2 LAMB 0.0 TEMP 298.0 DELT 10. BINS 100 CTEM
! lambda = .875 -> lambda' = 1.0
PROC FIRST 14 NUNIT 2 LAMB 0.0 TEMP 298.0 DELT 10. BINS 100 CTEM
!
! the END command tells the post-processor to tally everything ! up.
END
STOP
An explanation of the commands in this script can be found in the CHARMm Dictionary.

Removing correlation dependence

A major problem with free energy simulations is that the data are very highly correlated. To get around this, the data are divided into bins, and deviations of the bin averages from the total trajectory average are calculated to get a variance. Using bin averages removes the correlated dependence of the variance.

In the preceding example, a bin size of 100 is used. The estimated error depends on the choice of bin size. You hope to get convergence as a function of bin size. If you don't, trajectory lengths are probably not long enough.

Property variance

A method for determining the variance of a property that uses correlation functions has been developed (Straatsma et al. 1986). This method looks promising as long as you visually monitor the correlation function behavior and extrapolate at the longer lag times. However, an error in the approximation of the correlation function is introduced. The developers of this method use an exponential extrapolation to overcome this error.

When the variance of the ensemble averages is calculated, the uncertainty in thermodynamic properties is determined by propagation of errors. For thermodynamic integration, necessary derivatives are determined by numerical differentiation. Total uncertainties are determined by summing the variances for each window.


Using umbrella sampling

When you want to sample torsional minima separated by barriers that make transitions infrequent (> 1 kT), you can use umbrella or importance sampling. Use a modified potential with lower barriers, then correct the approximation to the ensemble average with the actual potential energy at the end:

Eq. 18     

Where:

w exp(-(Vactual - Vsurrogate))

<A> The corrected ensemble average of property A

The ensemble averages on the right-hand side of the equation are those that result from having the surrogate potential energy term in the Hamiltonian.

In the current implementation, an umbrella correction is available for the ensemble averages used to calculate ýA, ýE, and ýS in the free energy simulations. It is presently limited to modifying the three-fold torsional term.

CHARMm assumes that the surrogate potential is the one in the parameter file. If there are dihedral angles of the same type as the one that is to be subjected to umbrella sampling, and you do not want to treat the angles the same way, you must give different type names to the atoms and modify the parameter file.

Using the umbrella command

Assume your study is the transformation of n-butane into propane in aqueous solution or vacuum and the hybrid molecule has a segment name of btp. The umbrella command looks like this:


UMBR btp 1 C1 btp 1 C2 btp 1 C3 btp 1 C4 VACT 1.6 

The four atoms of the butane dihedral are specified and the 3-fold term for the actual potential is given. The surrogate term is present in the parameter file. If the molecule had more than one dihedral angle around the same central axis, all are specified in separate umbrella commands.

Invoking the umbrella command causes the free energy simulation output to have an additional field (that is, the W term in Eq. 18). The header line after the title in the data file has a field that indicates this additional field is present.

For post-processing, you must add the flag parameter UMBR to each process command to indicate that the umbrella correction is to be applied.


References

Fleischman, S. H. and Brooks III, C. L., "Thermodynamics of Aqueous Solvation: Solution Properties of Alcohols and Alkanes," J. Chem. Phys., 87 3029 (1987).

Mezei, M. and Beveridge, D. L., "Free Energy Simulations," Annals of the New York Academy of Sciences, 482 1 (1986).

Straatsma, T. P., Free Energy Evaluation by Molecular Dynamics Simulations, Ph.D. dissertation, University of Groningen, Netherlands, (1987).

Straatsma, T. P., Berendsen, H. J. C., and Stam, A. J., Molecular Physics, 57 89 (1986).




Last updated September 18, 1998 at 04:03PM PDT.
Copyright © 1997, Molecular Simulations, Inc. All rights reserved.