| CHARMm Principles |

This chapter explains
For an overview of free energy simulation methods, consult Mezei and Beveridge (1986), Straatsma (1987), or Fleischman and Brooks (1987).
In all of the methods for calculating relative free energy in CHARMm, a hybrid Hamiltonian is used:
Understanding the relative free energy Hamiltonian
Coupling parameter (extent of transformation)
For example, examine the calculation of the relative change in the solvation free energies of methanol and ethane. This example is taken from Fleischman and Brooks (1987).
The system is represented by water molecules (usually a box of 125 or more, using periodic boundary conditions) and the hybrid methanol/ethane system.

= 1.0) with the ethyl-methyl group charge. C2 is also considered an environment atom. The atoms of the water molecules constitute the actual environment atoms in this system. If the hybrid molecule were larger, it could contain additional environment atoms.
dependence of the Hamiltonian is more complex. Complexity is higher in those implementations where force constants and other parameters in the energy terms are factored by
rather than by calculating various energy terms and factoring those terms.
In a statistical mechanical sense, no particular reason exists to factor the Hamiltonian one way or another. The thermodynamic interactions hold regardless of path. The equipartition theorem for obtaining kinetic energy works just as well although, in some implementations, factoring of kinetic energy sometimes appears to be ignored.

Notice that all of the averages are at
. To get the total ýA:
You must ensure that the whole
range is covered. For example, in a methanol to ethane calculation, three
values could be used:
= 0.125, 0.500, and 0.875. To cover the range completely, you must perform a series of six simulations:
to
': 0.125 to 0.000 and 0.125 to 0.250;
Dynamics calculations are run and two ýA's are calculated: one lower and one higher than the corresponding
. This is termed double-wide sampling. Note that the simulations are joined at common
s.
The technique for running these simulations is called windowing. With windowing, simulations are run at a discrete number of
points. If very few points are used, long simulations are required in order to smoothly join the results from the trajectories. If many points are used (tens to hundreds), shorter simulations are required.
Thermodynamic integration method
For the thermodynamic integration method, the following expression is used:
Eq. 16
Integration is done by a quadrature method. In this implementation, the ensemble average is fitted as a function of
to a cubic spline polynomial and the polynomial is integrated analytically. No extrapolation to end-points is done.
. You will probably get better results by averaging energies at the endpoints and subtracting.
Slow-growth method
Slow growth is an approximation of the thermodynamic integration method. In this method, instead of
being constant for a given trajectory, the parameter varies monotonically with each time step, as indicated in Eq. 17:

i =
i+1 -
i
Some advantages are associated with this method:
Some disadvantages are also associated with this method:
is changed with every step, no method is available to tack on additional trajectories.

to
¢ can be calculated following analysis and additional points can be added, as appropriate.
The calculation is done in three steps:
1. Setting up a simulation
The first two steps occur in the same input file. The last step is generally done with a separate input file because the output of several trajectories is usually used.
To set up the free energy simulation input file, gather or generate the usual files for a dynamics simulation including a PSF, coordinates, parameters, and image or stochastic boundary condition input.
) potential. To remove bad contacts, minimization is generally done prior to dynamics, with the hybrid molecule unperturbed.
After the free energy simulation has been set up, dynamics can be run with no changes in the commands. Normally, you run some thermalization runs with data being discarded. For a thermalization run, the SAVE command in the free energy simulation command sequence is generally not used. For production runs using thermodynamic integration or thermodynamic simulation, the SAVE option must be issued in the free energy simulation input. This will result in the output of V
and V
¢ in a formatted file.
Example: Setting up an FES simulation and running dynamics
Below is a fragment of the input file for setting up the thermalization of an ethanol to propane hybrid. Windowing is used during the simulation.

For this example, the extended atom model is used, with aliphatic hydrogens considered part of the carbon to which they are attached. The example also provides for linear scaling of
, although non-linear
scaling can be specified.
*Ethanol -> PropaneDetailed descriptions of the commands provided in this example can be found in the CHARMm Dictionary.
*
! Read topology file
READ RTF CARD
* TOPOLOGY FILE ethanol -> propane
*
20 1! Version number
MASS 1 H 1.00800! hydrogen which can h-bond to neutral atom
MASS 13 CH2E 14.02700! - " - two
MASS 14 CH3E 15.03500 ! - " - three
MASS 53 OH1 15.99940 ! hydroxy oxygen
! This is put in to force the necessity of using a GENERATE
! Noangles in the input file. The standard topology files use
!this statement.
AUTOGENERATE ANGLEs
RESI ETP 0.000
GROU
C1 CH3E 0. ! environment atom
C2 CH2E 0.265 ! co-located charge atom
O1 OH1 -0.7 ! reactant atom
H1 H 0.435 C3 ! reactant atom note the non-bonded exclusion
! with GROU
C3 CH3E 0. ! product atom
BOND C1 C2 !environment term
BOND C2 O1 O1 H1 !reactant terms
BOND C2 C3 !product term
! the angles MUST be specified. Note the absence of O1 C2 C3
! between reactant and product atoms
ANGLe C1 C2 C3 !product term
ANGLe C1 C2 O1 C2 O1 H1 !reactant terms
! this will be a V(R) term.
DIHED C1 C2 O1 H1
! don't really need it but what the heck.
DONO H1 O1
ACCE O1
IC C1 C2 O1 H1 1.54 111. 180. 109.5 0.96
IC C2 O1 H1 BLNK 0. 0. 0. 0. 0.
IC C3 C2 C1 BLNK 0. 0. 0. 0. 0.
PATCH FIRST NONE LAST NONE
END
! Read parameter file
READ PARAM CARD
* parameter file for ETP hybrid.
BOND
CH2E CH3E 225.0 1.54
CH2E OH1 400.0 1.42
OH1 H 450.0 0.96
THETA
CH3E CH2E CH3E 45.0 112.5
CH3E CH2E OH1 45.0 111.0
CH2E OH1 H 35.0 109.5
PHI
CH3E CH2E OH1 H 0.5 3 0.0
NONBONDED NBXMOD 5 ATOM CDIEL SHIFT VATOM VDISTANCE VSWIT -
CUTNB 8.0 CTOFNB 7.5 CTONNB 6.5 EPS 1.0 E14FAC 0.4 WMIN 1.5
! Emin Rmin
! (kcal/mol) (A)
H 0.0440 -0.0498 0.8000
CH2E 1.77 -0.1142 2.235 1.77 -0.1 1.9
CH3E 2.17 -0.1811 2.165 1.77 -0.1 1.9
OH1 0.8400 -0.1591 1.6000
HBOND AEXP 4 REXP 6 HAEX 0 AAEX 0 NOACCEPTORS - HBNOEXCLUSIONS ALL -
CUTHB 0.5 CTOFHB 5.0 CTONHB 4.0 CUTHA 90.0 CTOFHA 90.0 CTONHA 90.0
H* N% -0.00 2.0 ! WER potential adjustment
H* O* -0.00 2.0
END
! read the sequence of one residue, read sequence card
* ETP
*
1
ETP
! Generate the hybrid molecule. Note that we use the NOANGLE
! command because of the AUTOGENERATE ANGLES command in the RTF ! file.
GENERATE ETP SETUP NOANGLE
! determine the geometry and coordinates
IC SEED 1 C1 1 C2 1 O1
IC PARAM
IC PURGE
IC BUILD
! The Hybrid molecule is built. Now set up the FES stuff.
TSM
! Assign reactant list:
REAC sele etp 1 O1 .or. etp 1 H1 end
! Assign product list:
PROD sele etp 1 C2 end
! Set lambda - we will use TI or TP.
! The lambda dependence of the Hamiltonian will be linear.
! This is the default and the POWEr 1 command is actually
! unnecessary.
LAMBda .125 POWEr 1
! The common methyl group is a co-located chargeatom. Since
! the charge in the rtf was for the reactant the RCHArge
! command is actually unnecessary.
co-located chargeETM 1 C2 PCHArge 0. RCHArge 0.265
! This is a thermalization run - so no save statement.
! Just terminate the FES setup with an END statement.
END
! Set up dynamics.
! Since we are interested in the thermodynamic properties and
! not the dynamics, we can use Langevin heat bath dynamics to
! maintain title
* etp: Ethanol To Propane
*
! a simple expedient
shake bond angle
! Set-up Langevin dynamics for temperature control
scalar fbeta set 50.0 sele .not. hydrogen end
! open restart file for output
open unit 3 write form name etp0.res
dynamics langevin timestep 0.001 nstep 10 nprint 2 iprfrq 2 -
firstt 298.0 finalt 298.0 twindl -5.0 twindh 5.0 -
ichecw 1 teminc 60 ihtfrq 20 ieqfrq 200 -
iasors 0 iasvel 1 iscvel 0 -
iunwri 3 nsavc 0 nsavv 0 iunvel 0 -
iunread -1-
!{* Nonbond options *}
inbfrq 10 imgfrq 10 ilbfrq 0 tbath 300.0 rbuffer 0.0 -
eps 1.0 cutnb 8.0 cutim 8.0 ctofnb 7.75
stop
Setup for slow-growth method
If both reactants and products exist, set up a slow-growth thermalization with the LAMBDA command (not the SLOW command), and use a
value a little away from 0 (for the 0 to 1 transformation) or 1 (for the 1 to 0 case). Run slow-growth production dynamics.
SLOW LFROM 1. LTO 0. TEMP 298.Note that the calculation is transforming from the product to the reactant.
The more difficult alternative is to switch the reactant and product designations. If you do this, remember to switch the PCHARGE and RCHARGE values for the co-located charge atoms. Use of the RCHARGE parameter is mandatory because the default is to assume that the charge in the RTF (or set by a SCALAR command) is the REACTANT charge.

approaching zero or one. Another way of viewing the situation is that at
= 0 or 1, the product or reactant atoms, respectively, do not yet exist. Doing a simulation to
' or viewing the derivative ¹H/¹
as a simulation to
+ 
requires having coordinates of atoms that do not exist either yet or any longer.Thermodynamic integration method
The thermodynamic integration integral over

tends to diverge when linear scaling is used. This is the result of the fact that, as
approaches zero or one, the affected atoms (mostly product or reactant atoms and sometimes the environment atoms bonded to them) feel forces that approach zero and thus can have positions anywhere in phase space.
= 0, the components of the derivatives ¹H(
)/¹
due to the product part of the Hamiltonian are identically zero. Similarly, at
= 1, the components due to the reactant part of the Hamiltonian are zero. This solves the
-goes-to-zero problem.
approaches 0 or 1.
scaling is used, a command should be issued to delete the product atoms from the hybrid molecule RTF prior to the free energy simulation commands at
= 0. For example:
DELEte ATOMs SELEct etp 1 C2 ENDThis is a standard CHARMm PSF modification command and is issued after segment generation.
Furthermore, the free energy simulations command sequence changes slightly from the ethane to propane example, as follows:
TSMBecause no product atoms exist at
REAC sele etp 1 O1.or. etp 1 H1 end
! no product atom at lambda = 0
PROD NONE
! non-linear lambda scaling
LAMBda .125 POWEr 2
END
= 0, the PROD NONE command is issued. Also, there is no need for the COLO command. For
= 1, an equivalent procedure is used. Thermodynamic perturbation method
To get around the same problem when you use the thermodynamic perturbation method with linear scaling, do not run dynamics at
= 0 or 1. Instead, run dynamics at values of
a small distance away from 0 or 1 and extrapolate down to the endpoints.
= 0 or 1 can mean that the significant part of phase space for the rearrangement is not adequately sampled. If, in going from reactant to product (or vice versa), a significant volume becomes newly accessible to the solvent, the presence of the r-12 repulsive forces from the almost (but not completely) disappeared atoms can conceivably prevent the necessary configurations of the water molecules from appearing in the finite length trajectory. This problem has not been fully investigated.
Non-linear scaling may be preferred for sampling efficiency. However, problems can result because the monotonicity of the integrand in the thermodynamic integration method integral is no longer assured. In this method, non-linear scaling forces result in very small perturbations from
to
', and the non-linear exponent makes ýV(
to
') very large. For example, if the exponent is 6,
is 0.5 and
' is 0.25 (a reasonable window), the potential energy term for the product is multiplied by 0.56 (giving 0.16 for
) and by 0.256 (giving 0.00024 for
'). You end up with the term exp(-
(0.16V
¢ - 0.00024V
¢) in the ensemble average, causing extremely slow convergence.
points. Then, do a thermalization run at each value of
, followed again by production runs.
and add the output to previous runs. You can also insert trajectories at other values of
and recalculate thermodynamic properties.
You must post-process output files so that you can compute free energy changes. For purposes of this discussion, assume you are working with three free energy simulation output files: etp1.prt, etp2.prt, and etp3.prt for 
Post-processing data
= 0.125, 0.500, and 0.875, respectively. For comparison purposes, assume there are three additional files named etp4.prt, etp5.prt, and etp6.prt for
= 0.125, 0.500, and 0.875, respectively.
* Post-processing Example ETP: ethanol -> propane vacuum.An explanation of the commands in this script can be found in the CHARMm Dictionary.
* TP method linear lambda scaling.
*
! open FES data files for input.
open unit 10 form read etp1.prt
open unit 11 form read etp2.prt
open unit 12 form read etp3.prt
open unit 13 form read etp4.prt
open unit 14 form read etp5.prt
open unit 15 form read etp6.prt
!
! now the post-processing input
!
TSM POST PSTAck 6 PLOT
! lambda = .125 -> lambda' = 0.
PROC FIRST 10 NUNIT 2 LAMB 0.0 TEMP 298.0 DELT 10. BINS 100 CTEM
! lambda = .125 -> lambda' = 0.25
PROC FIRST 10 NUNIT 2 LAMB 0.25 TEMP 298.0 DELT 10. BINS 100 CTEM
! lambda = .5 -> lambda' = 0.25
PROC FIRST 12 NUNIT 2 LAMB 0.0 TEMP 298.0 DELT 10. BINS 100 CTEM
! lambda = .5 -> lambda' = 0.75
PROC FIRST 12 NUNIT 2 LAMB 0.0 TEMP 298.0 DELT 10. BINS 100 CTEM
! lambda = .875 -> lambda' = 0.75
PROC FIRST 14 NUNIT 2 LAMB 0.0 TEMP 298.0 DELT 10. BINS 100 CTEM
! lambda = .875 -> lambda' = 1.0
PROC FIRST 14 NUNIT 2 LAMB 0.0 TEMP 298.0 DELT 10. BINS 100 CTEM
!
! the END command tells the post-processor to tally everything ! up.
END
STOP
Removing correlation dependence
A major problem with free energy simulations is that the data are very highly correlated. To get around this, the data are divided into bins, and deviations of the bin averages from the total trajectory average are calculated to get a variance. Using bin averages removes the correlated dependence of the variance.
A method for determining the variance of a property that uses correlation functions has been developed (Straatsma et al. 1986). This method looks promising as long as you visually monitor the correlation function behavior and extrapolate at the longer lag times. However, an error in the approximation of the correlation function is introduced. The developers of this method use an exponential extrapolation to overcome this error.
When the variance of the ensemble averages is calculated, the uncertainty in thermodynamic properties is determined by propagation of errors. For thermodynamic integration, necessary derivatives are determined by numerical differentiation. Total uncertainties are determined by summing the variances for each window.
When you want to sample torsional minima separated by barriers that make transitions infrequent (> 1 kT), you can use umbrella or importance sampling. Use a modified potential with lower barriers, then correct the approximation to the ensemble average with the actual potential energy at the end:
Using umbrella sampling
Eq. 18
Where:
(Vactual - Vsurrogate))
CHARMm assumes that the surrogate potential is the one in the parameter file. If there are dihedral angles of the same type as the one that is to be subjected to umbrella sampling, and you do not want to treat the angles the same way, you must give different type names to the atoms and modify the parameter file.
Using the umbrella command
UMBR btp 1 C1 btp 1 C2 btp 1 C3 btp 1 C4 VACT 1.6The four atoms of the butane dihedral are specified and the 3-fold term for the actual potential is given. The surrogate term is present in the parameter file. If the molecule had more than one dihedral angle around the same central axis, all are specified in separate umbrella commands.
Invoking the umbrella command causes the free energy simulation output to have an additional field (that is, the W term in Eq. 18). The header line after the title in the data file has a field that indicates this additional field is present.
For post-processing, you must add the flag parameter UMBR to each process command to indicate that the umbrella correction is to be applied.
Fleischman, S. H. and Brooks III, C. L., "Thermodynamics of Aqueous Solvation: Solution Properties of Alcohols and Alkanes," J. Chem. Phys., 87 3029 (1987). 
References