CHARMm Principles



3       Performing Energy and Force Calculations

The potential energy for any molecular structure can be computed in CHARMm given a PSF, a full parameter set with appropriate force constants and equilibrium geometries, and Cartesian coordinates. In addition to the energy, forces on each of the atoms in the molecular system can also be computed. Force calculations are particularly important during molecular dynamics simulations.

This chapter describes the CHARMm energy function and energy minimization algorithms that can be used to refine structures or locate stationary points for conformational searching for dynamics simulations.

This chapter explains

Script examples are included throughout the chapter.


Applying the CHARMm energy function

CHARMm uses a flexible and comprehensive empirical energy function that is a summation of many individual energy terms. The energy function is based on separable internal coordinate terms and pairwise nonbonded interaction terms (see Brooks et al. 1983).

The total energy is expressed by the following equation:

Potential Energy = Ebond + Etheta + Ephi + Eimpr + (internal)
Eelec + Evdw + (external)
Econs + Euser + Eother (special)

Evaluation of the energy of a molecular system is used as a step in minimization or dynamics. Additionally, the computed energy can be used for conformation comparisons, computing thermodynamic properties, calculating interaction energies and forces, projecting forces onto internal coordinates, evaluating structures during conformational searching, and computing sublimation energies.

In order to perform all of these functions, you are afforded a great deal of flexibility in using the empirical energy function. Indeed, almost everything can be customized, from parameters and the specification of individual energy terms to user-supplied energy routines.

Internal energy terms

Internal energy terms are listed as specified in residue topology files and include:

Eq. 1     

Eq. 2     

Eq. 3     

Where n = 1, 2, 3, 4, 6

Eq. 4     

The first two terms account for bond length and bond angle deformations. Harmonic approximation used for bond stretching and angle bending terms is valid at ordinary temperatures and in the absence of chemical reaction.

The dihedral angle (torsion) energy term is a four-atom potential based on the dihedral angle about an axis defined by the middle pair of atoms.

The improper torsional potential is designed to maintain chirality about a tetrahedral extended heavy atom, such as an amino acid alpha carbon without an explicit hydrogen, and to maintain planarity about sp2 hybridized atoms, such as carbonyl carbons with a quadratic distortion potential. Without such a term, out-of-plane potentials tend to be quartic. Additionally, this term provides a better force field near the minimum energy geometry, a consideration that is important for dynamics calculations and vibrational analysis.

External energy terms

External energy terms include:

Eq. 5     

For distance-dependent, shifted groups and extended modifications of this equation, see Brooks et al. (1983).

Eq. 6     

For a detailed explanation of this equation, see Brooks et al. (1983).

Nonbonded energy terms

Electrostatic and van der Waals interactions together are called nonbonded interactions.

Computing electrostatic potential

Electrostatic potential is computed using the partial atomic charges provided in the RTF. Several different approaches are included in CHARMm for treating the electrostatic interactions:

resulting in the electrostatic interaction of the form

.

This option is intended as an approximate way to deal with solvent affects without explicitly including solvent.

This approach is designed particularly for simulations where solvent is being included explicitly and the details of solvation and solvent polarization are of interest.

Computing van der Waals energy

The van der Waals energy is approximated in the CHARMm empirical energy function by the Lennard-Jones potential energy function. This function is often referred to as a 6-12 potential because the attractive force is proportional to R-6, while the repulsive force is modeled by R-12.

Nonbonded list

External terms are defined by nonbonded and hydrogen-bonded lists specified by the user. Nonbonded interactions refer to van der Waals terms and electrostatic interactions between atom pairs.

To maximize the efficiency of the nonbonded calculations, a list is created that contains all pair interactions to be considered. Atom pairs are not included in the list if they are too far apart (beyond the long-range cutoffs), or if they are directly connected (either through a bond or a bond angle). In the latter case, the atoms will be present in the excluded list.

The list generation approach was chosen over other alternatives (such as a pairwise search) for two reasons:

The list of atoms is updated periodically during minimization or dynamics to account for changes in atomic positions. If electrostatic groups are used, the list is stored in terms of group pairs. This leads to improved efficiency and a smaller list, as well as the ability to handle long-range group interactions.

The nonbonded calculation encompasses two steps: list generation and energy computation. The same nonbonded specification should be used for all energy calculations within a given project. Should it be necessary to use separate nonbonded specifications, the implications of these changes must be carefully evaluated.

The nonbonded list is primarily generated with the UPDATE command, although it can be generated or modified by providing nonbond-list specifications with other commands that include the following:

Updating nonbonded lists

During an energy minimization or molecular dynamics simulation, the nonbonded list can be updated according to a frequency that you set. The INBFRQ keyword is used to indicate the frequency for updating the nonbonded list. Three choices are available:

The heuristic test for determining whether the nonbonded list needs updating follows this procedure:

Nonbonded cutoffs

For a typical macromolecule, nonbonded interactions represent the bulk of the energy evaluation time. The efficiency of the nonbonded calculation is increased by including only atom pairs that are closer than the cutoff distance in a nonbonded list. The cutoff distance criterion, while computationally efficient, has some disadvantages. The possibility always exists that the overall nonbonded interactions at long range are not negligible because such interactions are very numerous. However, the slope of the van der Waals and electrostatic potentials in this region is very small. Because the nonbonded interaction vanishes at a greater rate than the total energy, the cutoff makes very little difference for most calculations.

A more significant disadvantage is that a discontinuity in the energy function results at the cutoff distance. For energy minimization of molecular dynamics calculations, this discontinuity can significantly distort computational results. For example, the total energy of the molecular system may not be conserved under such circumstances.

To resolve these difficulties, both the van der Waals and electrostatic energy terms can be modified by techniques that smooth the function around the cutoff distance, as described in the following paragraphs:

Two variables corresponding to the start and finish of the switching regions (ron and roff, respectively) are required to define the switching function. Depending on the value of rij, the following values are used to multiply individual electrostatic or van der Waals energy terms:

However, energy can still be significant at the cut-off distance and this can result in artificially large forces at long range. This is especially true for relatively short (that is, less than 12 Å) cutoff distances and small buffer regions (that is, when roff - ron < 3 Å).

Eq. 7     

One disadvantage to this function is that it has a discontinuity in the second derivatives at the cutoff distance.

All of these options are available as part of the empirical energy function in CHARMm.

Dielectric constant

The dielectric constant is an experimentally derived property of the bulk solvent that reflects the polarizability of the solvent molecules. A polarizable solvent such as water has a greater dielectric constant than less polar liquids. Electrostatic interactions in such high dielectric polarizable solvents are greatly attenuated.

A dielectric constant of 1.0 is sufficient for solvated simulations that include an atomic polarizability term. Because most calculations do not include these factors, a number of models have been used to simulate the dielectric behavior for such systems.

The EPS keyword is used to specify an explicit value for the dielectric constant. A relatively large dielectric constant can be used for simulating the aqueous environment of small systems. However, most calculations on molecular systems use a smaller dielectric constant. For example, a dielectric constant between 2.0 and 10.0 has been employed for simulations in the interior of a protein.

For electrostatic interactions in closely packed molecules, the number of solvent molecules between two interacting charges is usually less than that which would be experienced in the bulk solvent. Thus, the dielectric constant for the micro solvent environment is often not as great as for the bulk solvent. To simulate this effect, a distance-dependent dielectric constant can be used. This constant also acts as an approximate solvent screening term in which the electrostatic interaction experiences a greater and greater attenuation as the two charges are separated. The RDIE keyword is used to specify the use of this distance-dependent dielectric constant in energy calculations.

Hydrogen bonds

The generation of hydrogen bonds is one of the major steps in evaluating the energy of a system. The process of hydrogen bond generation involves looking at all possible pairs of hydrogen bond donors and acceptors and selecting those that are good. The meaning of good is determined by the parameters described below. The generation routine is also responsible for constructing positions for all uncoordinated hydrogens and adding them to the coordinate list.

The selection of hydrogen bonds involves three checks:

Because there are cutoffs involved with the selection of hydrogen bonds, and because the hydrogen bond list must be updated during dynamics and energy must be conserved, switching functions are needed to smooth the transition over a cutoff. The specification of hydrogen bond generation also allows specification of switching function parameters.

However, certain choices for the hydrogen boned selection process will never conserve energy in a dynamics run. This is true, for example, if you use the extended-atom representation for your calculations.

The function used in CHARMm to calculate the hydrogen bond energy is:

Note that hydrogen bond energy is not included as a default energy term. The current CHARMm parameter set has been derived in such a way that hydrogen bond effects are described by the combination of electrostatic and van der Waals forces.

Other energy terms

Other (special) energy terms include:

Atom constraint -- Used to rigidly maintain the position of certain atoms and delete the energy terms involving only these atoms. This can serve as an effective way of simulating a small part of a very large system with relative ease and increased efficiency.

Atom harmonic constraint -- Used primarily to avoid large displacements of atoms when minimizing, often when bad contacts are present, while still allowing the structure to relax.

Dihedral constraint -- Used to maintain certain local conformations or to examine a series of different conformations when making potential energy maps.

NOE constraint -- Used to incorporate NOE constraints (representing experimentally derived inter-proton distances) into the energy function. This is an especially important technique for generating structures for molecules in aqueous solution of lipid membranes as well as for gas phase studies.

Euser -- CHARMm also allows the addition of a user-supplied routine to the energy function that calculates an arbitrary energy term.

Example: Calculating the initial energies of enkephalin

This example uses the same model, enkephalin, as does Example: Constructing a model using generated internal coordinates. In fact, if you created and saved a PSF, this script could be modified to use the same PSF.

This script begins by generating a PSF and coordinates for enkephalin. The coordinates are copied to the COMP array. The energy of the system is then calculated in a number of different ways:

The following files are used in this example:

AMINO.BIN

PARM.BIN


*...
* Copyright (c) 1994
* Molecular Simulations Inc.
* All Rights Reserved
*
! Open and read an all hydrogen amino acid topology file
OPEN READ UNIT 11 FILE NAME "$CHM_DATA/AMINOH.BIN"
READ RTF UNIT 11 FILE
CLOSE UNIT 11
! Open and read the parameter file
OPEN READ UNIT 12 FILE NAME "$CHM_DATA/PARM.BIN"
READ PARA UNIT 12 FILE
! Read the enkephalin sequence
READ SEQUENCE CARD
* Pentapeptide sequence for met-enkephalin
*
TYR GLY GLY PHE MET
! Generate the PSF with a segment identifier of ENKP.
GENERATE ENKP SETUP
! Construct initial Cartesian coordinates
IC PARAMETERS
IC SEED 1 N 1 CA 1 C
! Copy coordinates to comparison set
COOR COPY COMP
! Update nonbonded lists
UPDATE RDIE SHIFT VSHIFT
! Get energy using standard routines
FAST 0
ENERGY
ENERGY COMP
GETE PRINT
FAST 1
! Get energy using optimized routines
ENERGY
! Get nonbonded energy only
SKIPE ALL EXCL VDW ELEC
ENERGY
! Restore default energy terms
SKIPE EXCL ALL
ENERGY
STOP


Minimizing energy

The goal of energy minimization is to find a set of coordinates representing a molecular conformation such that the potential energy of the system is at a minimum. As a consequence of many degrees of freedom for even the simplest of macromolecules, this task can be computationally quite difficult.

CHARMm has five different minimization methods, all working in Cartesian space, that provide a flexible array of iterative methods to facilitate energy minimization. Although the resulting conformation may only represent a local minimum, even macromolecules can be energy minimized efficiently using a number of these techniques.

All of the minimization methods take a molecular structure to a local minimum in the potential energy surface. There is no guarantee that this will be a global minimum. Small molecular systems can be minimized to a global minimum, but multiple runs from different starting points should be done to confirm that a global minimum has indeed been found.

With macromolecules, a very low probability exists that a local minimum will be the global minimum. In fact, a global minimum may never be found because of the complexity of the potential energy surface.

Using minimization

Minimization used as part of a programmed conformational search can yield many unique minima that can be useful in understanding large parts of the molecule's conformational space.

For example, minimization is an important tool in analyzing proteins that are generated through site-directed mutagenesis. After substituting, inserting, or deleting residues in a sequence, minimization, along with side-chain conformation scanning, can be used to determine whether the resulting mutuant structure is very much perturbed with respect to the wild type. If the perturbation is minimal, it is possible to model the structure of the mutant protein without resorting to X-ray diffraction studies.

Minimization methods

Each of the minimization methods available in CHARMm, together with implementation considerations is listed below:

When one or more negative eigenvalues exist, a blind application of the equations will find a saddle point in the potential. To overcome this problem, a single additional energy and gradient determination is performed along the eigenvector displacement for each small or negative eigenvalue. From this additional data, the energy function is approximated by a cubic potential and the step size that minimizes this function is adopted.

The advantages of this algorithm are that it avoids saddle points in the potential energy surface and converges rapidly when the potential is nearly quadratic.

The major disadvantage is that large computational requirements makes this technique time consuming and memory demanding for large molecules. Additionally, it is currently not possible to use SHAKE (described in Chapter 5, Setting Constraints and Periodic Boundaries) or images with this method.

This method performs energy minimization using a Newton-Raphson algorithm applied to a subspace of the coordinate vector spanned by the displacement coordinates of the last positions. The second derivative matrix is constructed numerically from the change in the gradient vectors, and is inverted by an eigenvector analysis that allows the routine to recognize and avoid saddle points in the energy surface. At each step, the residual gradient vector is calculated and used to add a steepest descent step, incorporating new direction into the basis set.

This method is the method of choice for most applications. Because it avoids the large storage requirements of the full NRAP second derivative method, larger systems can be minimized more efficiently.

For a general discussion of minimization methods, see Fletcher (1969) and Press et al (1987). For specific discussion of the conjugate gradient method, see Fletcher and Reeves (1964). For specific discussion of the Powell method, see Powell (1977) and Gunsteren and Karplus (1980).

Convergence criteria

As minimization is proceeding, CHARMm computes the values of several terms that can be monitored for energy convergence. These are:

If any of these terms is smaller than the default or the user-defined tolerance, minimization will stop.

Note that although a zero RMS gradient is a necessary condition for a minimum, it is not a satisfying condition (that is, saddle points on the potential energy surface have zero gradient but are not minima).

Minimization requirements

Because all energy minimizations involve calculating the potential energy of the system, you must have a PSF, coordinates, and a parameter file available. Hydrogen bonded and nonbonded lists must also be created prior to any energy evaluation and subsequent minimization.

Example: Minimizing enkephalin

Energy minimization is used in this example to relax the initial Cartesian coordinates of a pentapeptide generated from idealized internal coordinates.

The following files are used or generated in this example:

AMINO.BIN

PARM.BIN

ENKPINI.CRD


*...
* Copyright (c) 1994
* Molecular Simulations Inc.
* All Rights Reserved
*
! Open and read the topology and parameter files (binary)
OPEN READ UNIT 11 FILE NAME "$CHM_DATA/AMINOH.BIN"
READ RTF UNIT 11 FILE
CLOSE UNIT 11
OPEN READ UNIT 12 FILE NAME "$CHM_DATA/PARM.BIN"
READ PARAMETERS UNIT 12 FILE
CLOSE UNIT 12
! Read the sequence information and generate the PSF
READ SEQUENCE CARDS
* Enkephalin sequence
*
5
TYR GLY GLY PHE MET
GENERATE ENKP SETUP
OPEN READ UNIT 08 CARD NAME ENKPINI.CRD
READ COOR CARD UNIT 08
CLOSE UNIT 08
FAST 1
UPDATE RDIE SHIFT VSHIFT CUTNB 16.0 INBFRQ 10 IHBFRQ 0
! First set of minimization
MINIMIZE CONJ NSTEP 100 NPRINT 10
! Second set of minimization
MINIMIZE ABNR NSTEP 200 NPRINT 10
! Write out the minimize coordinates
OPEN WRITE UNIT 04 CARD NAME ENKPMIN.CRD
WRITE COORDINATES CARD UNIT 04
* Minimized enkephalin coordinates
*
STOP

Example: Minimizing crambin

This example illustrates how CHARMm can be used to take an initial crystal structure for a protein and compute the RMS differences of internal coordinates before and after energy minimization.

The following files are used or generated in this example:

AMINO.BIN

PARM.BIN

CRNINI.CRD

CRNMIN.CRD


*...
* Copyright (c) 1994
* Molecular Simulations Inc.
* All Rights Reserved
*...
* This CHARMm command file generates a PSF for crambin
* and performs several minimizations on the structure
* and computes the changes relative to the starting
* coordinates.
*...
*
! Open and read an all hydrogen amino acid topology file (binary)
! and parameter file
OPEN READ UNIT 11 FILE NAME "$CHM_DATA/AMINOH.BIN"
READ RTF UNIT 11 FILE
CLOSE UNIT 11
! Open and read the parameter file (binary)
OPEN READ UNIT 12 FILE NAME "$CHM_DATA/PARM.BIN"
READ PARA UNIT 12 FILE
CLOSE UNIT 12
! Read the sequence information and generate the PSF
READ SEQUENCE CARDS
* Sequence for crambin as taken from PDB file
*
46
THR THR CYS CYS PRO SER ILE VAL ALA ARG SER ASN PHE
ASN VAL CYS ARG LEU PRO GLY THR PRO GLU ALA ILE CYS
ALA THR TYR THR GLY CYS ILE ILE ILE PRO GLY ALA THR
CYS PRO GLY ASP TYR ALA ASN

! Generate the PSF with a segment identifier of 1CRN.
! and set up the internal coordinate tables.
GENERATE 1CRN SETUP
! Patch disulfide bonds in to the PSF
PATCH DISU 1CRN 3 1CRN 40 SETUP
PATCH DISU 1CRN 4 1CRN 32 SETUP
PATCH DISU 1CRN 16 1CRN 26 SETUP
! Read the PDB coordinates
OPEN READ UNIT 13 CARD NAME 1CRN.PDB
READ COORDINATES PDB UNIT 13
CLOSE UNIT 13
! Build and optimize all hydrogen positions
HBUILD
! Build any remaining coordinates
IC FILL PRESERVE
IC PARAMETERS
IC BUILD
! Setting up comparison coordinate set for future analysis
COOR COPY COMP
! Write the coordinates to disk
OPEN WRITE UNIT 13 CARD NAME CRNINI.CRD
WRITE COORDINATES CARD UNIT 13
* Crambin initial coordinates
*
UPDATE RDIE SHIFT VSHIFT
! First set of minimization
MINIMIZE SD NSTEP 100 INBFRQ -1 CUTNB 30 STEP 0.010 NPRINT 10
! Comparisons based on RMS
COOR RMS MASS
! Second set of minimization
MINIMIZE SD NSTEP 150 INBFRQ -1 CUTNB 30 STEP 0.0050 NPRINT 10
! Comparisons based on RMS
COOR RMS MASS
! Internal coordinate comparisons; select and print
! only the phi/psi internal coordinates
IC FILL
IC DIFF
IC KEEP SECO SELE ATOM * * N .OR. ATOM * * CA END
IC KEEP THIR SELE ATOM * * CA .OR. ATOM * * C END
IC DELE FOUR SELE ATOM * * O END
IC PURG
IC PRIN
! Write out crambin minimized coordinates
OPEN WRITE UNIT 04 CARD NAME CRNMIN.CRD
WRITE COORDINATES CARD UNIT 04

* Crambin coordinates after 250 steps of SD minimization
*
STOP


References

Brooks, B. R., Bruccoleri, R. E., Olason, B. D., States, D. J., Swaminathan, S., and Karplus, M., J Comp. Chem., 4 187 (1983).

Fletcher, R. (ed.), Optimization, Academic Press, New York and London, 1969.

Flectcher, R. and Reeves, C. M., The Computer Journal, 7 149 (1964).

Powell, M. J. D., "Restart Procedure for the Conjugate Gradient Method," Mathematical Programming, 12 241 (1977).

Press, W. H., Flannery, B. P., Teukolshy, S. A., and Vetrrling, W. T., Numerical Recipes: The Art of Scientific Computing, University Press, Cambridge, 1987.

Gunsteren, W. F. and Karplus, M., J. Comp. Chem., 1 266 (1980).




Last updated September 18, 1998 at 04:02PM PDT.
Copyright © 1997, Molecular Simulations, Inc. All rights reserved.