The development of ever-more-powerful computers, combined with the wide dissemination of modeling packages like AMBER, puts the power to perform valuable calculations in the hands of an increasingly large number of scientists. It is tempting to say that, given the increasing sophistication of such programs, all one needs is the appropriate hardware and software to perform good experiments.
But this is not the case. As modeling programs have grown more sophisticated, they have sprouted an ever-increasing array of options-options which must be properly chosen, if worthwhile results are to be obtained. And even if the options are appropriately set, one must ensure that the program is properly suited for the chosen application. Nowhere in AMBER is this more true than the GIBBS free energy module.
Here we discuss several issues which impinge on developing an appropriate GIBBS input file, and on interpreting the results produced. One is also strongly encouraged to review the literature referenced here and in the preface to the GIBBS program.
GIBBS version 4.0 offers five choices of method for calculating the free energy difference between two states. These include the general classes slow growth, free energy perturbation, and thermodynamic integration, as well as dynamically modified variants of the latter two. These were described in the introduction to GIBBS. As yet, it has not been shown conclusively what method is "best" for any particular type of problem.
At this time, the relative rates of convergence of the averaged quantities required by FEP and TI, which will directly impinge on the reliabilities of the two techniques, have not been determined.
Note that we approximate the integral using the trapezoidal algorithm, i.e.

This integration method should be reasonably accurate in most cases. But in case the user wishes to try their own integration scheme, setting ISANDE = 1 with TI will also force reporting of the values of and several other averages at every evaluation point (the other values reported relate to calculating the enthalpy/entropy, as described below).
GIBBS Version 4 allows the user to request that the enthalpy and entropy changes be reported in addition to the free energy (which is always reported). Two different schemes are used to calculate these quantities, depending on the free energy calculation method. Note that in either case, the enthalpy and entropy are necessarily dependent on being able to reliably extract small differences between averages of (often large) total system energies. In the case of free energy, on the other hand, we need only measure the average of a potential difference or a derivative. For this reason, enthalpy/entropy estimates are typically more than an order of magnitude less accurate than their free energy counterpart. One should be very cautious when interpreting them.
For FEP, the approximate equations derived in Ref. 4 are used. These approximate the required temperature derivatives by a finite difference. The equations used are derived from the FEP expression, and the sum of the resulting (enthalpy - T*entropy) will equal the reported free energy.
For TI, the enthalpy and entropy are evaluated using exact-form integral relationships presented in Ref. 5. The (enthalpy - T*entropy) calculated by this method will not necessarily equal the reported total free energy; the difference between the two quantities can be taken as a crude indication of the reliability of the enthalpy/entropy values. The integrals are approximated by the trapezoidal rule, as described above (Equation (5)).
By default (and without exception in older versions of Gibbs),
the optimal interaction
between two atoms i and j is
given by

This is fine when neither atom "vanishes" at either endpoint. But now consider the case where atom i vanishes at Then

Thus,
never gets smaller than
. At
the mixed well depth,
just slightly >0,
and suddenly a steric "gap" between
atoms i and j of
will be required. This can lead to sampling
inefficiencies. A better solution is to shrink
to a user-chosen small value as one of the atoms "vanishes". This is the
effect of variable IDSX0 (line 10).
The theory of DMW, and some exploratory applications, are described in Ref. (3). A sample input for GIBBS is shown below, follow by a few important explanatory notes.
(format compressed to fit page)
Line 14
We set ALMDEL = 0, ISLDYN=+2, IDIFRG=0, NSTMEQ=100, and
NSTMUL=200. This results in dynamically modified window FEP, with
100 steps of equilibration and 100 steps of data collection per window.
Line 14a:
On the next line, we set IAVSLP = 8, IAVSLM=2, and CORRSL=0.8.
This means that, at most, the 8 most recently calculated
accumulated_free_energy) points will be used in approximating the
slope. IAVSLM=2 means that as soon
as 2 points are available, we will calculate the slope from all available
points, until the maximum of 8 is reached. If the best-fit line through
the points fits the data with a correlation coefficient (CC) < 0.8, then
the number of points used in the current slope determination will
be halved, the slope and CC will be recalculated, and the comparison against
CC will be performed again. A minimum of two points are always used to calculate
the slope.
AMXMOV, which is set to 0.01 here, is the target change in free energy per window we are aiming for. The change on the next step is calculated as

Note that since we don't know a priori what the free energy versus curve will look like, we do not know exactly how many steps will be required to complete the simulation. The total number of MD steps required will depend both on AMXMOV and on NSTMEQ and NSTMUL (line 14). NSTLIM can be set to -1 on line 8 to force the program to continue until the total required number of steps have been performed. Also note that the value of AMXMOV used will often depend on the magnitude of the total anticipated free energy change. For example, one would not typically want to use AMXMOV=0.01 and NSTMEQ=NSTMUL=100 if the total energy change is 50 or 100 kcal/mole, as it can be for certain electrostatic changes.
line 14b:
IAVDEL < 0, which means that the
comparisons will not be used in scaling the widths of
windows. The viability and reliability of changes made using these types of
comparisons has not yet been established.
line 14c:
ALMDL0 is set to 1.0D-5. This means that the first IAVSLM window steps (before
we have enough points to calculate a slope) will be made with this
small step size. This step is chosen to be small in case the energy
is changing quickly in this region.
DLMIN is set to 1.0D-10. Typically, a value of DLMIN such as this would have no effect, since it is unlikely that the slope and AMXMOV would be such to require a step this small in the first place. At any rate, steps calculated to be smaller than DLMIN are reset to DLMIN. DLMIN can be valuable in some cases when one wishes to limit how slowly a simulation can progress.
DLMAX is set to 1.0D-2. Setting an appropriate value for DLMAX is important. If the G versus curve has any points of inflection, we might calculate a slope of approximately 0 at one or more points. In this case, the simple formula used to determine the next step size would indicate a very large step (as large as 1.0, the whole simulation length). This would be incorrect, as the slope could clearly turn significantly non-0 in a future range of DLMAX bounds the change in such cases.
AMXRST is set to 0.10. The slope we calculate is only an approximation of the "true" instantaneous slope, and the current slope is only an estimate of the slope over the next interval (window). Thus, it is possible that when we calculate the actual free energy change over the next window, it will be an unacceptable amount larger than the target value. In such a case, we may want to decrease the step size for this window and re-evaluate the energy. AMXRST is the largest allowable energy for a step. If the energy is > AMXRST, the stepsize is reduced, and the energy for the window recalculated. Note that setting AMXRST too close to AMXMOV will result not only in too many windows being reevaluated (inefficient), but can also lead to biased sampling.
ALMSTP(1) is set to -1.0. If 0 < ALMSTP(1) < 1.0, one can prescribe that the values of AMXMOV, DLMIN, DLMAX and AMXRST vary over different ranges in as described in the input discussion.
It is often of interest to determine the free energy difference between two states which differ in conformation, rather than in composition. For example, one might be interested in the free energy profile for rotation about a ring in a protein. Such a profile can be determined by performing a PMF simulation. To perform such a simulation, one must be able to define conformation as a function of lambda within the context of an otherwise free MD simulation. Fortunately, methods have been developed which allow selected internal coordinates to be constrained to chosen values, while otherwise affecting the MD trajectory only minimally. The best known of these is the SHAKE method for bond constraints. The methods of SHAKE have recently been extended to be generally applicable to angles and one can calculate the free energy changes that accumulate as the internal constraints are translated from those of the initial state to those of the final state. If one graphs the free energy changes as a function of the restraint target values (themselves a function of one gets the free energy profile for conformational changes.
Any constraint with a target value which is itself a function of will contribute to the free energy as lambda changes. This means that if SHAKE is used to constrain bonds of the perturbed group, and any of those bonds "grow" or "shrink" during the simulation, there will be a corresponding contribution to the free energy. In earlier work, this contribution has been overlooked, but we have shown that it must be included to reliably calculate free energies using the FEP The contribution in such a case can be calculated simply by setting NCORC=1.
Constraints other than SHAKE-en bonds can be defined by setting INTR > 0 (line 14) and providing the definitions after the standard input (see above). Any internal coordinate can be used; Be aware, however, that any internal coordinate which is part of a closed ring will present a special set of (often tricky) considerations (see below). In typical use, no compositional (or topological) change is performed when a PMF simulation is being carried out. A GIBBS-format topology file is still required from PARM or LEaP, though. An appropriate topology file with no atoms in the "perturbed group" can be generated by using the PERT option in PARM, but with no atoms defined in the pert group; i.e.
In general, PMF calculations within GIBBS may be performed with any method - FEP, TI or slow growth. (Before version 4.1, only FEP could be used for PMF calculations.) Note that there is one scenario where only the TI (with "constraint forces") method may be used: when any constrained internals whose target values change with lambda lie within a closed loop. The loop can either be part of the molecular topology, or as a result of the added topology of the constraint(s). To understand why neither FEP nor TI with "potential forces" can be used in such a case, you must recognize that for these latter methods, part of the procedure for calculating constraint contributions requires that we determine which atoms of the system are affected by a rigid body translation/rotation about the constrained internals. But the requisite set of atoms is not unambiguously defined when the constraint lies within a closed loop. Fortunately, the "constraint force" implementation of TI doesn't require that we make such a determination.
It is important to note that PMF calculations are typically very
compute-intensive. For FEP, Gibbs will determine which non-bonded pairs have an
interatomic distance which varies with one or more constraints, and only these
are re-evaluated in determining
. This helps reduce
the amount of computer time required for a FEP simulation, although the total
amount of time can remain high. The additional cpu overhead for calculating
constraint energies with TI is negligible in all cases.
While we have a good error check for some torsional PMF's (the free energy values after rotating 360° should be the same), we typically have no reliable way of determining that for other simulations enough sampling has carried out to determine a converged PMF curve. Our best guard against spurious results is careful consideration of the specific problem and the inherent relaxation timescales of the surroundings.
One of the thorniest issues related to free energy calculations is estimating the error in the At present, this error is typically estimated in one of four ways:
It must be understood that none of the above methods allows a completely reliable error estimate. At best, they provide a lower bound on the error. A large apparent error is a good indication that the results obtained are not appropriately converged. But a low apparent error does not necessarily indicate a converged and accurate simulation. This is clearly shown in Reference (7).
In "standard" operation, free energy changes in GIBBS are effected by transforming the potential representative of state 1 to that representative of state 2. The topology of the system does not change. To make atoms non-interacting at one of the endpoints, they are assigned zeroed non-bond and electrostatic parameters at this endpoint.
The improved mixing rules which can be used in GIBBS Version 4 (IOLEPS = 0, line 10) allow a second method to be used. One result of these new mixing rules is that if any pair of atoms "exist" only at mutually exclusive endpoints (e.g. atom i exists in state 1 but not state 2; atom j exists in state 2, but not in state 1), then effectively no non-bonded interactions are ever calculated between them. This means that, in lieu of the "standard" method which uses a single topology, we can define dual topologies, one corresponding to the endpoint, and the other corresponding to the endpoint. For the former topology, all non-bonded parameters would be defined to be 0 in the state. Similarly, all non-bonded parameters for the latter topology would be 0 at The two topologies would then never "see" each other at intermediate values of Defining dual topologies can aid in performing free energy calculations where bond connectivities must change. Dual topologies is the method incorporated into the "CHARMM" program.
On an efficiency basis, the relative merits of the two methods have not been established. Additional discussion of the two methods can be found in Ref. (7).