0508:physics0508095/man.tex

1: \documentclass[prb,twocolumn,showpacs,showkeys,preprintnumbers,amsmath,amssymb]{revtex4}

2: %\documentclass[prb,preprint,showpacs,showkeys,preprintnumbers,amsmath,amssymb]{revtex4}

3:

4: \usepackage{bm} %bold math

5: \usepackage{graphicx} %figures

6: \bibliographystyle{apsrev}

7:

8: \newcommand{\df}{$\Delta F$}

9: \newcommand{\dfrs}{$\Delta F_{\rm ref \rightarrow phys}$}

10: \newcommand{\dfsr}{$\Delta F_{\rm phys \rightarrow ref}$}

11:

12: \begin{document}

13:

14: \title{Simple estimation of absolute free energies for biomolecules}

15: \author{ F.\ Marty Ytreberg\footnote{E-mail: fmy1@pitt.edu} }

16: \author{ Daniel M.\ Zuckerman\footnote{E-mail: dmz@ccbb.pitt.edu} }

17: \affiliation{Department of Computational Biology,

18:   School of Medicine, University of Pittsburgh, Pittsburgh, PA 15261}

19: \date{\today}

20:

21: \begin{abstract}

22: One reason that free energy difference calculations are notoriously

23: difficult in molecular systems

24: is due to insufficient conformational overlap, or similarity, between

25: the two states or systems of interest. The degree of overlap is irrelevant,

26: however, if the absolute free energy of each state can be computed. We present

27: a method for calculating the absolute free energy

28: that employs a simple construction of an exactly computable

29: reference system which

30: possesses high overlap with the state of interest. The approach

31: requires only a physical ensemble of conformations generated via

32: simulation, and an auxiliary

33: calculation of approximately equal central-processing-unit (CPU) cost.

34: Moreover, the calculations can converge to the correct free energy value

35: even when the physical ensemble is incomplete or improperly distributed.

36: As a ``proof of principle,''

37: we use the approach to correctly predict free energies for

38: test systems where the absolute values can be calculated

39: exactly, and also to predict the

40: conformational equilibrium for leucine dipeptide in

41: implicit solvent.

42: \end{abstract}

43: \keywords{free energy,entropy}

44: \pacs{pacs}

45: \maketitle

46:

47: \section{Introduction}

48: Knowledge of the free energy for two different

49: states or systems of interest

50: allows the calculation of solubilities,

51: \cite{grossfield-jacs,vangunsteren-onestep}

52: determines binding affinities of ligands to proteins,

53: \cite{kollman-pnas,vangunsteren-estrogen}

54: and determines conformational equilibria

55: (e.g., Ref.\ \onlinecite{ytreberg-shift}).

56: Free energy differences (\df) therefore have potential

57: application in structure-based drug design where current

58: methods rely on {\it ad hoc} protocols to estimate binding affinities.

59: \cite{shoichet-nature,scheraga}

60:

61: Poor ``overlap,'' the lack of configurational

62: similarity between the two states or systems of interest,

63: is a key cause of computational expense and error in \df\ calculations.

64: The most common approach to improve overlap in free energy

65: calculations (used in thermodynamic integration, and free energy

66: perturbation) is

67: to simulate the system at multiple hybrid, or intermediate stages

68: (e.g., Refs.\ \onlinecite{zwanzig,beveridge,jorgensen,karplus-jcp,mccammon}).

69: However, the simulation of intermediate stages

70: greatly increases the computational cost of the \df\ calculation.

71:

72: Here, we address the overlap problem by calculating the absolute free

73: energy for each of the end states, thus avoiding the need for any

74: configurational overlap. Our method relies on the calculation of

75: the free energy difference between

76: a reference system (where the exact free energy

77: can be calculated, either analytically or numerically)

78: and the system of interest.

79:

80: Such use of a reference system with a computable free energy

81: has been used successfully in solids

82: where the reference system is generally a harmonic or

83: Einstein solid, \cite{hoover71,frenkel}

84: and liquid systems, where the reference

85: system is usually an ideal gas. \cite{hoover67,reinhardt-absf}

86: The scheme has also been applied to molecular

87: systems by Stoessel and Nowak, using a harmonic

88: solid in Cartesian coordinates as a reference system. \cite{stoessel}

89:

90: Other approaches to calculate the absolute free energies of

91: molecules have been developed.

92: Meirovitch and collaborators calculated

93: absolute free energies for peptides in vacuum,

94: for liquid argon and water using the hypothetical

95: scanning method. \cite{meirovitch-deca,meirovitch-argon}

96: Computational cost has thus far limited the approach to

97: peptides with sixty degrees of freedom. \cite{meirovitch-jcp}

98: The ``mining minima'' approach, developed by Gilson and collaborators,

99: estimates the absolute free energy of complex molecules

100: by attempting to enumerate the low-energy conformations

101: and estimating the contribution to the configurational integral

102: for each. \cite{gilson-jpca,gilson-bj}

103: Anharmonic effects can be included. \cite{gilson-jacs}

104: The mining minima method can, in principle, include potential

105: correlations between the torsions and bond angles or lengths,

106: and uses an approximate method to compute local partition functions.

107: Other investigators have estimated absolute free energies for molecules

108: using harmonic or quasi-harmonic approximations,

109: \cite{karplus-deca,gilson-jacs,aqvist-absf}

110: however, as discussed in

111: Refs.\ \onlinecite{gilson-jacs} and \onlinecite{karplus-deca}

112: local minima can be deviate substantially

113: from a parabolic shape.

114:

115: We introduce, apparently for the first time, a reference system

116: which is constructed to have high overlap with fairly general

117: molecular systems. The approach

118: can make use of either {\it internal or Cartesian}

119: coordinates. For biomolecules, using internal coordinates greatly

120: enhances the accuracy of the method since internal coordinates

121: are tailored to the description of conformations.

122: Further, {\it all degrees of freedom and their correlations}

123: are explicitly included in the method.

124:

125: Our method differs in several ways from the important study of

126: Stoessel and Nowak: \cite{stoessel}

127: (i) we use internal coordinates

128: for molecules which are key for optimizing the overlap between

129: the reference system and the system of interest;

130: (ii) we may use a nearly arbitrary reference potential because

131: only a numerical reference free energy value is needed,

132: not an analytic value;

133: (iii) there is no need, in cases we have studied,

134: to use multi-stage methodology to find

135: the desired free energy due to the overlap built into the

136: reference system,

137:

138: We consider this report a ``proof of principle''

139: for our reference system method.

140: After introducing the method,

141: it is tested on single and double-well two-dimensional systems,

142: and on a methane molecule where absolute free energy

143: estimates can be compared to exact values.

144: The method is then used to compute the absolute free energy

145: of the alpha and beta conformations for

146: leucine dipeptide (ACE-(leu)$_2$-NME) in implicit solvent,

147: {\it using all one-hundred fifteen degrees of freedom},

148: correctly calculating the free energy difference

149: $\Delta F_{\rm alpha \rightarrow beta}$.

150: Extensions of the method to larger systems are

151: then discussed.

152:

153: \section{Reference system method\label{sec-method}}

154: \subsection{The fundamental relations}

155: The absolute free energy of the system of interest (``phys'' for physical)

156: is defined using the partition function $Z_{\rm phys}$

157: \begin{eqnarray}

158:     F_{\rm phys} = -k_BT \ln Z_{\rm phys} = \nonumber \\

159: 	-k_BT \ln \left[

160: 	    \int d \vec{x} \;

161: 	    e^{-\beta \big(U_{\rm phys}(\vec{x})+K_{\rm phys}(\vec{x})\big)}

162: 	\right],

163: \end{eqnarray}

164: where $T$ is the system temperature, $\beta=1/k_BT$,

165: $U_{\rm phys}$ and $K_{\rm phys}$ are, respectively, the

166: physical potential energy (i.e., simulation forcefield)

167: and the kinetic energy,

168: and $\vec{x}$ represents the full set of

169: configurational coordinates (internal or Cartesian).

170: The kinetic energy term can be integrated exactly to obtain

171: \cite{gilson-jpcb}

172: \begin{eqnarray}

173:     Z_{\rm phys} =

174: 	\Bigg[ \frac{1}{h^{3N}}\frac{8\pi^2}{\sigma C^{\circ}}

175: 	\prod_{i=1}^N \big( 2\pi k_B T m_i \big)^{3/2} \Bigg]

176: 	    \int d \vec{x} \; e^{-\beta U_{\rm phys}(\vec{x})},

177:     \label{eq-Zphys}

178: \end{eqnarray}

179: where  $m_i$ is the mass of atom $i$, $h$ is Planck's constant,

180: $C^{\circ}$ is the standard concentration,

181: $\sigma$ is the symmetry number, \cite{gilson-bj}

182: $N$ is the number of particles in the system,

183: and the integral is

184: defined to be the configurational partition function.

185: For method used in this study the absolute free energy of the system

186: of interest is calculated using

187: a reference system (``ref''), and the following relationships are used,

188: \begin{eqnarray}

189:     Z_{\rm phys} = Z_{\rm ref} \frac{Z_{\rm phys}}{Z_{\rm ref}}, \nonumber \\

190:     F_{\rm phys} = F_{\rm ref} + \Delta F_{\rm ref \rightarrow phys},

191:     \label{eq-Fphys}

192: \end{eqnarray}

193: where $F_{\rm ref}$ is the trivially computable

194: free energy of the reference system, and \dfrs\ is the free energy

195: difference between the reference and physical system which can

196: be calculated using standard techniques.

197:

198: For this report, we include estimates of the configurational

199: integral only, i.e., the leading constant factor in square brackets in

200: Eq.\ (\ref{eq-Zphys}) is not included in our results. Ignoring

201: the constant is not a limitation since, for the conformational free

202: energies studied here, the term cancels for

203: free energy differences.

204:

205: \subsection{The reference energy and its normalization}

206: The trivial identities of Eq.\ (\ref{eq-Fphys}) suggest that arbitrary

207: reference systems can be used in our approach. To be concrete and anticipate

208: the procedure used, our discussion below will assume that a finite-length

209: simulation of the system of interest has been performed---from which

210: histograms of the coordinates have been generated.

211: For the molecular systems studied in this report, ordinary

212: Langevin dynamics simulations are performed using standard

213: forcefields.

214: The reference potential energy can be constructed from a wide

215: variety of histograms, as discussed below. Denoting

216: the computed histograms over all coordinates as $P(\vec{x})$, we define

217: \begin{eqnarray}

218:     U_{\rm ref}(\vec{x}) \equiv -k_BT \ln P(\vec{x}),

219:     \label{eq-Uref}

220: \end{eqnarray}

221: where $P(\vec{x})$ is the normalized probability of a particular

222: configuration (corresponding to a set of histogram bins);

223: see Fig.\ \ref{fig-schematic}.

224: For example, if all coordinates are binned as independent, then

225: \begin{eqnarray}

226:     P(\vec{x})=\prod_{i=1}^{N_{\rm coords}} P_i(x_i),

227:     \label{eq-Pind}

228: \end{eqnarray}

229: where $P_i(x_i)$ is the binned probability distribution (histogram)

230: for the $i^{\rm th}$ coordinate, and there

231: are $N_{\rm coords}$ degrees of freedom in the system.

232: If all coordinates are binned as pairwise correlated, then

233: \begin{eqnarray}

234:     P(\vec{x})=\prod_{ \{ i,j \} } P_{ij}(x_i,x_j),

235:     \label{eq-Pcorr}

236: \end{eqnarray}

237: where $\{ i,j \}$ is a set of pairs in which each coordinate occurs exactly

238: once, and $P_{ij}(x_i,x_j)$ is the probability for two particular coordinate

239: values from the two-dimensional histogram for these coordinates.

240: It is also possible to use an arbitrary combination of independent

241: and correlated coordinates---so long as each coordinate occurs

242: in only one $P$ factor.

243:

244: We emphasize that the final computed free energy values include

245: all correlations embodied in the true potential $U_{\rm phys}$. This

246: is true regardless of whether or how coordinates are correlated in the

247: reference potential.

248:

249: \begin{figure}

250:     \includegraphics[scale=0.35]{fig1.eps}

251:     \caption{Depiction of how the reference potential energy

252: 	$U_{\rm ref}$ is calculated for a one-coordinate system.

253: 	First the coordinate is binned, creating a

254: 	histogram $P$ (solid bars) populated according to the physical

255: 	ensemble. Then Eq.\ (\ref{eq-Uref}) is used to

256: 	calculate reference energies for each coordinate bin (dashed bars).

257: 	A hypothetical physical potential is

258: 	shown as a dotted curve for comparison to $U_{\rm ref}$.

259: 	For a multi-coordinate system $U_{\rm ref}$

260: 	would be the sum of the single-coordinate reference

261: 	potential energies.

262: 	\label{fig-schematic}

263:     }

264: \end{figure}

265:

266: A schematic of how $U_{\rm ref}$ is computed

267: for a one-coordinate system is shown in Fig.\ \ref{fig-schematic}.

268: The coordinate histogram is first determined (solid bars)

269: using a simulation trajectory;

270: then Eq.\ (\ref{eq-Uref}) is used to calculate

271: $U_{\rm ref}$ (dashed bars). A possible physical potential is

272: also included (dotted line) for comparison to $U_{\rm ref}$.

273: For a system containing many degrees of freedom,

274: the process is carried out for all coordinates, based

275: on Eq.\ (\ref{eq-Pind}), (\ref{eq-Pcorr}) or other correlation scheme.

276: $U_{\rm ref}$ is the sum

277: of all the appropriate terms,

278: consistent with Eq.\ (\ref{eq-Uref}) and the binning choice.

279:

280: The free energy of the reference system can now

281: be calculated via the reference partition function

282: \begin{eqnarray}

283:     Z_{\rm ref} = \int d\vec{x} \; e^{-\beta U_{\rm ref}(\vec{x})}

284:         = \int d\vec{x} \; P(\vec{x}).

285:     \label{eq-Zref}

286: \end{eqnarray}

287: In practice, we normalize the histogram for each coordinate

288: to one independently by summing

289: over all histogram bins. So, for a particular bond length $r_1$,

290: that is binned as independent, we account for the Jacobian

291: factor (see Eq.\ (\ref{eq-jacobian})) by defining $\xi = r_1^3/3$, and then

292: \begin{eqnarray}

293:     Z_{\xi} = \int d\xi \; P(\xi)

294: 	= \sum_{N_{\rm bin}} \; \Delta \xi \; P(\xi) = 1,

295: \end{eqnarray}

296: where $\Delta \xi$ is the histogram bin size, and $N_{\rm bin}$

297: is the number of bins in the $r_1$ histogram.

298: (Binning choices are discussed below.)

299: Similar relationships are used for all coordinates.

300: Thus the reference free energy $F_{\rm ref}=0$

301: and Eq.\ (\ref{eq-Fphys}) becomes

302: \begin{eqnarray}

303:     F_{\rm phys} = \Delta F_{\rm ref \rightarrow phys}

304:     \;\;\;\;\;\; (F_{\rm ref} \equiv 0)

305:     \label{eq-Fphys2}

306: \end{eqnarray}

307:

308: \subsection{Using the physical and reference ensembles}

309: With the reference potential energy $U_{\rm ref}$

310: defined in Eq.\ (\ref{eq-Uref})

311: and the physical potential energy $U_{\rm phys}$

312: given by the forcefield, which may include implicit solvation energies,

313: Boltzmann-distributed snapshots from both the

314: reference and physical systems can be utilized

315: to calculate $F_{\rm phys}$=\dfrs.

316: Here, we simply use free energy perturbation \cite{zwanzig}

317: from the reference to the physical systems

318: \begin{eqnarray}

319:     F_{\rm phys} = -k_B T \ln \Big\langle

320: 	e^{-\beta \big( U_{\rm phys}-U_{\rm ref} \big) }

321:     \Big\rangle_{\rm ref} \nonumber \\

322:         \doteq -k_B T \ln \Bigg( \frac{1}{N_{\rm ref}} \sum_{i=1}^{N_{\rm ref}}

323: 	e^{-\beta \big(U_{\rm phys}-U_{\rm ref}\bigr)} \Bigg)

324:     \label{eq-fep}

325: \end{eqnarray}

326: where $N_{\rm ref}$ is number of structures in the reference ensemble,

327: the ``$\doteq$'' symbol denotes a computational estimate,

328: and $\langle ... \rangle_{\rm ref}$ represents a canonical average

329: using structures from the reference ensemble only.

330: It is important to note that, while other choices for

331: computing $F_{\rm phys}$ are possible, such as Bennett's method,

332: \cite{bennett,shirts-benn,shirts-prl,crooks-pre,lu-jcc,ytreberg-shift}

333: Eq.\ (\ref{eq-fep}) is the only choice which relies solely on

334: configurations drawn from the reference ensemble which

335: are, by construction, sampled canonically and without

336: dynamical trapping.

337: We also note that ``uni-directional'' estimates like that of

338: Eq.\ (\ref{eq-fep}) have been analyzed extensively

339: (e.g., Refs.\ \onlinecite{zuckerman-prl} and \onlinecite{zuckerman-jstat})

340: and may be amenable to error-reduction techniques;

341: \cite{zuckerman-cpl,ytreberg-extrap} however, we have applied the

342: perturbation approach here to keep our initial analysis as straightforward

343: as possible.

344: Staged free energy methods like thermodynamic

345: integration \cite{straatsma-ti} and adaptive integration

346: \cite{swendsen-aim} may also be used.

347:

348: \subsection{The physical ensemble and construction of the reference system}

349: The method used in this report relies on simple

350: histograms for all degrees of freedom

351: (in principle, with internal or Cartesian coordinates)

352: based on a ``physical ensemble'' of

353: conformations generated via molecular dynamics,

354: Monte Carlo or other canonical simulation.

355: The histograms define a reference system with a free energy that is

356: trivially computable, as described in Sec.\ \ref{sec-method}.

357: We emphasize that an analytical

358: solution need not be available; a precise numerical evaluation is

359: more than adequate.

360: A well-sampled ensemble of reference system configurations is then

361: readily generated and used to compute the free energy

362: difference via Eq.\ (\ref{eq-fep}).

363:

364: The first step in our approach to constructing the reference

365: system is to generate a physical

366: ensemble (i.e., a trajectory) by simulating the

367: system of interest using

368: standard molecular dynamics, Monte Carlo, or other

369: canonical sampling techniques.

370: The trajectory produced by the simulation is

371: used to generate histograms for all coordinates

372: as described below.

373: In creating histograms, note that constrained coordinates,

374: such as bond lengths involving hydrogens constrained

375: by RATTLE, \cite{rattle}

376: need not be binned since these coordinates do not change

377: between configurations.

378: Such coordinate constraints are not required in the method, however.

379:

380: If internal coordinates are used (such as for the molecules

381: in this study), care must be taken to

382: account for the Jacobian factors.

383: Using internal coordinates with bond lengths $r$,

384: bond angles $\theta$ and dihedrals $\omega$, the

385: volume element in the configurational integral

386: of Eq.\ (\ref{eq-Zphys}) is given by \cite{gilson-jacs}

387: \begin{eqnarray}

388:     d \vec{x} =

389: 	\prod_{i=1}^{N-1} r_i^2 dr_i \;

390: 	\prod_{i=1}^{N-2} \sin\theta_i d\theta_i \;

391: 	\prod_{i=1}^{N-3} d\omega_i

392:     = \nonumber \\

393: 	\prod_{i=1}^{N-1} d (r_i^3/3) \;

394: 	\prod_{i=1}^{N-2} d (-\cos \theta_i) \;

395: 	\prod_{i=1}^{N-3} d\omega_i,

396:     \label{eq-jacobian}

397: \end{eqnarray}

398: where $N$ is the number of atoms in the system.

399: Thus, when using internal coordinates,

400: the simplest strategy to account for the

401: Jacobian is to bin according to a set of rules:

402: bond lengths are binned according to $r^3/3$,

403: bond angles are binned according to $\cos\theta$,

404: and dihedrals are binned according to $\omega$ (i.e., the

405: same as Cartesian coordinates).

406:

407: \subsection{Generation of the reference ensemble}

408: Once the histograms are constructed and populated using the physical

409: ensemble, the reference ensemble is generated.

410: To generate a single reference structure,

411: for each coordinate one chooses a histogram

412: bin according to the probability associated with that bin. Then a

413: coordinate value is chosen at random uniformly

414: within the bin according

415: the Jacobian factor in Eq.\ (\ref{eq-jacobian})---e.g., for

416: a bond length $r$, one chooses uniformly in the variable $(r^3/3)$.

417: The process is repeated for every degree of freedom in the system.

418: By repeating the entire procedure, one can generate

419: as many reference structures as desired

420: (i.e., the reference ensemble).

421:

422: \subsection{Summary of the reference system method}

423: In summary, the method is implemented by first constructing

424: properly normalized histograms for all internal (or Cartesian) coordinates

425: based on a physical ensemble of structures.

426: An ensemble of reference structures is then chosen at random from the

427: histograms.

428: The reference energy ($U_{\rm ref}$ of Eq.\ (\ref{eq-Uref})) and

429: physical energy ($U_{\rm phys}$ from the forcefield) must

430: be calculated for each structure in the reference ensemble.

431: Finally, Eq.\ (\ref{eq-fep}) is used to calculate the

432: desired absolute free energy of the system of interest.

433:

434: The CPU cost of the method, above that of the

435: initial ``physical'' trajectory, is one physical energy evaluation

436: for each of the $N_{\rm ref}$ reference structures, plus the less

437: expensive cost of generating reference structures.

438:

439: \section{Results}

440: To test the effectiveness of the reference system method

441: we first estimated the absolute free energy for three test systems

442: where the free energy is known exactly.

443: We chose the two-dimensional potentials

444: from Ref.\ \onlinecite{ytreberg-seps}, and  a methane molecule in vacuum.

445: Finally, we used the method to estimate the absolute free energies

446: of the alpha and beta conformations of the 50-atom

447: leucine dipeptide (ACE-(leu)$_2$-NME), and compared

448: the free energy difference obtained via our method

449: with an independent estimate.

450: In all cases, the free energy estimate computed by our approach

451: is in excellent agreement with independent results.

452:

453: \subsection{Simple test systems}

454: We first studied the two-dimensional

455: single and double-well potentials from Ref.\ \onlinecite{ytreberg-seps},

456: \begin{eqnarray}

457:   U_{\rm phys}^{\rm single}(x,y)=(x+2)^2+y^2, \nonumber\\

458:   U_{\rm phys}^{\rm double}(x,y)=\frac{1}{10}

459:   \Bigl\{

460:   ((x-1)^2-y^2)^2+ \nonumber \\

461:   10(x^2-5)^2 + (x+y)^4+(x-y)^4

462:   \Bigr\}.

463:   \label{eq-pot}

464: \end{eqnarray}

465:

466: \begin{table}

467:     \begin{tabular}{l|c|c}

468:     \hline \hline

469:     System & Exact & Estimate \\

470:     \hline

471:     two-dimensional single-well \cite{ytreberg-seps} & -1.1443 & -1.1449 (0.0003) \\

472:     \hline

473:     two-dimensional double-well \cite{ytreberg-seps} & 5.4043 & 5.4058 (0.0003)\\

474:     \hline

475:     Methane molecule & 10.932 & 10.934 (0.002)\\

476:     \hline \hline

477:     \end{tabular}

478:     \caption{

479: 	Absolute free energy estimates obtained using our

480: 	reference system approach for cases where the absolute free

481: 	energy can be determined exactly.

482: 	In all cases, the estimate is in excellent agreement with

483: 	the exact free energy.

484: 	The uncertainty, shown in parentheses

485: 	(e.g., $3.14 \; (0.05) = 3.14 \pm 0.05$), is

486: 	the standard deviation from five independent simulations.

487: 	The results for the two-dimensional systems are in $k_BT$ units

488: 	and methane results have units of kcal/mole.

489: 	The table shows estimates of the configurational

490: 	integral in Eq.\ (\ref{eq-Zphys}),

491: 	i.e., the constant term is not included in the estimate.

492:     \label{tab-results}

493:     }

494: \end{table}

495:

496: Table \ref{tab-results} shows the excellent agreement

497: between the reference system estimates and the exact free energies

498: (obtained analytically) for the

499: two-dimensional potentials used in this study, Eq.\ (\ref{eq-pot}).

500: The ``physical'' simulations used Metropolis Monte Carlo

501: with $k_BT=1.0$ and one

502: million snapshots in the physical and reference ensembles.

503: For all two-dimensional simulations, both coordinates

504: were treated with full

505: correlations---i.e., two-dimensional histograms were used---and

506: the bin sizes were chosen such that the number of bins ranged from

507: 100-1000.

508: The error shown in Table \ref{tab-results}

509: in parentheses is the standard deviation from five independent estimates

510: using five separate physical ensembles---and thus five different

511: reference systems.

512: Good estimates were also obtained using fewer snapshots---e.g.,

513: we obtained $F=-1.142 \; (0.003)$

514: for the single-well potential

515: and $F=5.408 \; (0.007)$ for the double-well potential

516: using 10,000 snapshots

517: in both the physical and reference ensembles.

518:

519: Table \ref{tab-results} also shows the excellent agreement between the

520: reference system estimates and the exact value of the free

521: energy for methane in vacuum.

522: Methane trajectories were generated

523: using TINKER 4.2 \cite{tinker} with the OPLS-AA forcefield. \cite{oplsaa}

524: The temperature was maintained at 300.0 K using Langevin dynamics with

525: a friction coefficient of 91.0 ${\rm ps}^{-1}$ and a time step of 0.5 fs.

526: The physical ensemble was created by generating five 10.0 ns trajectories

527: with snapshots saved every 0.1 ps.

528: Using the 100,000 methane structures in the physical ensemble,

529: the reference system was generated by binning internal coordinates

530: into histograms. The absolute free energy was then estimated

531: by generating 100,000 structures for the reference ensemble

532: and using Eq.\ (\ref{eq-fep}).

533: All coordinates were binned as independent using

534: one-hundred bins per coordinate, thus only one-dimensional histograms

535: were required.

536: The uncertainty shown in parenthesis in Table \ref{tab-results}

537: is the standard deviation from

538: five independent estimates using the five separate methane

539: trajectories---and thus five different reference systems.

540:

541: \begin{figure}

542:     \includegraphics[scale=0.35]{fig2.eps}

543:     \caption{Absolute free energy for methane estimated by

544: 	the reference system

545: 	method as a function of the number of reference

546: 	structures $N_{\rm ref}$ used in the estimate.

547: 	The solid horizontal line

548: 	is the exact free energy obtained by numerical integration.

549: 	Five independent simulations are shown on a log scale to clearly

550: 	show the convergence of the free energy estimate.

551: 	Results shown were obtained using Eq.\ (\ref{eq-fep})

552: 	with one-hundred bins for each degree of freedom, i.e., the estimates

553: 	for the absolute free energy of methane in Table \ref{tab-results}

554: 	are the values shown here for

555: 	$N_{\rm ref}=1,000,000$.

556: 	\label{fig-converge-meth}

557:     }

558: \end{figure}

559:

560: Figure \ref{fig-converge-meth} shows the convergence

561: behavior of the reference

562: system method for methane. Five independent absolute free energy

563: estimates are shown as a function of the number of reference

564: structures used in the estimate.

565: Each of the five simulations use the same protocol as described above,

566: i.e., the absolute free energy estimates in Table \ref{tab-results} are

567: the values shown in

568: Fig.\ \ref{fig-converge-meth} for $N_{\rm ref}=100,000$.

569:

570: Methane was chosen as a test system because

571: intra-molecular interactions are due only to bond

572: lengths and angles. In the OPLS-AA forcefield no non-bonded terms

573: are present in the

574: potential energy $U_{\rm phys}$, and thus the exact absolute free energy can

575: be computed numerically without great difficulty.

576: For methane, a configuration is determined by:

577: (i) four bond lengths, which are independent of each other and

578: all of other coordinates in the forcefield; and

579: (ii) five bond angles which are correlated to one another but

580: not to the bond lengths.

581: Thus the exact partition function $Z_{\rm meth}$ is a product

582: of four bond length partition functions $Z_r$ and one

583: angular partition function $Z_{\theta}$,

584: \begin{eqnarray}

585:     Z_{\rm meth} = Z_r^4 Z_{\theta}, \nonumber \\

586: 	Z_r = \int_{0}^{\infty} dr\;e^{-\beta U_{\rm phys}(r)},

587: 	    \nonumber \\

588: 	Z_{\theta} = \int_{0}^{\pi}

589: 	    d\theta_1 d\theta_2 d\theta_3 d\theta_4 d\theta_5 \;

590: 	    e^{-\beta U_{\rm phys}

591: 		(\theta_1,\theta_2,\theta_3,\theta_4,\theta_5)

592: 	      }.

593: \end{eqnarray}

594: $U_{\rm phys}(r)$ is harmonic and thus $Z_r$ was computed analytically

595: using parameters from the forcefield.

596: For $U_{\rm phys}(\theta_1,\theta_2,\theta_3,\theta_4,\theta_5)$

597: the correlations between angles must be

598: taken into account, thus $Z_{\theta}$ was estimated numerically using

599: TINKER to evaluate $U_{\rm phys}$ in the five-dimensional integral.

600: We found that $F_{\rm meth}=-k_B T \ln Z_{\rm meth} = 10.932$ kcal/mol

601: as shown in Table \ref{tab-results}.

602:

603: Methane was also used to show that the method correctly computes

604: the free energy even when the physical ensemble is incorrect or incomplete.

605: In our studies we found that the correct free energy

606: is obtained using our method even when the histogram for

607: each coordinate was assumed to be flat, i.e., without the

608: use of a physical ensemble (data not shown).

609:

610: \begin{figure}

611:     \includegraphics[scale=0.35]{fig3.eps}

612:     \caption{Absolute free energy for methane estimated by

613: 	the reference system

614: 	method as a function of the number of histogram bins used for

615: 	each degree of freedom. The plot shows the ``sweet spot'' where

616: 	histogram bins are small enough to reveal histogram features,

617: 	yet large enough to give sufficient population in each bin.

618: 	The results are shown with a vertical scale of

619: 	two kcal/mol and on a log scale to emphasize the

620: 	wide range of bin sizes that produce excellent results for the

621: 	reference system approach.

622: 	Results shown were obtained using Eq.\ (\ref{eq-fep})

623: 	for a methane molecule using $N_{\rm phys}=N_{\rm ref}=10,000$

624: 	(dashed curve)

625: 	and $N_{\rm phys}=N_{\rm ref}=100,000$ (solid curve).

626: 	The solid horizontal line shows the exact

627: 	free energy and the errorbars are the standard deviations

628: 	of five independent trials.

629: 	The plot demonstrates at least fifty bins should

630: 	be used for each independent coordinate,

631: 	and that the maximum number of bins

632: 	depends on the number of snapshots in the physical ensemble.

633: 	\label{fig-sweet}

634:     }

635: \end{figure}

636:

637: Choosing the size of the histogram bins

638: is an important consideration.

639: Figure \ref{fig-sweet} shows the large ``sweet spot'' where bins

640: are large enough

641: to be well populated, and yet small enough to reveal

642: histogram features.

643: The figure shows results for the absolute free energy

644: for a methane molecule using ten-thousand structures

645: in both the physical and reference ensembles,

646: $N_{\rm phys}=N_{\rm ref}=10,000$, (dashed curve)

647: and $N_{\rm phys}=N_{\rm ref}=100,000$ (solid curve).

648: The small vertical scale of two kcal/mol and the logarithmic horizontal

649: scale emphasize that there

650: is a wide range of bin sizes that produce excellent results for the

651: reference system approach.

652: Error bars are the standard deviation

653: of five independent simulations. The solid horizontal line shows the exact

654: free energy and the curves are free energy estimates,

655: using Eq.\  (\ref{eq-fep})

656: as a function of the number of bins used for the histograms

657: for all degrees of freedom. From this plot it is clear that one

658: should choose at least fifty bins, and that the maximum number of bins

659: that should be used depends on the number of snapshots in the physical

660: ensemble---more snapshots in the physical ensemble

661: means one can use more bins for the reference system.

662:

663: \begin{table}

664:     \begin{tabular}{l|c|c}

665:     \hline \hline

666:     System & Estimate (kcal/mol) & Independent Estimate\\

667:     \hline

668:     $F_{\rm alpha}$ & 87.3 (0.7) & --- \\

669:     \hline

670:     $F_{\rm beta}$  & 86.3 (0.7) & --- \\

671:     \hline

672:     $\Delta F_{\rm alpha \rightarrow beta}$ & -1.0 (0.9) & -0.85 (0.05) \\

673:     \hline \hline

674:     \end{tabular}

675:     \caption{

676: 	Absolute free energy estimates of

677: 	the alpha ($F_{\rm alpha}$) and beta ($F_{\rm beta}$) conformations

678: 	obtained using the

679: 	reference system method for leucine dipeptide with

680: 	GBSA solvation, in units of kcal/mol.

681: 	The independent measurement for the free energy difference

682: 	was obtained via a 1.0 $\mu$s unconstrained simulation.

683: 	The uncertainty for the absolute free energies,

684: 	shown in parentheses, is the standard deviation from five

685: 	independent 10.0 ns leucine dipeptide simulations using

686: 	one-million reference structures in the reference ensemble.

687: 	The uncertainty

688: 	for the free energy differences is obtained by using every possible

689: 	combination of $F_{\rm alpha}$ and $F_{\rm beta}$,

690: 	i.e., twenty-five independent estimates.

691:         The standard error associated with the

692: 	$\Delta F_{\rm alpha \rightarrow beta}$ reference system

693: 	estimate is 0.18 kcal/mol, reflecting the twenty-five

694: 	independent estimates.

695: 	The table shows estimates of the configurational

696: 	integral in Eq.\ (\ref{eq-Zphys}),

697: 	i.e., the constant term is not included in the estimate.

698:     \label{tab-results2}

699:     }

700: \end{table}

701:

702: \subsection{Leucine dipeptide}

703: Table \ref{tab-results2}

704: shows the agreement for leucine dipeptide

705: (ACE-(leu)$_2$-NME) between the free energy difference

706: $\Delta F_{\rm alpha \rightarrow beta}$

707: as predicted by the reference system method, and as

708: predicted via long simulation.

709: The leucine dipeptide physical ensembles were

710: generated using TINKER 4.2 \cite{tinker} with

711: the OPLS-AA forcefield. \cite{oplsaa}

712: The temperature was maintained at 500.0 K (to enable

713: an independent $\Delta F$ estimate via

714: repeated crossing of the free energy barrier between

715: alpha and beta configurations),

716: using Langevin dynamics with a friction coefficient of

717: 5.0 ${\rm ps}^{-1}$. GBSA \cite{still} implicit

718: solvation was used, and RATTLE was utilized to maintain all bonds involving

719: hydrogens at their ideal lengths \cite{rattle} allowing the use

720: of a 2.0 fs time step.

721:

722: We calculated reference systems and

723: computed absolute free energies of the alpha and

724: beta conformations based on five

725: 10.0 ns trajectories. For all simulations,

726: backbone torsions were constrained using a flat-bottomed

727: harmonic restraint (zero force if the torsion

728: angles were within the allowed range, and harmonic otherwise),

729: namely, for alpha: $-105<\phi<-45 \;{\rm and}\; -70<\psi<-10$;

730: and for beta: $-125<\phi<-65 \;{\rm and}\; 120<\psi<180$.

731: The reference system was generated using 100,000 snapshots

732: from the physical ensemble, then free energy estimates were obtained

733: by generating 1,000,000 structures for the reference ensemble for

734: each estimate. All one-hundred fifteen

735: (excludes bond lengths constrained by RATTLE \cite{rattle})

736: internal coordinates were binned as independent

737: with fifty bins for each coordinate.

738: The uncertainty shown in parenthesis is

739: the standard deviation from the

740: five independent estimates using the five separate trajectories, i.e.,

741: five different physical ensembles and five different reference systems.

742:

743: Since independent estimates of the absolute free energies

744: of the alpha and beta conformations of leucine dipeptide

745: are not available, we calculated the free

746: energy difference

747: $\Delta F_{\rm alpha \rightarrow beta} = -0.85 \; (0.05)$ kcal/mol

748: via a 1.0 $\mu$s unconstrained simulation.

749: The uncertainty of the independent estimate was obtained using

750: block averages.

751: The temperature was chosen to be 500.0 K which allowed around 1500

752: crossings of the free energy barrier between the alpha and

753: beta conformations, providing an accurate independent estimate.

754: As can be seen in Table \ref{tab-results2}, our estimated free

755: energy difference is in good agreement with the independent

756: value obtained via long simulation.

757:

758: We emphasize that the nearly kcal/mol fluctuations observed in our

759: leucine dipeptide estimates are completely independent of the magnitude

760: of the free energy difference of the same order. That is, for a similar

761: sized system and similar CPU investment, one would expect similar uncertainty,

762: even for a very large free energy difference. This, indeed, is the motivation

763: for performing absolute free energy calculations. We believe, moreover, that

764: efficiency improvements will be achieved beyond the data in this initial

765: report.

766:

767: \begin{figure}

768:     \includegraphics[scale=0.35]{fig4.eps}\\

769:     \vspace{12pt}

770:     \includegraphics[scale=0.35]{fig5.eps}

771:     \caption{Free energy for leucine dipeptide estimated by

772: 	the reference system

773: 	method as a function of the number of reference structures

774: 	$N_{\rm ref}$ used in the estimate.

775: 	Five independent simulations are shown on a log scale to demonstrate

776: 	the convergence behavior of the free energy estimate for

777: 	(a) the alpha configuration, and (b) the beta configuration.

778: 	Results shown were obtained using Eq.\ (\ref{eq-fep})

779: 	with fifty bins for each degree of freedom.

780: 	\label{fig-converge-di}

781:     }

782: \end{figure}

783:

784: Figure \ref{fig-converge-di} shows the convergence

785: behavior of the reference

786: system method for leucine dipeptide. Five free energy

787: estimates are shown as a function of the number of reference

788: structures used in the estimate for

789: (a) the alpha configuration, and (b) the beta configuration.

790: Each of the five simulations use the same protocol as described above.

791:

792: \begin{figure}

793:     \includegraphics[scale=0.35]{fig6.eps}\\

794:     \vspace{12pt}

795:     \includegraphics[scale=0.35]{fig7.eps}

796:     \caption{Scatter plots of the two $\chi_2$ torsions

797: 	of each residue for leucine dipeptide. Results are shown

798: 	for both physical and reference ensembles containing 100,000

799: 	structures each.

800: 	The figure shows that:

801: 	(i) the reference system has good overlap with the physical system,

802: 	as can be seen by the similarity between the two plots;

803: 	and (ii) the reference system is more broadly distributed

804: 	than the physical

805: 	system, as evidenced by the data at (-60,-60) for the reference

806: 	system that is not present for the physical system.

807: 	\label{fig-scat}

808:     }

809: \end{figure}

810:

811: The leucine dipeptide calculations also demonstrate two important

812: aspects of the particular reference system defined in this study:

813: (i) the reference system has good overlap with the physical system; and

814: (ii) the reference system is broader than the physical system.

815: Figure \ref{fig-scat} shows a scatter plot of the

816: $\chi_2$ torsions of each residue

817: for both the physical and reference ensembles. Each ensemble

818: contains 100,000 structures. The figure clearly shows the

819: excellent overlap between the reference and physical ensemble,

820: as can be seen by the similarity between the two plots. In

821: addition, the reference ensemble scatter plot has data

822: in the region (-60,-60) which does not exist in the

823: physical ensemble, showing that the reference system is ``broader'' than

824: the physical system.

825:

826: \begin{figure}

827:     \includegraphics[scale=0.35]{fig8.eps}

828:     \caption{Histogram of the distance between the $C_\delta$ of residue one

829: 	and the $C_\alpha$ of residue two for leucine dipeptide. Results are

830: 	shown for both reference and physical ensembles containing 100,000

831: 	structures each.

832: 	The figure shows that:

833: 	(i) the reference system has good overlap with the physical system;

834: 	and (ii) the reference system is broader than the physical system.

835: 	\label{fig-dist}

836:     }

837: \end{figure}

838:

839: Figure \ref{fig-dist} shows a histogram of the distance between the $C_\delta$

840: atom of residue one and the $C_\alpha$ of residue two for

841: the same ensembles as Fig.\ \ref{fig-scat}. The figure again shows

842: how the reference system has both excellent overlap with the

843: physical system and is also broader than the physical system.

844:

845: \section{Discussion}

846: The present results raise a number of questions regarding the reference

847: system approach to computing absolute free energies---in particular, regarding

848: the use of correlations, the importance of the physical ensemble,

849: and the potential for application to larger systems.

850:

851: \subsection{Correlation of Coordinates}

852: How can correlations among coordinates be used to increase the method's

853: effectiveness? One may choose to

854: bin coordinates as independent (i.e., one-dimensional

855: histograms), or with correlations

856: (i.e., multi-dimensional histograms).

857: For example, in peptides, one may choose to bin all

858: sets of backbone $\phi,\psi$ torsions as correlated, and all other

859: coordinates (bond lengths, bond angles, other torsions) as

860: independent. It might always seem advantageous

861: to bin some coordinates (at least backbone torsions)

862: as correlated, since reference structures drawn

863: randomly from the histograms

864: will be less likely to have steric

865: clashes. On the other hand, including correlations with small bin

866: sizes is impractical. As an example, imagine that for the leucine

867: dipeptide molecule used in

868: this study, one binned the four $\phi,\psi$ backbone torsions as

869: correlated. If fifty bins for each torsion were used (as should

870: be done according to the discussion below), then there

871: would be $50^4=6,250,000$ multi-dimensional bins to populate,

872: which is simply not feasible.

873:

874: There does appear to be an important advantage to eliminating at

875: least some correlations from the original ``physical'' ensemble:

876: namely, a larger portion of conformational space

877: becomes available to the reference ensemble;

878: see Figs.\ \ref{fig-scat} and \ref{fig-dist}.

879: Since coordinates for the reference structures

880: are drawn randomly and independently, it is

881: possible to generate reference structures that are

882: in entirely different energy basins than those in

883: the physical ensemble. {\it It is thus possible to

884: overcome the inadequacies of the physical ensemble

885: by binning internal coordinates independently}.

886: The optimal (presumably) limited use of correlations

887: will be considered in future work.

888:

889: Regardless of the degree of correlations included in $U_{\rm ref}$,

890: we emphasize that final results fully include correlations in the physical

891: potential $U_{\rm phys}$.

892:

893: \subsection{Quality of the physical ensemble}

894: Since the reference ensemble is generated by drawing at random from

895: histograms which, in turn, were generated from the physical ensemble,

896: a natural question to ask is: how complete does the physical

897: ensemble need to be?

898: The surprising answer is that, for our reference system method,

899: the physical ensemble does not need to

900: be complete, or even correct (properly distributed).

901: Since Eqs.\ (\ref{eq-Fphys}) and (\ref{eq-Fphys2}) are

902: valid for arbitrary reference systems,

903: the convergence of the free energy estimate to the correct value

904: is guaranteed, in the limit of infinite sampling

905: ($N_{\rm ref} \rightarrow \infty$), regardless of the

906: quality of the physical ensemble.

907: The ``trick'' is that the ensemble for the reference system must

908: be converged, which can be achieved with much less expense since

909: there is no dynamical trapping.

910: Unlike the typical case for molecular mechanics simulation,

911: we sample the reference ensemble ``perfectly''---there is no possibility

912: of being trapped in a local basin. By construction, since all coordinate

913: values are generated exactly according to the reference distributions,

914: the reference ensemble can only suffer from statistical (but not systematic)

915: error.

916: For example, it was possible to obtain the correct

917: free energy for methane based on 10,000 reference structures

918: even when the histogram for

919: each coordinate was assumed to be flat, i.e., without

920: the use of a physical ensemble (data not shown).

921:

922: It is important to note that, while convergence to the

923: correct free energy is guaranteed for any

924: choice of reference system, the efficiency of the method could

925: be dramatically reduced if the reference system does not overlap

926: well with the physical system.

927:

928: Given the fact that the physical ensemble need not be correct, it

929: is easy to imagine a modified method that does not require

930: simulation, but instead populates the histogram bins using the ``bare''

931: potential for each internal coordinate (e.g., Gaussian histograms

932: for bond lengths and angles). Of course,

933: the conformational state must be defined explicitly,

934: with upper and lower limits for coordinates.

935: Allowed ranges for the torsions (especially

936: $\phi,\psi$) are naturally obtainable via, e.g., Ramachandran

937: propensities (e.g., Ref.\ \onlinecite{richardson}),

938: and reasonable ranges for bond lengths and angles

939: could be chosen to be, e.g., several standard deviations

940: from the mean.

941:

942: \subsection{Extension to larger systems}

943: While the initial results of our reference system method are

944: promising, a naive implementation of the

945: method will find difficulty with large systems (as do

946: all absolute and relative free energy methods).

947: For our method, the difficulty

948: with including a very large number of degrees of freedom

949: is due to the fact that,

950: if one does not treat all correlations in

951: the backbone, then steric clashes will occur frequently when

952: generating the reference ensemble.

953:

954: However, it is possible to extend the method

955: to larger peptides, still include all degrees of freedom, and

956: bin all coordinates independently (important for broadening

957: configurational space, as discussed above), by using

958: a ``segmentation'' technique motivated by earlier work.

959: \cite{gibson-seg,leach-seg}

960: Consider generating reference

961: structures for a ten-residue peptide in the alpha helix conformation.

962: Due to the large number of backbone torsions,

963: most of the reference structures chosen at random

964: will not be energetically favorable.

965: However, if one breaks the peptide into two pieces, then one can

966: generate many structures for each segment, and only

967: ``keep'' energetically likely segment structures.

968: The selected structures

969: may be joined to form full structures which are reasonably likely

970: to have low energy.

971: For example, if one generates $10^5$ structures for each of the

972: two segments and keeps only $10^3$ of those, then one only need

973: evaluate $10^3 \times 10^3 = 10^6$ full structures

974: out of a possible $10^5 \times 10^5 = 10^{10}$.

975: A statistically correct segmentation strategy

976: is currently being investigated by the authors for use in

977: large peptides.

978:

979: Another strategy which may prove useful for larger systems

980: is to use the reference system method with multi-stage simulation.

981: Multi-stage

982: simulation requires the introduction of a hybrid potential energy

983: parameterized by $\lambda$, e.g.,

984: \begin{eqnarray}

985:     U_{\lambda} = \lambda U_{\rm phys} + (1-\lambda) U_{\rm ref}.

986: \end{eqnarray}

987: Thus, $U_0 = U_{\rm ref}$ and $U_1 = U_{\rm phys}$.

988: Simulations are performed using the hybrid potential energy

989: $U_{\lambda}$ (and thus a hybrid forcefield, if using molecular

990: dynamics) at intermediate $\lambda$ values between 0 and 1.

991: Conventional free energy methods such as thermodynamic integration

992: or free energy perturbation can then be used to

993: obtain $F_{\rm phys}$.

994:

995: We also believe that including correlations, such as suggested

996: by Eq.\ (\ref{eq-Pcorr}) and possibly other ways, may be useful.

997: The inclusion of correlations should improve the overlap between

998: the reference and physical ensembles---thereby reducing the amount

999: of sampling required in the reference system, hence improving efficiency.

1000: This also will be explored in future work.

1001: (We also remind the

1002: reader that the final free energy value includes the full correlations

1003: in $U_{\rm phys}$, regardless of $U_{\rm ref}$.)

1004:

1005: The method could prove useful in future protein-ligand binding

1006: studies. In the simplest approach, one could freeze all degrees of

1007: freedom except for the ligand and side-chain degrees of freedom

1008: in the binding site. While the absolute free energy would be unphysical,

1009: the approach could permit comparison of ligands or protein mutations

1010: with little or no conformational similarity.

1011:

1012: In principle, it is possible to extend the reference

1013: system method to include explicitly solvated biomolecules.

1014: However, as with all absolute free energy methods, the

1015: addition of the solvent degrees of freedom causes

1016: the free energy estimate to converge much more slowly than

1017: without explicit solvent.

1018: Thus, we feel the method described in this study will find

1019: use primarily in implicitly solvated biomolecules.

1020:

1021: \section{Conclusions}

1022: In conclusion, we have introduced and tested a simple

1023: method for calculating absolute

1024: free energies in molecular systems.

1025: The approach relies on the construction of an ensemble of

1026: reference structures (i.e., the reference system)

1027: that is designed to have high overlap with the physical system

1028: of interest.

1029: The method was first shown to reproduce exactly computable

1030: absolute free energies for simple systems, and

1031: then used to correctly predict the stability of leucine

1032: dipeptide conformations

1033: using all one-hundred fifteen degrees of freedom.

1034:

1035: Some strengths of the approach are that:

1036: (i) the reference system is built to have good overlap with the system

1037: of interest by using internal coordinates and by using

1038: a single equilibrium ensemble from Monte Carlo or molecular dynamics;

1039: (ii) the absolute free energy estimate is guaranteed to converge to the

1040: correct value, whether or not the physical ensemble is complete

1041: and, in fact, it is possible to estimate the absolute free energy

1042: without the use of a physical ensemble;

1043: (iii) the method explicitly includes all degrees of freedom employed

1044: in the simulation;

1045: (iv) the reference system need only be numerically

1046: computable, i.e., the exact analytic result is not needed; and

1047: (v) the method can  be trivially extended to include the use

1048: of multi-stage simulation.

1049: The CPU cost of the approach, beyond that for short trajectories

1050: of the physical system of interest,

1051: is one energy call for each reference structure, plus

1052: the less expensive cost of generating the reference ensemble.

1053:

1054: In the present ``proof of principle'' report,

1055: our method was used to study conformational

1056: equilibria; however we feel that the simplicity and flexibility

1057: of the method may find broad use in computational biophysics

1058: and biochemistry for a wide variety of free energy problems.

1059: We have also described a segmentation strategy, currently being

1060: pursued, to use the approach in much larger systems.

1061:

1062: \section*{Acknowledgments}

1063: The authors would like to thank Edward Lyman, Ronald White,

1064: Srinath Cheluvarajah and Hagai Meirovitch for many

1065: fruitful discussions.

1066:

1067: \bibliography{}

1068:

1069: \end{document}

1070: