0506:q-bio0506031/PRL.tex

1: % ****** Start of file template.aps ****** %

2: %%

3: %%

4: %%   This file is part of the APS files in the REVTeX 4 distribution.

5: %%   Version 4.0 of REVTeX, August 2001

6: %%

7: %%

8: %%   Copyright (c) 2001 The American Physical Society.

9: %%

10: %%   See the REVTeX 4 README file for restrictions and more information.

11: %%

12: %

13: % This is a template for producing manuscripts for use with REVTEX 4.0

14: % Copy this file to another name and then work on that file.

15: % That way, you always have this original template file to use.

16: %

17: % Group addresses by affiliation; use superscriptaddress for long

18: % author lists, or if there are many overlapping affiliations.

19: % For Phys. Rev. appearance, change preprint to twocolumn.

20: % Choose pra, prb, prc, prd, pre, prl, prstab, or rmp for journal

21: %  Add 'draft' option to mark overfull boxes with black boxes

22: %  Add 'showpacs' option to make PACS codes appear

23: %  Add 'showkeys' option to make keywords appear

24: \documentclass[aps,prl,preprint,groupedaddress]{revtex4}

25: %\documentclass[aps,prl,preprint,superscriptaddress]{revtex4}

26: %\documentclass[aps,prl,twocolumn,groupedaddress]{revtex4}

27:

28: % You should use BibTeX and apsrev.bst for references

29: % Choosing a journal automatically selects the correct APS

30: % BibTeX style file (bst file), so only uncomment the line

31: % below if necessary.

32: %\bibliographystyle{apsrev}

33:

34: \usepackage{graphicx}% Include figure files

35: \usepackage{dcolumn}% Align table columns on decimal point

36: \usepackage{bm}% bold math

37:

38: \begin{document}

39:

40: \special{papersize=8.5in,11in}

41:

42: % Use the \preprint command to place your local institutional report

43: % number in the upper righthand corner of the title page in preprint mode.

44: % Multiple \preprint commands are allowed.

45: % Use the 'preprintnumbers' class option to override journal defaults

46: % to display numbers if necessary

47: \preprint{LA-UR-05-5164}

48:

49: %Title of paper

50: \title{Allostery in a Coarse-Grained Model of Protein Dynamics}

51:

52: % repeat the \author .. \affiliation  etc. as needed

53: % \email, \thanks, \homepage, \altaffiliation all apply to the current

54: % author. Explanatory text should go in the []'s, actual e-mail

55: % address or url should go in the {}'s for \email and \homepage.

56: % Please use the appropriate macro foreach each type of information

57:

58: % \affiliation command applies to all authors since the last

59: % \affiliation command. The \affiliation command should follow the

60: % other information

61: % \affiliation can be followed by \email, \homepage, \thanks as well.

62:

63: \author{Dengming Ming}

64: %\email[dming@lanl.gov]{Your e-mail address}

65: %\homepage[]{Your web page}

66: %\thanks{}

67: %\altaffiliation{}

68: \affiliation{Computer and Computational Sciences Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA}

69:

70: \author{Michael E. Wall}

71: \email[Correspondence: ]{mewall@lanl.gov}

72: %\homepage[]{Your web page}

73: %\thanks{}

74: %\altaffiliation{}

75: \affiliation{Computer and Computational Sciences and Bioscience Divisions, Los Alamos National Laboratory, Los Alamos, NM 87545, USA}

76:

77: %Collaboration name if desired (requires use of superscriptaddress

78: %option in \documentclass). \noaffiliation is required (may also be

79: %used with the \author command).

80: %\collaboration can be followed by \email, \homepage, \thanks as well.

81: %\collaboration{}

82: %\noaffiliation

83:

84: \date{\today}

85:

86: \begin{abstract}

87:

88: We propose a criterion for optimal parameter selection in

89: coarse-grained models of proteins, and develop a refined elastic

90: network model (ENM) of bovine trypsinogen. The unimodal density-of-states

91: distribution of the trypsinogen ENM disagrees with the bimodal

92: distribution obtained from an all-atom model; however, the bimodal

93: distribution is recovered by strengthening interactions between atoms

94: that are backbone neighbors. We use the backbone-enhanced model to

95: analyze allosteric mechanisms of trypsinogen, and find relatively strong

96: communication between the regulatory and active sites.

97: \end{abstract}

98:

99: % insert suggested PACS numbers in braces on next line

100: \pacs{}

101: % insert suggested keywords - APS authors don't need to do this

102: %\keywords{}

103:

104: %\maketitle must follow title, authors, abstract, \pacs, and \keywords

105: \maketitle

106:

107: % body of paper here - Use proper section commands

108: % References should be done using the \cite, \ref, and \label commands

109: %\section{}

110: % Put \label in argument of \section for cross-referencing

111: %\section{\label{sec:theory}Theory}

112: %\subsection{\label{sec:theory}Theory}

113:

114: A major challenge of molecular biology is to understand regulatory

115: mechanisms in large protein complexes that are abundant in

116: multi-celluluar organisms. To make simulation of such complexes

117: computationally feasible, coarse-grained models have been developed,

118: in which a subset of the atoms in the complex are used to simulate the

119: large-scale motions. However, principled methods to quantify and

120: optimize the accuracy of coarse-grained models are currently lacking.

121:

122: In one common coarse-graining method, an all-atom model is simplified

123: by considering effective interactions among a subset of the atoms

124: (e.g., just the alpha-carbons). The usual criterion for model accuracy

125: is the ability of a model to reproduce atomic mean-squared

126: displacements (MSDs). However, MSDs are just one aspect of protein

127: dynamics -- a stricter criterion for the accuracy of a coarse-grained

128: model is the similarity between the configurational distributions of

129: the selected atoms in the coarse-grained and all-atom models. Such a

130: criterion is also biologically relevant, in part because the

131: conformational distribution is a key determinant of protein activity

132: \cite{Frauenfelder85}.

133:

134: One useful measure of the difference between conformational

135: distributions is the Kullback-Leibler divergence $D_{\bf x}$ (see

136: definition below) \cite{Kullback51,Ming05}. Recently, an analytic

137: expression for $D_{\bf x}$ was obtained for harmonic vibrations of a

138: protein-ligand complex both with and without a protein-ligand

139: interaction \cite{Ming05}. Here we show how an equivalent expression

140: may be applied to refine a coarse-grained model of protein

141: dynamics. To use the expression for $D_{\bf x}$ requires the marginal

142: probability distribution of a subset of the atoms in a protein, which

143: we calculate in the harmonic approximation. We then apply the

144: equations to refine an anisotropic elastic network model (ENM)

145: \cite{Atilgan01} of trypsinogen dynamics with respect to an all-atom

146: model calculated using CHARMM \cite{Brooks83}. The unimodal

147: density-of-states distribution of the ENM disagrees with the bimodal

148: distribution obtained from the all-atom model; however, the bimodal

149: distribution is recovered by strengthening interactions between atoms

150: that are backbone neighbors. Finally, the backbone-enhanced elastic

151: network model (BENM) is used to analyze allosteric mechanisms of

152: trypsinogen, revealing relatively strong communication between the

153: regulatory and active sites.

154:

155: Let $P({\bf x})$ be the probability distribution of the $3N$ atomic

156: coordinates ${\bf x}=(x_1,y_1,z_1,\ldots,x_N,y_N,z_N)$ of a molecular

157: model in the harmonic approximation. Let ${\bf x}=({\bf x}_1,{\bf

158: x}_2)$, where ${\bf x}_1$ is the $3N_1$ coordinates of a subset of

159: atoms of interest, and ${\bf x}_2$ is the $3N_2$ coordinates of the

160: remaining atoms. We are interested in calculating the marginal

161: distribution $P({\bf x}_1)$:

162: \begin{equation}

163: P({\bf x}_1) = \int d^{3N_2} {\bf x}_1 \, P({\bf x}_1, {\bf x}_2).

164: \label{eq:px1_general}

165: \end{equation}

166:

167: We now calculate $P({\bf x}_1)$ in a model of molecular vibrations. Consider a harmonic approximation to the potential energy function

168: $U({\bf x})$, where ${\bf x}$ is the deviation from an

169: equilibrium conformation ${\bf x}_0$:

170: \begin{equation}

171: U({\bf x} + {\bf x}_0)\approx U({\bf x}_0)+{1 \over 2}{\bf x}^\dag {\bf H}\,{\bf x}.

172: \end{equation}

173: The matrix ${\bf H}$ is the Hessian of $U$ evaluated at ${\bf x}_0$: $H_{ij}|_{{\bf x}_0}=\partial^2 U / \partial x_i \partial x_j |_{{\bf x}_0}.$ We assume a Boltzmann distribution for $P({\bf x})$, and ignore solvent and pressure effects:

174: \begin{equation}

175: P({\bf x})=Z^{-1}e^{{-{\bf x}^\dag {\bf H}\,{\bf x}}\over{2 k_B T}}=(2\pi k_B T)^{-3N/2}e^{{-\left|{\bf \Omega} {\bf V}^\dag {\bf x}\right|^2} \over {2 k_B T}}\prod_{i=1}^{3N}\omega_i,

176: \label{eq:harmboltzmann}

177: \end{equation}

178: where $Z$ is the partition function, $k_B$ is Boltzmann's constant,

179: $T$ is the temperature, the elements of the matrix $\left|{\bf \Omega}\right|^2={\rm diag} (\omega_1^2,\ldots,\omega_{3N}^2)$ are the eigenvalues of

180: ${\bf H}$, and the columns of the matrix ${\bf V}$ are the

181: eigenvectors of ${\bf H}$. To calculate $P({\bf x}_1)$ we define the

182: submatrices ${\bf H}_1$, ${\bf H}_2$, and ${\bf G}$ as follows:

183: \begin{eqnarray}

184: {\bf H\,x} = \left(

185: \begin{array}{cc}

186: {\bf H}_1 & {\bf G} \\

187: {\bf G}^\dag & {\bf H}_2

188: \end{array}

189: \right)

190: \left(

191: \begin{array}{c}

192: {\bf x}_1 \\

193: {\bf x}_2

194: \end{array}

195: \right)

196: =

197: \left(

198: \begin{array}{ccc}

199: {\bf H}_1 {\bf x}_1 & + & {\bf G} {\bf x}_2 \\

200: {\bf G}^\dag {\bf x}_1 & + & {\bf H}_2 {\bf x}_2

201: \end{array}

202: \right).

203: \label{eq:Hdecomp}

204: \end{eqnarray}

205: ${\bf H}_1$ couples coordinates from ${\bf x}_1$; ${\bf H}_2$ couples coordinates from ${\bf x}_2$; and ${\bf G}$ couples coordinates between ${\bf x}_1$ and ${\bf x}_2$. Eq.~(\ref{eq:harmboltzmann}) now can be expressed as

206: \begin{equation}

207: P({\bf x})=Z^{-1}e^{{-{\bf x}^\dag {\bf H}\,{\bf x}}\over{2 k_B T}}=(2\pi k_B T)^{-3N/2}e^{{-\left|{\bf \bar{\Omega}} {\bf \bar{V}}^\dag {\bf x}_1\right|^2 - \left|{\bf \Lambda} {\bf U}^\dag {\bf x}_2 + {\bf \Lambda}^{-1}{\bf U}^\dag {\bf G}^\dag {\bf x}_1 \right|^2} \over {2 k_B T}}\prod_{i=1}^{3N}\omega_i,

208: \label{eq:harmsub}

209: \end{equation}

210: where the diagonal elements of the matrix $\left|{\bf \Lambda}\right|^2={\rm

211: diag}(\lambda_1^2,\ldots,\lambda_{3N_1}^2)$ and the columns of the

212: matrix ${\bf U}$ are the eigenvalues and eigenvectors of ${\bf H}_2$,

213: and the diagonal elements of the matrix $\left|{\bf \bar{\Omega}}\right|^2={\rm

214: diag}(\bar{\omega}_1^2,\ldots,\bar{\omega}_{3N_1}^2)$ and the columns of the

215: matrix ${\bf \bar{V}}$ are the eigenvalues and eigenvectors of a matrix ${\bf \bar{H}}$ defined as

216: \begin{equation}

217: {\bf \bar{H}}={\bf H}_1 - {\bf G}{\bf H}_2^{-1}{\bf G}^\dag = {\bf \bar{V}}\left|{\bf \bar{\Omega}}\right|^2{\bf \bar{V}}^\dag.

218: \label{eq:Hbar}

219: \end{equation}

220: Eq.~(\ref{eq:Hbar}) is equivalent to an equation independently derived to

221: study local vibrations in the nucleotide-binding pockets of myosin and

222: kinesin \cite{Zheng05}. Performing the integral in

223: Eq.~(\ref{eq:px1_general}) leads to the desired equation for

224: $P({\bf x}_1)$:

225: \begin{equation}

226: P({\bf x}_1) = (2\pi k_B T)^{-3N_1/2}e^{{-\left|{\bf \bar{\Omega}}{\bf \bar{V}}^\dag {\bf x}_1 \right|^2} \over {2 k_B T}} \prod_{i=1}^{3N_1}\bar{\omega}_i.

227: \label{eq:margin}

228: \end{equation}

229:

230: Now consider the problem of optimal selection of the parameters

231: $\Gamma$ of a coarse-grained model of protein dynamics. Let ${\bf

232: x}_\alpha$ be the coordinates of the $N_\alpha$ alpha-carbons in an an

233: all-atom model, and ${\bf x}_\alpha^{(\Gamma)}$ be the same

234: coordinates in the coarse-grained model. We define the optimal

235: coarse-grained model as the one for which the Kullback-Leibler

236: divergence between $P^{(\Gamma)}({\bf x}_\alpha)$ and $P({\bf

237: x}_\alpha)$ is minimal, {\em i.e.}, for which $\Gamma$ is chosen such

238: that

239: \begin{equation}

240: D_{{\bf x}_\alpha}^{(\Gamma)}=\int d^{3N_\alpha}{\bf x}_\alpha \,

241: P^{(\Gamma)}({\bf x}_\alpha)\ln {P^{(\Gamma)}({\bf x}_\alpha) \over

242: P({\bf x}_\alpha)}

243: \label{eq:dkl}

244: \end{equation}

245: is minimal. We previously calculated an analytic expression for

246: equations like Eq.~(\ref{eq:dkl}) when $P({\bf x}_\alpha)$ and

247: $P^{(\Gamma)}({\bf x}_\alpha)$ are both governed by harmonic

248: vibrations \cite{Ming05}:

249:

250: \begin{equation}

251: D_{{\bf x}_\alpha}^{(\Gamma)}=\sum_{i=1}^{3N_\alpha}\left(\ln {\omega_{i}^{(\Gamma)} \over \bar{\omega}_{i}} + {1 \over {2 k_B T}}\bar{\omega}_{i}^2 \left| {\bf \bar{v}}^\dag_i \Delta{\bf x}_{\alpha,{0}}\right|^2 + {1 \over 2}\sum_{j=1}^{3N_\alpha}{\bar{\omega}_{j}^2 \over {\omega^{(\Gamma)}_{i}}^2}\left|{{\bf v}^{(\Gamma)}_i}^\dag {\bf \bar{v}}_j\right|^2- {1 \over 2}\right).

252: \label{eq:cgd}

253: \end{equation}

254: In Eq.~(\ref{eq:cgd}), ${\omega^{(\Gamma)}_i}^2$ and ${{\bf

255: v}^{(\Gamma)}_i}$ are the eigenvalue and eigenvector of mode $i$ of

256: the coarse-grained model; ${\bar{\omega}_i}^2$ and ${{\bf \bar{v}}_i}$

257: are the $i^{\rm th}$ eigenvalue and eigenvector of the matrix ${\bf

258: \bar{H}}$ calculated for the alpha-carbon atoms of the all-atom model

259: (Eq.~(\ref{eq:Hbar})), and $\Delta{\bf x}_{\alpha,{0}}={\bf

260: x}^{(\Gamma)}_{\alpha,0}-{\bf x}_{\alpha,0}$ is the difference between

261: the equilibrium coordinates of the coarse-grained and all-atom

262: models. An optimal coarse-grained model of harmonic vibrations is thus

263: one with parameters $\Gamma$ such that $D^{(\Gamma)}_{{\bf x}_\alpha}$

264: calculated using Eq.~(\ref{eq:cgd}) is minimal.

265:

266: In the ENM \cite{Atilgan01}, interacting alpha-carbon atoms are

267: connected by springs aligned with the direction of atomic

268: separation. Following the Tirion model of harmonic vibrations

269: \cite{Tirion96}, each spring has the same force constant $\gamma$. For

270: a given interaction network, the eigenvectors ${\bf v}^{(\gamma)}_i$

271: are independent of $\gamma$, and each eigenvalue

272: ${\omega^{(\gamma)}_i}^2$ is proportional to $\gamma$. The value of

273: $\gamma$ at which $D^{(\gamma)}_{{\bf x}_\alpha}$ is minimal may be

274: calculated using Eq.~(\ref{eq:cgd}):

275: \begin{equation}

276: \gamma={1 \over

277: {3N_\alpha}}\sum_{i=1}^{3N_\alpha}\sum_{j=1}^{3N_\alpha}

278: {\bar{\omega}_j^2 \over a_i^2}\left|{\bf v}_i^{(\gamma) \dag}{\bf

279: \bar{v}}_j\right|^2.

280: \label{eq:dmin}

281: \end{equation}

282: The proportionality constants $a_i^2={\omega^{(\gamma)}_i}^2/\gamma$

283: are determined from the eigenvalue spectrum calculated using an

284: arbitrary value of $\gamma$ (because the eigenvalues

285: $\omega^{(\gamma)2}_i$ are proportional to $\gamma$, the constants

286: $a_i^2$ are independent of $\gamma$). It is easily shown that the

287: third and fourth terms of Eq.~(\ref{eq:cgd}) cancel when $\gamma$

288: assumes the value given by Eq.~(\ref{eq:dmin}).

289:

290: The interaction network in an elastic network

291: model is generated by enabling interactions only between pairs of

292: atoms separated by a distance less than or equal to a cutoff distance

293: $r_c$. To optimize the model,

294: the value of $r_c$ for which

295: $D^{(\gamma)}_{{\bf x}_\alpha}$ is minimal is numerically estimated,

296: using values of $\gamma$

297: from Eq.~(\ref{eq:dmin}).

298:

299: As a test case for optimization, we developed a coarse-grained model

300: of bovine trypsinogen from an all-atom model (223 amino acids obtained

301: from PDB entry 4TPI \cite{Bode84}).  CHARMM was used for all-atom

302: simulations using the CHARMM22 force field with default parameter

303: values. HBUILD was used to generate hydrogen positions, and the energy

304: was initially minimized using 2000 steps of relaxation by the

305: adopted basis Newton-Raphson method, gradually reducing the weight of

306: a harmonic restraint to the crystal-structure coordinates. The final

307: minimized structure was obtained through vacuum minimization until a

308: gradient of $10^{-7}$ Kcal/mol\,\AA~was achieved, and the Hessian {\bf

309: H} was calculated in CHARMM. The coordinates of the elastic network

310: model were taken from the alpha-carbon coordinates of the minimized

311: all-atom model.

312:

313: The alpha-carbon vibrations of the all-atom model were calculated by

314: diagonalizing ${\bf \bar{H}}$ from Eq.~(\ref{eq:Hbar}). Interestingly,

315: the distribution of the density-of-states for the vibrations is

316: bimodal (Fig.~\ref{fig:freqs}) with 2/3 of the frequencies in the

317: low-frequency spectrum and 1/3 of the frequencies in the

318: high-frequency spectrum. Calculation of the density-of-states

319: distribution from other globular proteins yields bimodal patterns with

320: a similar 2:1 ratio between the numbers of low- and high-frequency

321: modes (unpublished results).

322:

323: \begin{figure}

324: \includegraphics[width=3.0in]{Figs/Dens}

325: \caption{Density-of-states distribution for all-atom and elastic

326: network models of trypsinogen. Frequency units are $({\rm Kcal}/ {\rm

327: mol} \, {\rm \AA}^2 \, m_p)^{1/2} = 2.04 \times 10^{13} \, {\rm Hz}$,

328: where $m_p$ is the proton mass. Densities were estimated by counting

329: the number of modes in bins of width 0.2, and normalizing the integral

330: to 663, which is the total number of non-zero modes. The ENM ({\em

331: dotted blue}) does not reproduce the bimodal distribution from the

332: all-atom model ({\em solid red}); however, the BENM recovers the

333: bimodal distribution ({\em dashed green}).}

334: \label{fig:freqs}

335: \end{figure}

336:

337: The best elastic network model of trypsinogen was obtained using a

338: cutoff distance $r_c$ of approximately 7.75~\AA, for which the optimal

339: value of $\gamma$ is 53.4~Kcal/mol\,\AA$^2$, yielding a value of

340: $D_{{\bf x}_\alpha}=312.9$ in a sharp minimum with respect to

341: $r_c$. The density-of-states distribution for the elastic network

342: model is unimodal, unlike that for the all-atom model

343: (Fig.~\ref{fig:freqs}).

344:

345:

346: Although the ENM treats all alpha-carbon pairs equally,

347: the distribution of distances

348: between successive alpha-carbons along the protein

349: backbone is known to be tightly

350: centered about 3.8~\AA. In addition, two of the six alpha-carbons

351: nearest to a typical alpha-carbon are backbone neighbors, which might

352: explain why 1/3 of the CHARMM-derived modes have significantly higher

353: frequencies than the others. We therefore wondered whether the ENM

354: might be improved by enhancing interactions between backbone

355: neighbors.

356:

357: Indeed, a more accurate coarse-grained model is obtained by using a

358: force constant enhanced by a factor of $\epsilon$ for interactions

359: between alpha-carbons that are neighbors on the backbone. Minimization

360: of $D_{{\bf x}_\alpha}$ for such a backbone-enhanced elastic network

361: model (BENM) with respect to $\epsilon$ and $r_c$ subject to

362: Eq.~(\ref{eq:dmin}) yields a model with $\epsilon=42$, $r_c=10.5$~\AA,

363: and $\gamma=4.26$~Kcal/mol\,\AA$^2$, resulting in a much lower value

364: $D_{{\bf x}_\alpha}=102.3$. The density-of-states distribution for

365: this model agrees quite well with that of the all-atom model

366: (Fig.~\ref{fig:freqs}), especially considering that the model is

367: optimized with respect to $D_{{\bf x}_\alpha}$, which does not

368: directly involve the density-of-states distribution. The agreement is

369: especially good for the high-frequency modes, suggesting that a

370: uniform force constant is a reasonable approximation for interactions

371: between alpha-carbons that are backbone neighbors. Furthermore, the

372: overlap $\sum_{i=1}^N\sum_{j=1}^N|{\bf v}_i^{(\gamma)\dag}\bar{\bf

373: v}_j|^2/N$ for the 223 highest-frequency modes is 0.99, indicating

374: that the spaces of the high-frequency eigenvectors are nearly

375: identical between the BENM and all-atom models. In contrast, the

376: low-frequency distribution of BENM states is narrower than that of the

377: all-atom model, indicating that a uniform force constant is a poorer

378: approximation for interactions between alpha-carbons that are not

379: backbone neighbors.

380:

381: \begin{figure}

382: \includegraphics[width=3.0in]{Figs/fluctuations}

383: \caption{Mean-squared displacements of alpha-carbon positions for

384: trypsinogen residues 10--229 obtained from normal-modes simulations

385: using CHARMM ({\em dashed green}), a BENM with parameters that

386: minimize $D_{{\bf x}_\alpha}$ with respect to CHARMM ({\em dotted

387: blue}), the same BENM but with $\gamma$ adjusted to better agree with

388: CHARMM MSDs ({\em fine-dotted magenta}), and an ENM with parameters

389: adjusted to agree with CHARMM MSDs ({\em dash-dotted cyan}). Values

390: were calculated at $T=300$~K using the Equipartition Theorem. Harmonic

391: vibrations at thermal equilibrium are known to inadequately

392: model crystallographic MSDs, which include other

393: sources of disorder ({\em solid red})

394: \cite{Go83}.}

395: \label{fig:flucts}

396: \end{figure}

397:

398: Both the BENM and the ENM yield patterns of alpha-carbon MSDs that are

399: similar to that of the all-atom model (Fig.~\ref{fig:flucts}). Because

400: there are fewer low-frequency BENM modes than low-frequency CHARMM

401: modes (Fig.~\ref{fig:freqs}), the BENM MSDs are consistently smaller

402: than the CHARMM MSDs; however, the BENM MSDs may be improved by

403: selecting $\gamma=1.2$~Kcal/mol\,\AA$^2$

404: (Fig.~\ref{fig:flucts}). These improved MSDs come at the cost of a

405: higher value of $D_{{\bf x}_\alpha}=528.4$, and a change in the

406: frequency scale by a factor $(1.2/4.3)^{1/2}=0.53$, resulting in a

407: poor model of the density-of-states distribution. The ENM with

408: parameters that minimize $D_{{\bf x}_\alpha}$ exhibits poor MSDs (not

409: shown); however, an ENM with $r_c=15.4$~\AA\ and

410: $\gamma=0.4$~Kcal/mol\,\AA$^2$ yields MSDs that agree well with those

411: of the CHARMM model (Fig.~\ref{fig:flucts}). In agreement with

412: previous results using the ENM \cite{Atilgan01}, we confirmed that the

413: parameters of both the ENM and BENM may be adjusted to yield a

414: reasonable model of crystallographic MSDs (not shown).

415:

416: Next consider the problem of quantifying allosteric effects in

417: proteins \cite{Ming05}. In allosteric regulation, molecular

418: interactions cause changes in protein activity through changes in

419: protein conformation. Although the importance of considering

420: continuous conformational distributions in understanding allosteric

421: effects was recognized by Weber \cite{Weber72}, theories of allosteric

422: regulation that consider continuous conformational distributions have

423: been lacking. We began to develop such a theory by defining the

424: allosteric potential as the Kullback-Leibler divergence $\bar{D}_{\bf

425: x}$ between protein conformational distributions before and after

426: ligand binding, and by calculating changes in the conformational

427: distribution of the full protein-ligand complex in the harmonic

428: approximation \cite{Ming05}. Here we use the expression for the

429: marginal distribution in Eq.~(\ref{eq:margin}) to calculate an equation

430: for the allosteric potential in the harmonic approximation, and apply

431: it to analyze allosteric mechanisms in trypsinogen.

432:

433: Let

434: ${\bf x}_p$ be the protein coordinates selected from the coordinates

435: ${\bf x}$ of a protein-ligand complex. $P^\prime({\bf x}_p)$ and

436: $P({\bf x}_p)$ are the protein conformational distributions with and

437: without a ligand interaction. Eq.~(\ref{eq:margin}) enables

438: $P^\prime({\bf x}_p)$ to be calculated from the full conformational

439: distribution $P^\prime({\bf x})$ of the protein-ligand complex. The

440: equation for the allosteric potential in the harmonic

441: approximation follows from the theory developed in ref.~\cite{Ming05}:

442: \begin{equation}

443: \bar{D}_{{\bf x}}=\sum_{i=1}^{3N_p}\left(\ln

444: {\bar{\omega}^{\prime}_{i} \over {\omega}_{i}} + {1 \over {2 k_B

445: T}}{\omega}_{i}^2 \left| {\bf {v}}^\dag_i \Delta{\bf x}_{p,{0}}\right|^2 +

446: {1 \over 2}\sum_{j=1}^{3N_p}{{\omega}_{j}^2 \over

447: {\bar{\omega}^{\prime 2}_{i}}}\left|{{\bf \bar{v}}^{\prime \dag}_i}

448: {\bf {v}}_j\right|^2- {1 \over 2}\right).

449: \label{eq:ap}

450: \end{equation}

451: In Eq.~(\ref{eq:ap}), $\bar{\omega}^{\prime 2}$ and ${\bf

452: \bar{v}}^{\prime}_i$ are the $i^{\rm th}$ eigenvalue and eigenvector

453: of the matrix ${\bf \bar{H}}$ calculated for the protein atoms of the

454: protein-ligand complex, $\omega_i^2$ and ${\bf v}_i$ are the

455: eigenvalue and eigenvector of mode $i$ of the apo-protein, and $\Delta

456: {\bf x}_{p,0}={\bf x}^\prime_{p,0}-{\bf x}_{p,0}$ is the difference

457: between the equilibrium coordinates of the protein with and without

458: the ligand interaction. The term

459: $\sum_{i=1}^{3N_p}\ln{\bar{\omega}^\prime_i / \omega_i}$ is

460: proportional to the change in configurational entropy of the protein

461: releasing the ligand, and the term

462: $\sum_{i=1}^{3N_p}{\omega}_{i}^2 \left| {\bf {v}}^\dag_i \Delta{\bf

463: x}_{p,{0}}\right|^2 / 2 k_B T$ is proportional to the potential energy

464: required to deform the apo-protein into its equilibrium

465: conformation in the protein-ligand complex.

466:

467: We used Eq.~(\ref{eq:ap}) to calculate changes in the configurational

468: distribution of local regions of trypsinogen upon binding bovine

469: pancreatic trypsinogen inhibitor (BPTI). BPTI binds in the active site

470: and exerts an allosteric effect, enhancing the affinity of trypsinogen

471: for Val-Val \cite{Bode79}. Alpha-carbon coordinates for 223 residues

472: were obtained from a crystal structure of trypsinogen in complex with

473: BPTI (residues 7--229 from PDB entry 4TPI \cite{Bode84}, including

474: theoretically modeled residues 7--9), and were used directly to

475: construct backbone-enhanced elastic network models of apo-trypsinogen

476: and the trypsinogen-BPTI complex. As suggested by the refined

477: trypsinogen model above, both models used $r_c=10.5$~\AA,

478: $\gamma=4.26$~Kcal/mol\,\AA$^2$, and $\epsilon=42$.

479:

480: Local changes in the conformational distribution of trypsinogen were

481: analyzed by considering changes in the neighborhood of each

482: alpha-carbon atom. A neighborhood was defined by selecting the atom of

483: interest plus its five nearest neighbors, and the matrix ${\bf

484: \bar{H}}$ was calculated for these six atoms in the models both with

485: (yielding ${\bf \bar{H}}^\prime$) and without (yielding ${\bf

486: \bar{H}}$) the BPTI interaction. A local value of $\bar{D}_{\bf x}$

487: was obtained using the eigenvalues and eigenvectors of ${\bf

488: \bar{H}}^\prime$ and ${\bf \bar{H}}$ in a suitably modified version of

489: Eq.~(\ref{eq:ap}).

490:

491: \begin{figure}

492: \includegraphics[width=3in]{Figs/trypsin_left_lores.eps}

493: \includegraphics[width=3in]{Figs/trypsin_right_lores.eps}

494: \caption{Visualization of local sites on the surface of trypsinogen

495: that exhibit a large change in the conformational distribution upon

496: binding BPTI. Values of $\bar{D}_{\bf x}$ are mapped to a logarithmic

497: temperature scale, with red coloring indicating large values. Changes

498: are large both in the BPTI-binding site ({\em left}) and in the

499: Val-Val binding site ({\em right}). There is a $90^\circ$ rotation

500: about the x-axis between the left and right panels.}

501: \label{fig:trypsinogen}

502: \end{figure}

503:

504: Not surprisingly, we found that the local values of $\bar{D}_{\bf x}$

505: were relatively large in the neighborhood of the BPTI-binding site

506: (Fig.~\ref{fig:trypsinogen}, {\em left panel}). Values of

507: $\bar{D}_{\bf x}$ elsewhere on the surface were smaller, with one

508: interesting exception: values in the Val-Val binding site were

509: comparable to those in the BPTI-binding site

510: (Fig.~\ref{fig:trypsinogen}, {\em right panel}).

511:

512: We also calculated local values of $\bar{D}_{\bf x}$ for the Val-Val

513: interaction, which causes the crystal structure of trypsinogen to

514: resemble that of active trypsin \cite{Bode78,Bode84}. We found that

515: values were relatively large in the neighborhood of Ser 195, which is

516: the key catalytic residue for trypsin and other serine proteases: the

517: value of $\bar{D}_{\bf x}$ in this neighborhood was 40$^{\rm th}$

518: highest of 223 residues in the crystal structure; 11$^{\rm th}$ of all

519: residues not directly interacting with the Val-Val in the model; the

520: highest of all residues located at least as far as Ser 195 is from the

521: Val-Val ligand; and greater than that for 20 of 60 residues located

522: closer to the ligand. Calculations for both the BPTI interaction and

523: the Val-Val interaction therefore indicate that there is a relatively

524: strong communication between the regulatory and active sites of

525: trypsinogen.

526:

527: Considering models beyond the ENM and BENM (and even models beyond

528: proteins), the theory presented here leads to a general prescription

529: for modeling harmonic vibrations using coarse-grained models of

530: materials. To optimally model the all-atom conformational

531: distribution, always use an energy scale for interactions that

532: eliminates the discrepancy due to differences in the eigenvectors

533: (Eq.~(\ref{eq:dmin})), and select the

534: coarse-grained model for which the entropy of the conformational

535: distribution is the largest (first term of Eq.~(\ref{eq:cgd})).

536:

537: Although traditional elastic network models can explain

538: characteristics of the functions and dynamics of proteins

539: \cite{Yang05}, the present study shows that they provide a poor

540: approximation to the conformational distribution calculated from

541: all-atom models of harmonic vibrations of proteins. Model accuracy is

542: significantly improved by using a backbone-enhanced elastic network

543: model, which strengthens interactions between atoms that are nearby in

544: terms of covalent linkage. Although the backbone-enhanced model

545: appears to accurately capture the high-frequency alpha-carbon

546: vibrations of an all-atom model, the model less accurately captures

547: the slower, large-scale harmonic vibrations (which in turn are known

548: to poorly approximate the full spectrum of highly nonlinear,

549: large-scale protein motions).

550:

551: We also find that the allosteric potential is a useful tool for

552: computational analysis of allosteric mechanisms in proteins. Using

553: calculations of the allosteric potential, communication between the

554: regulatory and active sites of trypsinogen was observed in a purely

555: mechanical, coarse-grained model of protein harmonic vibrations that

556: does not consider mean conformational changes or amino-acid

557: identities, supporting prior arguments for the possibility of

558: allostery without a mean conformational change \cite{Cooper84}. It will

559: be interesting to perform similar analyses on a wide range of all-atom

560: and coarse-grained models of protein vibrations, and to use more

561: realistic calculations of free-energy landscapes \cite{Garcia01} to

562: more accurately model changes in protein conformational distributions.

563:

564: % \subsubsection{}

565:

566: % If in two-column mode, this environment will change to single-column

567: % format so that long equations can be displayed. Use

568: % sparingly.

569: %\begin{widetext}

570: % put long equation here

571: %\end{widetext}

572:

573: % figures should be put into the text as floats.

574: % Use the graphics or graphicx packages (distributed with LaTeX2e)

575: % and the \includegraphics macro defined in those packages.

576: % See the LaTeX Graphics Companion by Michel Goosens, Sebastian Rahtz,

577: % and Frank Mittelbach for instance.

578: %

579: % Here is an example of the general form of a figure:

580: % Fill in the caption in the braces of the \caption{} command. Put the label

581: % that you will use with \ref{} command in the braces of the \label{} command.

582: % Use the figure* environment if the figure should span across the

583: % entire page. There is no need to do explicit centering.

584:

585: % \begin{figure}

586: % \includegraphics{}%

587: % \caption{\label{}}

588: % \end{figure}

589:

590: % Surround figure environment with turnpage environment for landscape

591: % figure

592: % \begin{turnpage}

593: % \begin{figure}

594: % \includegraphics{}%

595: % \caption{\label{}}

596: % \end{figure}

597: % \end{turnpage}

598:

599: % tables should appear as floats within the text

600: %

601: % Here is an example of the general form of a table:

602: % Fill in the caption in the braces of the \caption{} command. Put the label

603: % that you will use with \ref{} command in the braces of the \label{} command.

604: % Insert the column specifiers (l, r, c, d, etc.) in the empty braces of the

605: % \begin{tabular}{} command.

606: % The ruledtabular enviroment adds doubled rules to table and sets a

607: % reasonable default table settings.

608: % Use the table* environment to get a full-width table in two-column

609: % Add \usepackage{longtable} and the longtable (or longtable*}

610: % environment for nicely formatted long tables. Or use the the [H]

611: % placement option to break a long table (with less control than

612: % in longtable).

613: % \begin{table}%[H] add [H] placement to break table across pages

614: % \caption{\label{}}

615: % \begin{ruledtabular}

616: % \begin{tabular}{}

617: % Lines of table here ending with \\

618: % \end{tabular}

619: % \end{ruledtabular}

620: % \end{table}

621:

622: % Surround table environment with turnpage environment for landscape

623: % table

624: % \begin{turnpage}

625: % \begin{table}

626: % \caption{\label{}}

627: % \begin{ruledtabular}

628: % \begin{tabular}{}

629: % \end{tabular}

630: % \end{ruledtabular}

631: % \end{table}

632: % \end{turnpage}

633:

634: % Specify following sections are appendices. Use \appendix* if there

635: % only one appendix.

636: %\appendix

637: %\section{}

638:

639: % If you have acknowledgments, this puts in the proper section head.

640: %\begin{acknowledgments}

641: This work was supported by the US Department of Energy.

642: %\end{acknowledgments}

643:

644: % Create the reference section using BibTeX:

645: \bibliography{PRL}

646:

647: \end{document}

648: %

649: % ****** End of file template.aps ******

650:

651: