1: % ****** Start of file template.aps ****** %
2: %%
3: %%
4: %% This file is part of the APS files in the REVTeX 4 distribution.
5: %% Version 4.0 of REVTeX, August 2001
6: %%
7: %%
8: %% Copyright (c) 2001 The American Physical Society.
9: %%
10: %% See the REVTeX 4 README file for restrictions and more information.
11: %%
12: %
13: % This is a template for producing manuscripts for use with REVTEX 4.0
14: % Copy this file to another name and then work on that file.
15: % That way, you always have this original template file to use.
16: %
17: % Group addresses by affiliation; use superscriptaddress for long
18: % author lists, or if there are many overlapping affiliations.
19: % For Phys. Rev. appearance, change preprint to twocolumn.
20: % Choose pra, prb, prc, prd, pre, prl, prstab, or rmp for journal
21: % Add 'draft' option to mark overfull boxes with black boxes
22: % Add 'showpacs' option to make PACS codes appear
23: % Add 'showkeys' option to make keywords appear
24: \documentclass[aps,prl,preprint,groupedaddress]{revtex4}
25: %\documentclass[aps,prl,preprint,superscriptaddress]{revtex4}
26: %\documentclass[aps,prl,twocolumn,groupedaddress]{revtex4}
27:
28: % You should use BibTeX and apsrev.bst for references
29: % Choosing a journal automatically selects the correct APS
30: % BibTeX style file (bst file), so only uncomment the line
31: % below if necessary.
32: %\bibliographystyle{apsrev}
33:
34: \usepackage{graphicx}% Include figure files
35: \usepackage{dcolumn}% Align table columns on decimal point
36: \usepackage{bm}% bold math
37:
38: \begin{document}
39:
40: \special{papersize=8.5in,11in}
41:
42: % Use the \preprint command to place your local institutional report
43: % number in the upper righthand corner of the title page in preprint mode.
44: % Multiple \preprint commands are allowed.
45: % Use the 'preprintnumbers' class option to override journal defaults
46: % to display numbers if necessary
47: \preprint{LA-UR-05-5164}
48:
49: %Title of paper
50: \title{Allostery in a Coarse-Grained Model of Protein Dynamics}
51:
52: % repeat the \author .. \affiliation etc. as needed
53: % \email, \thanks, \homepage, \altaffiliation all apply to the current
54: % author. Explanatory text should go in the []'s, actual e-mail
55: % address or url should go in the {}'s for \email and \homepage.
56: % Please use the appropriate macro foreach each type of information
57:
58: % \affiliation command applies to all authors since the last
59: % \affiliation command. The \affiliation command should follow the
60: % other information
61: % \affiliation can be followed by \email, \homepage, \thanks as well.
62:
63: \author{Dengming Ming}
64: %\email[dming@lanl.gov]{Your e-mail address}
65: %\homepage[]{Your web page}
66: %\thanks{}
67: %\altaffiliation{}
68: \affiliation{Computer and Computational Sciences Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA}
69:
70: \author{Michael E. Wall}
71: \email[Correspondence: ]{mewall@lanl.gov}
72: %\homepage[]{Your web page}
73: %\thanks{}
74: %\altaffiliation{}
75: \affiliation{Computer and Computational Sciences and Bioscience Divisions, Los Alamos National Laboratory, Los Alamos, NM 87545, USA}
76:
77: %Collaboration name if desired (requires use of superscriptaddress
78: %option in \documentclass). \noaffiliation is required (may also be
79: %used with the \author command).
80: %\collaboration can be followed by \email, \homepage, \thanks as well.
81: %\collaboration{}
82: %\noaffiliation
83:
84: \date{\today}
85:
86: \begin{abstract}
87:
88: We propose a criterion for optimal parameter selection in
89: coarse-grained models of proteins, and develop a refined elastic
90: network model (ENM) of bovine trypsinogen. The unimodal density-of-states
91: distribution of the trypsinogen ENM disagrees with the bimodal
92: distribution obtained from an all-atom model; however, the bimodal
93: distribution is recovered by strengthening interactions between atoms
94: that are backbone neighbors. We use the backbone-enhanced model to
95: analyze allosteric mechanisms of trypsinogen, and find relatively strong
96: communication between the regulatory and active sites.
97: \end{abstract}
98:
99: % insert suggested PACS numbers in braces on next line
100: \pacs{}
101: % insert suggested keywords - APS authors don't need to do this
102: %\keywords{}
103:
104: %\maketitle must follow title, authors, abstract, \pacs, and \keywords
105: \maketitle
106:
107: % body of paper here - Use proper section commands
108: % References should be done using the \cite, \ref, and \label commands
109: %\section{}
110: % Put \label in argument of \section for cross-referencing
111: %\section{\label{sec:theory}Theory}
112: %\subsection{\label{sec:theory}Theory}
113:
114: A major challenge of molecular biology is to understand regulatory
115: mechanisms in large protein complexes that are abundant in
116: multi-celluluar organisms. To make simulation of such complexes
117: computationally feasible, coarse-grained models have been developed,
118: in which a subset of the atoms in the complex are used to simulate the
119: large-scale motions. However, principled methods to quantify and
120: optimize the accuracy of coarse-grained models are currently lacking.
121:
122: In one common coarse-graining method, an all-atom model is simplified
123: by considering effective interactions among a subset of the atoms
124: (e.g., just the alpha-carbons). The usual criterion for model accuracy
125: is the ability of a model to reproduce atomic mean-squared
126: displacements (MSDs). However, MSDs are just one aspect of protein
127: dynamics -- a stricter criterion for the accuracy of a coarse-grained
128: model is the similarity between the configurational distributions of
129: the selected atoms in the coarse-grained and all-atom models. Such a
130: criterion is also biologically relevant, in part because the
131: conformational distribution is a key determinant of protein activity
132: \cite{Frauenfelder85}.
133:
134: One useful measure of the difference between conformational
135: distributions is the Kullback-Leibler divergence $D_{\bf x}$ (see
136: definition below) \cite{Kullback51,Ming05}. Recently, an analytic
137: expression for $D_{\bf x}$ was obtained for harmonic vibrations of a
138: protein-ligand complex both with and without a protein-ligand
139: interaction \cite{Ming05}. Here we show how an equivalent expression
140: may be applied to refine a coarse-grained model of protein
141: dynamics. To use the expression for $D_{\bf x}$ requires the marginal
142: probability distribution of a subset of the atoms in a protein, which
143: we calculate in the harmonic approximation. We then apply the
144: equations to refine an anisotropic elastic network model (ENM)
145: \cite{Atilgan01} of trypsinogen dynamics with respect to an all-atom
146: model calculated using CHARMM \cite{Brooks83}. The unimodal
147: density-of-states distribution of the ENM disagrees with the bimodal
148: distribution obtained from the all-atom model; however, the bimodal
149: distribution is recovered by strengthening interactions between atoms
150: that are backbone neighbors. Finally, the backbone-enhanced elastic
151: network model (BENM) is used to analyze allosteric mechanisms of
152: trypsinogen, revealing relatively strong communication between the
153: regulatory and active sites.
154:
155: Let $P({\bf x})$ be the probability distribution of the $3N$ atomic
156: coordinates ${\bf x}=(x_1,y_1,z_1,\ldots,x_N,y_N,z_N)$ of a molecular
157: model in the harmonic approximation. Let ${\bf x}=({\bf x}_1,{\bf
158: x}_2)$, where ${\bf x}_1$ is the $3N_1$ coordinates of a subset of
159: atoms of interest, and ${\bf x}_2$ is the $3N_2$ coordinates of the
160: remaining atoms. We are interested in calculating the marginal
161: distribution $P({\bf x}_1)$:
162: \begin{equation}
163: P({\bf x}_1) = \int d^{3N_2} {\bf x}_1 \, P({\bf x}_1, {\bf x}_2).
164: \label{eq:px1_general}
165: \end{equation}
166:
167: We now calculate $P({\bf x}_1)$ in a model of molecular vibrations. Consider a harmonic approximation to the potential energy function
168: $U({\bf x})$, where ${\bf x}$ is the deviation from an
169: equilibrium conformation ${\bf x}_0$:
170: \begin{equation}
171: U({\bf x} + {\bf x}_0)\approx U({\bf x}_0)+{1 \over 2}{\bf x}^\dag {\bf H}\,{\bf x}.
172: \end{equation}
173: The matrix ${\bf H}$ is the Hessian of $U$ evaluated at ${\bf x}_0$: $H_{ij}|_{{\bf x}_0}=\partial^2 U / \partial x_i \partial x_j |_{{\bf x}_0}.$ We assume a Boltzmann distribution for $P({\bf x})$, and ignore solvent and pressure effects:
174: \begin{equation}
175: P({\bf x})=Z^{-1}e^{{-{\bf x}^\dag {\bf H}\,{\bf x}}\over{2 k_B T}}=(2\pi k_B T)^{-3N/2}e^{{-\left|{\bf \Omega} {\bf V}^\dag {\bf x}\right|^2} \over {2 k_B T}}\prod_{i=1}^{3N}\omega_i,
176: \label{eq:harmboltzmann}
177: \end{equation}
178: where $Z$ is the partition function, $k_B$ is Boltzmann's constant,
179: $T$ is the temperature, the elements of the matrix $\left|{\bf \Omega}\right|^2={\rm diag} (\omega_1^2,\ldots,\omega_{3N}^2)$ are the eigenvalues of
180: ${\bf H}$, and the columns of the matrix ${\bf V}$ are the
181: eigenvectors of ${\bf H}$. To calculate $P({\bf x}_1)$ we define the
182: submatrices ${\bf H}_1$, ${\bf H}_2$, and ${\bf G}$ as follows:
183: \begin{eqnarray}
184: {\bf H\,x} = \left(
185: \begin{array}{cc}
186: {\bf H}_1 & {\bf G} \\
187: {\bf G}^\dag & {\bf H}_2
188: \end{array}
189: \right)
190: \left(
191: \begin{array}{c}
192: {\bf x}_1 \\
193: {\bf x}_2
194: \end{array}
195: \right)
196: =
197: \left(
198: \begin{array}{ccc}
199: {\bf H}_1 {\bf x}_1 & + & {\bf G} {\bf x}_2 \\
200: {\bf G}^\dag {\bf x}_1 & + & {\bf H}_2 {\bf x}_2
201: \end{array}
202: \right).
203: \label{eq:Hdecomp}
204: \end{eqnarray}
205: ${\bf H}_1$ couples coordinates from ${\bf x}_1$; ${\bf H}_2$ couples coordinates from ${\bf x}_2$; and ${\bf G}$ couples coordinates between ${\bf x}_1$ and ${\bf x}_2$. Eq.~(\ref{eq:harmboltzmann}) now can be expressed as
206: \begin{equation}
207: P({\bf x})=Z^{-1}e^{{-{\bf x}^\dag {\bf H}\,{\bf x}}\over{2 k_B T}}=(2\pi k_B T)^{-3N/2}e^{{-\left|{\bf \bar{\Omega}} {\bf \bar{V}}^\dag {\bf x}_1\right|^2 - \left|{\bf \Lambda} {\bf U}^\dag {\bf x}_2 + {\bf \Lambda}^{-1}{\bf U}^\dag {\bf G}^\dag {\bf x}_1 \right|^2} \over {2 k_B T}}\prod_{i=1}^{3N}\omega_i,
208: \label{eq:harmsub}
209: \end{equation}
210: where the diagonal elements of the matrix $\left|{\bf \Lambda}\right|^2={\rm
211: diag}(\lambda_1^2,\ldots,\lambda_{3N_1}^2)$ and the columns of the
212: matrix ${\bf U}$ are the eigenvalues and eigenvectors of ${\bf H}_2$,
213: and the diagonal elements of the matrix $\left|{\bf \bar{\Omega}}\right|^2={\rm
214: diag}(\bar{\omega}_1^2,\ldots,\bar{\omega}_{3N_1}^2)$ and the columns of the
215: matrix ${\bf \bar{V}}$ are the eigenvalues and eigenvectors of a matrix ${\bf \bar{H}}$ defined as
216: \begin{equation}
217: {\bf \bar{H}}={\bf H}_1 - {\bf G}{\bf H}_2^{-1}{\bf G}^\dag = {\bf \bar{V}}\left|{\bf \bar{\Omega}}\right|^2{\bf \bar{V}}^\dag.
218: \label{eq:Hbar}
219: \end{equation}
220: Eq.~(\ref{eq:Hbar}) is equivalent to an equation independently derived to
221: study local vibrations in the nucleotide-binding pockets of myosin and
222: kinesin \cite{Zheng05}. Performing the integral in
223: Eq.~(\ref{eq:px1_general}) leads to the desired equation for
224: $P({\bf x}_1)$:
225: \begin{equation}
226: P({\bf x}_1) = (2\pi k_B T)^{-3N_1/2}e^{{-\left|{\bf \bar{\Omega}}{\bf \bar{V}}^\dag {\bf x}_1 \right|^2} \over {2 k_B T}} \prod_{i=1}^{3N_1}\bar{\omega}_i.
227: \label{eq:margin}
228: \end{equation}
229:
230: Now consider the problem of optimal selection of the parameters
231: $\Gamma$ of a coarse-grained model of protein dynamics. Let ${\bf
232: x}_\alpha$ be the coordinates of the $N_\alpha$ alpha-carbons in an an
233: all-atom model, and ${\bf x}_\alpha^{(\Gamma)}$ be the same
234: coordinates in the coarse-grained model. We define the optimal
235: coarse-grained model as the one for which the Kullback-Leibler
236: divergence between $P^{(\Gamma)}({\bf x}_\alpha)$ and $P({\bf
237: x}_\alpha)$ is minimal, {\em i.e.}, for which $\Gamma$ is chosen such
238: that
239: \begin{equation}
240: D_{{\bf x}_\alpha}^{(\Gamma)}=\int d^{3N_\alpha}{\bf x}_\alpha \,
241: P^{(\Gamma)}({\bf x}_\alpha)\ln {P^{(\Gamma)}({\bf x}_\alpha) \over
242: P({\bf x}_\alpha)}
243: \label{eq:dkl}
244: \end{equation}
245: is minimal. We previously calculated an analytic expression for
246: equations like Eq.~(\ref{eq:dkl}) when $P({\bf x}_\alpha)$ and
247: $P^{(\Gamma)}({\bf x}_\alpha)$ are both governed by harmonic
248: vibrations \cite{Ming05}:
249:
250: \begin{equation}
251: D_{{\bf x}_\alpha}^{(\Gamma)}=\sum_{i=1}^{3N_\alpha}\left(\ln {\omega_{i}^{(\Gamma)} \over \bar{\omega}_{i}} + {1 \over {2 k_B T}}\bar{\omega}_{i}^2 \left| {\bf \bar{v}}^\dag_i \Delta{\bf x}_{\alpha,{0}}\right|^2 + {1 \over 2}\sum_{j=1}^{3N_\alpha}{\bar{\omega}_{j}^2 \over {\omega^{(\Gamma)}_{i}}^2}\left|{{\bf v}^{(\Gamma)}_i}^\dag {\bf \bar{v}}_j\right|^2- {1 \over 2}\right).
252: \label{eq:cgd}
253: \end{equation}
254: In Eq.~(\ref{eq:cgd}), ${\omega^{(\Gamma)}_i}^2$ and ${{\bf
255: v}^{(\Gamma)}_i}$ are the eigenvalue and eigenvector of mode $i$ of
256: the coarse-grained model; ${\bar{\omega}_i}^2$ and ${{\bf \bar{v}}_i}$
257: are the $i^{\rm th}$ eigenvalue and eigenvector of the matrix ${\bf
258: \bar{H}}$ calculated for the alpha-carbon atoms of the all-atom model
259: (Eq.~(\ref{eq:Hbar})), and $\Delta{\bf x}_{\alpha,{0}}={\bf
260: x}^{(\Gamma)}_{\alpha,0}-{\bf x}_{\alpha,0}$ is the difference between
261: the equilibrium coordinates of the coarse-grained and all-atom
262: models. An optimal coarse-grained model of harmonic vibrations is thus
263: one with parameters $\Gamma$ such that $D^{(\Gamma)}_{{\bf x}_\alpha}$
264: calculated using Eq.~(\ref{eq:cgd}) is minimal.
265:
266: In the ENM \cite{Atilgan01}, interacting alpha-carbon atoms are
267: connected by springs aligned with the direction of atomic
268: separation. Following the Tirion model of harmonic vibrations
269: \cite{Tirion96}, each spring has the same force constant $\gamma$. For
270: a given interaction network, the eigenvectors ${\bf v}^{(\gamma)}_i$
271: are independent of $\gamma$, and each eigenvalue
272: ${\omega^{(\gamma)}_i}^2$ is proportional to $\gamma$. The value of
273: $\gamma$ at which $D^{(\gamma)}_{{\bf x}_\alpha}$ is minimal may be
274: calculated using Eq.~(\ref{eq:cgd}):
275: \begin{equation}
276: \gamma={1 \over
277: {3N_\alpha}}\sum_{i=1}^{3N_\alpha}\sum_{j=1}^{3N_\alpha}
278: {\bar{\omega}_j^2 \over a_i^2}\left|{\bf v}_i^{(\gamma) \dag}{\bf
279: \bar{v}}_j\right|^2.
280: \label{eq:dmin}
281: \end{equation}
282: The proportionality constants $a_i^2={\omega^{(\gamma)}_i}^2/\gamma$
283: are determined from the eigenvalue spectrum calculated using an
284: arbitrary value of $\gamma$ (because the eigenvalues
285: $\omega^{(\gamma)2}_i$ are proportional to $\gamma$, the constants
286: $a_i^2$ are independent of $\gamma$). It is easily shown that the
287: third and fourth terms of Eq.~(\ref{eq:cgd}) cancel when $\gamma$
288: assumes the value given by Eq.~(\ref{eq:dmin}).
289:
290: The interaction network in an elastic network
291: model is generated by enabling interactions only between pairs of
292: atoms separated by a distance less than or equal to a cutoff distance
293: $r_c$. To optimize the model,
294: the value of $r_c$ for which
295: $D^{(\gamma)}_{{\bf x}_\alpha}$ is minimal is numerically estimated,
296: using values of $\gamma$
297: from Eq.~(\ref{eq:dmin}).
298:
299: As a test case for optimization, we developed a coarse-grained model
300: of bovine trypsinogen from an all-atom model (223 amino acids obtained
301: from PDB entry 4TPI \cite{Bode84}). CHARMM was used for all-atom
302: simulations using the CHARMM22 force field with default parameter
303: values. HBUILD was used to generate hydrogen positions, and the energy
304: was initially minimized using 2000 steps of relaxation by the
305: adopted basis Newton-Raphson method, gradually reducing the weight of
306: a harmonic restraint to the crystal-structure coordinates. The final
307: minimized structure was obtained through vacuum minimization until a
308: gradient of $10^{-7}$ Kcal/mol\,\AA~was achieved, and the Hessian {\bf
309: H} was calculated in CHARMM. The coordinates of the elastic network
310: model were taken from the alpha-carbon coordinates of the minimized
311: all-atom model.
312:
313: The alpha-carbon vibrations of the all-atom model were calculated by
314: diagonalizing ${\bf \bar{H}}$ from Eq.~(\ref{eq:Hbar}). Interestingly,
315: the distribution of the density-of-states for the vibrations is
316: bimodal (Fig.~\ref{fig:freqs}) with 2/3 of the frequencies in the
317: low-frequency spectrum and 1/3 of the frequencies in the
318: high-frequency spectrum. Calculation of the density-of-states
319: distribution from other globular proteins yields bimodal patterns with
320: a similar 2:1 ratio between the numbers of low- and high-frequency
321: modes (unpublished results).
322:
323: \begin{figure}
324: \includegraphics[width=3.0in]{Figs/Dens}
325: \caption{Density-of-states distribution for all-atom and elastic
326: network models of trypsinogen. Frequency units are $({\rm Kcal}/ {\rm
327: mol} \, {\rm \AA}^2 \, m_p)^{1/2} = 2.04 \times 10^{13} \, {\rm Hz}$,
328: where $m_p$ is the proton mass. Densities were estimated by counting
329: the number of modes in bins of width 0.2, and normalizing the integral
330: to 663, which is the total number of non-zero modes. The ENM ({\em
331: dotted blue}) does not reproduce the bimodal distribution from the
332: all-atom model ({\em solid red}); however, the BENM recovers the
333: bimodal distribution ({\em dashed green}).}
334: \label{fig:freqs}
335: \end{figure}
336:
337: The best elastic network model of trypsinogen was obtained using a
338: cutoff distance $r_c$ of approximately 7.75~\AA, for which the optimal
339: value of $\gamma$ is 53.4~Kcal/mol\,\AA$^2$, yielding a value of
340: $D_{{\bf x}_\alpha}=312.9$ in a sharp minimum with respect to
341: $r_c$. The density-of-states distribution for the elastic network
342: model is unimodal, unlike that for the all-atom model
343: (Fig.~\ref{fig:freqs}).
344:
345:
346: Although the ENM treats all alpha-carbon pairs equally,
347: the distribution of distances
348: between successive alpha-carbons along the protein
349: backbone is known to be tightly
350: centered about 3.8~\AA. In addition, two of the six alpha-carbons
351: nearest to a typical alpha-carbon are backbone neighbors, which might
352: explain why 1/3 of the CHARMM-derived modes have significantly higher
353: frequencies than the others. We therefore wondered whether the ENM
354: might be improved by enhancing interactions between backbone
355: neighbors.
356:
357: Indeed, a more accurate coarse-grained model is obtained by using a
358: force constant enhanced by a factor of $\epsilon$ for interactions
359: between alpha-carbons that are neighbors on the backbone. Minimization
360: of $D_{{\bf x}_\alpha}$ for such a backbone-enhanced elastic network
361: model (BENM) with respect to $\epsilon$ and $r_c$ subject to
362: Eq.~(\ref{eq:dmin}) yields a model with $\epsilon=42$, $r_c=10.5$~\AA,
363: and $\gamma=4.26$~Kcal/mol\,\AA$^2$, resulting in a much lower value
364: $D_{{\bf x}_\alpha}=102.3$. The density-of-states distribution for
365: this model agrees quite well with that of the all-atom model
366: (Fig.~\ref{fig:freqs}), especially considering that the model is
367: optimized with respect to $D_{{\bf x}_\alpha}$, which does not
368: directly involve the density-of-states distribution. The agreement is
369: especially good for the high-frequency modes, suggesting that a
370: uniform force constant is a reasonable approximation for interactions
371: between alpha-carbons that are backbone neighbors. Furthermore, the
372: overlap $\sum_{i=1}^N\sum_{j=1}^N|{\bf v}_i^{(\gamma)\dag}\bar{\bf
373: v}_j|^2/N$ for the 223 highest-frequency modes is 0.99, indicating
374: that the spaces of the high-frequency eigenvectors are nearly
375: identical between the BENM and all-atom models. In contrast, the
376: low-frequency distribution of BENM states is narrower than that of the
377: all-atom model, indicating that a uniform force constant is a poorer
378: approximation for interactions between alpha-carbons that are not
379: backbone neighbors.
380:
381: \begin{figure}
382: \includegraphics[width=3.0in]{Figs/fluctuations}
383: \caption{Mean-squared displacements of alpha-carbon positions for
384: trypsinogen residues 10--229 obtained from normal-modes simulations
385: using CHARMM ({\em dashed green}), a BENM with parameters that
386: minimize $D_{{\bf x}_\alpha}$ with respect to CHARMM ({\em dotted
387: blue}), the same BENM but with $\gamma$ adjusted to better agree with
388: CHARMM MSDs ({\em fine-dotted magenta}), and an ENM with parameters
389: adjusted to agree with CHARMM MSDs ({\em dash-dotted cyan}). Values
390: were calculated at $T=300$~K using the Equipartition Theorem. Harmonic
391: vibrations at thermal equilibrium are known to inadequately
392: model crystallographic MSDs, which include other
393: sources of disorder ({\em solid red})
394: \cite{Go83}.}
395: \label{fig:flucts}
396: \end{figure}
397:
398: Both the BENM and the ENM yield patterns of alpha-carbon MSDs that are
399: similar to that of the all-atom model (Fig.~\ref{fig:flucts}). Because
400: there are fewer low-frequency BENM modes than low-frequency CHARMM
401: modes (Fig.~\ref{fig:freqs}), the BENM MSDs are consistently smaller
402: than the CHARMM MSDs; however, the BENM MSDs may be improved by
403: selecting $\gamma=1.2$~Kcal/mol\,\AA$^2$
404: (Fig.~\ref{fig:flucts}). These improved MSDs come at the cost of a
405: higher value of $D_{{\bf x}_\alpha}=528.4$, and a change in the
406: frequency scale by a factor $(1.2/4.3)^{1/2}=0.53$, resulting in a
407: poor model of the density-of-states distribution. The ENM with
408: parameters that minimize $D_{{\bf x}_\alpha}$ exhibits poor MSDs (not
409: shown); however, an ENM with $r_c=15.4$~\AA\ and
410: $\gamma=0.4$~Kcal/mol\,\AA$^2$ yields MSDs that agree well with those
411: of the CHARMM model (Fig.~\ref{fig:flucts}). In agreement with
412: previous results using the ENM \cite{Atilgan01}, we confirmed that the
413: parameters of both the ENM and BENM may be adjusted to yield a
414: reasonable model of crystallographic MSDs (not shown).
415:
416: Next consider the problem of quantifying allosteric effects in
417: proteins \cite{Ming05}. In allosteric regulation, molecular
418: interactions cause changes in protein activity through changes in
419: protein conformation. Although the importance of considering
420: continuous conformational distributions in understanding allosteric
421: effects was recognized by Weber \cite{Weber72}, theories of allosteric
422: regulation that consider continuous conformational distributions have
423: been lacking. We began to develop such a theory by defining the
424: allosteric potential as the Kullback-Leibler divergence $\bar{D}_{\bf
425: x}$ between protein conformational distributions before and after
426: ligand binding, and by calculating changes in the conformational
427: distribution of the full protein-ligand complex in the harmonic
428: approximation \cite{Ming05}. Here we use the expression for the
429: marginal distribution in Eq.~(\ref{eq:margin}) to calculate an equation
430: for the allosteric potential in the harmonic approximation, and apply
431: it to analyze allosteric mechanisms in trypsinogen.
432:
433: Let
434: ${\bf x}_p$ be the protein coordinates selected from the coordinates
435: ${\bf x}$ of a protein-ligand complex. $P^\prime({\bf x}_p)$ and
436: $P({\bf x}_p)$ are the protein conformational distributions with and
437: without a ligand interaction. Eq.~(\ref{eq:margin}) enables
438: $P^\prime({\bf x}_p)$ to be calculated from the full conformational
439: distribution $P^\prime({\bf x})$ of the protein-ligand complex. The
440: equation for the allosteric potential in the harmonic
441: approximation follows from the theory developed in ref.~\cite{Ming05}:
442: \begin{equation}
443: \bar{D}_{{\bf x}}=\sum_{i=1}^{3N_p}\left(\ln
444: {\bar{\omega}^{\prime}_{i} \over {\omega}_{i}} + {1 \over {2 k_B
445: T}}{\omega}_{i}^2 \left| {\bf {v}}^\dag_i \Delta{\bf x}_{p,{0}}\right|^2 +
446: {1 \over 2}\sum_{j=1}^{3N_p}{{\omega}_{j}^2 \over
447: {\bar{\omega}^{\prime 2}_{i}}}\left|{{\bf \bar{v}}^{\prime \dag}_i}
448: {\bf {v}}_j\right|^2- {1 \over 2}\right).
449: \label{eq:ap}
450: \end{equation}
451: In Eq.~(\ref{eq:ap}), $\bar{\omega}^{\prime 2}$ and ${\bf
452: \bar{v}}^{\prime}_i$ are the $i^{\rm th}$ eigenvalue and eigenvector
453: of the matrix ${\bf \bar{H}}$ calculated for the protein atoms of the
454: protein-ligand complex, $\omega_i^2$ and ${\bf v}_i$ are the
455: eigenvalue and eigenvector of mode $i$ of the apo-protein, and $\Delta
456: {\bf x}_{p,0}={\bf x}^\prime_{p,0}-{\bf x}_{p,0}$ is the difference
457: between the equilibrium coordinates of the protein with and without
458: the ligand interaction. The term
459: $\sum_{i=1}^{3N_p}\ln{\bar{\omega}^\prime_i / \omega_i}$ is
460: proportional to the change in configurational entropy of the protein
461: releasing the ligand, and the term
462: $\sum_{i=1}^{3N_p}{\omega}_{i}^2 \left| {\bf {v}}^\dag_i \Delta{\bf
463: x}_{p,{0}}\right|^2 / 2 k_B T$ is proportional to the potential energy
464: required to deform the apo-protein into its equilibrium
465: conformation in the protein-ligand complex.
466:
467: We used Eq.~(\ref{eq:ap}) to calculate changes in the configurational
468: distribution of local regions of trypsinogen upon binding bovine
469: pancreatic trypsinogen inhibitor (BPTI). BPTI binds in the active site
470: and exerts an allosteric effect, enhancing the affinity of trypsinogen
471: for Val-Val \cite{Bode79}. Alpha-carbon coordinates for 223 residues
472: were obtained from a crystal structure of trypsinogen in complex with
473: BPTI (residues 7--229 from PDB entry 4TPI \cite{Bode84}, including
474: theoretically modeled residues 7--9), and were used directly to
475: construct backbone-enhanced elastic network models of apo-trypsinogen
476: and the trypsinogen-BPTI complex. As suggested by the refined
477: trypsinogen model above, both models used $r_c=10.5$~\AA,
478: $\gamma=4.26$~Kcal/mol\,\AA$^2$, and $\epsilon=42$.
479:
480: Local changes in the conformational distribution of trypsinogen were
481: analyzed by considering changes in the neighborhood of each
482: alpha-carbon atom. A neighborhood was defined by selecting the atom of
483: interest plus its five nearest neighbors, and the matrix ${\bf
484: \bar{H}}$ was calculated for these six atoms in the models both with
485: (yielding ${\bf \bar{H}}^\prime$) and without (yielding ${\bf
486: \bar{H}}$) the BPTI interaction. A local value of $\bar{D}_{\bf x}$
487: was obtained using the eigenvalues and eigenvectors of ${\bf
488: \bar{H}}^\prime$ and ${\bf \bar{H}}$ in a suitably modified version of
489: Eq.~(\ref{eq:ap}).
490:
491: \begin{figure}
492: \includegraphics[width=3in]{Figs/trypsin_left_lores.eps}
493: \includegraphics[width=3in]{Figs/trypsin_right_lores.eps}
494: \caption{Visualization of local sites on the surface of trypsinogen
495: that exhibit a large change in the conformational distribution upon
496: binding BPTI. Values of $\bar{D}_{\bf x}$ are mapped to a logarithmic
497: temperature scale, with red coloring indicating large values. Changes
498: are large both in the BPTI-binding site ({\em left}) and in the
499: Val-Val binding site ({\em right}). There is a $90^\circ$ rotation
500: about the x-axis between the left and right panels.}
501: \label{fig:trypsinogen}
502: \end{figure}
503:
504: Not surprisingly, we found that the local values of $\bar{D}_{\bf x}$
505: were relatively large in the neighborhood of the BPTI-binding site
506: (Fig.~\ref{fig:trypsinogen}, {\em left panel}). Values of
507: $\bar{D}_{\bf x}$ elsewhere on the surface were smaller, with one
508: interesting exception: values in the Val-Val binding site were
509: comparable to those in the BPTI-binding site
510: (Fig.~\ref{fig:trypsinogen}, {\em right panel}).
511:
512: We also calculated local values of $\bar{D}_{\bf x}$ for the Val-Val
513: interaction, which causes the crystal structure of trypsinogen to
514: resemble that of active trypsin \cite{Bode78,Bode84}. We found that
515: values were relatively large in the neighborhood of Ser 195, which is
516: the key catalytic residue for trypsin and other serine proteases: the
517: value of $\bar{D}_{\bf x}$ in this neighborhood was 40$^{\rm th}$
518: highest of 223 residues in the crystal structure; 11$^{\rm th}$ of all
519: residues not directly interacting with the Val-Val in the model; the
520: highest of all residues located at least as far as Ser 195 is from the
521: Val-Val ligand; and greater than that for 20 of 60 residues located
522: closer to the ligand. Calculations for both the BPTI interaction and
523: the Val-Val interaction therefore indicate that there is a relatively
524: strong communication between the regulatory and active sites of
525: trypsinogen.
526:
527: Considering models beyond the ENM and BENM (and even models beyond
528: proteins), the theory presented here leads to a general prescription
529: for modeling harmonic vibrations using coarse-grained models of
530: materials. To optimally model the all-atom conformational
531: distribution, always use an energy scale for interactions that
532: eliminates the discrepancy due to differences in the eigenvectors
533: (Eq.~(\ref{eq:dmin})), and select the
534: coarse-grained model for which the entropy of the conformational
535: distribution is the largest (first term of Eq.~(\ref{eq:cgd})).
536:
537: Although traditional elastic network models can explain
538: characteristics of the functions and dynamics of proteins
539: \cite{Yang05}, the present study shows that they provide a poor
540: approximation to the conformational distribution calculated from
541: all-atom models of harmonic vibrations of proteins. Model accuracy is
542: significantly improved by using a backbone-enhanced elastic network
543: model, which strengthens interactions between atoms that are nearby in
544: terms of covalent linkage. Although the backbone-enhanced model
545: appears to accurately capture the high-frequency alpha-carbon
546: vibrations of an all-atom model, the model less accurately captures
547: the slower, large-scale harmonic vibrations (which in turn are known
548: to poorly approximate the full spectrum of highly nonlinear,
549: large-scale protein motions).
550:
551: We also find that the allosteric potential is a useful tool for
552: computational analysis of allosteric mechanisms in proteins. Using
553: calculations of the allosteric potential, communication between the
554: regulatory and active sites of trypsinogen was observed in a purely
555: mechanical, coarse-grained model of protein harmonic vibrations that
556: does not consider mean conformational changes or amino-acid
557: identities, supporting prior arguments for the possibility of
558: allostery without a mean conformational change \cite{Cooper84}. It will
559: be interesting to perform similar analyses on a wide range of all-atom
560: and coarse-grained models of protein vibrations, and to use more
561: realistic calculations of free-energy landscapes \cite{Garcia01} to
562: more accurately model changes in protein conformational distributions.
563:
564: % \subsubsection{}
565:
566: % If in two-column mode, this environment will change to single-column
567: % format so that long equations can be displayed. Use
568: % sparingly.
569: %\begin{widetext}
570: % put long equation here
571: %\end{widetext}
572:
573: % figures should be put into the text as floats.
574: % Use the graphics or graphicx packages (distributed with LaTeX2e)
575: % and the \includegraphics macro defined in those packages.
576: % See the LaTeX Graphics Companion by Michel Goosens, Sebastian Rahtz,
577: % and Frank Mittelbach for instance.
578: %
579: % Here is an example of the general form of a figure:
580: % Fill in the caption in the braces of the \caption{} command. Put the label
581: % that you will use with \ref{} command in the braces of the \label{} command.
582: % Use the figure* environment if the figure should span across the
583: % entire page. There is no need to do explicit centering.
584:
585: % \begin{figure}
586: % \includegraphics{}%
587: % \caption{\label{}}
588: % \end{figure}
589:
590: % Surround figure environment with turnpage environment for landscape
591: % figure
592: % \begin{turnpage}
593: % \begin{figure}
594: % \includegraphics{}%
595: % \caption{\label{}}
596: % \end{figure}
597: % \end{turnpage}
598:
599: % tables should appear as floats within the text
600: %
601: % Here is an example of the general form of a table:
602: % Fill in the caption in the braces of the \caption{} command. Put the label
603: % that you will use with \ref{} command in the braces of the \label{} command.
604: % Insert the column specifiers (l, r, c, d, etc.) in the empty braces of the
605: % \begin{tabular}{} command.
606: % The ruledtabular enviroment adds doubled rules to table and sets a
607: % reasonable default table settings.
608: % Use the table* environment to get a full-width table in two-column
609: % Add \usepackage{longtable} and the longtable (or longtable*}
610: % environment for nicely formatted long tables. Or use the the [H]
611: % placement option to break a long table (with less control than
612: % in longtable).
613: % \begin{table}%[H] add [H] placement to break table across pages
614: % \caption{\label{}}
615: % \begin{ruledtabular}
616: % \begin{tabular}{}
617: % Lines of table here ending with \\
618: % \end{tabular}
619: % \end{ruledtabular}
620: % \end{table}
621:
622: % Surround table environment with turnpage environment for landscape
623: % table
624: % \begin{turnpage}
625: % \begin{table}
626: % \caption{\label{}}
627: % \begin{ruledtabular}
628: % \begin{tabular}{}
629: % \end{tabular}
630: % \end{ruledtabular}
631: % \end{table}
632: % \end{turnpage}
633:
634: % Specify following sections are appendices. Use \appendix* if there
635: % only one appendix.
636: %\appendix
637: %\section{}
638:
639: % If you have acknowledgments, this puts in the proper section head.
640: %\begin{acknowledgments}
641: This work was supported by the US Department of Energy.
642: %\end{acknowledgments}
643:
644: % Create the reference section using BibTeX:
645: \bibliography{PRL}
646:
647: \end{document}
648: %
649: % ****** End of file template.aps ******
650:
651: