1:
2: \documentclass[aps,twocolumn,superscriptaddress,showpacs,unsortedaddress]{revtex4}
3:
4: \usepackage{graphicx}
5: \usepackage{amssymb}
6: \setlength{\topmargin}{-.5cm}
7:
8: \newcommand{\lap}[1]{\mathrel{\mathop{\cal L}\limits_{#1}^{}}}
9: \begin{document}
10:
11: %\baselineskip0.9cm
12:
13: \title{Complex folding pathways in a simple $\beta$-hairpin}
14:
15: \author{Guanghong Wei}
16: \affiliation{D\'epartement de physique et GCM, Universit\'e de Montr\'eal, C.P. 6128,
17: succ. centre-ville, Montr\'eal (Qu\'ebec) Canada}
18:
19: \author{Normand Mousseau}
20: %\thanks{Corresponding author}
21: \email{Normand.Mousseau@umontreal.ca}
22: \affiliation{D\'epartement de physique et GCM, Universit\'e de
23: Montr\'eal, C.P. 6128, succ. centre-ville, Montr\'eal (Qu\'ebec) Canada}
24:
25: \author{Philippe Derreumaux}
26: \affiliation{Information Genomique et Structurale, CNRS-UMR 1889,
27: 31 Chemin Joseph Aiguier, 13402
28: Marseille Cedex 20, France}
29:
30: \affiliation{Laboratoire de Biochimie Theorique, UPR 9080 CNRS, Institut de Biologie
31: Physico-Chimique,
32: 13 rue Pierre et Marie Curie, 75005 Paris, France}
33:
34: \date{\today}
35:
36: \begin{abstract}
37: The determination of the folding mechanisms of proteins is critical to
38: understand the topological change that can propagate Alzheimer and
39: Creutzfeld-Jakobs diseases, among others. The computational community
40: has paid considerable attention to this problem; however, the
41: associated time scale, typically on the order of milliseconds or more,
42: represents a formidable challenge. {\it Ab initio} protein folding
43: from long molecular dynamics (MD) simulations or ensemble dynamics is
44: not feasible with ordinary computing facilities and new techniques
45: must be introduced. Here we present a detailed study of the folding of
46: a 16-residue $\beta$-hairpin, described by a generic energy model and
47: using the activation-relaxation technique. From a total of 90
48: trajectories at 300 K, three folding pathways emerge. All involve a
49: simultaneous optimization of the complete hydrophobic and hydrogen
50: bonding interactions. The first two follow closely those observed by
51: previous theoretical studies. The third pathway, never observed by
52: previous all-atom folding, unfolding and equilibrium simulations, can
53: be described as a reptation move of one strand of the $\beta$-sheet
54: with respect to the other. This reptation move indicates that
55: non-native interactions can play a dominant role in the folding of
56: secondary structures. These results point to a more complex folding
57: picture than expected for a simple $\beta$-hairpin.
58:
59: {\bf Key words}: the Activation-Relaxation Technique; protein folding;
60: simulations; $\beta$-hairpin; reptation.
61:
62: \end{abstract}
63:
64: \pacs{}
65: \maketitle
66:
67: %\newpage
68:
69: \section*{\bf INTRODUCTION}
70: As one of the smallest building blocks of proteins, the $\beta$-hairpin
71: and particularly the second $\beta$-hairpin of the domain B1 of protein
72: G (referred to as $\beta$-hairpin2) has been the subject of many
73: theoretical and experimental folding studies. This peptide adopts hairpin
74: structures in solution but overall its flexibility
75: precludes the determination of a high-resolution NMR solution
76: structure.~\cite{NMR95} Fluorescence experiments show that this
77: $\beta$-hairpin folds in isolation with a time constant of 6 microseconds
78: and its folding kinetics is described by the two-state
79: model.~\cite{EATON} Because these data do not provide details of the
80: transition and such a time scale cannot be covered by
81: hundreds of long molecular dynamics (MD)-trajectories at 300 K in explicit
82: solvent,~\cite{DU98} alternative methods have been used in order to
83: characterize the thermodynamics and folding kinetics of the
84: $\beta$-hairpin2.
85:
86: Two folding mechanisms have already been proposed.
87: The first mechanism, suggested by statistical mechanical
88: models~\cite{MUN98} and lattice Monte Carlo (MC) simulations,~\cite{KI99}
89: is that folding starts at the turn and propagates towards the tail by
90: hydrogen bonding interactions, the hydrophobic cluster forming at the
91: end of folding. One variant of this mechanism suggested by Langevin
92: dynamics of an off-lattice model is that the formation of the hydrophobic
93: cluster is followed by zipping of hydrogen bonds (H-bonds), predominantly
94: starting from those near the turn.~\cite{THI00}
95: Another variant suggested by all-atom MD simulations is that the
96: $\beta$-hairpin
97: folds beginning at the turn, followed by hydrophobic collapse and then
98: H-bond formation.~\cite{TSAI02}
99:
100: The second mechanism proposed is that the N- and C-termini first approach
101: each other to form a loop, and the structure propagates from there.
102: This mechanism is apparently independent of all-atom force field details,
103: since it has been recognized by ensemble dynamics at 300 K using
104: implicit solvent,~\cite{PD01} replica exchange method combining MD
105: trajectories
106: with a temperature exchange MC process using SPC model
107: solvent,~\cite{BS01} minimalist Go-folding discontinuous MD
108: simulations,~\cite{ZHOU02} unfolding simulations,~\cite{PD99,LEE01} and
109: multicanonical MC simulations with an implicit solvent model.~\cite{MK99}
110:
111: Along with the difference in folding dynamics within each scenario, three
112: questions are yet to be resolved. The first question is when the native
113: H-bond network and hydrophobic core form: (i) the hydrophobic core is being
114: formed
115: first, and the H-bonds appear,~\cite{GS01,PD99,LEE01,MK99,MA00} (ii)
116: the H-bonds
117: form first and then the hydrophobic core,~\cite{EATON} or (iii) the final
118: hydrophobic core and H-bonds form simultaneously.~\cite{ZHOU02,PD01,THI00}
119: %
120: The second question is whether helical
121: structures exist during folding process. Berne et al., by using the
122: OPLSAA force field, did not find evidence of significant helical structures
123: in their simulations at all temperatures studied,~\cite{BS01} while
124: Garcia and
125: Sanbonmatsu found a helical content of 15\% at low temperatures,~\cite{GS01}
126: Pande et al. detected short-lived semi-helical intermediates at room
127: temperature,~\cite{PD01} and Irb\"ack found a low population of $\alpha$-helix
128: structure at 273 K.~\cite{IRB03}
129: The third question is whether the previously used methods, which fail to
130: detect both pathways, may miss other major folding pathways, as has been
131: discussed recently~\cite{AF02} for ensemble dynamics which uses a large
132: number of short MD simulations of only tens of nanoseconds and a
133: supercluster of thousands of computer processors.
134:
135: To address these issues, we simulate the folding mechanisms of hairpin2
136: using a previously described model --- the Optimized Potential for Efficient
137: peptide-structure Prediction (OPEP), and sampling technique --- the
138: Activation-Relaxation Technique (ART).~\cite{BM96, BM98}
139: %
140: OPEP can be used to
141: simulate any amino acid sequence and works for proteins that do
142: and do not form ordered structures in solution. Ordered structures include
143: three-helix bundles and three-stranded anti-parallel $\beta$-sheet structures,
144: among others.~\cite{FOR01}
145: %
146: For its part, ART generates trajectories on the configurational energy
147: landscape,
148: identifying a series of energy minima separated by first-order saddle points.
149: The
150: efficiency of ART is not affected by
151: the height of the activation energy barriers or the complexity of the atomic
152: rearrangements and thus samples very efficiently the rugged-energy
153: landscapes of small proteins.
154: %
155: The OPEP-ART approach has been applied recently to study the
156: folding of three sequences adopting an $\alpha$-helix, a three
157: stranded antiparallel
158: $\beta$-sheet and a $\beta$-hairpin helix in solution.~\cite{WMD02}
159: %
160: In this work, 82 folding simulations at 300 K
161: start from a fully extended conformation ($\phi$= $-$180$^{\circ}$,
162: $\psi$= 180$^{\circ}$) using different random-number seeds: 52 use the
163: standard OPEP force field (16 reaching the folded state), 20 use a modified
164: set of OPEP parameters (6 reaching the folded state), and 10 use a biased
165: Go-like potential (10 reaching the folded state). To determine the effect
166: of the
167: starting structure, we also launched 8 independent runs
168: (4 reaching the folded
169: state) at 300 K starting from a semi-helical conformation using the standard
170: OPEP force field.
171:
172: From a total of 90 trajectories at room temperature, 36 found
173: the native state providing a detailed picture of the folding mechanism.
174: Although all these folding trajectories involve a simultaneous optimization of
175: the complete hydrophobic and hydrogen bonding interactions, the 36 folding
176: runs can be described by 3 mechanisms: two of them follow closely
177: those observed
178: by previous theoretical and computational studies, but the third one
179: represents a new
180: folding mechanism for proteins. This mechanism can be described
181: as a reptation move of one strand of the $\beta$-sheet with respect to
182: the other. These three mechanisms offer
183: a complete picture of the $\beta$-hairpin folding, independently of the
184: exact amino acid composition, and help reconcile conflicting theoretical
185: data on the hairpin2 of protein G~\cite{EATON,MK99,THI00,ZHOU02} or
186: between various hairpins, e.g., the first hairpin of tendamistat~\cite{BON00}
187: and a 11-residue model peptide.~\cite{WJ99} The existence of these
188: three competing mechanisms was presented recently in a short
189: communication;~\cite{WMD03} here, we offer a detailed description of
190: the folding mechanisms in this simple $\beta$ hairpin. Furthermore, we
191: present
192: the results of new simulations using either G\=o-like potential or
193: semi-helical
194: starting conformations.
195:
196: \section*{\bf METHODS}
197:
198: We have simulated the folding of the C-terminal $\beta$-hairpin from
199: protein G (residues 41-56). The sequence of the peptide is
200: GEWTYDDATKTFTVTE. The energy surface was modeled using the OPEP model
201: and the dynamics was obtained by the activation-relaxation technique.
202:
203:
204: \subsection*{\bf Activation-Relaxation Technique}
205:
206: ART is a generic method to explore the landscape of continuous energy
207: functions through a series of activated steps. The algorithm has
208: evolved considerably over the years and here we apply its most recent
209: version, ART nouveau,~\cite{MM00,MDB01} which uses a recursion method,
210: the Lancz\'os algorithm,~\cite{Lan88} to extract the direction of lowest
211: curvature of the landscape leading to a first-order saddle point.
212: Such an approach provides an efficient way to extract a limited
213: spectrum of eigenvectors and eigenvalues without requiring the
214: evaluation and diagonalization of the full Hessian matrix. A similar
215: approach was also introduced in Ref.~\onlinecite{Wales99}.
216: An ART event is defined directly in the space of configurations,
217: which allows for moves of any complexity, and consists of four steps:
218:
219: \begin{enumerate}
220: \item Starting from a local minimum, a configuration is first pushed
221: outside the harmonic well until a negative eigenvalue appears in the
222: Hessian matrix.
223:
224: \item The configuration is then pushed along the eigenvector
225: associated with the negative eigenvalue until the total force is
226: close to zero, indicating a saddle point. The first two steps constitute
227: the activation phase.
228:
229: \item The configuration is pushed slightly over the saddle point and
230: is relaxed to a new local minimum, using standard minimization
231: technique.
232:
233: \item Finally, the new configuration is accepted/rejected using the
234: Metropolis criterion at the desired temperature.
235: In each of the simulations at hand, this four-step procedure was
236: repeated 4000 times, taking less than 18 processor-hours on an IBM Power-4
237: machine
238: \end{enumerate}
239:
240: As discussed in our previous work,~\cite{WMD02} the temperature in ART
241: is not a real temperature since ART samples the conformational space from
242: one minimum to another minimum. However, ART generates
243: well-controlled trajectories (more than 83\% of events relax back to within
244: 0.4~\AA\ from their initial minima starting at the saddle points).~\cite{WMD02}
245: A detailed description of the algorithm and implementation of
246: ART can be found in earlier publications.~\cite{BM96,BM98,WMD02}
247:
248: \subsection*{\bf Energy Model}
249:
250: We use a coarse-grained off-lattice model where each amino acid is
251: represented by its N, H, C$\alpha$, C, O and one bead for its side
252: chain. The exact OPEP energy function, which includes solvent effects
253: implicitly, was obtained by maximizing the energy of the native fold
254: and an ensemble of non-native states for six training peptides with
255: 10-38 residues. In this work, the side chain propensities of the 20
256: amino acids for $\alpha_R$ helix, $\beta$-strand and $\alpha_L$ helix
257: ~\cite{DP99,DP00} are neglected. The total energy is thus expressed
258: by:
259: \begin{eqnarray}
260: E &=& w_{L} E_L + w_{H} E_{HB1} + w_{HH} E_{HB2} + w_{SC,SC} E_{SC,SC}
261: \nonumber \\
262: & & + w_{SC,M} E_{SC,M} + w_{M,M} E_{M,M}
263: \end{eqnarray}
264:
265: The interaction potentiel OPEP is a function of the weights $w$'s of
266: the following interactions:
267:
268: \noindent
269: (i) quadratic terms to maintain stereochemistry: bond lengths and bond
270: angles for all particles and improper dihedral angles for the side
271: chains and the peptide bonds $E_L$,
272:
273: \noindent
274: (ii) and excluded-volume potential of the main chain interactions $
275: E_{M,M}$ and of side chain--main chain interactions $ E_{SC,M}$,
276:
277: \noindent
278: (iii) pairwise contact 6-12 interactions between the side chains
279: considering all 20 amino acid types $E_{SC,SC}$,
280:
281: \noindent
282: (iv) backbone two-body $E_{HB1}$ and four-body $E_{HB2}$ hydrogen
283: bonding interactions. All nonbonded interactions are included (no
284: truncation).
285:
286: The two-body energy of one H-bond between residues $i$ and $j$ is
287: defined by
288: \begin{eqnarray}
289: E_{HB1} &=& \varepsilon_{hb} \sum_{ij}\mu(r_{ij})\nu(\alpha_{ij})
290: \end {eqnarray}
291: where
292: \begin{eqnarray}
293: \mu(r_{ij}) &=&
294: 5(\frac{\sigma}{r_{ij}})^{12}-6(\frac{\sigma}{r_{ij}})^{10}
295: \end {eqnarray}
296:
297: \begin{eqnarray}
298: \nu(\alpha_{ij}) &=& \left\{\begin{array}{r@{\quad \ \ \
299: \quad}}\cos^{2} \alpha_{ij} \ \ \ \ \ \ \ \ \alpha_{ij} > 90^{\circ}
300: \\ 0 \ \ \ \ \ \ \ \ \ \ \ \ otherwise
301: \end{array} \right.
302: \end{eqnarray}
303: where $r_{ij}$ is the O..H distance between the carbonyl oxygen and
304: amide hydrogen and $\alpha_{ij}$ the NHO angle.
305:
306: The cooperative energy between two neighbored H-bonds $ij$ and
307: $kl$ is defined by
308: \begin{eqnarray}
309: E_{HB2} &=& \varepsilon_{2hb} \exp(-(r_{ij}-\sigma)^2/2)
310: \exp(-(r_{kl}-\sigma)^2/2)
311: \nonumber \\
312: & &\Delta(ijkl)
313: \end {eqnarray}
314: where $\Delta(ijkl)$ = 1 if (k,l) = (i+1, j+1) or (i+2, j-2) or
315: (i+2, j+2), otherwise $\Delta(ijkl)$ = 0. This corresponds to
316: the pattern of H-bonds in $\alpha$-helices, anti-parallel and
317: parallel beta-sheets, respectively.
318:
319: In this work, unless specified we use $\sigma$ = 1.8~\AA,
320: $\varepsilon_{hb}$ = 1.0 kcal/mol if j=i+4 (helix), otherwise = 1.5
321: kcal/mol, and $\varepsilon_{2hb}$ = $-$2.0.
322: The other parameters can be found in Ref.~\onlinecite{FOR01}.
323:
324: \subsection*{\bf Trajectory Analysis}
325:
326: Our native structure contains six main chain H-bonds excluding the one
327: at the turn since it rarely forms because of geometrical
328: constraints. Following Karplus ~\cite{MK99} and Garcia,~\cite{GS01}
329: they are numbered from the tail to the turn 42:HN-55:O (H1), 55:HN-42:O (H2),
330: 44:HN-53:O (H3), 53:HN-44:O (H4), 46:HN-51:O (H5),
331: 51:HN-46:O (H6). The expression $i$:HN-$j$:O denotes the atoms
332: involving a H-bond between residues $i$ and $j$.
333:
334: To characterize a conformation, we use the number of native H-bonds,
335: the radius of gyration of the hydrophobic core
336: ($Rg_{core}$), and the $C_\alpha$-rmsd of residues 41-56 from the 2GB1
337: structure.\cite{GF91} $Rg_{core}$ is calculated using the side chains
338: of the four residues W43, Y45, F52, and V54. A H-bond is defined if it
339: satisfies DSSP conditions:~\cite{KS83} the distance between the
340: carbonyl oxygen and amide hydrogen (O..H) is less than 2.4~\AA\, and
341: the NHO angle is greater than 145$^\circ$.
342:
343: The regions of conformational space that have been sampled by all
344: simulations are clustered as follows. The C$_{\alpha}$-rmsd is calculated
345: for each pair of structures in each simulation. The number of neighbors
346: is then computed for each structure using a C$_{\alpha}$-rmsd cutoff of
347: 1.5 \AA. The conformation with the highest number of neighbors is
348: considered as the center of the first cluster. All the neighbors of this
349: conformation are removed from the ensemble of conformations. The center
350: of second cluster is then determined in the same way as for the first cluster,
351: and this procedure is repeated until each structure is assigned to a cluster.
352: Then we have a list of central structures of clusters for every folding simulation.
353: Once all the trajectories are clustered, we can cluster all the central structures
354: in two different folding trajectories, in order to identify the common clusters
355: between the simulations.
356:
357: \section*{\bf RESULTS}
358: \subsection*{\bf Native vs. Non-native Hairpin Structures}
359:
360: Cluster analysis of all ART-generated trajectories shows that the
361: lowest-energy conformation ($\it E$ = $-$ 33 kcal/mol) deviates by
362: less than 1 \AA~ $C_\alpha$ rms from the hairpin structure within
363: protein G (PDB code 2GB1~\cite{GF91}). For the purpose of our
364: simulation, we define a structure as being fully folded -- or native
365: -- if it satisfies the following four criteria: (i) the six native
366: H-bonds are formed (see Methods); (ii) the
367: backbone dihedral angles ($\phi$, $\psi$) have standard $\beta$-sheet
368: values (around ($-$90$^{\circ}$, 150$^{\circ}$)); (iii) the
369: hydrophobic core is well packed
370: (core radius of gyration is around 4.3 \AA) and (iv) the all-residue
371: $C_\alpha$ rmsd from 2GB1 is less than 2.5 \AA. These conditions for
372: nativeness are thus more stringent than those of in previous folding
373: simulations.~\cite{PD01, ZHOU02} For Zagrovic and
374: collaborators,~\cite{PD01} the H-bonds connecting the two
375: strands are not required to be native and hairpin with asymmetric
376: strands are considered as native structures. Similarly, for Zhou {\it
377: et al},~\cite{ZHOU02} the condition for nativeness is that all-heavy
378: atom rmsd from the global minimum structure is less than 2.5 \AA. The
379: rmsd between their modeled structure and the experimental 2GB1 is not
380: given.
381:
382: \begin{figure*}[ht!]%Figure 1
383: \includegraphics[width=14cm]{fig1.eps}
384: \vspace{0.0cm}
385: \caption{
386: The sampled hairpin structures with different main chain
387: H-bonds and 2GB1 structure. (a) 2GB1 structure; (b) Hairpin structure
388: with 6 native H-bonds (H1-H6), rmsd =1.09 (0.88)~\AA, E = -33 kcal/mol;
389: (c) hairpin structure with 5 key H-bonds (H1-H4, 52:HN-45:O),
390: rmsd = 2.87 (2.15) \AA, E = -29 kcal/mol; (d) hairpin structure
391: with 4 key hydrogen
392: bonds (43:HN-54:O, 45:HN-52:O, 52:HN-45:O, 54:HN-43:O), rmsd = 2.62 (2.39)
393: \AA, E = -28 kcal/mol; (e) hairpin structure with 4 key H-bonds
394: (44:HN-54:O, 46:HN-52:O, 52:HN-46:O, 54:HN-44:O), rmsd = 2.08 (1.87)~\AA,
395: E = -30 kcal/mol; (f) hairpin structure with 5 key H-bonds (43:HN-53:O,
396: 45:HN-51:O, 51:HN-45:O, 53:HN-43:O, 45:HN-41:O), rmsd = 2.42 (2.18)~\AA,
397: E = -29 kcal/mol; (g) hairpin structure with 4 key H-bonds (42:HN-54:O,
398: 44:HN-52:O, 52:HN-44:O, 54:HN-42:O), rmsd = 2.40 (1.98)~\AA, E = -28 kcal/mol;
399: (h) hairpin structure with 4 key H-bonds (43:HN-55:O, 45:HN-53:O,
400: 53:HN-45:O, 55:HN-43:O), rmsd = 3.56 (2.96)~\AA, E = -28 kcal/mol. The
401: rmsd value in parentheses is the C$_\alpha$-rmsd from 2GB1 for residues
402: 43-54. According to our definition of folded state, only hairpin (b) is
403: folded state.
404: }
405: \label{fig:hairpins}
406: \end{figure*}
407:
408: Figure~\ref{fig:hairpins} (produced by using the MOLMOL software \cite{MOL96})
409: shows the 2GB1 structure (a), our native
410: structure (b) and six non-native hairpin structures (c-h) sampled by
411: ART. Figure~\ref{fig:hairpins}(e) and ~\ref{fig:hairpins}(h) show two
412: hairpins in which the turn (residues 7-11) is shifted toward C
413: terminus. Figure~\ref{fig:hairpins}(f) and \ref{fig:hairpins}(g) show
414: two non-native hairpins where the turn (residues 6-10) is shifted
415: toward the N terminus. In this study, only hairpin (b), of lowest
416: energy, is the native state, while, according to the definition of
417: folded state in Ref.~\onlinecite{PD01}, hairpin structures (c)-(h) are
418: also folded states. Our hairpin structures (c) and (d) are very
419: similar to the folded state in Series 17 (two key H-bonds are
420: HB53-44 and HB52-45) and Series 2 (two key H-bonds are HB45-52
421: and HB43-54) of Ref.~\onlinecite{PD01}, respectively. Both of them have
422: symmetric strands, but different H-bond network
423: pattern. Moreover, from the key H-bonds in their eight
424: independent folding trajectories, it seems that the hairpin
425: structures in Series 1, 7, 9, 11 are asymmetric about the
426: $\beta$-turn.
427:
428:
429: \begin{figure*}[ht!]%Figure 2
430: \includegraphics[width=14cm]{fig2.eps}
431: \vspace{0.0cm}
432: \caption{
433: The sampled metastable states using the standard OPEP potential,
434: starting from a fully extended state. (a) hairpin structure with
435: 4 key H-bonds
436: (43:HN-54:O, 45:HN-52:O,52:HN-45:O, 54:HN-43:O), rmsd = 2.62~\AA,
437: E = -28 kcal/mol; (b) hairpin structure with 4 key H-bonds (44:HN-54:O,
438: 46:HN-52:O, 52:HN-46:O, 54:HN-44:O), rmsd = 2.08~\AA, E = -30 kcal/mol;
439: (c) hairpin structure with 4 key H-bonds (44:HN-54:O, 46:HN-52:O,
440: 52:HN-46:O, 54:HN-44:O), rmsd = 2.88~\AA, E = -27 kcal/mol; (d) hairpin
441: structure with 5 key H-bonds (43:HN-53:O, 45:HN-51:O, 51:HN-45:O,
442: 53:HN-43:O, 45:HN-41:O), rmsd = 2.6~\AA, E = -29 kcal/mol; (e) hairpin
443: structure with 4 key H-bonds (44:HN-55:O, 46:HN-53:O, 53:HN-46:O,
444: 55:HN-44:O), rmsd = 3.34~\AA, E = -23 kcal/mol; (f) a $\alpha$-$\beta$
445: structure with short helix appearing in the C terminal, rmsd = 7.39~\AA,
446: E = -22 kcal/mol; (g) an $\alpha$-helix structure involving residues from 5
447: to 15, rmsd = 5.89~\AA, E = -23 kcal/mol; (h) a three-stranded beta-sheet structure
448: with 4 H-bonds (42:HN-49:O, 49:HN-42:O, 48:HN-54:O, 54:HN-48:O),
449: rmsd = 7.81~\AA, E = -27 kcal/mol. Two metastable state (c) and (d) converge
450: to the native state within 5000 additional trial events.
451: }
452: \label{fig:non-native}
453: \end{figure*}
454:
455: \subsection*{Analysis of Unfolded Trajectories}
456:
457: From a total of 90 runs, 36 trajectories reach the native state.
458: The other 54 fail to locate the native
459: structure within 4000 ART-events. These runs lead either to non-native
460: hairpin conformations as discussed previously or to other metastable
461: conformations of various secondary compositions, e.g. $\alpha$-helix
462: with coil , three-stranded antiparallel $\beta$-sheet and short
463: $\alpha$-helix with $\beta$-like structures.
464:
465: Figure~\ref{fig:non-native} shows the structural features
466: of the 8 lowest energy metastable structures sampled. The 54 metastable
467: states lie between $-$21 and $-$30 vs. $-$33 kcal/mol for the native
468: state.
469: %
470: It is important to note that these structures do not represent dead
471: ends for the simulation; continuing the simulation at 300~K
472: for two arbitrarily chosen metastable states (c)
473: and (d), we find that it is possible to reach the fully formed hairpin
474: structure within 5000 additional trial events.
475: %
476: In Fig.~\ref{fig:rmsd-E}, we plot the energy vs. rmsd for the
477: lowest-energy structures in all the 52 simulations (16 folded and 36
478: unfolded) using the standard OPEP and starting from the fully extended
479: state. We see that all folded structures appear in a dense region
480: around $-$33 kcal/mol and below an rmsd of 2.0~\AA\ and are well
481: separated from the non-folded ones; clearly, our potential can
482: discriminates folded states from metastable states. By visual
483: inspection of the 36 metastable states and comparison with
484: Fig.~\ref{fig:rmsd-E}, we see (i) that the conformations
485: with $E$ between $-$25 and $-$30 kcal/mol and rmsd between
486: 2.0 and 4.1~\AA\ adopt asymmetric $\beta$-sheet structures with
487: different H-bond networks, (ii) that the conformations with $E$
488: between $-$25 and $-$28 kcal/mol and rmsd between 6.0 and
489: 8.5~\AA\ show three-stranded antiparallel $\beta$-sheet structures,
490: (iii) and that the other conformations with $E$ between
491: $-$21 and $-$25 kcal/mol and rmsd between 3.0 and 8.0~\AA\ adopt
492: $\alpha$-helix with coil, short $\beta$-hairpin with coil, or short
493: $\alpha$-helix with $\beta$-like structures.
494: %
495: As the rmsd of some non-native hairpin structures can be as small as
496: 2.1 \AA\ (see Fig.~\ref{fig:non-native}(b)), it is clear that this
497: sole criterion
498: is not sufficient to differentiate between native and non-native hairpin
499: structures.
500:
501: \begin{figure}[ht!]%Figure 3
502: ~\\ \vspace{-6.0cm}\includegraphics[width=8.8cm]{fig3.eps}
503: \vspace{0.0cm}
504: \caption{
505: Rmsd as a function of energy of the lowest-energy structures
506: generated in the 52 simulations (16 folded and 36 unfolded) using the
507: standard OPEP potential, starting from a fully extended state. Circles,
508: triangles, and crosses are for native $\beta$-hairpin, asymmetric $\beta$-hairpin with
509: different H-bond pattern, and other different structures, respectively.
510: }
511: \label{fig:rmsd-E}
512: \end{figure}
513:
514: \subsection*{Analysis of Folded Trajectories}
515:
516: As reported in Ref.~\onlinecite{WMD03}, the 16 folding simulations using the
517: standard OPEP potential which start from a fully extended state can be
518: classified into 3 mechanisms : I (4 runs), II (7 runs) and III (5
519: runs).
520:
521: \begin{figure*}[ht!]%Figure 4
522:
523: \vspace{-0.6cm}\includegraphics[width=13cm]{fig4.eps}
524:
525: \vspace{0.0cm}
526: \caption{
527: Detailed analysis of a representative folding trajectory following Mechanism I,
528: simulated at 300 K, starting from a fully extended state. (a) Six snapshots.
529: Only accepted events are shown. (b) C$_\alpha$-rmsd from 2GB1 structure of
530: the hairpin, radius of gyration of the hydrophobic core $Rg_{core}$, and the
531: two-end distance as a function of accepted event number. (c) Total energy, number
532: of all the native and non-native H-bonds in each sampled conformation as a function
533: of accepted event number.
534: }
535: \label{fig:rmsd-extend-300k-s4}
536: \end{figure*}
537: \begin{figure}[ht!]%Figure 5
538: \vspace{0.0cm}\includegraphics[width=8.6cm]{fig5.eps}
539: \caption{
540: Status of the six native H-bonds as a function of accepted event number
541: in the four runs following Mechanism I. Green: not formed, red: formed. Run 1 is the
542: simulation described in Fig. 4.
543: }
544: \label{fig:hb-mechanism-I}
545: \end{figure}
546:
547: A detailed analysis of mechanism I, seen in four folding trajectories,
548: is given in Figure~\ref{fig:rmsd-extend-300k-s4}. This mechanism is
549: similar to that described in
550: Refs.~\onlinecite{KI99, THI00, WJ99}. Starting from a fully extended
551: state, the peptide first collapses into a compact state with the turn
552: placed in the right section of the chain (residues 7-10) (Events 53);
553: this step is characterized by the formation of a partially packed
554: hydrophobic core (radius of gyration of the hydrophobic core,
555: $Rg_{core}$, is 4.8 \AA, see Fig.~\ref{fig:rmsd-extend-300k-s4}(b))
556: and the appearance of several
557: non-native H-bonds (Fig.~\ref{fig:rmsd-extend-300k-s4}(c)). The
558: following steps serve to stabilize its hydrophobic core. At event 80,
559: a native H-bond near the turn (H5) forms, followed rapidly by
560: the formation of H4; 29 events later (event 109), the peptide
561: reorganizes its hydrophobic core to a well packed state ($Rg_{core}$
562: reaches its final value 4.3~\AA). The
563: reorganization of the hydrophobic core allows the formation of new
564: native main chain H-bonds (H3, H2, H6). At that point, the
565: two-end distance fluctuates around 9~\AA\
566: (Fig.~\ref{fig:rmsd-extend-300k-s4}(b)). In spite of these new
567: native H-bonds, the flexibility of the loop remains large and H6, an
568: H-bond near the loop, breaks and reforms many times between
569: events 128 and 300 (see Run 1 in Fig.~\ref{fig:hb-mechanism-I}).
570: Finally, the H-bond H1, near the end of the
571: peptide, forms at last (event 471) leading to the native state: the
572: core radius of gyration remains its final value 4.3 ~\AA, the rmsd
573: from 2GB1 structure drops to 2.0~\AA\
574: (Fig.~\ref{fig:rmsd-extend-300k-s4}(b)), the total number of native
575: H-bonds is six, and the total energy is $-$32 kcal/mol
576: (Fig.~\ref{fig:rmsd-extend-300k-s4}(c)). The time formation of
577: the six native H-bonds H1-H6 can be seen in
578: Fig.~\ref{fig:hb-mechanism-I} for the four folding runs.
579: The number of native and non-native H-bonds in
580: each accepted conformation as a function of event number is given in
581: Fig.~\ref{fig:rmsd-extend-300k-s4}(c). We clearly see that
582: the folding process is a competition between native and non-native
583: hydrogen bonding interactions, and that in Mechanism I the two ends
584: of this $\beta$-hairpin gradually come near to form H1 last, and the
585: hydrophobic core becomes well packed rapidly.
586:
587: \begin{figure*}[ht!]%Figure 6
588: \vspace{-0.6cm}\includegraphics[width=13cm]{fig6.eps}
589: \vspace{0.0cm}
590: \caption{
591: Detailed analysis of a representative folding trajectory following Mechanism II,
592: simulated at 300 K, starting from a fully extended state. (a) Six snapshots.
593: (b) C$_\alpha$-rmsd from 2GB1 structure of
594: the hairpin, radius of gyration of the hydrophobic core $Rg_{core}$, and the
595: two-end distance as a function of accepted event number. (c) Total energy,
596: number of all the native
597: and non-native H-bonds in each sampled conformation as a function of
598: accepted event number.
599: }
600: \label{fig:rmsd-extend-300k-s5}
601: \end{figure*}
602:
603: \begin{figure}[ht!]%Figure 7
604: \vspace{0.cm}\includegraphics[width=8.6cm]{fig7.eps}
605: \vspace{0.0cm}
606: \caption{
607: Status of the six native H-bonds as a function of accepted event number in
608: the seven runs following Mechanism II. Green: not formed, red: formed. Run
609: 11 is the simulation described in Fig. 6.
610: }
611: \label{fig:hb-mechanism-II}
612: \end{figure}
613:
614: Mechanism II, seen in seven folding trajectories, was observed in
615: simulations of hairpin2~\cite{PD01,BS01,MK99} and of a 10-residue
616: model peptide.~\cite{NK02} Figure~\ref{fig:rmsd-extend-300k-s5} shows
617: the major folding steps associated with this mechanism. Within a few
618: events, the hydrophobic interaction between the four residues W43,
619: Y45, F52, and V54 induces the formation of a partial hydrophobic core
620: ($Rg_{core}$ is 6.4~\AA) resulting in a globular
621: state with three non-native H-bonds (event 55). At the same time, the
622: N- and the C-termini approach each other to form a large loop.
623: Because of the competition between the hydrophobic, native, and
624: non-native hydrogen bonding forces, the peptide rearranges its
625: hydrophobic core by breaking and reforming some non-native
626: interactions ($Rg_{core}$ increases to 8~\AA). At
627: event 191, the hydrophobic core drops to 5.5~\AA\ and, after 16 more
628: events, the H-bond H1 (event 207) near the end forms. This
629: reorganization of hydrophobic core causes the second (H2 at event 366)
630: and third (H3 at event 417) H-bonds to form. After a 150-event optimization
631: between hydrophobic and hydrogen bonding interactions, the radius of
632: gyration of the hydrophobic core reaches its final value (4.3~\AA\ at event
633: 567), and the fourth H-bond H4 appears. In the following
634: several events, H5 (at event 570) and H6 (at event 573) form; the
635: peptide satisfies the native conditions (see Fig.~\ref{fig:rmsd-extend-300k-s5}).
636: Fig.~\ref{fig:hb-mechanism-II} shows the time formation of
637: the six native H-bonds H1-H6 in the seven folding trajectories.
638: As is shown in Fig.~\ref{fig:rmsd-extend-300k-s5} and
639: Fig.~\ref{fig:hb-mechanism-II} the two ends of
640: this $\beta$-hairpin come near to form H-bond H1 first (and to a lesser
641: extent H2 first), and the hydrophobic core reorganizes to its well packed state
642: slowly. The partial helical structure
643: (Events 169 and 280) is not a necessary intermediate and appears in 4
644: of the 7 trajectories following mechanism II. Comparing the two
645: mechanisms, one can see that they are not mutually
646: incompatible; many trajectories fall somewhere in between these two
647: descriptions. In some cases, for example, folding is initiated in the
648: middle region (H4). The $\beta$-sheet then propagates first outwards
649: (forming H1-H3) and then inwards (forming H5-H6) (see Run 6 in
650: Fig.~\ref{fig:hb-mechanism-II}).
651:
652:
653: \begin{figure*}[ht!]%Figure 8
654:
655: \vspace{-0.6cm}\includegraphics[width=13.0cm]{fig8.eps}
656:
657: \vspace{0.0cm}
658: \caption{
659: Detailed analysis of a folding trajectory following Mechanism III,
660: simulated at 300 K, starting from a fully extended state. Another
661: folding trajectory was presented in a short communication.~\cite{WMD03}
662: (a) Six snapshots.
663: (b) C$_\alpha$-rmsd from 2GB1 structure of
664: the hairpin, radius of gyration of the hydrophobic core $Rg_{core}$, and the
665: two-end distance as a function of accepted event number.
666: (c) Total energy, number of all the native and non-native H-bonds in each
667: sampled conformation as a function of accepted event number.
668: }
669: \label{fig:rmsd-extend-300k-s24}
670: \end{figure*}
671: \begin{figure}[ht!]%Figure 9
672: ~\\ \vspace{-0.2cm}\includegraphics[width=8.2cm]{fig9.eps}
673:
674: \vspace{0.0cm}
675:
676: \caption{
677: Analysis of all trajectories following Mechanism III.
678: (a) Rmsd from the 2GB1 structure,
679: (b) number of non-native H-bonds, and (c) number of native H-bonds
680: in each conformation as a function of accepted event number.
681: Run 13 is the the folding simulation described in Fig. 3 in a short
682: communication,~\cite{WMD03} and Run 14 is the folding simulation
683: described in Fig. 8. The final rmsd's from the 2GB1 structure are less
684: than 1.8 \AA~(see (a)) in the five folding simulations. In the five folding
685: simulations, on average, the non-native H-bonds break almost
686: at the same time (see(b)), then the native H-bonds form rapidly, almost
687: instantaneously (see(c)).
688: }
689: \label{fig:rmsd-and-hb}
690: \end{figure}
691:
692: Mechanism III had not been observed
693: in previous all-atom folding,~\cite{PD01,ZHOU02} unfolding,~\cite{PD99} and
694: equilibrium simulations.~\cite{MK99,BS01,GS01} This mechanism, seen in
695: five folding trajectories, is characterized by a rapid folding into a
696: collapsed state with a turn at the wrong place, forming an asymmetric
697: hairpin structure stabilized by non-native H-bonds and a partially
698: packed hydrophobic core. Then slowly, step by step in a reptation
699: mode, the asymmetry is corrected, with non-native H-bonds breaking and
700: reforming, in a structure getting closer to the native hairpin. A
701: representative trajectory is given in Fig.~\ref{fig:rmsd-extend-300k-s24}.
702: Folding begins with the
703: formation of a compact state defined by a partially packed
704: hydrophobic core and two non-native H-bonds 46:HN-55:O and 48:HN-53:O
705: (event 55). Then the number of non-native H-bonds increases to four:
706: 46:HN-55:O, 48:HN-53:O, 53:HN-48:O, 55:HN-46:O, and a short
707: $\beta$-sheet structure (Event 84) appears. Driven by the hydrophobic
708: interactions, at event 99, the reptation motion of the loop causes the
709: four non-native H-bonds to break, and four new non-native H-bonds
710: 44:HN-55:O, 46:HN-53:O, 53:HN-46:O, 55:HN-44:O form. This peptide
711: shows a new asymmetric $\beta$-sheet structure, which is closer to its
712: symmetric $\beta$-hairpin structure. During the next 200 events, this
713: peptide stays in this state and reorganizes its hydrophobic
714: core. After an optimization of the hydrophobic and hydrogen bonding
715: interactions, at event 302, the reptation motion of the loop enhances
716: longitudinal motion of the two strands, breaking the four non-native
717: H-bonds and forming four native ones (H1-H4); simultaneously, the core
718: radius of gyration drops to 4.7~\AA. The hairpin structures forms
719: rapidly afterwards, with the addition of the fifth and sixth native
720: H-bond (H5 and H6), a rmsd dropping to 1.4~\AA\ (Event 334), and the
721: energy approaching to -33 kcal/mol. Once the native hairpin forms, it
722: is fairly stable: the total number of H-bonds remains at six, the
723: radius of gyration of the hydrophobic core keeps around 4.3~\AA, the
724: rmsd fluctuates around 1.4~\AA, and the energy fluctuates around
725: $-$33 kcal/mol. As shown in Figure~\ref{fig:rmsd-extend-300k-s24} and
726: Fig.~\ref{fig:rmsd-and-hb}, the critical and rate-limiting step in this
727: mechanism is the breaking, almost in synchrony, of four or five non-native
728: H-bonds of a structure very close to the native state, followed by a rapid
729: formation of the native H-bonds. It can also be seen from
730: Fig.~\ref{fig:rmsd-extend-300k-s24} that the two ends of this $\beta$-hairpin
731: slowly come near, and the hydrophobic core reorganizes to its well
732: packed state slowly.
733:
734:
735: \section*{\bf DISCUSSION}
736:
737: \subsection*{Sensitivity of Trajectories}
738:
739: Because various interaction potentials have led previously to
740: conflicting reports regarding the details of folding, it is essential
741: to determine how changes in the force-field and the starting
742: structure affect the folding trajectories.
743:
744: We consider here two variations of the force field. Firstly, we run 20
745: simulations using OPEP with an energy parameter for the H-bond
746: $\varepsilon_{hb}$ increased from the standard value of 1.5 to 2.5,
747: a change which could considerably affect the stability of intermediate
748: structures. Remarkably, we still recover the three folding mechanisms
749: among the six successful trajectories. Secondly, we run 10
750: simulations using ART with G\={o}-like OPEP potential ( $E= E(\mbox{OPEP})+
751: k*(\mbox{rmsd}^2$), for $k=0.5$, 0.55 and 0.6) which favors native contacts in
752: the total energy. Starting from a fully extended state, all ten
753: simulations found the folded state following either mechanism I or II:
754: six simulations fold from the end region (H1 or H2 form first), two
755: from the middle (H3 forms first) and two from the turn (H5 first). As
756: in earlier G\={o}-model simulations,~\cite{ZHOU02} we find no helical
757: intermediates in any of the ten folding simulations.
758: In these simulations, folding can be initiated at the end region, middle
759: region, and turn region of the peptide, in agreement with
760: the results obtained by using only the OPEP potential. Because native
761: interactions are strongly favored in G\={o}-model simulations, the
762: asymmetric conformations found in mechanism III become prohibitive,
763: preventing the appearance of mechanism III. The bias of the G\={o}
764: potential can therefore lead to an incomplete sampling of the folding
765: paths.
766:
767: \begin{figure*}[ht!]%Figure 10
768: \vspace{-0.6cm}\includegraphics[width=13.0cm]{fig10.eps}
769: \vspace{0.2cm}
770: \caption{
771: A detailed analysis of a trajectory resulting in the fully
772: $\beta$-hairpin at 300 K, starting from a semi-helical structure.
773: In this initial state, the 14 pairs of ($\phi$,$\psi$) values, excluding
774: residues Gly41 and Glu56, are
775: (-63$^{\circ}$, 121$^{\circ}$), (-63$^{\circ}$, -27$^{\circ}$),
776: (-65$^{\circ}$, 98$^{\circ}$), (-71$^{\circ}$, -10$^{\circ}$),
777: (-48$^{\circ}$, 145$^{\circ}$),
778: (-51$^{\circ}$, -61$^{\circ}$), (-56$^{\circ}$, -48$^{\circ}$),
779: (-51$^{\circ}$, -55$^{\circ}$),
780: (-61$^{\circ}$, -51$^{\circ}$),
781: (-61$^{\circ}$, 94$^{\circ}$), (-63$^{\circ}$, -48$^{\circ}$),
782: (-54$^{\circ}$, -43$^{\circ}$), (-59$^{\circ}$, -58$^{\circ}$),
783: (-74$^{\circ}$, 80$^{\circ}$).
784: ($\phi$,$\psi$) values near (-60$^{\circ}$, -45$^{\circ}$)
785: are typical for $\alpha$-helices.
786: (a) Six snapshots. (b) C$_\alpha$-rmsd from 2GB1 structure of
787: the hairpin, radius of gyration of the hydrophobic core $Rg_{core}$, and the
788: two-end distance as a function of accepted event number.
789: (c) Total energy, number of all the native and non-native
790: H-bonds in each sampled conformation as a function of accepted event number.
791: }
792: \label{fig:rmsd-300k-s6}
793: \end{figure*}
794:
795: To determine the impact of the starting conformation on the
796: ART-trajectories and to address the question of semi-helical
797: intermediates during folding,~\cite{PD01} eight trial simulations were
798: attempted at 300 K starting from a semi-helical structure. Four of
799: those resulted in hairpin structures with six native H-bonds,
800: while the remaining four displayed helical structures or
801: $\beta$-hairpin structures with non-native H-bonds.
802: Interestingly, the four folded trajectories closely follow Mechanism
803: II. Figure~\ref{fig:rmsd-300k-s6} gives a representation of folding
804: simulation at 300 K. The semi-helical structure rapidly relaxes to a
805: more compact structure (event 92) in which the two ends of the peptide
806: approach each other. At this stage, a partially packed hydrophobic
807: core ($Rg_{core}$ is 6.7 \AA) appears, without native H-bonds, however.
808: Driven by the strong
809: hydrophobic interactions among the four hydrophobic residues (Tyr45,
810: Phe52, Trp43, Val54), the two ends of the peptide come nearer and one
811: native H-bond (H1) is formed at the end region of the peptide. After
812: 50 events, the next two H-bonds H2 and H3 form almost
813: simultaneously. The peptide stays in this state for 350 events,
814: reorganizing slowly the hydrophobic core ($Rg_{core}$ oscillates around
815: 5.2 \AA\ and the rmsd from the 2GB1 structure reaches a plateau (4
816: \AA)). After a slow structural adjustment process, the hydrophobic
817: core forms fully in a large cooperative move at event 500 ($Rg_{core}$
818: drops to 4.3 \AA), and the six native H-bonds set in rapidly
819: afterwards (H4 and H5 form first, followed by the formation of
820: H6). This event marks the completion of the folding process of the
821: peptide with the rmsd reaching to 1.5 \AA. This demonstrates that
822: ART can find the folded state starting from a helical structure and
823: that this helical structure may exist in the real folding pathway of this
824: protein, as discussed in previous simulations~\cite{GS01,PD01,IRB03} and recent
825: energy landscape characterization of $\beta$-hairpin2 and its
826: isomers~\cite{NUS03}.
827:
828:
829: From the structure of the initial helical state, we can explain why the four
830: folded trajectories only follow Mechanism II. This state is characterized
831: by
832: a helical structure spanning residues 47-50, and two non-native H-bonds
833: (50:HN-46:O and 51:HN-47:O). The existence of
834: this helical segment makes it difficult to form native H-bonds at
835: the turn or the middle region of the peptide because of geometric
836: restrictions. However the two ends of this peptide are very flexible,
837: driven by the hydrophobic interactions, they can approach each other
838: easily, then form a H-bond first between them. When the initial state
839: is a fully extended state, this peptide has much more freedom to find
840: its native state, and this is why multiple folding pathways are
841: present.
842:
843: \subsection*{Cluster Analysis of the Folded Trajectories}
844:
845: As we have seen in the previous section, the folding process can be
846: described by a single concept: the competition between three types of
847: interactions. A cluster analysis shows, however, that this does not
848: mean that the folding trajectories can be unified.
849:
850: We perform a cluster analysis following the procedure described in
851: Ref.~\onlinecite{Daura} (see Methods), using a C$_{\alpha}$-rmsd
852: cutoff of 1.5 \AA.
853: All the accepted conformations in each folded trajectory of the
854: 26 folding simulations obtained by the standard OPEP potential are used
855: for this procedure. A total of 12-40 clusters were found for each folded
856: trajectory, indicating strong variations in the details of the folding trajectories.
857: Moreover, we find very little overlap of the basins between similar trajectories:
858: except for the trivial initial and native clusters, it is generally
859: not possible to match more than one or two clusters between trajectories
860: following the same mechanisms. We obtained the same qualitative results
861: using other clustering analysis.
862:
863: The failure of cluster analysis by C$_{\alpha}$-rmsd to characterize the three
864: different folding mechanisms for this $\beta$-hairpin is due to the flexibility of
865: this small peptide. For example, the asymmetric $\beta$-hairpin structures in
866: the folding trajectory following the reptation mechanism are very diverse,
867: presenting a wide range of hydrogen-bond patterns. Moreover, the
868: C$_{\alpha}$-rmsd between a beta hairpin with the turn shifted to C-terminal
869: and a beta hairpin with the turn shifted to N-terminal is as big as 4.3 \AA.
870: Increasing the C$_{\alpha}$-rmsd cutoff will make it difficult to differentiate
871: asymmetric $\beta$-hairpin state from the folded state. The classification in terms
872: of folding mechanisms as identified by the formation of hydrogen bonds appears
873: therefore superior to the clusterization method for this small peptide.
874:
875:
876: \subsection*{Does the Reptation Folding Trajectory Exist?}
877:
878: Surprizingly, although present in part in many previous simulations,
879: the reptation mechanism had not been identified previously. The
880: asymmetric conformations with only non-native H-bonds, which
881: characterize the intermediate states in Mechanisms III, are found in
882: many simulations. For example, a cluster analysis of the structures
883: produced in an all-atom multicanonical MC simulation of the $\beta$-hairpin2
884: finds that these asymmetric conformations account for 20\% of all
885: conformations.~\cite{MK99} Similarly, Skolnick and collaborators find
886: that a lattice model often folds into asymmetric structures.~\cite{KI99}
887: In a recent work of Irb\"ack, a local minimum corresponding to a $\beta$-hairpin
888: with non-native topology is observed in the energy landscape of this
889: $\beta$-hairpin2.~\cite{IRB03}
890: Finally, most of the folded conformations identified by Zagrovic {\it
891: et al.} in distributed MD simulations (Folding@home) seem
892: asymmetric although this is not explicitly stated.~\cite{PD01}
893: Similar results are found in smaller peptides such
894: as the 11-residue model peptide of Wang {\it et al.} studied by
895: molecular dynamics, which also displays asymmetric $\beta$-hairpin
896: structures.~\cite{WJ99}
897:
898: Although many simulations found the asymmetric conformation, none
899: seems to have been able to overcome the rate-limiting step, which requires
900: breaking all non-native bonds at once (Fig.~\ref{fig:rmsd-and-hb} (b)), in
901: order to form the native state. With OPEP, this barrier is found to be about
902: 12 kcal/mol and corresponds to a time scale on the order of
903: $\mu$s,~\cite{DS98} near the experimental folding time but much beyond
904: what can be reached by standard molecular dynamics.
905:
906: In addition to display a time scale in agreement with experiment,
907: there is experimental evidence that many $\beta$-hairpin sequences can
908: populate two distinct hairpin conformations of various loop lengths
909: and pairings of $\beta$-strands in solution.~\cite{SE95,RA99}
910: Fluorescence microscopy also suggests that myosin can induce the
911: reptation of actin filaments when adenosine triphosphate is
912: added.~\cite{HD02} The reptation mechanism, which has been well
913: established as a fundamental movement in polymer chain, might
914: therefore exist in the folding of a simple $\beta$-hairpin.
915:
916: \subsection*{Competition between Hydrophobic and Native Hydrogen
917: Bonding Interactions}
918:
919: The folding mechanisms described above are the result of a strong
920: competition between hydrophobic and native hydrogen-bonding
921: interactions. All cases show that the hydrophobic
922: interactions play a dominant role in the folding process. At the
923: beginning of the folding, a partially packed hydrophobic core always
924: forms before native H-bonds appear (see
925: Fig.~\ref{fig:rmsd-extend-300k-s4}(b) and (c),
926: Fig.~\ref{fig:rmsd-extend-300k-s5}(b) and (c),
927: Fig.~\ref{fig:rmsd-extend-300k-s24}(b) and (c),
928: Fig.~\ref{fig:rmsd-300k-s6}(b) and (c)). The following step is the
929: rearrangement of the hydrophobic core and the optimization between the
930: complete hydrophobic and native hydrogen bonding interactions. When
931: hydrophobic and hydrogen bonding interactions reach a balance, the
932: well packed hydrophobic core forms before
933: (Fig.~\ref{fig:rmsd-extend-300k-s4}(b) and (c)) or at the same time as
934: (Fig.~\ref{fig:rmsd-extend-300k-s5}(b) and (c),
935: Fig.~\ref{fig:rmsd-extend-300k-s24}(b) and (c),
936: Fig.~\ref{fig:rmsd-300k-s6}(b) and (c)) the native H-bonds network forms.
937:
938: \subsection*{Competition between Native and Non-native Hydrogen
939: Bonding Interactions}
940:
941: Native and non-native hydrogen-bonding interactions also compete strongly
942: during the folding process. The initial collapse is always
943: accompanied by the formation of three to six non-native H-bonds (see
944: Fig.~\ref{fig:rmsd-extend-300k-s4}(c),
945: Fig.~\ref{fig:rmsd-extend-300k-s5}(c),
946: Fig.~\ref{fig:rmsd-and-hb}(b),
947: Fig.~\ref{fig:rmsd-300k-s6}(c)). However, these non-native H-bonds
948: are not stable in the long run; they form, break, and reform in
949: response to the movement of the hydrophobic core. Driven by the
950: hydrophobic interactions, the non-native hydrogen bonding interactions
951: finally become weaker, and native H-bonds form, leading rapidly to
952: the native state. The whole folding process can therefore be described
953: as a balance between hydrophobic, native and non-native hydrogen
954: bonding forces.
955:
956: \section*{\bf CONCLUSIONS}
957:
958: By demonstrating that the folding of a $\beta$-hairpin can be
959: initiated at the end, the middle and the turn region, as well as from
960: an asymmetric conformation, the three folding mechanisms proposed here
961: help reconcile conflicting theoretical data on the hairpin2 of protein
962: G~\cite{EATON,MK99,THI00,ZHOU02} or between various hairpins, e.g. the
963: first hairpin of tendamistat.~\cite{BON00}
964: Using these three mechanisms, we can now propose a complete picture
965: of the folding of $\beta$-hairpins which does not depend on the exact
966: amino-acid composition. The exact folding path followed by a given
967: $\beta$-hairpin should be influenced by its sequence and the solvent
968: conditions; all paths should, however, belong to one of the three
969: mechanisms presented here. The first two mechanisms, with the
970: propagation from either the turn or the end points, had already been
971: described in previous reports on $\beta$-hairpin 2, but no previous method
972: had managed to detect both pathways. The third mechanism, identified in
973: these simulations for the first time, involves folding into an
974: asymmetric state followed by a reptation of one strand over the other
975: until the peptide reaches its native state. The existence of this
976: mechanism is suppported by a number of experimental and numerical
977: results even though the rate-limiting step and the presence of
978: non-native states place it outside the scope of most simulation
979: methods, including unfolding, G\={o} and standard MD approaches. This
980: last results underlines the importance of direct non-biases methods,
981: such as ART, for studying the folding process.
982:
983: Although complex, presenting a large number of different paths, the
984: folding of this $\beta$-hairpin can still be described by a unique
985: process of competition between hydrophobic core and native and
986: non-native H-bond interactions. In particular, it is clear that
987: non-native H-bond interactions can play a critical role in the
988: folding process even though they are absent in the final product.
989:
990: \section*{\bf ACKNOWLEDGEMENTS}
991:
992: GW and NM are supported in part by the {\it Fonds qu\'eb\'ecois pour
993: la formation des chercheurs et l'aide \`a la recherche} and the {\it
994: Natural Sciences and Engineering Research Council} of Canada. Most
995: of the calculations were done on the computers of the {\it R\'eseau
996: qu\'eb\'ecois de calcul de haute performance} (RQCHP). NM is a
997: Cottrell Scholar of the Research Corporation. We thank Drs. Hue Sun
998: Chan, Marek Cieplak, and Saraswathi Vishveshwara for useful discussion.
999:
1000:
1001: \begin{thebibliography}{10}
1002:
1003: \bibitem{NMR95}
1004: Blanco FJ, and Serrano L.
1005: \newblock Folding of protein GB1 domain studied by the conformational
1006: characterization of fragments comprising its secondary structure elements.
1007: \newblock {Eur J Biochem} 1999;{230:}634--649.
1008:
1009: \bibitem{EATON}
1010: Munoz V, Thompson PA, Hofrichter J, and Eaton WA.
1011: \newblock Folding dynamics and mechanism of beta-hairpin formation.
1012: \newblock {Nature} 1997;{390:}196--199.
1013:
1014: \bibitem{DU98}
1015: Duan Y, and Kollman PA.
1016: \newblock Pathways to a protein folding intermediate observed in a
1017: 1-microsecond simulation in aqueous solution.
1018: \newblock {Science} 1998;{282:}740--744.
1019:
1020: \bibitem{MUN98}
1021: Munoz V, Henry ER, Hofrichter J, and Eaton WA.
1022: \newblock A statistical mechanical model for beta-hairpin kinetics.
1023: \newblock {Proc Natl Acad Sci USA} 1998;{95:}5872--5879.
1024:
1025: \bibitem{KI99}
1026: Kolinski A, Ilkowski B, and Skolnick J.
1027: \newblock Dynamics and thermodynamics of beta-hairpin assembly: insights from
1028: various simulation techniques.
1029: \newblock {Biophys J} 1999;{77:}2942--2952.
1030:
1031: \bibitem{THI00}
1032: Klimov DK, and Thirumalai D.
1033: \newblock Mechanisms and kinetics of $\beta$-hairpin formation.
1034: \newblock {Proc Natl Acad Sci USA} 2000;{97:}2544--2549.
1035:
1036: \bibitem{TSAI02}
1037: Tsai J, and Levitt M.
1038: \newblock Evidence of turn and salt bridge contributions to $\beta$-hairpin
1039: stability: MD simulations of C-terminal fragment from the B1 domain of
1040: protein G.
1041: \newblock {Biophys Chem} 2002;{101-102:}187--201.
1042:
1043: \bibitem{PD01}
1044: Zagrobic B, Sorin EJ, and Pande V.
1045: \newblock $\beta$-hairpin folding simulations in atomistic detail using an
1046: implicit solvent model.
1047: \newblock {J Mol Biol} 2001;{313:}151--169.
1048:
1049: \bibitem{BS01}
1050: Zhou R, Berne BJ, and Germain R.
1051: \newblock The free energy landscape for $\beta$ hairpin folding in explicit
1052: water.
1053: \newblock {Proc Natl Acad Sci USA} 2001;{98:}14931--14936.
1054:
1055: \bibitem{ZHOU02}
1056: Zhou Y, and Linhananta A.
1057: \newblock Role of hydrophilic and hydrophobic contacts in folding of the second
1058: $\beta$-hairpin fragment of protein G: molecular dynamics simulation studies
1059: of an all-atom model.
1060: \newblock {Proteins} 2002;{47:}154--162.
1061:
1062: \bibitem{PD99}
1063: Pande VS, and Rokhsar DS.
1064: \newblock Molecular dynamics simulations of unfolding and refolding of a
1065: beta-hairpin fragment of protein G.
1066: \newblock {Proc Natl Acad Sci USA} 1999;{96:}9062--9067.
1067:
1068: \bibitem{LEE01}
1069: Lee J, and Shin S.
1070: \newblock Understanding $\beta$-hairpin formation by molecular dynamics
1071: simulations of unfolding.
1072: \newblock {Biophys J} 2001;{81:}2507--2516.
1073:
1074: \bibitem{MK99}
1075: Dinner AR, Lazaridis T, and Karplus M.
1076: \newblock Understanding $\beta$-hairpin formation.
1077: \newblock {Proc Natl Acad Sci USA} 1999;{96:}9068--9073.
1078:
1079: \bibitem{GS01}
1080: Garc\'{i}a AE, and Sanbonmatsu KY.
1081: \newblock Exploring the energy landscape of a $\beta$-hairpin in explicit
1082: solvent.
1083: \newblock {Proteins} 2001;{42:}345--354.
1084:
1085: \bibitem{MA00}
1086: Ma B, and Nussinov R.
1087: \newblock Molecular dynamics simulations of a beta-hairpin fragment of protein
1088: G: balance between side-chain and backbone forces.
1089: \newblock {J Mol Biol} 2000;{296:}1091--1104.
1090:
1091: \bibitem{IRB03}
1092: Irb\"ack A, Samuelsson B, Sjunnesson F, and Wallin S.
1093: \newblock Thermodynamics of $\alpha$- and $\beta$-structure formation in proteins.
1094: \newblock {Biophys J} 2003;{85:}1466--1473.
1095:
1096: \bibitem{AF02}
1097: Fersht AR.
1098: \newblock On the simulation of protein folding by short time scale molecular
1099: dynamics and distributed computing.
1100: \newblock {Proc Natl Acad Sci USA} 2002;{99:}14122--14125.
1101:
1102: \bibitem{BM96}
1103: Barkema GT, and Mousseau N.
1104: \newblock Event-based relaxation of continuous disordered systems.
1105: \newblock {Phys Rev Lett} 1996;{77:}4358--4361.
1106:
1107: \bibitem{BM98}
1108: Barkema GT, and Mousseau N.
1109: \newblock Identification of relaxation and diffusion mechanisms in amorphous
1110: silicon.
1111: \newblock {Phys Rev Lett} 1998;{81:}1865--1868.
1112:
1113: \bibitem{FOR01}
1114: Forcellino F, and Derreumaux P.
1115: \newblock Computer simulations aimed at structure prediction of supersecondary
1116: motifs in proteins.
1117: \newblock {Proteins} 2001;{45:}159--166.
1118:
1119: \bibitem{WMD02}
1120: Wei G, Mousseau N, and Derreumaux P.
1121: \newblock Exploring the energy landscape of proteins: A characterization of the
1122: activation-relaxation technique.
1123: \newblock {J Chem Phys} 2002;{117:}11379--11387.
1124:
1125: \bibitem{BON00}
1126: Bonvin AM, and van Gunsteren WF.
1127: \newblock $\beta$-hairpin stability and folding: molecular dynamics studies of
1128: the first $\beta$-hairpin of tendamistat.
1129: \newblock {J Mol Biol} 2000;{296:}255--268.
1130:
1131: \bibitem{WJ99}
1132: Wang H, Varady J, Ng L, and Sung S.
1133: \newblock Molecular dynamics simulations of $\beta$-hairpin folding.
1134: \newblock {Proteins} 1999;{37:}325--333.
1135:
1136: \bibitem{WMD03}
1137: Wei G, Derreumaux P, and Mousseau N.
1138: \newblock Sampling the complex energy landscape of a simple $\beta$-hairpin.
1139: \newblock {J Chem Phys} 2003;{119:}6403-6406.
1140:
1141: \bibitem{MM00}
1142: Malek R, and Mousseau N.
1143: \newblock Dynamics of Lennard-Jones clusters: A characterization of the
1144: activation-relaxation technique.
1145: \newblock {Phys Rev E} 2000;{62:}7723--7728.
1146:
1147: \bibitem{MDB01}
1148: Mousseau N, Derreumaux P, Barkema GT, and Malek R. (2001).
1149: \newblock Sampling activated mechanisms in proteins with the
1150: activation-relaxation technique.
1151: \newblock {J Mol Graph Model} 2001;{19:}78--86.
1152:
1153: \bibitem{Lan88}
1154: Lancz\'os C.
1155: \newblock {Applied Analysis}.
1156: \newblock Dover, New York; 1988.
1157:
1158: \bibitem{Wales99}
1159: Munro LJ, and Wales DJ.
1160: \newblock Defect migration in crystalline silicon.
1161: \newblock {Phys Rev B} 1999;{59:}3969--3980.
1162:
1163: \bibitem{DP99}
1164: Derreumaux P.
1165: \newblock From polypeptide sequences to structures using Monte Carlo
1166: simulations and an optimized potential.
1167: \newblock {J Chem Phys} 1999;{11:}2301--2310.
1168:
1169: \bibitem{DP00}
1170: Derreumaux P.
1171: \newblock Generating ensemble averages for small proteins from extended
1172: conformations by monte carlo simulations.
1173: \newblock {Phys Rev Lett} 2000;{85:}206--209.
1174:
1175: \bibitem{GF91}
1176: Gronenborn AM, and et~al.
1177: \newblock A novel, highly stable fold of the immunoglobulin in binding domain
1178: of streptococcal protein-G.
1179: \newblock {Science} 1991;{253:}657--661.
1180:
1181: \bibitem{KS83}
1182: Kabsch W, and Sander C.
1183: \newblock Dictionary of protein secondary structure: pattern recognition of
1184: H-bond and geometrical features.
1185: \newblock {Biopolymers} 1983;{22:}2577--2637.
1186:
1187: \bibitem{MOL96}
1188: Koradi R, Billeter M, and Wuthrich K.
1189: \newblock Molmol: A program for display and analysis of macromolecular
1190: structures.
1191: \newblock {J Mol Graphics} 1996;{14:}51--55.
1192:
1193: \bibitem{NK02}
1194: Kamiya N, Higo J, and Nakamura H.
1195: \newblock Conformational transition states of $\beta$-hairpin peptide between
1196: the ordered and disordered conformations in explicit water.
1197: \newblock {Protein Sci} 2002;{11:}2297--2307.
1198:
1199: \bibitem{NUS03}
1200: Ma B, and Nussinov R.
1201: \newblock Energy landscape and dynamics of the $\beta$-hairpin G peptide and
1202: its isomers: Topology and sequences.
1203: \newblock {Protein Sci} 2003;{12:} 1882--1893.
1204:
1205: \bibitem{Daura}
1206: Daura X, van Gunsteren WF, and Mark AE.
1207: \newblock Folding-unfolding thermodynamics of a heptapeptide from equilibrium
1208: simulations.
1209: \newblock {Proteins} 1999;{34:}269--280.
1210:
1211: \bibitem{DS98}
1212: Derreumaux P, and Schlick T.
1213: \newblock The loop opening/closing motion of the enzyme triosephosphate
1214: isomerase.
1215: \newblock {Biophys J} 1998;{74:}72--81.
1216:
1217: \bibitem{SE95}
1218: Searle MS, Williams DH, and Packman LC.
1219: \newblock A short linear peptide derived from the N-terminal sequence of
1220: ubiquitin folds into a water-stable non-native beta-hairpin.
1221: \newblock {Nat Struct Biol} 1995;{2:}999--1006.
1222:
1223: \bibitem{RA99}
1224: Ramirez-Alvarado M, Kortemme T, Blanco FJ, and Serrano L. (1999).
1225: \newblock beta-hairpin and beta-sheet formation in designed linear peptides.
1226: \newblock {Bioorganic Medicinal Chemistry}{ \bf 7}, 93--103.
1227:
1228: \bibitem{HD02}
1229: Humphrey D, Duggan C, Saha D, Smith D, and K\"{a}s J.
1230: \newblock Active fluidization of polymer networks through molecular motors.
1231: \newblock {Nature} 2002;{416:}413--416.
1232:
1233: \end{thebibliography}
1234:
1235: \end{document}
1236:
1237: