0304:cond-mat0304231/part1

1: \documentstyle[12pt]{report}

2: \def\rbuildrel#1\over#2{\mathrel{\mathop{#2}\limits_{#1}}}

3:

4: \catcode`@=11

5: \def\underline#1{\relax\ifmmode\@@underline#1\else

6:         $\@@underline{\hbox{#1}}$\relax\fi}

7:

8: \def\changefootnote{\def\thefootnote{\fnsymbol{footnote}} }

9: \def\titlepage{\pagestyle{empty}\c@page=0

10:       \def\thefootnote{\fnsymbol{footnote}} }

11: \def\endtitlepage{\pagestyle{plain}\c@page=1

12:       \def\thefootnote{\arabic{footnote}} \c@footnote\z@ }

13: \catcode`@=12

14: \def\sfootnote{\def\thefootnote{\fnsymbol{footnote}}}

15:

16: \def\Deltait{{\mit \Delta}}

17:

18: \newskip\humongous \humongous=0pt plus 1000pt minus 1000pt

19: \def\caja{\mathsurround=0pt}

20: \def\eqalign#1{\,\vcenter{\openup1\jot \caja

21:         \ialign{\strut \hfil$\displaystyle{##}$&$

22:         \displaystyle{{}##}$\hfil\crcr#1\crcr}}\,}

23: \newif\ifdtup

24: \def\panorama{\global\dtuptrue \openup1\jot \caja

25:         \everycr{\noalign{\ifdtup \global\dtupfalse

26:         \vskip-\lineskiplimit \vskip\normallineskiplimit

27:         \else \penalty\interdisplaylinepenalty \fi}}}

28: \def\eqalignno#1{\panorama \tabskip=\humongous

29:         \halign to\displaywidth{\hfil$\displaystyle{##}$

30:         \tabskip=0pt&$\displaystyle{{}##}$\hfil

31:         \tabskip=\humongous&\llap{$##$}\tabskip=0pt

32:         \crcr#1\crcr}}

33:

34: \def\Dscr{{\cal D}}

35: \def\DsC{{\cal C}}

36: \def\DsS{{\cal S}}

37: \def\DsN{{\cal N}}

38: \def\DsE{{\cal E}}

39: \def\*{\hskip .06 cm}

40:

41: \def\thebibliography#1{\section*{\ \markboth

42:  {REFERENCES}{REFERENCES}}\list

43:  {{\arabic{enumi}}.}

44:  {\settowidth\labelwidth{{#1}.}\leftmargin\labelwidth

45:  \advance\leftmargin\labelsep

46:  \usecounter{enumi}}

47:  \def\newblock{\hskip .11em plus .33em minus -.07em}

48:  \sloppy

49:  \sfcode`\.=1000\relax}

50: \let\endthebibliography=\endlist

51: %

52: %

53: \def\thebibliographyp#1{\section*{\ \markboth

54:  {Chan \& Dill, $\quad$ Polymer Principles in Protein Structure

55:  and Stability}{Chan \& Dill, $\quad$ Polymer Principles in Protein Structure

56:  and Stability}}\list

57:  {\arabic{enumi}.}{\settowidth\labelwidth{#1.}\leftmargin\labelwidth

58:  \advance\leftmargin\labelsep

59:  \usecounter{enumi}}

60:  \def\newblock{\hskip .11em plus .33em minus -.07em}

61:  \sloppy

62:  \sfcode`\.=1000\relax}

63: \let\endthebibliographyp=\endlist

64:

65: \def\sqr#1#2{{\vcenter{\vbox{\hrule height.#2pt

66: \hbox{\vrule width.#2pt height#1pt \kern#1pt

67: \vrule width.#2pt}

68: \hrule height.#2pt}}}}

69: \def\square{\mathchoice\sqr34\sqr34\sqr{2.1}3\sqr{1.5}3}

70:

71: %\def\@cite#1#2{({#1\if@tempswa , #2\fi})}

72:

73: \renewcommand{\textfraction}{0.0}

74: \renewcommand{\topfraction}{1}

75: \renewcommand{\bottomfraction}{1}

76: \setcounter{topnumber}{50}

77: \setcounter{bottomnumber}{50}

78: \setcounter{totalnumber}{50}

79: \setlength{\floatsep}{\baselineskip}

80: \setlength{\textfloatsep}{\baselineskip}

81: \renewcommand{\thefigure}{\arabic{figure}}

82: %

83: \def\baselinestretch{1.3}

84: \topmargin = +.1 in

85: \textheight 8.75 in

86: \oddsidemargin = 0.2in

87: \textwidth 450 pt

88: %

89: \begin{document}

90: %

91: %\titlepage

92: \noindent

93: $\null$

94: \hfill April 8, 2003

95: %

96:

97: \vskip 0.6in

98: %

99:

100: \begin{center}

101: %

102: {\Large\bf Contact Order Dependent Protein Folding Rates:}\\

103:

104: \vskip 0.3cm

105:

106: {\Large\bf Kinetic Consequences of a Cooperative Interplay}\\

107:

108: \vskip 0.3cm

109:

110: {\Large\bf Between Favorable Nonlocal Interactions and}\\

111:

112: \vskip 0.3cm

113:

114: {\Large\bf Local Conformational Preferences}\\

115:

116:

117:

118:

119: \vskip .5in

120: %

121: {\bf H\"useyin K{\footnotesize{\bf{AYA}}}}

122: and

123: {\bf Hue Sun C{\footnotesize{\bf{HAN}}}}$^\dagger$\\

124: %

125: $\null$

126:

127: Protein Engineering Network of Centres of Excellence (PENCE),\\

128: Department of Biochemistry, and

129: Department of Medical Genetics \& Microbiology,

130: Faculty of Medicine, University of Toronto,

131: Toronto, Ontario M5S 1A8, Canada\\

132:

133:

134: %

135:

136:

137: %{\tt Submitted to ""}

138: %{\tt To appear in ""}

139: %

140:

141: \end{center}

142: %

143:

144: \vskip 1cm

145:

146: \noindent

147: {\bf Running title:} Physics of Contact-Order Dependent Protein Folding \\

148:

149: \vskip 1cm

150:

151: \noindent {\bf Key words:}

152: calorimetry / chevron plot / G\=o models /

153: simple two-state kinetics /\\ single-domain proteins / nonadditivity

154:

155: $\null$\\

156: %

157:

158: %\vskip 0.8in

159:

160: \noindent

161: $^\dagger$ Corresponding author.\\

162: E-mail address of Hue Sun C{\footnotesize{HAN}}:

163: chan@arrhenius.med.toronto.edu\\

164: Tel: (416)978-2697; Fax: (416)978-8548\\

165: Mailing address: Department of Biochemistry, University of Toronto,

166: Medical Sciences Building -- 5th Fl., 1 King's College Circle,

167: Toronto, Ontario M5S 1A8, Canada.

168: %

169:

170: \vfill\eject

171: %\endtitlepage

172: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

173: %

174:

175: \def\thefootnote{\fnsymbol{footnote}}

176:

177: \noindent

178: {\large\bf Abstract}\\

179:

180: \vskip .2 in

181:

182: \noindent

183: Physical mechanisms underlying the empirical correlation between

184: relative contact order (CO) and folding rate among naturally-occurring

185: small single-domain proteins are investigated by evaluating postulated

186: interaction schemes for a set of three-dimensional 27mer lattice

187: protein models with 97 different CO values. Many-body interactions are

188: constructed such that contact energies become more favorable when short

189: chain segments sequentially adjacent to the contacting residues adopt

190: native-like conformations. At a given interaction strength, this

191: scheme leads to folding rates that are logarithmically well correlated

192: with CO (correlation coefficient $r=0.914$) and span more than 2.5 orders of

193: magnitude, whereas folding rates of the corresponding G\=o models with

194: additive contact energies have much less logarithmic correlation with CO

195: and span only approximately one order of magnitude.

196: The present protein chain models also exhibit calorimetric cooperativity

197: and linear chevron plots similar to that observed experimentally for

198: proteins with apparent

199: simple two-state folding/unfolding kinetics. Thus, our findings

200: suggest that CO-dependent folding rates of real proteins may arise

201: partly from a significant positive coupling between nonlocal contact

202: favorabilities and local conformational preferences.

203:

204: %**************************************************************************

205: \vfill\eject

206:

207: $\null$

208: \vskip -1cm

209:

210: %------------------------------------------------------------------------------

211:

212: \centerline{\bf INTRODUCTION}

213:

214: $\null$

215:

216: \noindent

217: {\bf Generic protein properties as energetic constraints}

218:

219: The folding of many small single-domain proteins is well approximated

220: by simple two-state thermodynamics and kinetics.$^{1,2}$ In

221: the past several years, we have shown that fundamental insights

222: into protein energetics can be gained by using these general, apparently

223: mundane properties as experimental constraints on protein chain

224: models.$^{3-10}$ This approach is based on the recognition

225: that model interaction schemes capable of producing these commonly observed

226: experimental properties are, somewhat surprisingly, not entirely

227: straightforward to come up with. To date, much advance has been made by

228: coarse-grained modeling of protein folding.$^{7,11-15}$ Nonetheless,

229: the interactions postulated by many existing models are insufficient for

230: calorimetric two-state cooperativity.$^{3,4}$ Furthermore, even common

231: G\=o models are not cooperative enough for simple two-state kinetics,

232: their explicit native biases notwithstanding. Specifically, we recently

233: found that several lattice$^{6,9,10}$ and continuum (off-lattice)$^{8}$

234: G\=o-like formulations with essentially additive interaction schemes all

235: led to chevron rollovers --- a hallmark of folding kinetics that are often

236: operationally referred to as non-two-state.$^{9}$ Apparently, many-body

237: interactions are needed to produce chevron plots with linear folding and

238: unfolding arms consistent with a two-state description of equilibrium

239: thermodynamics.$^{10}$

240: \\

241:

242: Small single-domain proteins are characterized as well by a significant

243: correlation between relative contact order (CO) and folding rate.$^{16}$

244: Therefore, it is only logical to require a model protein interaction

245: scheme to produce a similar correlation.$^{17,18}$ Ising-like$^{19,20}$ and

246: other$^{21,22}$ constructs without explicit chain representations have had

247: successes in this regard.  However, as for thermodynamic and kinetic

248: cooperativities, achieving the CO dependence requirement in models with

249: explicit chain representations appears to be a nontrivial task.

250: Notably, an early lattice model study using a 20-letter alphabet suggested

251: that proteins with higher CO should fold faster,$^{23}$ thus predicting

252: a trend opposite$^{17}$ to that for real single-domain proteins.$^{16,18}$

253: A more recent 20-letter lattice model investigation, on the other hand,

254: found modest correlations between increasing CO and longer logarithmic

255: folding time (correlation coefficient $r\approx 0.70$--$0.79$ for chain

256: lengths $\ge 54$).$^{24}$ An earlier continuum G\=o model studies of 18

257: proteins also found a modest correlation between increasing CO and slower

258: logarithmic folding rates ($r=0.69$).$^{25}$ But the corresponding dispersion

259: in simulated folding rates covers only $\approx 1.5$ orders of magnitude,

260: which is much narrower than the $\approx 5$ orders of magnitude covered by

261: the real folding rates of the proteins in the given dataset. When a different

262: potential function was used in a more recent continuum G\=o model analysis,

263: however, no correlation between CO and simulated folding rates was

264: discerned.$^{26}$

265: \\

266:

267: Recently, based on lattice 27mer simulations, Jewett et al.$^{27}$ have

268: proposed that enhanced thermodynamic cooperativity and

269: many-body interactions --- which are basic properties of individual

270: two-state proteins to begin with$^{1-10}$ --- may also be a key to

271: understand the correlation between CO and folding rate across different

272: proteins. This is an attractive and insightful idea. However, the particular

273: way in which thermodynamic cooperativity was enhanced by these authors

274: led only to modest increases in folding rate dispersion relative to that

275: for the corresponding lattice G\=o models with pairwise additive contact

276: energies. Both the dispersion in folding rates and the correlation of

277: logarithmic folding rate with CO ($r=0.75$) for the most cooperative

278: interaction scheme they reported were

279: similar to that obtained from an earlier continuum G\=o model study,$^{25}$

280: as well as that from a recent simulation of 20-letter lattice models$^{24}$

281: with only pairwise additive contact energies (see above). In our view,

282: these results suggest that while CO-dependent folding may well derive from

283: certain intraprotein interactions that are also responsible for high

284: thermodynamic cooperativity, CO-dependent folding does not arise from

285: thermodynamic cooperativity {\it per se}. In other words, how cooperativity

286: is achieved can be critically important. Many {\it a priori} many-body

287: mechanisms are consistent with high thermodynamic cooperativity. An example

288: is the two rather different interaction schemes we considered

289: in ref.~10 --- one involves local-nonlocal coupling while the other

290: assigns an extra favorable energy to the ground-state structure

291: as a whole. But perhaps not all such mechanisms can mimic experimentally

292: observed CO dependencies to the same degree. Therefore, to shed light on

293: the physical mechanisms of CO-dependent folding, we endeavor to construct

294: an interaction scheme that would provide larger dispersions in folding

295: rates and better correlations with CO.

296: \\

297:

298: $\null$

299:

300: %------------------------------------------------------------------------------

301:

302: \centerline{\bf MODELS AND METHODS}

303:

304: $\null$

305:

306: The present study focuses on the idea of a cooperative interplay

307: between local conformational preferences and the contact-like interactions

308: that drive the packing of the protein core.$^{3,5,6,10}$ We have shown

309: that chain models embodying this idea can lead to calorimetric

310: cooperativity and simple two-state kinetics,$^{10}$ although our

311: exploration thus far has been limited to model proteins that are

312: mostly helical.$^{3,5,6,10}$ Here we consider a general formulation

313: of this idea, the basic ingredients of which are described by Fig.~1A.

314: This hypothesis may be viewed as a synthesis of the local-dominant and

315: the nonlocal-dominant perspectives.$^{28}$ We were motivated by the recognition

316: that both local$^{29,30}$ and nonlocal$^{31,32}$ intraprotein interactions

317: are important determinants of protein structure and stability. Yet local

318: conformational preferences alone are often insufficient for stable secondary

319: structures under physiological conditions. Secondary structure formation

320: is known to be context dependent;$^{33}$ they are stable when packed in the

321: core of a protein but are usually not stable in isolation (ref.~31 and

322: references therein). Furthermore, conformational space grows exponentially

323: with chain length, even when preferences arising from local excluded

324: volume effects are taken into account.$^{34}$ It follows that a large

325: part of the stability and uniqueness of protein native structures cannot

326: be explained by local interactions alone.$^{35}$ On the other hand,

327: our recent G\=o-model studies have shown that nonlocal contact-like

328: interactions by themselves are not cooperative enough for simple

329: two-state kinetics$^{6,8-10}$ if they are not coupled to local

330: conformational propensities.

331: \\

332:

333: \noindent

334: {\bf A simple model of local-nonlocal coupling}

335:

336: Here we explore the hypothesis in Fig.~1A by incorporating its form of

337: local-nonlocal coupling into a new interaction scheme in Fig.~1B

338: for explicit-chain models configured on three-dimensional simple

339: cubic lattices. This allows the idea to be tested quantitatively. Fig.~1B

340: may be viewed as a generalization of similar constructs we have

341: used previously in the context of helical proteins.$^{3,5,6,10}$

342: As a first step in our inquiry, we make the simplifying assumption

343: that the interactions are native-centric,$^{25-27,31,36-38}$ in that

344: only native interactions are favored, while nonnative interactions

345: are neutral (have zero energy). The local-nonlocal coupling in Fig.~1B

346: involves nonadditive many-body interactions. A chain segment which is

347: locally nativelike (with native bond and torsion angles) but make no

348: native contact is not stabilized (contributing zero energy). On the other

349: hand, nonlocal contact interactions between monomers far apart along

350: the chain sequence are more favorable when the chain segments around the

351: contacting residues are in their native conformations than when they are not.

352: As such, the present model differs from models that additively combine contact

353: energies and local favorabilities.$^{39}$ The importance of nonadditive

354: many-body effects in protein folding has been recognized,$^{3,5,6,10,40-44}$

355: but they have not been used extensively to model calorimetric two-state

356: cooperativity and linear chevron plots.$^{3-10}$ Our aim here is to

357: utilize extremely coarse-grained representations as a computationally

358: efficient means to explore the general principles linking CO-dependent

359: folding and proteinlike cooperativities. Many structural and

360: energetic details of real proteins are beyond the scope of this work.

361: In particular, the present work does not deal with the microscopic

362: physical origins of local-nonlocal coupling. Instead we just presume

363: that its presence in naturally occurring proteins could arise from

364: evolutionary design. Because of these, the simple interaction scheme

365: in Fig.~1B should be viewed only as a tentative model in this regard.

366: \\

367:

368: In order to examine the folding rates of a set of model proteins

369: whose native structures cover a diverse range of CO values, we now consider

370: chains of length $n=27$ configured on simple cubic lattices. For these

371: 27mers, there are 103,346 distinct maximally compact conformations

372: (not related by rotations or inversions)$^{45,46}$ confined to a

373: $3\times 3\times 3$ cube. The distribution of CO among these maximally

374: compact conformations covers 97 different values$^{27}$ from

375: CO $=$ $208/756=0.275$ to $402/756=0.532$ (inset of Fig.~2A,

376: where CO is computed using equation~1 of ref.~16). For each CO value,

377: we randomly choose a maximally compact 27mer conformation as

378: the native structure of a model protein (Table~I).\footnote{Since the

379: present choices of structures are independent of that by

380: Jewett et al.,$^{27}$ the structures listed in Table I do not necessarily

381: coincide with those used in their study.}

382: \\

383:

384: Folding and unfolding kinetics are modeled by standard Monte Carlo

385: simulations using the Metropolis criterion and the elementary

386: chain moves of end flips, corner flips, crankshafts, and rigid

387: rotations. The relative frequencies of attempting these moves are

388: 4.7\%, 58.3\%, 27\%, and 10\% respectively (c.f. ref.~6)\footnote{The

389: following typographical error in ref.~6 should be corrected. The relative

390: attempt frequencies of corner flips and crankshafts used in this prior

391: study of ours were, respectively, 60.6\% and 27\%, not the 27\% and 60.6\%

392: stated on p.~901 of ref.~6.}

393: Time is measured by the number of attempted Monte Carlo moves for a given

394: process. The set of elementary chain moves is chosen to mimic physically

395: plausible processes. Lattice model kinetics are dependent on

396: the choice of move set.$^{12}$ Nonetheless, we expect the general trend

397: predicted by the model is less sensitive to move set when kinetics are

398: not dominated by trapping events,$^{12}$ as is the case here and has

399: been verified by Jewett et al.$^{27}$

400: Progress towards the native state is tracked by the fractional

401: number of native contacts $Q$ (ref.~3--6). To ascertain the implications

402: of the local-nonlocal coupling we proposed, results from a highly cooperative

403: interaction scheme with $a=0.1$ are compared with that from the additive

404: scheme ($a=1$) of common G\=o models (c.f. Fig.~1B). Folding trajectories

405: are initiated at a randomly generated conformation; folding first passage

406: time is defined by the formation of the $Q=1$ ground-state conformation.

407: Unfolding trajectories are initiated at the ground-state conformation;

408: unfolding first passage time is the time it takes for the chain to

409: be left with three or fewer native contacts ($Q\le 3/28$); $Q=3/28$

410: is chosen to define unfolding because it coorresponds approximately

411: to the free energy minimum for the denatured state.

412: \\

413:

414: $\null$

415:

416: %------------------------------------------------------------------------------

417:

418: \centerline{\bf RESULTS}

419:

420: $\null$

421:

422: \noindent

423: {\bf Sensitivity of folding rate on CO enhanced by local-nonlocal coupling}

424:

425: Fig.~2 provides the correlation between CO and folding rate among

426: our 27mer models. It shows clearly that the local-nonlocal coupling

427: mechanism postulated in Fig.~1 can lead to a significant enhancement of

428: correlation as well as much increased sensitivity of folding rate to CO.

429: Whereas the dispersion in folding rates among the common additive

430: G\=o models in Fig.~2A covers only approximately one order of magnitude

431: (a factor of ten) and the logarithmic folding rates exhibit only a relatively

432: weak correlation with CO (correlation coefficient $r= 0.63$), the corresponding

433: dispersion among the $a=0.1$ cooperative models in Fig.~2B covers

434: approximately 2.5 to 3 orders of magnitude, with a strong correlation

435: between CO and logarithmic folding rate ($r=0.914$) comparable to that

436: observed among

437: a selection of real, small single-domain proteins.$^{18}$ Similar to

438: the corresponding experimental situations,$^{16,18}$ the comparisons in

439: Fig.~2 were performed under conditions for which folding relaxation is

440: essentially single-exponential, as is evident from the good agreements

441: in Fig.~2 between median first passage time divided by $\ln 2$

442: and the corresponding mean first passage time.$^{6,47}$ To better

443: delineate the effects of having weakened contact interactions when

444: the chain segments locally adjacent to the contacting residues are

445: nonnative, several $a$ values other than the $a=0.1$ used for the main

446: plot are compared in the inset of Fig.~2B. It shows CO-dependent

447: folding at different levels of local-nonlocal coupling (different

448: $a$ values) for several 27mers with representative CO's.

449: The $a=0$ case here corresponds to complete interdependence between

450: nonlocal contact and local structure. This inset indicates

451: that sensitivity of folding rate to CO increases

452: (the fitted line has a more negative slope) with decreasing $a$, and that

453: the behavior of the $a=0.1$ models is very similar to that of the

454: $a=0$ models. These results

455: further affirm that local-nonlocal coupling is a key ingredient for

456: the good correlation between CO and fold rate in these models. Nevertheless,

457: as for real proteins,$^{16,18}$ despite the good correlation, CO by itself

458: cannot predict folding rates of the present models with high accuracy.

459: Folding rates here can vary significantly for different structures with

460: the same CO as well. For example, for the particular 27mer with CO

461: $=346/756=0.458$ in Fig.~2B, the datapoint

462: $\log_{10}({\rm folding\ rate})=-5.75$ may be viewed as an ``outlier''

463: vis-\`a-vis the fitted line. However, for two other 27mers with the

464: same CO but do not belong to the randomly chosen set in Table~I (and

465: therefore not plotted and not used in the correlation analysis of

466: Fig.~2B), we found $\log_{10} ({\rm folding\  rate})=-7.26$ and $-7.60$,

467: which happen to be much closer to the fitted line in Fig.~2B.

468: The reasons behind variations in folding rates among structures

469: with same CO remain to be elucidated.

470: \\

471:

472: \noindent

473: {\bf A consistent model of thermodynamic and kinetic cooperativity}

474:

475: Fig.~3 provides further analyses of the folding/unfolding kinetics of

476: one example 27mer structure we choose to study in more detail.

477: Consistent with our previous results,$^{6,8-10}$ it shows that the model

478: chevron plot$^{48}$ predicted by the common additive G\=o potential (upper

479: plot) deviates significantly from simple two-state kinetics in that it

480: exhibits a severe rollover under only moderately native conditions. More

481: specifically, for this case rollover becomes significant at

482: ${\cal E}/k_{\rm B}T$ values that are only slightly more negative (more

483: favorable to folding) than that of the transition midpoint

484: (${\cal E}/k_{\rm B}T\approx -1.43$). In contrast, the chevron plot

485: predicted by the model with a substantial local-nonlocal coupling (lower

486: plot) is qualitatively similar to that of real, small single-domain

487: proteins that fold and unfold with simple two-state kinetics.$^{10}$

488: In particular, it has essentially linear folding and unfolding arms over an

489: extended range of ${\cal E}/k_{\rm B}T$ values. We have also obtained

490: for this model the equilibrium free energy of unfolding

491: $\Delta G_{\rm u}$ as a function of ${\cal E}/k_{\rm B}T$, where

492: $\Delta G_{\rm u}$ here is taken to be that between the unique $Q=1$

493: conformation and those with $Q\le 3/28$. (The same definition is used for

494: unfolding kinetics as stated above.) Because $\Delta G_{\rm u}$ is

495: essentially linear in ${\cal E}/k_{\rm B}T$, the linearity of the

496: chevron arms over an extended ${\cal E}/k_{\rm B}T$

497: range implies an essentially linear relationship between folding/unfolding

498: rates and $\Delta G_{\rm u}$ within the corresponding regime (i.e., the

499: model parameter ${\cal E}$ may be eliminated in favor of the lower horizontal

500: scale in Fig.~3). Furthermore, comparing the mean first passage times in Fig.~3

501: versus the corresponding median first passage times divided by $\ln 2$ shows

502: that folding or unfolding relaxation for this model is essentially

503: single exponential$^{6,47}$ for $\Delta G_{\rm u}<$ $10k_{\rm B}T$.

504: Essentially single-exponential folding under moderately folding conditions

505: is further demonstrated by an approximately linear logarithmic

506: distribution of first passage time$^{8,9,49}$ shown in the inset.

507: Similar to the cooperative models we recently investigated,$^{10}$

508: for the model with local-nonlocal coupling in Fig.~3, the thermodynamic

509: $\Delta G_{\rm u}$ values matches well with the kinetically obtained

510: quantity $-k_{\rm B}T\ln [({\rm folding\ rate})/({\rm unfolding\ rate})]$

511: for $\Delta G_{\rm u}$ ranging from $10k_{\rm B}T$ to $-6k_{\rm B}T$

512: (lower V-shape). In other words, the folding/unfolding kinetics of this

513: model is simple two-state$^{6,8-10}$ within a $\Delta G_{\rm u}$

514: range quite similar to that experimentally accessible to small

515: single-domain proteins.$^{10}$ Finally, the cooperative model in Fig.~3

516: is also calorimetrically two-state. Assuming that the interactions are

517: temperature independent, the model's van't Hoff to calorimetric enthalpy

518: ratio $\Delta H_{\rm vH}/\Delta H_{\rm cal}$ ($\kappa_2$ without baseline

519: subtraction$^{4}$) is determined to be $0.992$ (detailed calculation not

520: shown), satisfying the requirement of

521: $\Delta H_{\rm vH}/\Delta H_{\rm cal}\approx 1$ for two-state

522: thermodynamics.$^{3-5}$ Taken together, the above considerations imply that

523: the local-nonlocal coupling mechanism for enhanced CO-dependent folding

524: in Fig.~2B also provides --- as it should --- a consistent account of

525: thermodynamic and kinetic cooperativities$^{6,8-10}$ in simple two-state

526: proteins (Fig.~3).

527: \\

528:

529: As it stands, the transition midpoints of all 27mers considered here

530: with the local-nonlocal coupling parametrized by $a=0.1$ are very close

531: to one another. This is because the interaction scheme in Fig.~1B assigns

532: the same energy ($=28{\cal E}$) to every ground-state conformation.

533: This is a simplifying assumption in the present modeling setup.

534: Since the thermodynamic stabilities of real, small single-domain proteins

535: are quite diverse,$^{16,18}$ it is important to note that, in a broader

536: perspective, our hypothesis that significant CO-dependent folding can

537: emerge from local-nonlocal coupling is not contingent upon the different

538: proteins in question having very similar thermodynamic stabilities. In

539: more sophisticated models, for example, an extra favorable energy that

540: differs from one 27mer to another may be assigned to the ground-state

541: conformation (i.e., a different $E_{\rm gs}$ term as defined in ref.~10

542: for each 27mer). In that case, the

543: thermodynamic stabilities of different 27mers can be very different,

544: but their folding rates would not be affected by this extra feature of

545: the model. In other words, the correlation between CO and folding rate

546: in Fig.~2B would remain unchanged. As we have recently argued,$^{10}$

547: such extra stabilizing energies for the ground state as a whole are

548: physical plausible because experimental evidence$^{50}$ indicates that

549: in real proteins there is a partial separation between the driving

550: forces for folding kinetics and the interactions responsible for

551: thermodynamic stability.

552: \\

553:

554: $\null$

555:

556: %------------------------------------------------------------------------------

557:

558: \centerline{\bf DISCUSSION}

559:

560: $\null$

561:

562: Energy landscapes of the present models are further characterized in

563: Fig.~4 for three representative structures with low, intermediate,

564: and high CO values. In this figure, the low- and high-CO structures are,

565: respectively, the fastest and slowest folding among the 97 structures

566: in Table~I, whereas the intermediate-CO structure is the one analyzed

567: in Fig.~3. For the common additive G\=o potential, energy $E$ is

568: directly proportional to $Q$ ($E={\cal E}Q$). However, for the

569: cooperative models with local-nonlocal coupling, there are multiple

570: energy levels for each $Q$, with $E={\cal E}Q$ as the lower bound

571: (left panels of Fig.~4). This means that, on average, the energetic

572: separations between non-ground-state and ground-state conformations in the

573: cooperative models with local-nonlocal coupling are larger than that in the

574: additive G\=o models. This feature is demonstrated directly in the right

575: panels of Fig.~4, which show that the number of non-ground-state conformations

576: within a given energy range is smaller for the cooperative models than for

577: the additive G\=o models except for the highest energies ($E\approx 0$).

578: It follows that the overall thermodynamic cooperativities of the models

579: with local-nonlocal coupling are substantially higher than that of the

580: corresponding additive G\=o models. This behavior is expected as well from

581: our recent finding that simple two-state folding/unfolding kinetics (Fig.~3

582: above) requires ``near-Levinthal'' thermodynamic cooperativity.$^{10}$

583: Indeed, for the three models in Fig.~4 with local-nonlocal coupling,

584: the van't Hoff to calorimetric enthalpy ratios

585: $\Delta H_{\rm vH}/\Delta H_{\rm cal}$ are, from top to bottom,

586: $\kappa_2=$ $0.972$, $0.992$, and $0.998$. These values are extremely

587: high for model enthalpy ratios without baseline subtractions.$^{4}$

588: In contrast, the corresponding additive G\=o models are less cooperative,

589: with $\kappa_2=$ $0.751$, $0.861$, and $0.878$. Here it is noteworthy

590: that the additive G\=o models' $\Delta H_{\rm vH}/\Delta H_{\rm cal}$

591: ratios even after empirical baseline subtractions,$^{4}$

592: $\kappa_2^{({\rm s})}=$ $0.885$, $0.961$, and $0.962$, are lower than

593: the $\Delta H_{\rm vH}/\Delta H_{\rm cal}$ ratios of the

594: cooperative models in the absence of baseline subtractions.

595: \\

596:

597: \noindent

598: {\bf Contact-order dependence indicative of special mechanisms

599: of cooperativity}\\

600:

601: Obviously, thermodynamic cooperativity is a necessary ingredient

602: for any protein chain model that purports to rationalize the

603: generic properties of small single-domain proteins.$^{3-10}$

604: For the particular interaction scheme we consider, the above analysis shows

605: that features that give rise to significant CO-dependent folding also

606: lead to high thermodynamic cooperativity. However, the converse

607: is not necessarily true. More in-depth considerations and a comparison

608: of the present results with that of Jewett et al.$^{27}$ indicate that

609: higher thermodynamic cooperativity {\it per se} does not necessarily give

610: rise to more enhanced dependence of folding rate on CO. Our reasoning

611: is as follows. First, for the present set of 27mer structures we have

612: chosen randomly, the correlation between logarithmic folding rate and CO

613: is quantified by $r=0.63$ ($r^2=0.39$) for the additive G\=o interaction

614: scheme. Despite that this correlation happens to be weaker than that

615: of Jewett et al.'s collection of additive G\=o models (their $r^2=0.51$),

616: after cooperativity is enhanced by local-nonlocal coupling, the correlation

617: between logarithmic folding rate and CO for our $a=0.1$ models is much

618: higher ($r^2=0.84$, see Fig.~2 above, an improvement in $r^2$ value

619: of $0.33$)\footnote{

620: Because all the model chains in the present study have the same length

621: and the same number of native contacts, their correlation coefficient

622: between folding rate and CO is the same as that between folding rate

623: and the total contact distance (TCD) defined in ref.~51.

624: }

625: than the best case reported by Jewett et al.$^{27}$ ($r^2=0.57$

626: for their $s=3$, an improvement in $r^2$

627: value of $0.06$ over that for their additive G\=o models).\footnote{

628: If the $s=3$ interaction scheme of Jewett et al. is applied to the

629: present set of structures and kinetic models, we found $r^2=0.65$

630: for the correlation between CO and folding rate. In

631: this case, the folding rates span $\approx 1.8$ orders of magnitude;

632: see ref.~52 for details.

633: }

634: Second, the folding rates of our cooperative models are much more

635: sensitive to CO, covering 2.5 to 3 orders of magnitude, whereas those

636: of Jewett et al. cover only approximately 1.3 orders of magnitude.

637: This means that the present local-nonlocal coupling mechanism is

638: significantly more effective in enhancing CO dependence than the

639: nonlinear $E$--$Q$ relationship postulated by Jewett et al. (equation~1

640: of ref.~27). Their interaction scheme does not make direct reference to

641: chain conformations as such. Thermodynamic cooperativity is

642: enhanced in their models by stipulating that the total contact energy $E$ (for

643: a given conformation as a whole) does not decrease (does not become more

644: favorable) linearly with increasing $Q$ as in common G\=o models;

645: but rather decreases at progressively faster and faster rates when $Q$

646: is closer to unity.\footnote{

647: Jewett et al. suggested that the ``extraordinary

648: cooperativity in protein folding'' may originate from ``three-body

649: interactions.'' But how three-body interactions might lead to their

650: $E$--$Q$ relationship remains to be elucidated.}

651: Third, in fact, if thermodynamic cooperativity is further increased

652: in the interaction scheme of Jewett et al. by increasing their $s$ parameter,

653: the energy landscape will eventually become a Levinthal golf course in the

654: $s\rightarrow\infty$ limit. In that case, folding would be rate-limited

655: by random conformational search and CO-dependence would be all but

656: eliminated. Fourth, in this connection, we have recently considered three

657: 27mer models with CO $=0.28$, $0.40$ and $0.51$ in a separate study. The

658: thermodynamic cooperativity of these models are enhanced by

659: assigning an extra stabilizing energy to the ground state but without

660: local-nonlocal coupling.$^{10}$ For the energetic parameters we considered,

661: the folding rates of these models cover less than an order of

662: magnitude.$^{10}$ The same set of results also indicated that dispersion

663: in folding rates under moderately folding conditions would decrease if

664: thermodynamic cooperativity is increased by assigning an even stronger

665: stabilizing energy to the ground state, in a manner similar to greatly

666: increasing $s$ in Jewett et al.'s formulation. Taken together, these

667: observations lead us to the conclusion that while thermodynamic

668: cooperativity is certainly necessary, by itself it is not sufficient

669: to guarantee CO-dependent folding rates similar to that observed

670: experimentally$^{16,18}$ if the underlying mechanism for thermodynamic

671: cooperativity is not specified.

672: \\

673:

674: CO-dependent folding highlights the important role of local interactions

675: in determining folding rates.$^{16-18}$ It suggests that the mechanism

676: of folding may involve relatively fast formation of local structure.

677: In this regard, we note that under the general lattice scheme in Fig.~1B,

678: formation of strong (unattenuated) native contacts with contact order

679: $\vert j - i \vert=3$ is relatively easier than formation of strong

680: native contacts with higher contact orders. This is because in the

681: $\vert j - i \vert=3$ case there is an overlap between parts of the two

682: local segments that have to be nativelike in order

683: for the contact to be strong. Physically, how a general

684: mechanism similar to that in Fig.~1 may arise in real proteins from

685: solvent-mediated atomic interactions such as sidechain packing

686: and hydrogen bonding remains to be elucidated.

687: Many basic issues will have to be tackled to address this question.

688: For example, correlations between backbone and sidechain rotamer

689: conformations$^{53}$ may contribute to such a mechanism.

690: Another possibility is that aspects of {\it anti-cooperativity}

691: of certain types of hydrophobic interactions$^{54}$ may help disfavor

692: premature nonspecific hydrophobic collapse (which would lead to kinetic

693: trapping$^{14}$) when the sidechains are locally less well packed than

694: that in the native state. If this is the case, it could give rise to

695: local-nonlocal coupling mechanisms similar to that postulated in Fig.~1.

696: \\

697:

698: In summary, while the models used in the present study are rudimentary,

699: they provide strong evidence that a cooperative interplay between local

700: conformational preferences and nonlocal favorable contact-like

701: interactions is an important mechanism in accounting for experimentally

702: observed CO-dependent folding of small single-domain proteins.

703: We are optimistic that more rigorous applications of

704: the CO-dependence constraint as well as the thermodynamic and kinetic

705: cooperativity requirements would help further narrow down theoretical

706: possibilities and thus contribute to a more realistic understanding of protein

707: energetics.

708: \\

709:

710:

711:

712: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

713:

714: %---------------------------------------------------------------------------

715:

716:

717: $\null$

718:

719: %===========================================================================

720:

721: \noindent

722: {\Large Acknowledgments.}

723: We thank Robert L. Baldwin, Alan Davidson, Teresa Head-Gordon, Michael

724: Levitt, Vijay Pande, Kevin Plaxco, Steve Plotkin, Wes Stites and Yaoqi Zhou

725: for helpful discussions, and Vijay Pande and Kevin Plaxco for kindly sharing

726: their work (ref.~27) before publication. The research reported

727: here was partially supported by the Canadian Institutes of Health

728: Research (CIHR grant no. MOP-15323), a Premier's Research Excellence Award

729: from the Province of Ontario, and the Ontario Centre for Genomic Computing

730: at the Hospital for Sick Children in Toronto. H. S. C. is a Canada

731: Research Chair in Biochemistry.

732:

733: %===========================================================================

734: \vfill\eject

735:

736:

737: \par\vfill\eject

738:

739: \noindent

740: {\large\bf References}

741:

742: \kern -1.5cm

743:

744: \begin{thebibliography}{99}

745:

746: %XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

747:

748: \bibitem{1}

749: Jackson SE, Fersht AR.

750: Folding of chymotrypsin inhibitor 2. 1. Evidence for a two-state

751: transition. Biochemistry 1991;30:10428--10435.

752:

753: \bibitem{2}

754: Baker D.

755: A surprising simplicity to protein folding. Nature 2000;405:39--42.

756:

757: \bibitem{3}

758: Chan HS.

759: Modeling protein density of states: Additive hydrophobic

760: effects are insufficient for calorimetric two-state cooperativity.

761: Proteins 2000;40:543--571.

762:

763: \bibitem{4}

764: Kaya H, Chan HS.

765: Polymer principles of protein calorimetric

766: two-state cooperativity.

767: Proteins 2000;40:637--661 [Erratum: Proteins 2001;43:523].

768:

769: \bibitem{5}

770: Kaya H, Chan HS.

771: Energetic components of cooperative protein folding.

772: Phys Rev Lett 2000;85:4823--4826.

773:

774: \bibitem{6}

775: Kaya H, Chan HS.

776: Towards a consistent modeling of protein thermodynamic and kinetic

777: cooperativity: How applicable is the transition state picture to

778: folding and unfolding? J Mol Biol 2002;315:899--909.

779:

780: \bibitem{7}

781: Chan HS, Kaya H, Shimizu S. Computational methods

782: for protein folding: scaling a hierarchy of complexities.

783: In: Jiang T, Xu Y, Zhang MQ, editors. Current Topics in Computational

784: Molecular Biology. Cambridge, MA: The MIT Press; 2002. p 403--447.

785:

786: \bibitem{8}

787: Kaya H, Chan HS.

788: Solvation effects and driving forces for protein thermodynamic and

789: kinetic cooperativity: How adequate is native-centric topological

790: modeling? J Mol Biol 2003;326:911--931.

791:

792: \bibitem{9}

793: Kaya H, Chan HS.

794: Origins of chevron rollovers in non-two-state protein folding kinetics.

795: Submitted (2003); [cond-mat/0302305,\\

796: {\tt http://xxx.lanl.gov/abs/cond-mat/0302305}].

797:

798: \bibitem{10}

799: Kaya H, Chan HS.

800: Simple two-state protein folding kinetics requires near-Levinthal

801: thermodynamic cooperativity. Submitted (2003);

802: [cond-mat/0302306,

803: {\tt http://xxx.lanl.gov/abs/cond-mat/0302306}].

804:

805: \bibitem{11}

806: Bryngelson JD, Onuchic JN, Socci ND, Wolynes PG.

807: Funnels, pathways, and the energy landscape of protein folding: A

808: synthesis. Proteins 1995;21:167--195.

809:

810: \bibitem{12}

811: Dill KA, Bromberg S, Yue K, Fiebig KM, Yee DP, Thomas PD,

812: Chan HS.

813: Principles of protein folding --- A perspective from simple

814: exact models. Protein Sci. 1995;4:561--602.

815:

816: \bibitem{13}

817: Thirumalai D, Woodson SA. Kinetics of folding of proteins

818: and RNA. Acc Chem Res 1996;29:433--439.

819:

820: \bibitem{14}

821: Chan HS, Dill KA. Protein folding in the landscape

822: perspective: Chevron plots and non-Arrhenius kinetics.

823: Proteins 1998;30:2--33.

824:

825: \bibitem{15}

826: Mirny L, Shakhnovich E.

827: Protein folding theory: From lattice to all-atom models.

828: Annu Rev Biophys Biomol Struct 2001;30:361--396.

829:

830: \bibitem{16}

831: Plaxco KW, Simons KT, Baker D.

832: Contact order, transition state placement and the refolding rates

833: of single domain proteins. J Mol Biol 1998;227:985--994.

834:

835: \bibitem{17}

836: Chan HS.

837: Matching speed and locality. Nature 1998;392:761--763.

838:

839: \bibitem{18}

840: Plaxco KW, Simons KT, Ruczinski I, Baker D. (2000). Topology,

841: stability, sequence, and length: Defining the determinants of two-state

842: protein folding kinetics. Biochemistry 2000;39:11177--11183.

843:

844: \bibitem{19}

845: Alm E, Baker D. Prediction of protein-folding mechanisms from

846: free-energy landscapes derived from native structures.

847: Proc Natl Acad Sci USA 1999;96:11305--11310.

848:

849: \bibitem{20}

850: Mu{\~n}oz V, Eaton WA.

851: A simple model for calculating the kinetics of

852: protein folding from three-dimensional structures.

853: Proc Natl Acad Sci USA 1999;96:11311--11316.

854:

855: \bibitem{21}

856: Debe DA, Goddard WA.

857: First principles prediction of protein folding

858: rates. J Mol Biol 1999;294:619--625.

859:

860: \bibitem{22}

861: Makarov DE, Keller CA, Plaxco KW, Metiu H.

862: How the folding rate constant of simple, single-domain proteins depends

863: on the number of native contacts.

864: Proc Natl Acad Sci USA 2002;99:3535--3539.

865:

866: \bibitem{23}

867: Abkevich VI, Gutin AM, Shakhnovich EI.

868: Impact of local and nonlocal interactions on thermodynamics and kinetics

869: of protein folding. J Mol Biol 1995:252:460--471.

870:

871: \bibitem{24}

872: Faisca PFN, Ball RC. Topological complexity, contact order, and

873: protein folding rates. J Chem Phys 2002;117:8587--8591.

874:

875: \bibitem{25}

876: Koga N, Takada S.

877: Roles of native topology and chain-length scaling in protein folding:

878: A simulation study with a G\=o-like model.

879: J Mol Biol 2001;313:171--180.

880:

881: \bibitem{26}

882: Cieplak M, Hoang TX.

883: Universality classes in folding times of proteins.

884: Biophys J 2003;84:475--488.

885:

886: \bibitem{27}

887: Jewett AI, Pande VS, Plaxco KW.

888: Cooperativity, smooth energy landscapes and the origins of

889: topology-dependent protein folding rates. J Mol Biol 2003;326:247--253.

890:

891: \bibitem{28}

892: Uversky VN, Fink AL.

893: The chicken-egg scenario of protein folding revisited.

894: FEBS Lett 2002;515:79--83.

895:

896: \bibitem{29}

897: Baldwin RL, Rose GD.

898: Is protein folding hierarchic? I. Local structure and peptide

899: folding. Trends Biochem Sci 1999;24:26--33.

900:

901: \bibitem{30}

902: Shortle D. Composites of local structure propensities:

903: Evidence for local encoding of

904: long-range structure. Protein Sci 2002;11:18--26.

905:

906: \bibitem{31}

907: G\=o N, Taketomi H.

908: Respective roles of short- and long-range interactions in protein folding.

909: Proc Natl Acad Sci USA 1978;75:559--563.

910:

911: \bibitem{32}

912: Dill KA.

913: Dominant forces in protein folding.

914: Biochemistry 1990;29:7133--7155.

915:

916: \bibitem{33}

917: Minor DL, Kim PS.

918: Context-dependent secondary structure formation of a designed protein

919: sequence. Nature 1996;380:730--734.

920:

921: \bibitem{34}

922: Feldman HJ, Hogue CWV.

923: Probabilistic sampling of protein conformations: New hope for brute force?

924: Proteins 2002;46:8--23.

925:

926: \bibitem{35}

927: Shimizu S, Chan HS.

928: Origins of protein denatured state compactness and hydrophobic

929: clustering in aqueous urea: Inferences from nonpolar potentials of

930: mean force. Proteins 2002;49:560--566.

931:

932: \bibitem{36}

933: Micheletti C, Banavar JR, Maritan A, Seno F.

934: Protein structures and optimal folding from a geometrical variational

935: principle. Phys Rev Lett 1999;82:3372--3375.

936:

937: \bibitem{37}

938: Clementi C, Nymeyer H, Onuchic JN. Topological and energetic

939: factors: What determines the structural details of the transition state

940: ensemble and ``en-route'' intermediates for protein folding? An investigation

941: for small globular proteins. J Mol Biol 2000;298:937--953.

942:

943: \bibitem{38}

944: Linhananta A, Zhou Y.

945: The role of sidechain packing and native contact interactions in folding:

946: Discontinuous molecular dynamics folding simulations of an all-atom

947: G\=o model of fragment B of {\it Staphylococcal} protein A.

948: J Chem Phys 2002;117:8983--8995.

949:

950: \bibitem{39}

951: Thomas PD, Dill KA.

952: Local and nonlocal interactions in globular proteins and

953: mechanisms of alcohol denaturation.

954: Protein Sci 1993;2:2050--2065.

955:

956: \bibitem{40}

957: Kolinski A, Galazka W, Skolnick J.

958: On the origin of the cooperativity of protein folding: Implications

959: from model simulations. Proteins 1996;26:271--287.

960:

961: \bibitem{41}

962: Plotkin SS, Wang J, Wolynes PG.

963: Statistical mechanics of a correlated energy landscape model for protein

964: folding funnels. J Chem Phys 1997;106:2932--2948.

965:

966: \bibitem{42}

967: Liwo A, Kazmierkiewicz R, Czaplewski C, Groth M, Oldziej S, Wawak RJ,

968: Rackovsky S, Pincus MR, Scheraga HA. United-residue force field for

969: off-lattice protein structure simulations: III. Origin of backbone

970: hydrogen-bonding cooperativity in united-residue potentials.

971: J Comput Chem 1998;19:259--276.

972:

973: \bibitem{43}

974: Takada S, Luthey-Schulten Z, Wolynes PG.

975: Folding dynamics with nonadditive forces: A simulation study

976: of a designed helical protein and a random heteropolymer.

977: J Chem Phys 1999;110:11616--11629.

978:

979: \bibitem{44}

980: Eastwood MP, Wolynes PG.

981: Role of explicitly cooperative interactions in protein folding funnels:

982: A simulation study. J Chem Phys 2001;114:4702--4716.

983:

984: \bibitem{45}

985: Chan HS, Dill KA.

986: The effect of internal constraints on the configurations of chain

987: molecules. J Chem Phys 1990;92:3118--3135 [Erratum: J Chem Phys

988: 1997;107:10353].

989:

990: \bibitem{46}

991: Chan HS, Bornberg-Bauer E.

992: Perspectives on protein evolution from simple exact models.

993: Applied Bioinformatics 2002;1:121-144.

994:

995: \bibitem{47}

996: Gutin A, Sali A, Abkevich V, Karplus M, Shakhnovich EI.

997: Temperature dependence of the folding rate in a simple protein model:

998: Search for a ``glass'' transition.

999: J Chem Phys 1998;108:6466--6483.

1000:

1001: \bibitem{48}

1002: Matthews CR.

1003: Effect of point mutations on the folding of globular proteins.

1004: Methods Enzymol 1987;154:498--511.

1005:

1006: \bibitem{49}

1007: Abkevich VI, Gutin AM, Shakhnovich EI. Free energy

1008: landscape for protein folding kinetics: Intermediates, traps, and multiple

1009: pathways in theory and lattice model simulations.

1010: J Chem Phys 1994;101:6052--6062.

1011:

1012: \bibitem{50}

1013: Northey JGB, Di Nardo AA, Davidson AR.

1014: Hydrophobic core packing in the SH3 domain folding transition state.

1015: Nature Struct Biol 2002;9:126--130.

1016:

1017: \bibitem{51}

1018: Zhou HY, Zhou YQ. Folding rate prediction using total contact distance.

1019: Biophys J 2002;82:458--463.

1020:

1021: \bibitem{52}

1022: Chan HS, Shimizu S, Kaya H. Cooperativity principles in protein folding.

1023: Methods Enzymol, in press.

1024:

1025: \bibitem{53}

1026: Dunbrack RL. Rotamer libraries in the 21st century.

1027: Curr Opin Struct Biol 2002;12:431--440.

1028:

1029: \bibitem{54}

1030: Shimizu S, Chan HS. Anti-cooperativity and cooperativity

1031: in hydrophobic interactions: Three-body free energy landscapes and

1032: comparison with implicit-solvent potential functions for proteins.

1033: Proteins 2002;48:15--30 [Erratum: Proteins 2002;49:294].

1034:

1035:

1036: \end{thebibliography}

1037:

1038: %------------------------------------------------------------------------

1039: \vfill\eject

1040:

1041: \centerline{\large \bf Table I}

1042: \vskip .2 in

1043:

1044:

1045: {\footnotesize

1046:

1047: \begin{center}

1048: \begin{tabular}{|cc|cc|}

1049: \hline

1050: $\sum\Delta S_{ij}$ & conformation & $\sum\Delta S_{ij}$ & conformation \\

1051: \hline

1052:  208 & uufddfuurddbuubddruufddfuu  & 306 & uufrrbldrfdflurullddburdbr \\

1053:  210 & uufddfuurbbdffdbbrffuubdbu  & 308 & uufdfrbrbulddrfllfrruublfl \\

1054:  212 & ufdfuubbrddffubufrddbuubdd  & 310 & ufrulblfddrrbbllfuburdrfub \\

1055:  214 & uuffdbdfrbufubbddrffuubdbu  & 312 & uufrbbllffdrrdllbubdrurfdb \\

1056:  216 & ufdfuubbrddfuufddruubbddfu  & 314 & uufdrubbdfdfllbbuffubbrddr \\

1057:  218 & ufdfuubbrdfufddbbruuffddbu  & 316 & uffrddblbruufdllbdffrulubb \\

1058:  220 & uuffddburfdbbuuffrddbuubdd  & 318 & uufrbddbuullffdrrdllbubdru \\

1059:  222 & uuffddburdfuubbddruuffdbdf  & 320 & uufddfrruubbdfdbluuffldrbd \\

1060:  224 & uufddrbufubrfdbdfflurulldd  & 322 & uffdbrbrfufullbbrdrufldfdr \\

1061:  226 & uufddfuurddbubdrffubbulfrf  & 324 & ufrbddlfrflurullbbddffubrr \\

1062:  228 & uffdbrbuffdrbbuffubbllfrfl  & 326 & ufrfddluulddbbuufdrdbrfubu \\

1063:  230 & ufdrbufublffddruurddbuubdd  & 328 & ufrubdbuldldrrffllbufurblb \\

1064:  232 & uufrrblddrufdluldfurdruull  & 330 & uufdrubblddlfubuffddrrbbuf \\

1065:  234 & uufddfuurddrbluurfdbbulddr  & 332 & uuffrddruubbdfllfdbrrbluuf \\

1066:  236 & ufddbbuurrflfrdlbdfrbubldr  & 334 & ufrrdbdfluldbbrruuflbldrfd \\

1067:  238 & uuffdbdfrrblbrulffrulbbrfd  & 336 & uffrbrbuflblffrrddllbrrblu \\

1068:  240 & uffdbrfurdbblufrbuffllbbrf  & 338 & uffurrdldrbblurullfrrdldlf \\

1069:  242 & uufrbddfflburflurrbbdffdbb  & 340 & ufrrbbdffdlbrbllffurbubldr \\

1070:  244 & ufdrrbluulfrdrbuffllddrrul  & 342 & ufrullbrrblldrrdllfufdrrbu \\

1071:  246 & ufdfurbdfruullbrrddblurull  & 344 & uffdrdllbrbluuffdbrrdbuuff \\

1072:  248 & ufdfrbuflurblbrrdldrffuubd  & 346 & uuffddrrbbuufdfuldbubddflu \\

1073:  250 & ufddbrfruublfdbrdblluurdru  & 348 & uufrbbdlulddrrffllbuufdrrb \\

1074:  252 & uffdbrrflurbbdlufufrbbllff  & 350 & ufubrrdfdfuldblfuurrbldbdr \\

1075:  254 & ufddbrblurrdfflubrfulbrbll  & 352 & uffddrbllurrfubbddlluuffdd \\

1076:  256 & ufdfurdruullbbrddrfluurdbu  & 354 & ufrfdrbufubbllfrflddbrburd \\

1077:  258 & ufrbdflfrrbbuullfrrdfulldr  & 356 & uffrrbdbuullffrrbldbdflfrr \\

1078:  260 & uufddfurbbrdlffrbufubblffl  & 358 & uufdrrubddffuulldrdlbrbuuf \\

1079:  262 & uuffdbrbufrfldrdllbrbrfubu  & 360 & ufrddllfrruulldrblubddrruu \\

1080:  264 & ufdrurddbuuldblurrddllffrb  & 362 & ufubrrdffldrbblflfuurrbldb \\

1081:  266 & ufdrurddllbrbluurrfldbrdfu  & 364 & ufrfddlbblffubbuffrdbrdbuu \\

1082:  268 & uuffdbrubrfddbuldflfrrulur  & 366 & ufrfdbdfllbbuufdfurdbdbruu \\

1083:  270 & uuffrrdllbdrbufrulbrddffll  & 368 & uffurrbbddffllbrbuulfrdrfl \\

1084:  272 & uufrdfuldbdfrruubblddfrubd  & 370 & uffurrddbbuufllbrddflfrubr \\

1085:  274 & ufdrubrfddllbbuurrdldrfuld  & 372 & uufdrfdruubbddluufflddbrru \\

1086:  276 & ufdrbdlfrrubdblluurffrbbdl  & 374 & uffrdrbbuullffrrdbuldbdflf \\

1087: \end{tabular}

1088: \end{center}

1089: \vskip .15 in

1090:

1091: }

1092:

1093: \vfill

1094:

1095: $\null$ \hfill $\dots$ {\it Table I to be cont'd}

1096:

1097: \vfill\eject

1098:

1099: \noindent{\large \bf Table I} $\dots$ ({\it cont'd from last page})

1100: \vskip .2 in

1101:

1102: {\footnotesize

1103:

1104: \begin{center}

1105: \begin{tabular}{|cc|cc|}

1106:  278 & ufddrrbllbrrullurrfflbdfrb  & 376 & uffrddllbuubddrfrbuufdlflu \\

1107:  280 & uffrddlubdruubddllfubuffdd  & 378 & ufdrfdlluubbdfdbrfrbuuffld \\

1108:  282 & ufrbdffuldlubbddrrfflbuldf  & 380 & ufrfddlbrbllffubbuffrdbrbu \\

1109:  284 & uufrfldrrubbldrfdblfuldfrr  & 382 & uufdrfdrbbuufdfullddbrbuuf \\

1110:  286 & uffubbrddrffuldlbrurbufflb  & 384 & ufrbbullddfuurrfllddrrbblu \\

1111:  288 & ufdfrrubufldlubbrfdbdfrbuu  & 386 & uffddrbllfuubbddrruuffdbll \\

1112:  290 & uufrbbldrfdbllfubuffddrurd  & 388 & uffrburbddffllbrbuulffrrdb \\

1113:  292 & uffrddbbuufdldblffrulubbdf  & 390 & ufrufddrbbuffubbllffddbrbu \\

1114:  294 & ufdrdfulurbbddlluufddrfluu  & 392 & ufrufddrbbuffubbllffddbrbu \\

1115:  296 & ufdfrbdflbbruuffllddbbuufd  & 394 & ufrrddlbburuflblddffurbrdb \\

1116:  298 & ufrrdblblurfrbddffluldbrbl  & 396 & uffrddblflbufubbddrruufdlf \\

1117:  300 & ufdfrullddbuubddrffrbuubdd  & 400 & ufrfddllubdrrblluuffrdbrbu \\

1118:  302 & ufdfurddlluubbdfdbrfrbuufd  & 402 & ufrufrbbddffllbrbuulffdrrb \\

1119:  304 & ufrdlluurrbbdfdbllfuubdruf  &  &  \\

1120: \hline

1121: \end{tabular}

1122: \end{center}

1123: \vskip .15 in

1124:

1125: }

1126: {\noindent {{\bf Table~I.}}} $\quad$

1127: The ground-state 27mer conformations ($n=27$) used in this investigation are

1128: given by sequences of 26 bond directions, where r = right ($+x$),

1129: l = left ($-x$), f = forward ($+y$), b = backward ($-y$), u = up ($+z$),

1130: d = down ($-z$). A structure is randomly selected for each of the 97 possible

1131: CO values amongst the compact 27mer structures with

1132: $t_{\rm max}=28$ contacts. Each integer $\sum\Delta S_{ij}$ is

1133: the sum of $\vert j-i \vert$ over the $(i,j)$ nearest-neighbor

1134: contacts in the given conformation ($j-i\ge 3$).

1135: Here CO $=\sum\Delta S_{ij}/(n t_{\rm max})$ $=\sum\Delta S_{ij}/756$.

1136:

1137: %------------------------------------------------------------------------

1138: \vfill\eject

1139:

1140: \noindent

1141: {\large\bf Figure Captions}\\

1142:

1143: \noindent

1144: {\bf Figure 1.} $\quad$

1145: (A) Schematics of local-nonlocal cooperative energetics in protein

1146: folding. The conformation in the solid box represents the native (N)

1147: structure; the two filled circles depict a pair of nonlocal residues

1148: interacting favorably in the native state. The interaction strength

1149: between a residue pair is strong and essentially the same as that in

1150: the native structure if the chain segments sequentially local

1151: to both residues are nativelike, as in (i). [Dotted boxes in (A) are

1152: used to mark nativelike chain segments.] However, the interaction strength is

1153: weakened if one or two chain segments sequentially local to the

1154: interacting residues are not nativelike, as in examples (ii)--(iv).

1155: (B) A lattice implementation of this protein folding scenario. Here

1156: the favorable energy for every contact (between residues $i$ and $j$,

1157: $\vert j-i \vert \ge 3$) in the ground-state native (N) structure is

1158: ${\cal E}$ ($<0$) when the relative positions of the five residues centered

1159: at $i$ (residues $i-2$, $i-1$, $i$, $i+1$, and $i+2$) as well as the relative

1160: positions of five residues centered at $j$ (residues $j-2$, $j-1$,

1161: $j$, $j+1$, and $j+2$) are the same as that in N [solid lines in (i)],

1162: irrespective of the relative orientations of the two five-residue chain

1163: segments.  However, if the local conformation of one or both sets of

1164: five contiguous residues is nonnative, the contact energy is weakened by an

1165: attentuation factor $a$ ($0\le a< 1$). Examples of the latter situation

1166: is given by (ii)--(iv), where nonnative local chain segments are

1167: drawn as broken lines.

1168: \\

1169:

1170: \noindent

1171: {\bf Figure 2.} $\quad$

1172: Correlation between the common (base 10) logarithm of folding rate

1173: and CO for the 97 structures in Table~I under moderately folding

1174: conditions at ${\cal E}/k_{\rm B}T=-1.47$, using (A) the

1175: common additive G\=o potential and (B) the local-nonlocal

1176: cooperative interaction scheme with $a=0.1$. Solid lines are least-square

1177: fits. Here folding rate is the reciprocal of mean folding first passage

1178: time (folding rate $=$ 1/MFPT).  Each MFPT is averaged from 500

1179: trajectories. Associated with each value of $\log_{10}(1/{\rm MFPT})$

1180: (filled circle) is an open circle marking the common logarithm of the

1181: median folding first passage time (FPT) divided by $\ln 2$. If the

1182: kinetics is single-exponential, MFPT $=$ (median FPT)/$\ln 2$.

1183: The inset in (A) is the distribution of CO among the

1184: 103,346 maximally compact 27mer conformations, wherein the number of

1185: conformations (vertical scale) is shown as a function of CO (horizontal

1186: scale). The inset in (B) uses six

1187: representative structures with different CO values ($\sum\Delta S_{ij}=$

1188: 208, 224, 268, 310, 348, and 386 entries in Table~I) to illustrate that

1189: $\log_{10}({\rm folding\ rate})$ (vertical scale) is more sensitive to CO

1190: (horizontal scale) when the local-nonlocal coupling is stronger. In this

1191: inset, different symbols denote different $a$ values; the lines fitted

1192: through the symbol are, from top to bottom, for $a=1$, $0.75$, $0.5$,

1193: $0.25$, $0.1$, and $0.0$.

1194: \\

1195:

1196: \noindent

1197: {\bf Figure 3.} $\quad$

1198: Model chevron plots for a CO $=0.410$ structure ($\sum\Delta S_{ij}=310$

1199: entry in Table~I) are given by negative natural logarithm of MFPT as a

1200: function of ${\cal E}/k_{\rm B}T$ (filled symbols). Values

1201: of (median FPT)/$\ln 2$ are shown by the open symbols. Squares (folding)

1202: and triangles (unfolding) are for the additive G\=o potential

1203: ($a=1$, upper plot), whereas circles (folding) and diamonds (unfolding)

1204: are for the $a=0.1$ local-nonlocal cooperative interaction scheme (lower

1205: plot). Each MFPT is averaged from 500 trajectories, except for the model

1206: with local-nonlocal coupling at ${\cal E}/k_{\rm B}T=-1.47$ (arrow).

1207: For this particular case, 7,500 folding trajectories were simulated to

1208: provide enriched statistics for the FPT distribution in the inset,

1209: wherein $P(t)\Delta t$ is the fraction of trajectories with

1210: $t-\Delta t/2<$ FPT $\le t+\Delta t/2$, and the bin size $\Delta t$ for FPT

1211: is equal to $5\times 10^6$. The free energy of unfolding

1212: $\Delta G_{\rm u}$ for the $a=0.1$ cooperative model is computed using

1213: Monte Carlo histogram techniques based on sampling at the transition

1214: midpoint ${\cal E}/k_{\rm B}T=-1.33$. $\Delta G_{\rm u}$ is essentially

1215: linear in ${\cal E}$ (lower horizontal scale). The dotted V-shape,

1216: which fits well to the kinetic datapoints of the $a=0.1$ cooperative

1217: model over an extended regime, is an hypothetical simple two-state

1218: chevron plot consistent with the dependence of $\Delta G_{\rm u}$ on

1219: ${\cal E}$.

1220: \\

1221:

1222: \noindent

1223: {\bf Figure 4.} $\quad$

1224: Energy landscapes of three representative models with local-nonlocal

1225: coupling ($a=0.1$, $\sum\Delta S_{ij}=224$, $310$, and $386$ entries

1226: in Table~I; ${\cal E}=-1$). The left panels show the correlation between

1227: $E$ and $Q$; each dot indicates that at least one conformation with the

1228: given $(E,Q)$ was encountered in our sampling. The right panels show these

1229: structures' logarithmic densities of states, where $g(E)$ is the number of

1230: conformations with energy $E$ for the cooperative models ($a=0.1$, dots).

1231: Included for comparison are the $\ln g(E)$ values of the corresponding

1232: additive G\=o models ($a=1$, open circles; ${\cal E}=-1$). The densities of

1233: states here are estimated by Monte Carlo

1234: sampling at the models' transition midpoints ${\cal E}/k_{\rm B}T=-1.33$

1235: ($a=0.1$) and ${\cal E}/k_{\rm B}T=-1.43$ ($a=1$). Note that the cooperative

1236: models have more energy levels than the additive models. Therefore, to

1237: compare their densities of states on an equal footing, the open squares

1238: provide the natural logarithm of the number of conformations in the $a=0.1$

1239: cooperative models with energies in the range $m-0.5\le E< m+0.5$,

1240: where $m=1,0,-1,-2,\dots$ is an integer. Now the densities of states

1241: represented by the open squares ($a=0.1$) are directly comparable to

1242: that represented by the open circles ($a=1$) because their values

1243: are based upon the same unity bin size for $E$.

1244:

1245: %===========================================================================

1246:

1247: \end{document}

1248: