1: %\documentclass{jpsj2}
2: %\documentclass[preprint]{jpsj2}
3: %\documentclass[letter]{jpsj2}
4: \documentclass[twocolumn]{jpsj2}
5: %\usepackage{amsmath}
6: %\usepackage{graphicx}
7: \title{Diversity in Free Energy Landscape of Proteins with the Same Native Topology}
8: \author{Hiroo \textsc{Kenzaki}$^{1,2,3}$\thanks{E-mail address: kenzaki@tbp.cse.nagoya-u.ac.jp}
9: and Macoto \textsc{Kikuchi}$^{2,1}$\thanks{E-mail address: kikuchi@cmc.osaka-u.ac.jp}}
10: \inst{
11: $^{1}$Department of Physics, Osaka University, Toyonaka 560-0043\\
12: $^{2}$Cybermedia Center, Osaka University, Toyonaka 560-0043\\
13: $^{3}$JST-CREST, Nagoya 464-8601}
14:
15:
16: \abst{In order to elucidate the role of the native state topology and the stability of subdomains in protein folding,
17: we investigate free energy landscape of human lysozyme,
18: which is composed of two subdomains, by Monte Carlo simulations.
19: A realistic lattice model with G\={o}-like interaction is used.
20: We take the relative interaction strength (stability, in other word)
21: of two subdomains as a
22: variable parameter and study the folding process.
23: A variety of folding process is observed and we obtained a
24: phase diagram of folding in terms of temperature and the relative stability.
25: Experimentally-observed diversity in folding process of c-type lysozimes
26: is thus understood as a consequence of the difference in the relative stability
27: of subdomains.
28: }
29: \kword{protein folding, free energy landscape, native topology, lysozyme, multicanonical ensemble, G\={o} model}
30:
31: \begin{document}
32: \maketitle
33: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
34: Proteins take particular conformations called the native states under
35: the physiological condition.
36: How proteins fold into their native states through vast
37: conformational spaces has been a fundamental problem in biophysics.
38: Recent development of the energy landscape theory has made it
39: more and more evident that the folding processes are determined largely
40: by the chain conformations of the native states (often called as \textit{native topology}).
41: According to the theory, conformation spaces of proteins become narrower towards the native states and the resulted energy landscapes are said symbolically as \textit{funnel-like shaped}.\cite{onuchic97,pande00}
42: In other word,
43: proteins are designed so that the energetical frustration is minimized.
44:
45: Rise and development of the energy landscape theory has induced the recent revival of G\={o} model,\cite{taketomi75}
46: which includes interactions only between amino acids that contact in the native state (such an amino acid pair is called as a \textit{native pair}) and thus the native state is automatically the ground state.
47: Actually, G\={o} model is considered as a minimalistic realization of the funnel-like energy landscape.
48: Some variants of G\={o} model
49: have been shown to reproduce various properties of protein folding.
50: \cite{clementi00,koga01}
51: Remarkably,
52: validity of G\={o}-like models is not limited only to
53: the proteins that exhibit simple folding processes;
54: some proteins having intermediate states in the folding processes
55: are also described well by G\={o}-like models.
56: Recent success of G\={o}-like models supports the abovementioned postulate that the native topology is the most important factor that determines the folding process, since they are based on the information only of the native state.
57: In fact, proteins having a similar native structure follow a similar folding path in many cases.
58: Some proteins with the same native topology, however, are known to fold via different pathways.\cite{zarrine05}
59: Thus, it is true that
60: the folding process is governed largely by the native topology,
61: but there are some cases that other factors dominate.
62:
63: Following the statistical concept of the energy landscape theory,
64: the folding route of a protein is determined by the free energy
65: of the transition state and, if exists, the intermediate state.
66: However, analysis of folding process along this line has been limited mainly to single-domain proteins.
67: Validity of the G\={o}-like models as well as the funnel picture of the energy landscape for larger multi-domain proteins is still an open question.
68: Considering the fact that most of the real proteins consist of two or more subdomains, understanding the folding processes of multi-domain proteins is an important subject.
69: In this letter, we investigate folding process of proteins of two subdomains
70: within the framework of G\={o}-like model.
71: In particular, we focus on influence of inhomogeneity in energetic stability of subdomains.
72:
73: We deal with chicken-type (c-type) lysozyme, a class of two-domain proteins, in which a number of proteins with a similar native structure are included such as several variants of lysozymes and $\alpha$-lactalbumin.
74: Folding processes of c-type lysozymes have been extensively investigated by experiments, because they are relatively small among multi-domain proteins.
75: In fact, it is currently one of the {\it standard} materials for study of the folding processes of multi-domain proteins.
76: The c-type lysozymes have two subdomains:
77: $\alpha$ subdomain, which is composed of two helical regions both of N-terminal and C-terminal,
78: and $\beta$ subdomain, which is the middle beta region[Fig. \ref{fig1}(a)].
79: Many proteins in this class exhibit three-state folding kinetics;
80: in other words, intermediate states are observed in the folding processes.
81: Nature of the intermediate states are different from protein to protein.
82: Hen lysozyme and human lysozyme
83: have intermediate states in the folding processes (kinetic intermediates),
84: while no corresponding equilibrium state (equilibrium intermediate)
85: has been observed under sweeping of external parameters such as temperature.\cite{radford92,hooke94}
86: Thus, the intermediates states exist only as thermodynamically metastable states in these cases.
87: On the other hand, canine milk lysozyme and $\alpha$-lactalbumin have equilibrium intermediates\cite{koshiba00,schulman97}.
88: Considering the native topology, we intuitively expect that
89: $\beta$ subdomain will fold first, and then $\alpha$ subdomain, because formation of $\beta$ subdomain requires contacts of amino acids that are distantly located along the chain.
90: A G\={o}-like model with homogeneous interactions predicts the same folding route as will be shown later.
91: But the experimental evidences indicate opposite;
92: $\alpha$ domain already folded partially both in the kinetic intermediates and the equilibrium ones, while $\beta$ domain is still disordered there.
93:
94: Interactions in G\={o}-like models are often taken to be homogeneous, that is,
95: the same interaction parameter is applied to all the native pairs.
96: Although such setting is the simplest choice,
97: we are allowed also to use different interaction parameters for different pairs within the framework of G\={o}-like models, if only the native pairs are taken into account.
98: In fact, G\={o}-like models using Miyazawa--Jernigan type heterogeneous interactions\cite{miyazawa96} have also been used,\cite{karanicolas03,ejtehadi04,das05} considering that interactions between amino acid residues are heterogeneous in reality.
99: In this letter, however, we propose a totally different treatment for the interaction parameters.
100: Instead of fixing the interaction parameters to particular values, we regard them as variable parameters, and thereby we discuss diversity of possible folding routes within proteins of the same topology in a unified manner.
101:
102: \begin{figure}[tb]
103: \begin{center}
104: \includegraphics{fig1.eps}
105: \caption{(a)X-ray crystal structure of human lysozyme.
106: Blue indicates $\alpha$ domain (residues 1-38 and 88-130)
107: and red indicates $\beta$ domain (residues 39-87).
108: This image was prepared using
109: MOLSCRIPT\cite{kraulis91} and Raster3D\cite{merritt97}.
110: (b) Superposition of the C$^{\alpha}$ trace of X-ray crystal structure of human lysozyme (red) and its lattice realization (blue)}
111: \label{fig1}
112: \end{center}
113: \end{figure}
114:
115: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
116:
117: It is widely recognized that a folding simulation of a protein is a difficult task even with today's high performance computers.
118: Calculation of the global free energy landscape should be even more difficult,
119: because we need to sample conformations of wide energy range with accurate weight.
120: Lattice models are favorable for this purpose from the view point of computational cost.
121: Some realistic lattice models
122: that can represent flexible structures of real proteins in a satisfactory level of approximation have been proposed.\cite{kolinski04}
123: In the present work, we use the 210-211 hybrid lattice model
124: in which an amino acid residue is represented by its C$^{\alpha}$ atom located on a simple cubic lattice;
125: All other atoms such as N and H are not explicitly considered and the side chain as well.
126: Consecutive C$^{\alpha}$ atoms are connected by vectors
127: of the type (2,1,0) or (2,1,1) and all the possible permutations.
128: In order to take into account the excluded volume effect of amino acid residues, we assume that each amino acid occupies seven lattice sites,
129: that is, a center site where C$^{\alpha}$ atom is located and all of its nearest-neighbor sites.
130: Construction of G\={o}-like model requires knowledge about the native structure.
131: For that purpose, we take the X-ray crystal structure of human lysozyme (Protein Data Bank Code: 1jsf), which consists of 130 amino acid residues, as a reference structure;
132: then the native structure of the lattice protein model is obtained by
133: fitting to its C$^{\alpha}$ trace.
134: Figure \ref{fig1}(b) shows the native structure of the lattice model thus obtained and that of human lysozyme.
135: Their root mean square deviation (rmsd) is $0.85$ \AA,
136: which we regard is satisfactorily accurate for the present purpose.
137:
138: We introduce G\={o}-like interactions which act only in the native pairs.
139: A harmonic-type local interaction is also introduced, which expresses the excess energy due to the bond stretching.
140: Then the
141: \textit{Hamiltonian} for the \textit{homogeneous} case is defined as follows:
142:
143: \begin{align}
144: V_h & = \frac{K_b}{2} \sum_{i=1} (r_{i,i+2} - r_{i,i+2}^{nat})^2- \sum_{j-i>2}
145: \varepsilon C_{i,j} \Delta (r_{i,j}, r_{i,j}^{nat}),\\
146: \Delta(x,y) & = \left\{
147: \begin{array}{lc}
148: 1 & |x^2 - y^2| \le W\\
149: 0, & \textrm{otherwise}
150: \end{array} \right.
151: \end{align}
152: \noindent
153: where $i$ and $j$ indicate the residue number counted along the chain, $r_{i,j}$ is the distance between C$^{\alpha}$ atoms of $i$-th and $j$-th residues,
154: $r_{i,j}^{nat}$ is their native distance,
155: $W$ is the width of the G\={o} potential,
156: $K_{b}$ is strength of the local interaction,
157: $\varepsilon$ is strength of the G\={o} potential
158: and $C_{i,j}$ is assigned a value $1$ or $0$
159: depending on whether the pair is the native pair or not.
160: We use $W = 2$, $K_b = 1$ and $\varepsilon = 1$ throughout this work.
161: A pair of residues is considered as a native pair
162: if the minimum distance between their heavy atoms
163: is less than $4.5$ \AA{} in the X-ray crystal structure.
164:
165: Then we introduce inhomogeneity in interactions.
166: For the purpose of investigating the effect of subdomain stability,
167: we divide the interaction $V_h$ into two parts, the interactions within the $\alpha$ domain and the rest, and write the new potential as $V = R_{\alpha} V_{\alpha} + R_{\beta} V_{\beta}$, where $R_{\alpha}$ and $R_{\beta}$ are variable parameters.
168: We should note that inter-subdomain interaction is included in $V_{\beta}$.
169: For efficient sampling of conformations,
170: we employ a variant of two-energy multicanonical ensemble Monte Carlo method,\cite{berg92,lee93,iba98,chikenji99}
171: in which two dimensional histogram in ($V_{\alpha}$, $V_{\beta}$)--space is made to be flat.
172: This method enables us to study equilibrium properties for any values of $R_{\alpha}$ and $R_{\beta}$ through the reweighting procedure.
173:
174:
175: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
176:
177: \begin{figure}[tb]
178: \begin{center}
179: \includegraphics{fig2.eps}
180: \caption{Phase diagram of folding of lysozyme.
181: Transition points, estimated from the peaks of heat capacity,
182: are shown for various $R_{\alpha/\beta}$.
183: N indicates the native state and R is for the random coil.
184: I$_{\alpha}$ is an intermediate state with $\alpha$ domain formed,
185: and I$_{\beta}$ is an intermediate state with $\beta$ domain formed.
186: Values of $R_{\alpha/\beta}$ and $T$ indicated by the arrows are:
187: (a) $R_{\alpha/\beta} = 0.74$ and $T=0.845$,
188: (b) $R_{\alpha/\beta} = 1.06$ and $T=0.846$,
189: (c) $R_{\alpha/\beta} = 1.28$ and $T=0.853$.}
190: \label{fig2}
191: \end{center}
192: \end{figure}
193:
194: We investigate how does
195: the weight parameter $R_{\alpha/\beta}\equiv R_{\alpha}/R_{\beta}$
196: influences the folding behavior.
197: Figure \ref{fig2} shows the phase diagram in terms of temperature $T$ and $R_{\alpha/\beta}$.
198: The total energy of the native state is fixed same.
199: We define the transition point
200: as temperature that the heat capacity takes a (local) maximal value.
201: We found that number of transitions varies between one and three
202: in the range of $0.5 \le R_{\alpha/\beta} \le 2$.
203: Lysozyme folds into the native state N below the lowest transition temperature and unfolds to be a random coil R above the highest transition temperature
204: irrespective of the value of $R_{\alpha/\beta}$
205: (these two transitions coincide in the single transition case)
206: For $R_{\alpha/\beta} \le 0.74$, an equilibrium intermediate state is observed in a finite temperature range, which we call as I$_\beta$.
207: Another equilibrium intermediate state, which we call as I$_\alpha$, is observed for $R_{\alpha/\beta} \ge 1.28$.
208: An additional weak transition is found within the intermediate state region
209: when $R_{\alpha/\beta}$ close to $0.5$ or $2$;
210: peak of the heat capacity is relatively low at these additional transitions
211: and thus only a slight structural change is considered to take place;
212: we will not pursue this transition further.
213:
214: Figure \ref{fig3}(a)-(c) shows the free energy landscapes
215: at three selected points (a)-(c) indicated by the allows in Fig. \ref{fig2}, respectively.
216: Three minima corresponding to R, I$_{\beta}$ and N are observed at (a).
217: The free energy landscape clearly shows that
218: $\beta$ subdomain forms in the intermediate states I$_\beta$.
219: Three minima are also seen at (b), which in this case correspond to R, I$_{\alpha}$ and N.
220: $\alpha$ subdomain forms in the intermediate state I$_\alpha$.
221: Thus different folding pathways are followed in these two cases.
222: Two pathways between R and N have the same weight at (c), which are divided by a free energy barrier.
223: In contrast to the above two cases, neither of two pathways contains a metastable state in this case, so that the folding at this point should be of two-state type without a kinetic intermediate state.
224: Considering also the free energy landscapes for other values of $R_{\alpha/\beta}$ (figures not shown), the folding behavior at the lower transition point (folding temperature) is summarized as follows:
225: (i)For $R_{\alpha/\beta} \le 0.74 (a)$,
226: I$_\beta$ is the equilibrium intermediate state.
227: Thus it also is a kinetic intermediate at the folding temperature.
228: (ii)I$_\beta$ becomes a metastable state as $R_{\alpha/\beta}$ increases from $0.74$, and thus I$_\beta$ acts as a kinetic intermediate.
229: (iii)the metastable state disappears at some $R_{\alpha/\beta}<1.06$ (b).
230: The folding becomes two-state type and $beta$ subdomain forms first in the principal pathway.
231: (iv)Two-state folding with two equally possible pathway at (c).
232: (v)Two-state folding with $alpha$ subdomain form first in the principal pathway for $R_{\alpha/\beta}>1.06$.
233: (vi)the metastable state corresponding to I$_\alpha$ appears at some $R_{\alpha/\beta}>1.06$. I$_\beta$ acts as a kinetic intermediate.
234: (vii)For $R_{\alpha/\beta} > 1.28 (c)$,
235: I$_\alpha$ is the equilibrium intermediate state and thus also is a kinetic intermediate.
236:
237:
238: \begin{figure}[tb]
239: \begin{center}
240: \includegraphics{fig3.eps}
241: \caption{Free energy landscape of lysozyme
242: as a function of $V_\alpha$ and $V_\beta$.
243: (a) $R_{\alpha/\beta} = 0.74$ and $T=0.845$,
244: (b) $R_{\alpha/\beta} = 1.06$ and $T=0.846$,
245: (c) $R_{\alpha/\beta} = 1.28$ and $T=0.853$.}
246: \label{fig3}
247: \end{center}
248: \end{figure}
249:
250: Change in a single parameter $R_{\alpha/\beta}$ thus results in a wide spectrum of folding behavior.
251: The cases (i) and (vii) may be expected naturally from the energy balance,
252: but existence of the finite region of two-state folding with two folding pathways is rather a nontrivial result.
253: %Change of the main folding route in the two-state region si
254: The homogeneous G\={o}-like model $R_{\alpha/\beta} = 1$ is included in the case (iii), while the experimental evidence suggests that
255: the human lysozyme actually corresponds to the case (vi)\cite{hooke94}.
256: This discrepancy indicates that the $\alpha$ subdomain has high stability
257: compared to the $\beta$ subdomain in human lysozyme.
258: Differences found in folding behavior of other c-type lysozymes,
259: such as existence or nonexistence of an equilibrium intermediate state,
260: can also be understood as a result of difference in $R_{\alpha/\beta}$.
261: This postulate is further supported by an experimental evidence that
262: the human lysozyme becomes to have an equilibrium intermediate state
263: when a part of the protein is replaced by the corresponding part of $\alpha$-lactalbumin;\cite{pardon95}
264: in the present context, the human lysozyme seems to shift from the case (vi) to the case (vii) due to the change of interaction parameters.
265:
266: Lysozyme is not the only example of proteins that a subdomain
267: or, for smaller proteins, a folding unit
268: composed both of N-terminal and C-terminal regions rather than the central region is formed first in the equilibrium intermediate state .\cite{li99}
269: We expect that the results obtained above can be generalized for understanding the folding mechanism of these proteins, because existence of the cases (i) and (vii) is considered to be a natural consequence of two domains.
270: In this view point, the folding processes of the above proteins correspond to the case (vii).
271: Of course, details such as the number of pathways in the two-state transition region will differ largely from protein to protein.
272: Many experimental evidences have also been reported that
273: the folding processes can be modified by artificial mutation methods
274: leaving the native topology unchanged.\cite{zarrine05}
275: %Mutation of only one residue sometimes change the folding process
276: %between two-state (without a folding intermediate) and three-state (with a folding intermediate) kinetics.
277: Although these experiments are mainly for single-domain proteins,
278: the mechanism of these changes may be understood as a consequences of
279: the relative interaction strength of folding units, which are smaller units than a subdomain.
280:
281: %To summarize,
282: %we investigated the free energy landscape that describes the folding of lysozyme using a G\={o}-like model on a simple cubic lattice.
283:
284: To summarize,
285: we observed a variety of folding processes of lysozyme
286: by varying the relative strength of the interactions in two subdomains
287: in spite that the native conformation is kept unchanged.
288: This result suggests that the experimentally observed diversity in folding behavior of c-type lysozymes can be understood as a consequence of difference in the subdomain stability.
289: It should be stressed that the model is still within the framework of G\={o}-like model.
290: Thus, a variety of folding route for proteins of the same native topology can be understood by the funnel picture.
291: We expect that folding sequences of multi-domain proteins are in general influenced by subdomain stability as well as the native topology.
292: In order to describe such a variation of folding processes in a unified manner,
293: G\={o}-like model with variable interaction strength will be usefully.
294:
295:
296: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
297:
298:
299: \section*{Acknowledgment}
300: We would like to thank M. Sasai, S. Takada and G. Chikenji for critical reading of the manuscript and valuable comments.
301: The present work is partially supported by IT-program of Ministry of Education,
302: Culture, Sports, Science and Technology, The 21st Century COE program named
303: "Towards a new basic science: depth and synthesis", Grant-in-Aid for Scientific
304: Research (C) (17540383) from Japan Society for the Promotion of Science
305: and JST-CREST.
306:
307:
308: \begin{thebibliography}{99}
309: \bibitem{onuchic97} J.N. Onuchic, Z. Luthey-Schulten and P.G. Wolynes: Anuu. Rev. Phys. Chem. \textbf{48} (1997) 545.
310: \bibitem{pande00} V.S. Pande, A.Y. Grosberg and T. Tanaka: Rev. Mod. Phys. \textbf{72} (2000) 259.
311: \bibitem{taketomi75} H. Taketomi, Y. Ueda and N. G\={o}: Int. J. Peptide Protein Res. \textbf{7} (1975) 445.
312:
313: \bibitem{clementi00} C. Clementi, H. Nymeyer and J.N. Onuchic: J. Mol. Biol. \textbf{298} (2000) 937.
314: \bibitem{koga01} N. Koga and S. Takada: (2001) J. Mol. Biol. \textbf{313} (2001) 171.
315: \bibitem{zarrine05} A. Zarrine-Afsar, S.M. Larson and A.R. Davidson: (2005) Curr. Opin. Struct. Biol. \textbf{15} (2005) 42.
316:
317: \bibitem{radford92} S.E. Radford, C.M. Dobson and P.A. Evans: Nature \textbf{358} (1992) 302.
318: \bibitem{hooke94} S.D. Hooke, S.E. Radford and C.M. Dobson: Biochemistry \textbf{33} (1994) 5867.
319: \bibitem{koshiba00} M. Kikuchi, K. Kawano and K. Nitta: Protein Sci. \textbf{7} (1998) 2150.
320: \bibitem{schulman97} B.A. Schulman, P.S. Kim, C.M. Dobson and C. Redfield: Nature Struct. Biol. \textbf{4} (1997) 630.
321:
322: %\bibitem{onuchic97} %(arpc97.pdf)
323: %\bibitem{gunasekaran01} K. Gunasekaran, S.J. Eyles, A.T. Hagler and L.M. Gierasch: Curr. Opin. Struct. Biol. \textbf{11} (2001) 83.
324:
325: %\bibitem{dinner00} A.R. Dinner, A. \v{S}ali, L.J. Smith, C.M. Dobson and M. Karplus: Trends Biol. Sci. \textbf{25} (2000) 331.
326: \bibitem{miyazawa96} S. Miyazawa and R.L. Jernigan: J. Mol. Biol. \textbf{256} (1996) 623.
327: \bibitem{karanicolas03} J. Karanicolas and C.L. Brooks III: J. Mol. Biol. \textbf{334} (2003) 309.
328: \bibitem{ejtehadi04} M.R. Ejtehadi, S.P. Avall and S.S. Plotkin: Proc. Natl. Acad. Sci. USA \textbf{101} (2004) 15088.
329: \bibitem{das05} P. Das, S. Matysiak and C. Clementi: Proc. Natl. Acad. Sci. USA \textbf{102} (2005) 10141.
330: \bibitem{kolinski04} A. Kolinski and J. Skolnick: Polymer \textbf{45} (2004) 511.
331: \bibitem{kraulis91} P.J. Kraulis: J. Appl. Crystallogr. \textbf{24} (1991) 946.
332: \bibitem{merritt97} E.A. Merritt and D.J. Bacon, Methods Enzymol. \textbf{277} (1997) 505.
333: \bibitem{berg92} B.A. Berg and T. Neuhaus: Phys. Rev. Lett. \textbf{68} (1992) 9.
334: \bibitem{lee93} J. Lee: Phys. Rev. Lett. \textbf{71} (1993) 211.
335: \bibitem{iba98} Y. Iba, G. Chikenji and M. Kikuchi: J. Phys. Soc. Jpn. \textbf{67} (1998) 3327.
336: \bibitem{chikenji99} G. Chikenji, M. Kikuchi and Y. Iba: Phys. Rev. Lett. \textbf{83} (1999) 1886.
337: \bibitem{pardon95} E. Pardon, P. Haezebrouck, A. De Baetselier, Shaun D. Hooke, K.T. Fancourt, J. Desmet, C.M. Dobson, H. Van Dael and M. Joniau: J. Biol. Chem. \textbf{270} (1995) 10514.
338: \bibitem{li99} R. Li and C. Woodward: Protein Sci. \textbf{8} (1999) 1571.
339: \end{thebibliography}
340:
341: \end{document}
342:
343: