1: %%% SWITCH BETWEEN THESE FOR 1 OR 2 COLUMNS
2: %\documentclass[preprint,superscriptaddress]{revtex4} % for 1 column
3: \documentclass[aps,prl,twocolumn,groupedaddress]{revtex4} % for 2 columns
4:
5: \newcounter{col}
6: %%% SWITCH BETWEEN THESE FOR 1 OR 2 COLUMNS
7: %\setcounter{col}{1} % for 1 column
8: \setcounter{col}{2} % for 2 columns
9:
10: %\usepackage[pdftex]{hyperref}
11: \usepackage[pdftex]{graphicx}
12: \usepackage{rotating}
13: \usepackage{subfigure}
14: \usepackage{verbatim}
15: \usepackage{amsmath}
16: \usepackage{amssymb}
17: \usepackage{color}
18: \usepackage{ifthen}
19:
20: \newcommand{\beq}{\begin{equation}}
21: \newcommand{\eeq}{\end{equation}}
22: \newcommand{\beqn}{\begin{eqnarray}}
23: \newcommand{\eeqn}{\end{eqnarray}}
24: \newcommand{\avg}[1]{\langle{#1}\rangle}
25: \newcommand{\ket}[1]{|{#1}\rangle}
26: \newcommand{\bra}[1]{\langle{#1}|}
27: \newcommand{\ip}[2]{\langle{#1}|{#2}\rangle}
28: \renewcommand{\H}{\hat{H}}
29: \newcommand{\medium}{4.in}
30:
31: \begin{document}
32:
33: \title{Statistical properties of multistep enzyme-mediated reactions}
34:
35: \author{Wiet H. de Ronde\footnote{These authors contributed equally to this work}}
36: \email{deronde@amolf.nl}
37: \affiliation{FOM Institute for Atomic and Molecular Physics, Kruislaan 407, 1098 SJ, Amsterdam}
38:
39: \author{Bryan C. Daniels\footnotemark[1]}
40: \email{bcd27@cornell.edu}
41: \affiliation{Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, NY 14853, USA}
42:
43: \author{Andrew Mugler\footnotemark[1]}
44: \email{ajm2121@columbia.edu}
45: \affiliation{Department of Physics, Columbia University, New York, NY 10027, USA}
46:
47: \author{Nikolai A. Sinitsyn}
48: \email{nsinitsyn@lanl.gov}
49: \affiliation{Computer, Computational and Statistical Sciences Division, Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM 87545, USA}
50:
51: \author{Ilya Nemenman}
52: \email{nemenman@lanl.gov}
53: \affiliation{Computer, Computational and Statistical Sciences Division, Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM 87545, USA}
54:
55: \date{\today}
56:
57: \ifthenelse{\value{col} = 1}{\linespread{1}}{}
58: \begin{abstract}
59: Enzyme-mediated reactions may proceed through multiple intermediate
60: conformational states before creating a final product molecule, and
61: one often wishes to identify such intermediate structures from
62: observations of the product creation. In this paper, we address this
63: problem by solving the chemical master equations for various
64: enzymatic reactions. We devise a perturbation theory analogous to
65: that used in quantum mechanics that allows us to determine the first
66: ($\avg{n}$) and the second ($\sigma^2$) cumulants of the
67: distribution of created product molecules as a function of the
68: substrate concentration and the kinetic rates of the intermediate
69: processes. The mean product flux $V=d\avg{n}/dt$ (or
70: ``dose-response'' curve) and the Fano factor $F=\sigma^2/\avg{n}$
71: are both realistically measurable quantities, and while the mean
72: flux can often appear the same for different reaction types, the
73: Fano factor can be quite different. This suggests both qualitative
74: and quantitative ways to discriminate between different reaction
75: schemes, and we explore this possibility in the context of four
76: sample multistep enzymatic reactions. We argue that measuring both
77: the mean flux and the Fano factor can not only discriminate between
78: reaction types, but can also provide some detailed information about
79: the internal, unobserved kinetic rates, and this can be done without
80: measuring single-molecule transition events.
81: \end{abstract}
82: \ifthenelse{\value{col} = 1}{\linespread{1.5}}{}
83:
84: \maketitle
85:
86: %\section{Introduction}
87: Enzyme-mediated reactions are ubiquitous in biology. Traditionally,
88: they have been described as a two-step Michaelis-Menten (MM) process
89: \cite{Michaelis}, in which the enzyme and the substrate form a complex
90: that can decay either back into the enzyme and the substrate, or
91: forward into the enzyme and the product (see Fig.~\ref{cartoon}A). The
92: latter step is usually assumed to be irreversible, leaving three
93: kinetic rates that specify the reaction. To determine these kinetic
94: rates, a typical experiment measures the average rate of product
95: formation (or product ``flux'') $V$ as a function of substrate
96: concentration $S$ (also called a ``dose-response'' curve), producing a
97: plot as in Fig.~\ref{plots}A. Two pieces of information can be
98: extracted from this plot: the saturating reaction rate $V_{\max}$ and
99: the Michaelis constant $K$, the substrate concentration at half of the
100: maximum rate. Importantly, these two measurements do not specify the
101: three underlying kinetic rates, thus they do not allow for a full
102: identification of the reaction processes.
103:
104: The MM mechanism is not entirely general: many enzyme-mediated
105: reactions consist of multiple intermediate internal steps (such as
106: conformational changes of either the enzyme or the substrate, enzymes
107: that occur in active and inactive states, etc.), each with its own
108: forward and backward reaction rates. While measurements of
109: substrate-enzyme complex formation and product releases are possible
110: even on a single molecule level in enzymatic kinetics \cite{English}
111: and in ion channel transport \cite{Rostovtseva,Nestorovich},
112: %mathematically equivalent to it,
113: typical experiments cannot resolve intermediate steps when measuring
114: only the average reaction rate since they produce qualitatively
115: similar curves for $V(S)$. For example, the mean flux through an
116: arbitrary complex ion channel that holds at most one large transported
117: molecule at a time is indistinguishable from that through a simple
118: channel with just two internal states \cite{Bezrukov}.
119:
120: An interesting problem then is to determine which experimental
121: measurements could identify the multistep nature of an enzyme-mediated
122: reaction without requiring measurements at intermediate steps. We
123: suggest that this is possible by measuring not only the mean rate but
124: also the variance in the rate of the creation of product
125: molecules. Modern experiments can clearly perform this task in
126: different experimental systems \cite{English,Golding}.
127:
128: Here we present a general perturbative approach for calculating the
129: cumulants of a product molecule flux for a given enzymatic reaction
130: scheme. To illustrate the method, we first apply it to the usual MM
131: reaction (Fig.~\ref{cartoon}A). In addition to recovering the
132: well-known result for the mean rate of product formation as a function
133: of substrate concentration, we derive the dependence on substrate of
134: the Fano factor, the ratio of the variance in the number of product
135: molecules to the mean. Importantly, our approach is extendible, at
136: least in principle, to an arbitrary enzyme-mediated reaction scheme,
137: and we demonstrate this by analyzing three more complex reaction
138: schemes, shown in Fig.~\ref{cartoon}B-D. In the context of these
139: reactions, we show that the dependence of the Fano factor on the
140: substrate concentration can produce qualitatively different results
141: for different reaction types, allowing one to distinguish them
142: experimentally. In addition, we argue that quantitative features of
143: the Fano factor measurements can constrain the values of the
144: underlying kinetic rate constants more tightly than the mean rate
145: measurements alone. Measurements of higher order product formation
146: cumulants, if experimentally possible, would allow one to constrain
147: properties of the reaction even more strongly.
148:
149: \begin{figure}
150: \begin{tabular}{|c|c|} \hline
151: \ifthenelse{\value{col} = 1}{
152: {\LARGE A} & \scalebox{.5}{\input{cartoon_A.pdftex_t}}\\ \hline
153: {\LARGE B} & \scalebox{.5}{\input{cartoon_B.pdftex_t}}\\ \hline
154: {\LARGE C} & \scalebox{.5}{\input{cartoon_C.pdftex_t}}\\ \hline
155: {\LARGE D} & \scalebox{.5}{\input{cartoon_D.pdftex_t}}\\ \hline
156: }{
157: {\LARGE A} & \scalebox{.28}{\input{cartoon_A.pdftex_t}}\\ \hline
158: {\LARGE B} & \scalebox{.28}{\input{cartoon_B.pdftex_t}}\\ \hline
159: {\LARGE C} & \scalebox{.28}{\input{cartoon_C.pdftex_t}}\\ \hline
160: {\LARGE D} & \scalebox{.28}{\input{cartoon_D.pdftex_t}}\\ \hline
161: }
162: \end{tabular}
163: \linespread{1}
164: \caption{Potential schemes for an enzyme-mediated reaction, in which
165: substrate $S$ is converted to product $P$. {\bf A:} A simple
166: Michaelis-Menten (MM) reaction. {\bf B:} A MM reaction with an
167: additional intermediate state (e.g.\ if the complex undergoes a
168: conformational change before creating the product). {\bf C:} A
169: scheme in which the enzyme must become active (e.g., through
170: phosphorylation) before mediating the reaction. {\bf D:} A scheme
171: in which the enzyme must become active before mediating the
172: reaction, and the reaction leaves the enzyme inactive.}
173: \label{cartoon}
174: \end{figure}
175:
176:
177: \section{Methods: The Michaelis-Menten Model}
178:
179: Going beyond a simple description of the mean production of a
180: particular molecule and making predictions about the intrinsic noise
181: requires a stochastic description, such as the chemical master
182: equation (CME) \cite{vanKampen}. The CME describes the evolution in
183: time of the joint probability distribution for the copy numbers of all
184: species involved in a reaction scheme. For the enzyme-mediated
185: reactions we consider, we make the assumption that each enzyme acts
186: independently, that is, the substrate concentration is much larger
187: than the enzyme concentration. This is equivalent to treating the
188: process as if only one enzyme were present at a time. Furthermore, we
189: assume that the concentration of the substrate is constant during each
190: experimental measurement, and thus our master equation needs only to
191: keep track of the enzyme's state and the number of created product
192: molecules $n$. We note that both of these assumptions can be relaxed
193: using recently developed techniques
194: \cite{Sinitsyn,Sinitsyn2}. Finally, we only search for the
195: distribution of the number of product molecules at times much longer
196: than a typical enzymatic turnover time.
197:
198: We begin by demonstrating our method on the simple Michaelis-Menten
199: (MM) reaction in Fig.\ \ref{cartoon}A. In the MM reaction, the enzyme
200: will be in either a free state $E$ or a bound state $ES$. Therefore
201: we partition the joint probability distribution into two parts:
202: $P^E_n$, the probability that $n$ product molecules have been created
203: {\it and} the enzyme is free, and $P^{ES}_n$, the probability that $n$
204: product molecules have been created {\it and} the enzyme is bound,
205: yielding the CME \cite{vanKampen} \beqn
206: \label{ma1}
207: \frac{dP^E_n}{dt}&=&-k_1SP^E_n+k_{-1}P^{ES}_n+k_2P^{ES}_{n-1}\\
208: \label{ma2}
209: \frac{dP^{ES}_n}{dt}&=&k_1SP^E_n-(k_{-1}+k_2)P^{ES}_n \eeqn where the
210: rates are defined in Fig.~\ref{cartoon}A, and $S$ is the number of
211: substrate molecules. (Note that $S$ can equivalently be thought of as
212: the concentration of substrate as long as one appropriately rescales
213: the rates). The total probability of having $n$ product molecules is
214: then $P_n=P^E_n+P^{ES}_n$.
215:
216: We note that the situation where the product molecules are created and
217: never destroyed or transformed back into the substrate is not
218: physical, and additional reactions that degrade the product in some
219: way are needed. However, as long as we are interested in how many
220: product molecules have been created, rather than are present at a
221: given time, the creation, Eqn.~(\ref{ma1}, \ref{ma2}), and the decay
222: reactions can be considered independently.
223:
224: Similar to Refs.~\cite{Bagrets,Sinitsyn,Sinitsyn2,Gopich,Hornos} and
225: others, we begin our solution of Eqns.\ (\ref{ma1}-\ref{ma2}) by
226: defining the generating function \beq G^z(\chi) = \sum_{n=0}^{\infty}
227: P^z_n e^{i\chi n} \eeq with $z \in \{E, ES\}$. Defining the vector
228: $\ket{G}=(G^E,G^{ES})^T$, we may write the total generating function
229: as \beq G(\chi) = \ip{\hat{1}}{G} = G^E+G^{ES} \eeq where
230: $\bra{\hat{1}}=(1,1)$ (note that we are adopting ``bra-ket'' vector
231: notation commonly used in quantum mechanics literature). The
232: advantage of this formalism is that the mean $\avg{n}$ and variance
233: $\sigma^2$ of the distribution of product molecules $P_n$ can be
234: calculated from $G(\chi)$ via \beq
235: \label{cu}
236: \avg{n} = \left.\frac{d(\ln G)}{d(i\chi)}\right|_{\chi=0}, \qquad
237: \sigma^2 = \left.\frac{d^2(\ln G)}{d(i\chi)^2}\right|_{\chi=0}.
238: \eeq
239: Furthermore we note that having $N$ (independently acting) enzymes is equivalent to taking $G$ to $G^N$, so that extension to larger concentrations of enzymes is straightforward.
240:
241: Now multiplying Eqns.\ (\ref{ma1}-\ref{ma2}) by $e^{i\chi n}$ and
242: summing over $n$ produces
243: \beq
244: \label{eom}
245: \frac{d\ket{G}}{dt}=\H \ket{G},
246: \eeq
247: where, for the MM reaction,
248: \beq
249: \H=\H_A= \begin{pmatrix}
250: -k_1 S & k_{-1} + k_2e^{i\chi} \\
251: k_1 S & -(k_{-1} + k_2)
252: \end{pmatrix}.
253: \eeq
254:
255: Eqn.\ (\ref{eom}) is solved by \beq \ket{G(t)} = e^{\H t}\ket{G_0},
256: \eeq with an initial condition $\ket{G_0}$. If we write the matrix
257: exponential in terms of the eigenvalues $\lambda_j$ and eigenvectors
258: $\ket{u_j}$ of $\H$ as \footnote{Note that since $\H$ is not symmetric,
259: the eigenvectors do not satisfy $\ket{u_j}=\bra{u_j}^T$, but rather they
260: solve $\H\ket{u_j}=\lambda_j\ket{u_j}$ and $\bra{u_j}\H=\lambda_j\bra{u_j}$,
261: respectively.} \beq e^{\H t} = \sum_j e^{\lambda_j t}
262: \ket{u_j}\bra{u_j}, \eeq then, at $t$ much larger than the typical enzyme
263: turnover time, $G(\chi)$ becomes \beq G(\chi) = \sum_j e^{\lambda_j t}
264: \ip{\hat{1}}{u_j}\ip{u_j}{G_0} \approx e^{\lambda_0 t}
265: \ip{\hat{1}}{u_0}\ip{u_0}{G_0}, \eeq where $\lambda_0$ is the eigenvalue
266: with the least negative real part. Taking the
267: log, we get \beq
268: \label{lnG}
269: \ln G(\chi) = \lambda_0 t + \ln\left(\ip{\hat{1}}{u_0}\ip{u_0}{G_0}\right)
270: \approx \lambda_0 t, \eeq since again, in the long-time limit, the
271: first term dominates the second (for any bounded $G_0$), and the
272: initial number of product molecules is forgotten. Recalling Eqn.\
273: (\ref{cu}), it is clear now that one only needs to find the
274: $\chi$-dependence of the least negative eigenvalue $\lambda_0$ of the
275: matrix $\H_A$ in order to compute the cumulants of the product molecule
276: distribution. In fact, writing $\lambda_0$ as a power series, \beq
277: \lambda_0 = \sum_{m=0}^\infty \lambda_0^{(m)} \frac{(i\chi)^m}{m!},
278: \eeq it is clear that one only needs to know the coefficients up to
279: $m=2$ in order to compute the mean and variance of the distribution;
280: i.e.\ \beqn
281: \label{p}
282: \avg{n} &=& \lambda_0^{(1)} t, \\
283: \label{sig}
284: \sigma^2 &=& \lambda_0^{(2)} t, \eeqn and higher order terms are
285: needed for higher cumulants only. Since Eqn.~(\ref{cu})
286: takes $\chi\to 0$, this permits a perturbative approach similar to
287: that used in quantum mechanics \cite{Griffiths}, with $\chi$ treated as a small parameter.
288:
289: Specifically, we write $\H=\H^{(0)}+\H^{(1)}\sum_{m=1}^\infty(i\chi)^m/m!$
290: where (for the MM case)
291: \ifthenelse{\value{col} = 1}{
292: \beq
293: \label{MMH}
294: \H_A^{(0)}= \begin{pmatrix}
295: -k_1 S & k_{-1} + k_2 \\
296: k_1 S & -(k_{-1} + k_2)
297: \end{pmatrix},
298: \qquad
299: \H_A^{(1)}= k_2\begin{pmatrix}
300: 0 & 1\\
301: 0 & 0
302: \end{pmatrix}, \eeq
303: }{
304: \beqn
305: \label{MMH}
306: \H_A^{(0)}&=& \begin{pmatrix}
307: -k_1 S & k_{-1} + k_2 \\
308: k_1 S & -(k_{-1} + k_2)
309: \end{pmatrix},\\
310: \H_A^{(1)}&=& k_2\begin{pmatrix}
311: 0 & 1\\
312: 0 & 0
313: \end{pmatrix}, \eeqn
314: }
315: and we truncate at $m=2$. We emphasize that this truncation does not
316: introduce any further approximation if one is interested only in the
317: first and second moments of the product molecule distribution.
318: The least negative eigenvalue of $\H^{(0)}$ is $\lambda_0^{(0)}=0$
319: \footnote{More precisely, $\H_0$ is a propensity matrix whose columns
320: sum to zero, which means one of its eigenvalues is zero and the rest
321: are negative \cite{vanKampen}.}, and the higher order corrections are
322: given by \cite{Griffiths}
323: \beqn
324: \label{l1}
325: \lambda_0^{(1)} &=& \bra{u_0^{(0)}}\H^{(1)}\ket{u_0^{(0)}},\\
326: \label{l2}
327: \lambda_0^{(2)} &=& \lambda_0^{(1)}-2\sum_{j\ne 0}\frac{1}{\lambda_j^{(0)}}|\bra{u_j^{(0)}}\H^{(1)}\ket{u_0^{(0)}}|^2.
328: \eeqn
329:
330: Noting Eqns.\ (\ref{p}-\ref{sig}), the rate of product formation
331: $V=d\avg{n}/dt$ and the Fano factor $F=\sigma^2/\avg{n}$ can now be
332: written: \beqn
333: \label{V}
334: V&=&\lambda_0^{(1)},\\
335: \label{F}
336: F&=&\lambda_0^{(2)}/\lambda_0^{(1)}. \eeqn
337: For the MM case (Fig.~\ref{cartoon}A), this gives \beqn
338: \label{VMM}
339: V_A&=&V_A^{\rm max}\frac{S}{S+K_A},\\
340: \label{FMM}
341: F_A&=&1-\alpha_A\frac{S}{(S+K_A)^2}, \eeqn where $V_A^{\rm max}=k_2$,
342: $K_A=(k_2+k_{-1})/k_1$, and $\alpha_A=2k_2/k_1$. The expression for
343: mean flux $V_A$ is well-known \cite{Michaelis}, and $K_A$ is called
344: the Michaelis constant; the expression for the Fano factor $F_A$ is
345: less familiar.
346:
347: This procedure is fully extendible to other more complicated
348: enzyme-mediated reactions. The reaction scheme determines the master
349: equation and thus $\H^{(0)}$ and $\H^{(1)}$. Specifically, $\hat{H}^{(0)}$
350: is given by the Markov transition matrix for the enzymatic states
351: (disregarding the $n$ variable), and $\hat{H}^{(1)}$ has a $1$ marking
352: every rate where the product gets created, and a $-1$ where it is
353: destroyed. Then Eqns.\ (\ref{V}-\ref{F}) give the product formation
354: rate and the Fano factor, and higher orders in perturbation theory
355: would provide more cumulants. To illustrate the breadth of the method,
356: in the next section, we apply this procedure to three reaction schemes
357: that include multiple intermediate reaction steps.
358:
359:
360: \section{Results: Complex enzymatic reactions}
361:
362: \subsection{Product distribution statistics}
363:
364: Many enzyme-mediated reactions involve intermediate steps, and it is
365: instructive to illustrate our approach with three prototypical examples, shown in Fig.~\ref{cartoon}B-D.
366:
367: \subsubsection{Reaction scheme B}
368:
369: Fig.~\ref{cartoon}B depicts a case in which the complex undergoes an
370: intermediate step, such as a conformational change, before creating
371: the product \cite{frenzen}. This kinetic scheme is also equivalent to
372: certain ion channels \cite{Bezrukov}. Such multistep enzymatic
373: reactions have been shown (including via our method here) to reduce
374: noise in chemical reactions \cite{Doan}. The master equation
375: describing this system is \beqn
376: \frac{dP^E_n}{dt}&=&-k_1SP^E_n+k_{-1}P^{ES}_n+k_2P^{EP}_{n-1},\\
377: \frac{dP^{ES}_n}{dt}&=&k_1SP^E_n-(k_{-1}+k_+)P^{ES}_n+k_{-}P^{EP}_n,\\
378: \frac{dP^{EP}_n}{dt}&=&k_+P^{ES}_n-(k_{-}+k_2)P^{EP}_n, \eeqn which
379: yields
380: \ifthenelse{\value{col} = 1}{
381: \beq \H_B^{(0)} = \begin{pmatrix}
382: -k_1 S & k_{-1} & k_2 \\
383: k_1 S & -(k_{-1} + k_{+}) & k_{-} \\
384: 0 & k_+ & -(k_{-} + k_2)
385: \end{pmatrix},
386: \qquad
387: \H_B^{(1)} =
388: k_2\begin{pmatrix}
389: 0 & 0 & 1 \\
390: 0 & 0 & 0 \\
391: 0 & 0 & 0
392: \end{pmatrix}.
393: \eeq
394: }{
395: \beqn
396: \H_B^{(0)} &=& \begin{pmatrix}
397: -k_1 S & k_{-1} & k_2 \\
398: k_1 S & -(k_{-1} + k_{+}) & k_{-} \\
399: 0 & k_+ & -(k_{-} + k_2)
400: \end{pmatrix},\\
401: \H_B^{(1)} &=&
402: k_2\begin{pmatrix}
403: 0 & 0 & 1 \\
404: 0 & 0 & 0 \\
405: 0 & 0 & 0
406: \end{pmatrix}.
407: \eeqn
408: }
409: The product flux and Fano factor are then
410: \beqn
411: \label{VB}
412: V_B&=&V_B^{\rm max}\frac{S}{S+K_B}\\
413: \label{FB}
414: F_B&=&1-\alpha_B\frac{S(S+K'_B)}{(S+K_B)^2}
415: \eeqn
416: where $V_B^{\rm max}=k_2k_+/(k_2+k_++k_{-})$,
417: $K_B=(k_2k_++k_2k_{-1}+k_{-1}k_{-})/(k_1(k_2+k_++k_{-}))$,
418: $\alpha_B=2k_2k_+/(k_2+k_++k_{-})^2$,
419: and $K'_B=(k_2+k_++k_{-}+k_{-1})/k_1$.
420:
421: \subsubsection{Reaction scheme C}
422:
423: Fig.~\ref{cartoon}C depicts a case in which the enzyme exists in an
424: inactive and an active state. The enzyme switches autonomously between
425: these states, but can only react with the substrate in its active
426: form. Note that in this case we have two isolated reactions, since the
427: enzyme remains in the active state when a product is produced. This
428: scheme can be interpreted as a toy model for a voltage-gated ion
429: channel that can only transmit a single molecule at a time
430: \cite{hh}. Alternatively, this scheme could be a model for the
431: production-degradation and subsequent translation of mRNA ($E^*$) by
432: ribosomes ($S$) into protein ($P$). Finally, this is also an extreme model of an enzyme that has internal states with different rates of product formation, such as studied in \cite{English}. For this scheme we can write the
433: following master equation: \beqn
434: \frac{dP^E_n}{dt}&=&-k_+P^E_n + k_{-}P^{E^*}_n\\
435: \frac{dP^{E^*}_n}{dt}&=&k_+P^E_n-k_{-}P^{E^*}_n+k_2P^{E^*S}_{n-1}
436: \ifthenelse{\value{col} = 1}{}{\\ &&}
437: -k_1SP^{E^*}_n+k_{-1}P^{E^*S}_n\\
438: \frac{dP^{E^*S}_n}{dt}&=&-k_2P^{E^*S}_{n}-k_{-1}P^{E^*S}_n+k_1SP^{E^*}_n
439: \eeqn which yields
440: \ifthenelse{\value{col} = 1}{
441: \beq \H_C^{(0)} =
442: \begin{pmatrix}
443: -k_+ & k_{-} & 0 \\
444: k_+ & -(k_{-}+k_1 S) & k_{-1} + k_2 \\
445: 0 & k_1 S & -(k_{-1} + k_2)
446: \end{pmatrix},
447: \qquad
448: \H_C^{(1)} =
449: k_2\begin{pmatrix}
450: 0 & 0 & 0 \\
451: 0 & 0 & 1 \\
452: 0 & 0 & 0
453: \end{pmatrix}.
454: \eeq
455: }{
456: \beqn \H_C^{(0)} &=&
457: \begin{pmatrix}
458: -k_+ & k_{-} & 0 \\
459: k_+ & -(k_{-}+k_1 S) & k_{-1} + k_2 \\
460: 0 & k_1 S & -(k_{-1} + k_2)
461: \end{pmatrix},\\
462: \H_C^{(1)} &=&
463: k_2\begin{pmatrix}
464: 0 & 0 & 0 \\
465: 0 & 0 & 1 \\
466: 0 & 0 & 0
467: \end{pmatrix}.
468: \eeqn
469: }
470: The product flux and Fano factor are then
471: \beqn
472: \label{VC}
473: V_C &=& V^{\rm max}_C \frac{S}{S+K_C},\\
474: \label{FC}
475: F_C &=& 1 - \alpha_C\frac{S}{(S+K_C)^2}, \eeqn
476: where $V^{\rm max}_C=k_2$,
477: $K_C=(k_+{+}k_{-})(k_2+k_{-1})/(k_{+}k_1)$, and
478: $\alpha_C=2k_2[1+k_-(k_+-k_2-k_{-1})/k_+^2]/k_1$. Note that these expressions
479: reduce to those for the MM reaction (Eqns.\ (\ref{VMM}-\ref{FMM})) for
480: $k_-\to 0$, since this limit corresponds to the enzyme always being in
481: the active state. Note also that since $\alpha_C$ can be negative,
482: $F_C$ can be greater than 1 (and in fact it is infinite in the limit
483: of rare activation $k_+\to 0$) due to the compounded noise from the
484: independent stochastic processes of enzyme activation and complex
485: formation. Under the interpretation of this scheme as protein
486: translation, $F\gg1$ corresponds to many proteins in a translation
487: burst from a single rare mRNA.
488:
489: \subsubsection{Reaction scheme D}
490:
491: Figure \ref{cartoon}D shows a third example of a more complex reaction
492: scheme, in which an active enzyme transforms a substrate into a
493: product and, in contrast to scheme C, returns to its inactive state in
494: the process. The enzyme must switch back to its active state for a
495: new reaction to occur. Similar dynamics have been found for the
496: $\beta$-galactosidase enzyme \cite{English}. Alternatively, this can be a model for an enzyme that transfers a phosphate group to a
497: substrate, and needs to reacquire a new phosphate group before
498: continuing to function as an enzyme. For this scheme, we can write the
499: following master equation: \beqn
500: \frac{dP^E_n}{dt}&=&-k_{+}P^E_n + k_{2}P^{E^*S}_{n-1},\\
501: \frac{dP^{E^*}_n}{dt}&=&k_{+}P^E_n-k_1SP^{E^*}_n+k_{-1}P^{E^*S}_n,\\
502: \frac{dP^{E^*S}_n}{dt}&=&k_1SP^{E^*}_n-k_{-1}P^{E^*S}_{n}-k_2P^{E^*S}_n,
503: \eeqn which yields
504: \ifthenelse{\value{col} = 1}{
505: \beqn \H_D^{(0)} =
506: \begin{pmatrix}
507: -k_+ & 0 & k_2 \\
508: k_+ & -k_1S & k_{-1} \\
509: 0 & k_1S & -(k_{-1} + k_2)
510: \end{pmatrix},
511: \qquad
512: \H_D^{(1)} =
513: k_2\begin{pmatrix}
514: 0 & 0 & 1 \\
515: 0 & 0 & 0 \\
516: 0 & 0 & 0
517: \end{pmatrix}.
518: \eeqn
519: }{
520: \beqn \H_D^{(0)} &=&
521: \begin{pmatrix}
522: -k_+ & 0 & k_2 \\
523: k_+ & -k_1S & k_{-1} \\
524: 0 & k_1S & -(k_{-1} + k_2)
525: \end{pmatrix},\\
526: \H_D^{(1)} &=&
527: k_2\begin{pmatrix}
528: 0 & 0 & 1 \\
529: 0 & 0 & 0 \\
530: 0 & 0 & 0
531: \end{pmatrix}.
532: \eeqn
533: }
534: The product flux and the Fano factor are then
535: \beqn
536: \label{VD}
537: V_D &=& V^{\rm max}_D \frac{S}{S+K_D},\\
538: \label{FD}
539: F_D &=& 1 - \alpha_D\frac{S(S+K'_D)}{(S+K_D)^2}, \eeqn where $V^{\rm
540: max}_D=k_2k_{+}/(k_2+k_{+})$,
541: $K_D=k_+(k_2+k_{-1})/(k_1(k_2+k_{+}))$,
542: $\alpha_D=2k_2k_+/(k_2+k_{+})^2$ and $K'_D=(k_2+k_{+}+k_{-1})/k_1$.
543: Note that these expressions reduce to those for the MM reaction
544: (Eqns.\ (\ref{VMM}-\ref{FMM})) for $k_+\to\infty$, since this limit
545: corresponds to the immediate reversion of the enzyme to its active
546: state following a product formation.
547:
548: All four reactions in Fig.~\ref{cartoon} use an enzyme to convert a
549: substrate into a product, but as we have derived using the present
550: method, the statistical properties of the product molecule
551: distributions differ among the cases.
552:
553: \subsection{Measurable differences between reaction schemes}
554: Since different reactions have different statistical properties, it
555: should be possible to use our methods and results to differentiate
556: among the underlying reactions based on experimental observations.
557: Here we demonstrate how basic measurements can differentiate among the
558: four reaction schemes presented above.
559:
560: The mean product formation rates $V$ for all four reaction schemes A,
561: B, C and D shown in Fig.~\ref{cartoon}, Eqns.\ (\ref{VMM}, \ref{VB},
562: \ref{VC}, \ref{VD}), are qualitatively similar functions of substrate
563: concentration $S$, and it would not be possible to differentiate the
564: schemes based on mean data alone (see Fig.\ \ref{plots}). Measurement
565: of the Fano factor $F$ [Eqns.\ (\ref{FMM}, \ref{FB}, \ref{FC},
566: \ref{FD})], however, can reveal qualitative and quantitative features
567: that can differentiate among these schemes, which we outline here and
568: summarize in Table \ref{tab}.
569:
570: First, a distinction is possible based on the asymptotic value of $F$
571: as the substrate concentration $S$ saturates. For reaction schemes A
572: and C, \beq F_{A,C}(S\rightarrow\infty)=1, \eeq whereas for reaction
573: schemes B and D, \beq F_{B,D}(S\rightarrow\infty)=1-\alpha_{B,D}, \eeq
574: where $\alpha_B$ and $\alpha_D$ are defined following Eqns.\
575: (\ref{FB}) and (\ref{FD}) respectively. This expression has a minimum
576: value $1/2$ in the limits $k_2=k_{+}\gg k_{-}$ for B
577: and $k_2=k_{+}$ for D. Thus a saturation value of $F$ that is
578: significantly less than 1 offers evidence for reaction scheme B or D
579: over A or C (see Fig.~\ref{plots}).
580:
581: Second, distinctions are possible based on the value $F^*$ at the
582: extremum of
583: the Fano factor as a function of substrate concentration $S$. For a
584: MM reaction (case A), there is a minimum:\beq
585: \label{FstarA}
586: F^*_A=1-\frac{\alpha_A}{4K_A} = 1-\frac{1}{2}\frac{k_2}{k_2+k_{-1}} ,
587: \eeq
588: which is always
589: between $1/2$ (for $k_2\gg k_{-1}$) and 1 (for $k_{-1}\gg k_2$).
590: Similarly, for reaction scheme C, we obtain
591: \beq
592: \label{FstarC}
593: F^*_C=1-\frac{\alpha_C}{4K_C},
594: \eeq where $\alpha_C$
595: and $K_C$ are defined following Eqn.\ (\ref{FC}). This expression
596: also has a minimum value of $1/2$ (for $k_+\gg k_{-}$ and $k_2 \gg
597: k_{-1}$), but, unlike in the MM case, it can become larger than 1 if
598: $k_+(k_++k_-)<k_-(k_2+k_{-1})$ (see Fig.\ \ref{plots}). Indeed, as
599: mentioned, in the limit of rare activation $k_+\to 0$, we find
600: $F^*\rightarrow\infty$.
601:
602: Depending on the kinetic rates, reaction schemes B and D
603: may or may not have a minimum for positive $S$
604: (see Fig.\ \ref{plots} for an example of each).
605: In the cases for which a minimum exists, \beq
606: \label{FstarB}
607: F^*_{B,D}=1-\frac{\alpha_{B,D}}{4}\frac{{K'}_{B,D}^2}{K_{B,D}(K'_{B,D}-K_{B,D})},
608: \eeq where $\alpha_B$, $K_B$, and $K'_B$ are defined following Eqn.\
609: (\ref{FB}) and $\alpha_D$, $K_D$, and $K'_D$ are defined following
610: Eqn.\ (\ref{FD}). This expression has the minimum value $1/3$ in the
611: limit $k_+=k_2\gg k_{-1}$ for both schemes (and additionally $k_{+}\gg
612: k_{-}$ for B). In the reaction scheme B, these limits reduce the
613: system to a linear irreversible three-step cascade; an $L$-step
614: irreversible cascade has minimum $F^*$ of $1/L$ in the analogous
615: limits \cite{Doan}. Comparing with the MM minimum value of $F^*=1/2$,
616: it is clear that a measured value of $F^*$ less than $1/2$ is a strong
617: indication that more than one intermediate step is present
618: \footnote{In all schemes A, B, C, and D, $F^*$ is dependent on $k_1$
619: through $S^*$, which explicitly ensures that $k_1S=k_2$; this is a
620: commonly known result, and it may be used by nature to suppress
621: noise in natural signaling systems such as phototransduction
622: \cite{Doan}.}.
623:
624: Lastly, distinctions can be made based on measurement of $S^*$, the
625: substrate concentration at which an extremum in $F$ occurs.
626: For cases A and C, \beq \frac{S^*_{A,C}}{K_{A,C}}=1, \eeq where $K_A$
627: and $K_C$ are defined following Eqns.\ (\ref{FMM}) and (\ref{FC})
628: respectively, and, as in all four cases, $K$ is the concentration at
629: which $V$ is half-maximal. For cases B and D, on the other hand (when
630: there is a minimum), \beq
631: \label{SstarB}
632: \frac{S^*_{B,D}}{K_{B,D}}=\frac{K'_{B,D}}{K'_{B,D}-2K_{B,D}}, \eeq
633: where $K_B$ and $K'_B$ are defined following Eqn.\ (\ref{FB}) and
634: $K_D$ and $K'_D$ are defined following Eqn.\ (\ref{FD}). This
635: expression is bounded from below by 1
636: %(e.g.\ for $k_+\gg k_2 \gg k_{-1}$ for both schemes, and additionally $k_+\gg k_-$ for B),
637: (e.g.\ for $k_+\gg \{k_-,k_2,k_{-1}\}$ for B, or for $k_+\gg \{k_2,k_{-1}\}$ for D),
638: but can potentially be infinite
639: %(e.g.\ for $k_-=k_{-1}\gg k_2$ and $k_-\gg k_+$ for B, or for $k_{-1} \gg k_2=k_+$ for D).
640: (e.g.\ for $k_-=k_{-1}\gg \{k_2,k_+\}$ for B, or for $k_{-1} \gg k_2=k_+$ for D).
641: This implies that if
642: an extremum of the Fano factor occurs at a substrate concentration
643: significantly different from that at which the mean product formation
644: rate is half-maximal, it is a strong indication that more than one
645: intermediate step is present.
646:
647: Table \ref{tab} summarizes these distinctions, and Figure \ref{plots}
648: showcases the qualitative differences in the Fano factor curves among
649: the four reaction schemes caused by differences in the underlying
650: kinetics. For more complicated reaction schemes, such as multiple
651: substrate binding by the enzyme, modeled by a high Hill coefficient,
652: the Fano factor curve would gain even more distinguishing features,
653: such as additional extrema and/or inflection points
654: \footnote{We leave this as an exercise for future q-bio Summer School students.}.
655:
656: \begin{figure}
657: \centering
658: \ifthenelse{\value{col} = 1}{
659: \includegraphics[width = .8\textwidth]{Vplot_inset_New.pdf}
660: \includegraphics[width = .8\textwidth]{Fplot_inset_New.pdf}
661: }{
662: \includegraphics[width = .47\textwidth]{Vplot_inset_New.pdf}
663: \includegraphics[width = .47\textwidth]{Fplot_inset_New.pdf}
664: }
665: \linespread{1}
666: \caption{\label{plots} Mean product flux (also called dose-response
667: curve) $V$ and the Fano factor $F$ versus substrate concentration
668: $S$ for the four cases in Fig.~\ref{cartoon}: A solid, B dashed, C
669: dotted, and D dot-dashed. Plots are of Eqns.\ (\ref{VMM}, \ref{VB},
670: \ref{VC}, \ref{VD}) for $V$ and Eqns.\ (\ref{FMM}, \ref{FB},
671: \ref{FC}, \ref{FD}) for $F$, with $k_1=1$, $k_{-1}=1$,
672: $k_2=1$, $k_+=0.1$, and $k_-=0.01$. Note that while there are no
673: qualitative differences in $V$ (and in fact all curves collapse when
674: $V$ is normalized by $V^{\rm max}$ and $S$ by $K$, as seen in the
675: inset), features can appear in $F$ that signify that a process is
676: more complicated than the single-intermediate case A.}
677: \end{figure}
678:
679: \begin{table}
680: \centering
681: \begin{tabular}{|c|c|c|c|c|}
682: \hline
683: &A &B &C &D \\ \hline
684: $F(S\rightarrow\infty)$ &$1$ &$\left[\frac{1}{2}, 1\right]$ &$1$ &$\left[\frac{1}{2}, 1\right]$ \\ \hline
685: $F^*$ &$\left[\frac{1}{2}, 1\right]$ &$\left[\frac{1}{3}, 1\right]$ &$\left[\frac{1}{2}, \infty\right)$ &$\left[\frac{1}{3}, 1\right]$\\ \hline
686: $S^*/K$ &$1$ &$\left[1, \infty\right)$ &$1$ &$\left[1, \infty\right)$ \\ \hline
687: \end{tabular}
688: \linespread{1}
689: \caption{Bounds on experimentally measurable quantities that are useful in
690: distinguishing among schemes for enzyme-mediated reactions. A, B,
691: C, and D refer to reaction schemes in Fig.~\ref{cartoon}.
692: Star ($^*$) denotes the extremum of the Fano factor, such that $F^*$ is the
693: minimum or maximum value and $S^*$ is the substrate concentration at which
694: it occurs.
695: $K$ is the substrate concentration at which product formation rate $V$ is
696: half-maximal. Generally speaking, minimum bounds on all three quantities
697: occur when forward reaction rates dominate backward rates, and maximum
698: bounds occur when backward rates dominate forward rates; see text for
699: more details.}
700: \label{tab}
701: \end{table}
702:
703: \subsection{Extracting reaction rates from data}
704: In addition to helping one distinguish among competing reaction
705: schemes, experimental measurement of the dose-response curve $V(S)$
706: and the Fano curve $F(S)$ can be used to determine the kinetic rates
707: of the underlying biochemical reactions. If the structure of the
708: biochemical reaction is known, analytical expressions for both curves
709: in terms of the kinetic rates and the substrate concentration can be
710: obtained using our method [see e.g.\ Eqns. (\ref{VMM}-\ref{FMM},
711: \ref{VB}-\ref{FB}, \ref{VC}-\ref{FC}, \ref{VD}-\ref{FD})] and can be
712: fit to experimental data. Often times, measurements of the
713: qualitative features of both curves (such as those highlighted in
714: Table \ref{tab}) are enough to extract the kinetic rates; for more
715: complex reactions a full fit to the data would be necessary.
716: Additionally we note that performing full fits of experimental data to
717: the analytical expressions may also help in the original task of
718: distinguishing among (or at least eliminating) different biochemical
719: reaction schemes.
720:
721: The MM reaction is an example of a case in which measurement of the qualitative
722: features is enough to extract all kinetic rates. However, it is
723: important to note that in order to do this, one needs both the dose-response
724: curve and the Fano curve. In particular, one needs only to measure
725: the reaction rate at saturation $V^{\rm max}_A$, the substrate
726: concentration $K_A$ at which the rate is half maximal, and the minimum
727: value of the Fano curve $F_A^*$. Then, from Eqn.\ (\ref{FstarA}) and
728: the expressions following Eqn.\ (\ref{FMM}), one obtains \beqn
729: k_2&=&V_A^{\rm max},\\
730: k_{-1}&=&\frac{F_A^*-1/2}{1-F_A^*}V_A^{\rm max},\\
731: k_1&=&\frac{V_A^{\rm max}}{2K_A(1-F_A^*)}. \eeqn Instead of obtaining
732: only $k_2$ and a combination of $k_1$ and $k_{-1}$ by measuring only
733: the dose-response curve (as is traditionally done for MM reactions),
734: we now have analytical expressions for all three rates.
735:
736: For more complex reaction schemes, a similar analysis can be performed
737: to obtain analytical expressions for the kinetic rates in terms of the
738: experimental data. However, it can be the case that not all rates can
739: be determined unambiguously from measurements of $V$ and $F$ (for the
740: reaction scheme B, for example, symmetries in the inverted expressions
741: imply that measurements of $V$ and $F$ do not always uniquely
742: determine the five unknown kinetic rates).
743: %In these cases, a full fit of
744: %the dose-response and the Fano curves to the data would be required to
745: %extract the rates.
746: When experimentally feasible, one
747: may also compare higher moments of the measured product molecule
748: distributions with those calculated via our method.
749:
750: \section{Discussion}
751: We have developed a method of solving chemical master equations for
752: multistep enzymatic reactions using a perturbation theory approach
753: analogous to that encountered in quantum mechanics. With this method,
754: finding cumulants of the distribution of product molecules is
755: equivalent to diagonalizing a matrix with dimensionality equal to the
756: number of internal states in the kinetic diagram of the reaction.
757: Then obtaining the first $m$ cumulants of the reaction can be done by
758: solving the perturbation theory to $m$th order, which is
759: straightforward. In particular, the first two moments $\avg{n}$ and
760: $\sigma^2$ together define the dose-response curve $V=d\avg{n}/dt$ and
761: the Fano factor $F=\sigma^2/\avg{n}$. As both are currently measurable
762: in a variety of systems, comparing the calculated $F$ to experimental
763: data can be used to identify the underlying structure of molecular
764: reactions.
765:
766: We have applied this perturbation theory approach to four different
767: reaction schemes, starting with the simplest Michaelis-Menten
768: kinetics, and progressing to more complicated kinetic schemes with
769: internal states. We calculated the dose-response curve and the Fano
770: factor for each as functions of the substrate
771: concentration. Importantly, while the dose-response curves for all of
772: the considered reactions are qualitatively similar, prominent
773: qualitative features of the Fano factor curve (such as its values at
774: large substrate concentrations, as well as the position and values
775: at its extremum) allow us to disambiguate the considered reaction
776: schemes. Performing detailed fits of the curves to experimental data
777: (when feasible) can be an ultimate test for whether the underlying
778: kinetic structure is known.
779:
780: For the MM reaction, knowing just a handful of features of the $F(S)$
781: curve allows us to derive all three rates that completely
782: define the kinetic scheme, while the entire dose-response curve is
783: insufficient for this purpose. Similar results hold for the reactions
784: with intermediate steps, but here the analytical treatment is more
785: difficult, and often qualitative properties of $F$ alone do not define all of the
786: underlying kinetic parameters. Instead, a quantitative fit of
787: derived expressions for $F(S)$ to experimental data would be required.
788:
789: We stress that the kinetic schemes analyzed in this article are simple
790: toy models only. However, extending our analysis to more complicated
791: schemes to derive the first few cumulants of the product number
792: distribution is not difficult, and it can be automated with just a
793: simple linear-algebra solver. In particular, calculation of the Fano
794: factor for a signaling cascade as in \cite{Doan} or for a complex
795: network of single protein confirmations \cite{Li} is
796: straightforward. It should be noted, however, that generating
797: sufficient experimental data to distinguish minute details of
798: competing kinetic schemes is not easy. Our approach simplifies the
799: problem somewhat since it does not require single-molecule kinetic
800: data, as in \cite{English,Li}, but it is based on measuring a
801: mesoscopic, fluctuating flux. Still, ideally qualitative differences
802: would dominate the disambiguation task, as emphasized with the toy
803: models considered here.
804:
805: \section*{Acknowledgments}
806: The authors would like to thank the organizers, the lecturers, the
807: participants, and the sponsors of the $2^{\rm nd}$ q-bio Summer School
808: on Cellular Information Processing in Los Alamos, NM. We are also
809: thankful to Michael Wall for careful reading of the manuscript. WdR
810: was supported by the research program of the ``Stichting voor
811: Fundamenteel Onderzoek der Materie,'' which is financially supported
812: by The Netherlands Organization for Scientific Research. BCD was
813: supported by NSF Grant DMR-0705167. AM was supported by NSF Grant
814: DGE-0742450. NAS and IN were supported by DOE under Contract No.\
815: DE-AC52-06NA25396. IN was further supported by NSF Grant No.\
816: ECS-0425850.
817:
818: \bibliographystyle{apsrev}
819: \bibliography{mm}
820:
821: %\section{Supplementary information}
822: %\input{SI.tex}
823: \end{document}
824:
825: