0811.3283/mm.tex
1: %%% SWITCH BETWEEN THESE FOR 1 OR 2 COLUMNS
2: %\documentclass[preprint,superscriptaddress]{revtex4} % for 1 column
3: \documentclass[aps,prl,twocolumn,groupedaddress]{revtex4} % for 2 columns
4: 
5: \newcounter{col}
6: %%% SWITCH BETWEEN THESE FOR 1 OR 2 COLUMNS
7: %\setcounter{col}{1} % for 1 column
8: \setcounter{col}{2} % for 2 columns
9: 
10: %\usepackage[pdftex]{hyperref}
11: \usepackage[pdftex]{graphicx}
12: \usepackage{rotating}
13: \usepackage{subfigure}
14: \usepackage{verbatim}
15: \usepackage{amsmath}
16: \usepackage{amssymb}
17: \usepackage{color}
18: \usepackage{ifthen}
19: 
20: \newcommand{\beq}{\begin{equation}}
21: \newcommand{\eeq}{\end{equation}}
22: \newcommand{\beqn}{\begin{eqnarray}}
23: \newcommand{\eeqn}{\end{eqnarray}}
24: \newcommand{\avg}[1]{\langle{#1}\rangle}
25: \newcommand{\ket}[1]{|{#1}\rangle}
26: \newcommand{\bra}[1]{\langle{#1}|}
27: \newcommand{\ip}[2]{\langle{#1}|{#2}\rangle}
28: \renewcommand{\H}{\hat{H}}
29: \newcommand{\medium}{4.in}
30: 
31: \begin{document}
32: 
33: \title{Statistical properties of multistep enzyme-mediated reactions}
34: 
35: \author{Wiet H. de Ronde\footnote{These authors contributed equally to this work}}
36: \email{deronde@amolf.nl}
37: \affiliation{FOM Institute for Atomic and Molecular Physics, Kruislaan 407, 1098 SJ, Amsterdam}
38: 
39: \author{Bryan C. Daniels\footnotemark[1]}
40: \email{bcd27@cornell.edu}
41: \affiliation{Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, NY 14853, USA}
42: 
43: \author{Andrew Mugler\footnotemark[1]}
44: \email{ajm2121@columbia.edu}
45: \affiliation{Department of Physics, Columbia University, New York, NY 10027, USA}
46: 
47: \author{Nikolai A. Sinitsyn}
48: \email{nsinitsyn@lanl.gov}
49: \affiliation{Computer, Computational and Statistical Sciences Division, Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM 87545, USA}
50: 
51: \author{Ilya Nemenman}
52: \email{nemenman@lanl.gov}
53: \affiliation{Computer, Computational and Statistical Sciences Division, Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM 87545, USA}
54: 
55: \date{\today}
56: 
57: \ifthenelse{\value{col} = 1}{\linespread{1}}{}
58: \begin{abstract}
59:   Enzyme-mediated reactions may proceed through multiple intermediate
60:   conformational states before creating a final product molecule, and
61:   one often wishes to identify such intermediate structures from
62:   observations of the product creation. In this paper, we address this
63:   problem by solving the chemical master equations for various
64:   enzymatic reactions. We devise a perturbation theory analogous to
65:   that used in quantum mechanics that allows us to determine the first
66:   ($\avg{n}$) and the second ($\sigma^2$) cumulants of the
67:   distribution of created product molecules as a function of the
68:   substrate concentration and the kinetic rates of the intermediate
69:   processes. The mean product flux $V=d\avg{n}/dt$ (or
70:   ``dose-response'' curve) and the Fano factor $F=\sigma^2/\avg{n}$
71:   are both realistically measurable quantities, and while the mean
72:   flux can often appear the same for different reaction types, the
73:   Fano factor can be quite different. This suggests both qualitative
74:   and quantitative ways to discriminate between different reaction
75:   schemes, and we explore this possibility in the context of four
76:   sample multistep enzymatic reactions. We argue that measuring both
77:   the mean flux and the Fano factor can not only discriminate between
78:   reaction types, but can also provide some detailed information about
79:   the internal, unobserved kinetic rates, and this can be done without
80:   measuring single-molecule transition events.
81: \end{abstract}
82: \ifthenelse{\value{col} = 1}{\linespread{1.5}}{}
83: 
84: \maketitle
85: 
86: %\section{Introduction}
87: Enzyme-mediated reactions are ubiquitous in biology. Traditionally,
88: they have been described as a two-step Michaelis-Menten (MM) process
89: \cite{Michaelis}, in which the enzyme and the substrate form a complex
90: that can decay either back into the enzyme and the substrate, or
91: forward into the enzyme and the product (see Fig.~\ref{cartoon}A). The
92: latter step is usually assumed to be irreversible, leaving three
93: kinetic rates that specify the reaction. To determine these kinetic
94: rates, a typical experiment measures the average rate of product
95: formation (or product ``flux'') $V$ as a function of substrate
96: concentration $S$ (also called a ``dose-response'' curve), producing a
97: plot as in Fig.~\ref{plots}A.  Two pieces of information can be
98: extracted from this plot: the saturating reaction rate $V_{\max}$ and
99: the Michaelis constant $K$, the substrate concentration at half of the
100: maximum rate.  Importantly, these two measurements do not specify the
101: three underlying kinetic rates, thus they do not allow for a full
102: identification of the reaction processes.
103: 
104: The MM mechanism is not entirely general: many enzyme-mediated
105: reactions consist of multiple intermediate internal steps (such as
106: conformational changes of either the enzyme or the substrate, enzymes
107: that occur in active and inactive states, etc.), each with its own
108: forward and backward reaction rates. While measurements of
109: substrate-enzyme complex formation and product releases are possible
110: even on a single molecule level in enzymatic kinetics \cite{English}
111: and in ion channel transport \cite{Rostovtseva,Nestorovich},
112: %mathematically equivalent to it,
113: typical experiments cannot resolve intermediate steps when measuring
114: only the average reaction rate since they produce qualitatively
115: similar curves for $V(S)$.  For example, the mean flux through an
116: arbitrary complex ion channel that holds at most one large transported
117: molecule at a time is indistinguishable from that through a simple
118: channel with just two internal states \cite{Bezrukov}.
119: 
120: An interesting problem then is to determine which experimental
121: measurements could identify the multistep nature of an enzyme-mediated
122: reaction without requiring measurements at intermediate steps. We
123: suggest that this is possible by measuring not only the mean rate but
124: also the variance in the rate of the creation of product
125: molecules. Modern experiments can clearly perform this task in
126: different experimental systems \cite{English,Golding}.
127: 
128: Here we present a general perturbative approach for calculating the
129: cumulants of a product molecule flux for a given enzymatic reaction
130: scheme. To illustrate the method, we first apply it to the usual MM
131: reaction (Fig.~\ref{cartoon}A).  In addition to recovering the
132: well-known result for the mean rate of product formation as a function
133: of substrate concentration, we derive the dependence on substrate of
134: the Fano factor, the ratio of the variance in the number of product
135: molecules to the mean.  Importantly, our approach is extendible, at
136: least in principle, to an arbitrary enzyme-mediated reaction scheme,
137: and we demonstrate this by analyzing three more complex reaction
138: schemes, shown in Fig.~\ref{cartoon}B-D.  In the context of these
139: reactions, we show that the dependence of the Fano factor on the
140: substrate concentration can produce qualitatively different results
141: for different reaction types, allowing one to distinguish them
142: experimentally.  In addition, we argue that quantitative features of
143: the Fano factor measurements can constrain the values of the
144: underlying kinetic rate constants more tightly than the mean rate
145: measurements alone. Measurements of higher order product formation
146: cumulants, if experimentally possible, would allow one to constrain
147: properties of the reaction even more strongly.
148: 
149: \begin{figure}
150: \begin{tabular}{|c|c|} \hline
151: \ifthenelse{\value{col} = 1}{
152: {\LARGE A} & \scalebox{.5}{\input{cartoon_A.pdftex_t}}\\ \hline
153: {\LARGE B} & \scalebox{.5}{\input{cartoon_B.pdftex_t}}\\ \hline
154: {\LARGE C} & \scalebox{.5}{\input{cartoon_C.pdftex_t}}\\ \hline
155: {\LARGE D} & \scalebox{.5}{\input{cartoon_D.pdftex_t}}\\ \hline
156: }{
157: {\LARGE A} & \scalebox{.28}{\input{cartoon_A.pdftex_t}}\\ \hline
158: {\LARGE B} & \scalebox{.28}{\input{cartoon_B.pdftex_t}}\\ \hline
159: {\LARGE C} & \scalebox{.28}{\input{cartoon_C.pdftex_t}}\\ \hline
160: {\LARGE D} & \scalebox{.28}{\input{cartoon_D.pdftex_t}}\\ \hline
161: }
162: \end{tabular}
163: \linespread{1}
164: \caption{Potential schemes for an enzyme-mediated reaction, in which
165:   substrate $S$ is converted to product $P$.  {\bf A:} A simple
166:   Michaelis-Menten (MM) reaction.  {\bf B:} A MM reaction with an
167:   additional intermediate state (e.g.\ if the complex undergoes a
168:   conformational change before creating the product).  {\bf C:} A
169:   scheme in which the enzyme must become active (e.g., through
170:   phosphorylation) before mediating the reaction.  {\bf D:} A scheme
171:   in which the enzyme must become active before mediating the
172:   reaction, and the reaction leaves the enzyme inactive.}
173: \label{cartoon}
174: \end{figure}
175: 
176: 
177: \section{Methods: The Michaelis-Menten Model}
178: 
179: Going beyond a simple description of the mean production of a
180: particular molecule and making predictions about the intrinsic noise
181: requires a stochastic description, such as the chemical master
182: equation (CME) \cite{vanKampen}.  The CME describes the evolution in
183: time of the joint probability distribution for the copy numbers of all
184: species involved in a reaction scheme.  For the enzyme-mediated
185: reactions we consider, we make the assumption that each enzyme acts
186: independently, that is, the substrate concentration is much larger
187: than the enzyme concentration. This is equivalent to treating the
188: process as if only one enzyme were present at a time.  Furthermore, we
189: assume that the concentration of the substrate is constant during each
190: experimental measurement, and thus our master equation needs only to
191: keep track of the enzyme's state and the number of created product
192: molecules $n$. We note that both of these assumptions can be relaxed
193: using recently developed techniques
194: \cite{Sinitsyn,Sinitsyn2}. Finally, we only search for the
195: distribution of the number of product molecules at times much longer
196: than a typical enzymatic turnover time.
197: 
198: We begin by demonstrating our method on the simple Michaelis-Menten
199: (MM) reaction in Fig.\ \ref{cartoon}A.  In the MM reaction, the enzyme
200: will be in either a free state $E$ or a bound state $ES$.  Therefore
201: we partition the joint probability distribution into two parts:
202: $P^E_n$, the probability that $n$ product molecules have been created
203: {\it and} the enzyme is free, and $P^{ES}_n$, the probability that $n$
204: product molecules have been created {\it and} the enzyme is bound,
205: yielding the CME \cite{vanKampen} \beqn
206: \label{ma1}
207: \frac{dP^E_n}{dt}&=&-k_1SP^E_n+k_{-1}P^{ES}_n+k_2P^{ES}_{n-1}\\
208: \label{ma2}
209: \frac{dP^{ES}_n}{dt}&=&k_1SP^E_n-(k_{-1}+k_2)P^{ES}_n \eeqn where the
210: rates are defined in Fig.~\ref{cartoon}A, and $S$ is the number of
211: substrate molecules. (Note that $S$ can equivalently be thought of as
212: the concentration of substrate as long as one appropriately rescales
213: the rates).  The total probability of having $n$ product molecules is
214: then $P_n=P^E_n+P^{ES}_n$.
215: 
216: We note that the situation where the product molecules are created and
217: never destroyed or transformed back into the substrate is not
218: physical, and additional reactions that degrade the product in some
219: way are needed. However, as long as we are interested in how many
220: product molecules have been created, rather than are present at a
221: given time, the creation, Eqn.~(\ref{ma1}, \ref{ma2}), and the decay
222: reactions can be considered independently.
223: 
224: Similar to Refs.~\cite{Bagrets,Sinitsyn,Sinitsyn2,Gopich,Hornos} and
225: others, we begin our solution of Eqns.\ (\ref{ma1}-\ref{ma2}) by
226: defining the generating function \beq G^z(\chi) = \sum_{n=0}^{\infty}
227: P^z_n e^{i\chi n} \eeq with $z \in \{E, ES\}$.  Defining the vector
228: $\ket{G}=(G^E,G^{ES})^T$, we may write the total generating function
229: as \beq G(\chi) = \ip{\hat{1}}{G} = G^E+G^{ES} \eeq where
230: $\bra{\hat{1}}=(1,1)$ (note that we are adopting ``bra-ket'' vector
231: notation commonly used in quantum mechanics literature).  The
232: advantage of this formalism is that the mean $\avg{n}$ and variance
233: $\sigma^2$ of the distribution of product molecules $P_n$ can be
234: calculated from $G(\chi)$ via \beq
235: \label{cu}
236: \avg{n} = \left.\frac{d(\ln G)}{d(i\chi)}\right|_{\chi=0}, \qquad
237: \sigma^2 = \left.\frac{d^2(\ln G)}{d(i\chi)^2}\right|_{\chi=0}.
238: \eeq
239: Furthermore we note that having $N$ (independently acting) enzymes is equivalent to taking $G$ to $G^N$, so that extension to larger concentrations of enzymes is straightforward.
240: 
241: Now multiplying Eqns.\ (\ref{ma1}-\ref{ma2}) by $e^{i\chi n}$ and
242: summing over $n$ produces 
243: \beq
244: \label{eom}
245: \frac{d\ket{G}}{dt}=\H \ket{G},
246: \eeq
247: where, for the MM reaction,
248: \beq
249: \H=\H_A=	\begin{pmatrix}
250: 	-k_1 S	& k_{-1} + k_2e^{i\chi}	\\
251: 	k_1 S	& -(k_{-1} + k_2)
252: 	\end{pmatrix}.
253: \eeq
254: 
255: Eqn.\ (\ref{eom}) is solved by \beq \ket{G(t)} = e^{\H t}\ket{G_0},
256: \eeq with an initial condition $\ket{G_0}$.  If we write the matrix
257: exponential in terms of the eigenvalues $\lambda_j$ and eigenvectors
258: $\ket{u_j}$ of $\H$ as \footnote{Note that since $\H$ is not symmetric,
259:   the eigenvectors do not satisfy $\ket{u_j}=\bra{u_j}^T$, but rather they
260:   solve $\H\ket{u_j}=\lambda_j\ket{u_j}$ and $\bra{u_j}\H=\lambda_j\bra{u_j}$,
261:   respectively.} \beq e^{\H t} = \sum_j e^{\lambda_j t}
262: \ket{u_j}\bra{u_j}, \eeq then, at $t$ much larger than the typical enzyme
263: turnover time, $G(\chi)$ becomes \beq G(\chi) = \sum_j e^{\lambda_j t}
264: \ip{\hat{1}}{u_j}\ip{u_j}{G_0} \approx e^{\lambda_0 t}
265: \ip{\hat{1}}{u_0}\ip{u_0}{G_0}, \eeq where $\lambda_0$ is the eigenvalue
266: with the least negative real part.  Taking the
267: log, we get \beq
268: \label{lnG}
269: \ln G(\chi) = \lambda_0 t + \ln\left(\ip{\hat{1}}{u_0}\ip{u_0}{G_0}\right)
270: \approx \lambda_0 t, \eeq since again, in the long-time limit, the
271: first term dominates the second (for any bounded $G_0$), and the
272: initial number of product molecules is forgotten.  Recalling Eqn.\
273: (\ref{cu}), it is clear now that one only needs to find the
274: $\chi$-dependence of the least negative eigenvalue $\lambda_0$ of the
275: matrix $\H_A$ in order to compute the cumulants of the product molecule
276: distribution.  In fact, writing $\lambda_0$ as a power series, \beq
277: \lambda_0 = \sum_{m=0}^\infty \lambda_0^{(m)} \frac{(i\chi)^m}{m!},
278: \eeq it is clear that one only needs to know the coefficients up to
279: $m=2$ in order to compute the mean and variance of the distribution;
280: i.e.\ \beqn
281: \label{p}
282: \avg{n} &=& \lambda_0^{(1)} t, \\
283: \label{sig}
284: \sigma^2 &=& \lambda_0^{(2)} t, \eeqn and higher order terms are
285: needed for higher cumulants only.  Since Eqn.~(\ref{cu})
286: takes $\chi\to 0$, this permits a perturbative approach similar to
287: that used in quantum mechanics \cite{Griffiths}, with $\chi$ treated as a small parameter.
288: 
289: Specifically, we write $\H=\H^{(0)}+\H^{(1)}\sum_{m=1}^\infty(i\chi)^m/m!$ 
290: where (for the MM case)
291: \ifthenelse{\value{col} = 1}{
292: \beq
293: \label{MMH}
294: \H_A^{(0)}=	\begin{pmatrix}
295: 	-k_1 S	& k_{-1} + k_2	\\
296: 	k_1 S	& -(k_{-1} + k_2)
297: 	\end{pmatrix},
298: \qquad
299: \H_A^{(1)}=	k_2\begin{pmatrix}
300: 	0	& 1\\
301: 	0	& 0
302: \end{pmatrix}, \eeq
303: }{
304: \beqn
305: \label{MMH}
306: \H_A^{(0)}&=&	\begin{pmatrix}
307: 	-k_1 S	& k_{-1} + k_2	\\
308: 	k_1 S	& -(k_{-1} + k_2)
309: 	\end{pmatrix},\\
310: \H_A^{(1)}&=&	k_2\begin{pmatrix}
311: 	0	& 1\\
312: 	0	& 0
313: \end{pmatrix}, \eeqn
314: }
315: and we truncate at $m=2$.  We emphasize that this truncation does not
316: introduce any further approximation if one is interested only in the
317: first and second moments of the product molecule distribution.  
318: The least negative eigenvalue of $\H^{(0)}$ is $\lambda_0^{(0)}=0$
319: \footnote{More precisely, $\H_0$ is a propensity matrix whose columns
320:   sum to zero, which means one of its eigenvalues is zero and the rest
321:   are negative \cite{vanKampen}.}, and the higher order corrections are
322: given by \cite{Griffiths}
323: \beqn
324: \label{l1}
325: \lambda_0^{(1)} &=& \bra{u_0^{(0)}}\H^{(1)}\ket{u_0^{(0)}},\\
326: \label{l2}
327: \lambda_0^{(2)} &=& \lambda_0^{(1)}-2\sum_{j\ne 0}\frac{1}{\lambda_j^{(0)}}|\bra{u_j^{(0)}}\H^{(1)}\ket{u_0^{(0)}}|^2.
328: \eeqn
329: 
330: Noting Eqns.\ (\ref{p}-\ref{sig}), the rate of product formation
331: $V=d\avg{n}/dt$ and the Fano factor $F=\sigma^2/\avg{n}$ can now be
332: written: \beqn
333: \label{V}
334: V&=&\lambda_0^{(1)},\\
335: \label{F}
336: F&=&\lambda_0^{(2)}/\lambda_0^{(1)}.  \eeqn
337: For the MM case (Fig.~\ref{cartoon}A), this gives \beqn
338: \label{VMM}
339: V_A&=&V_A^{\rm max}\frac{S}{S+K_A},\\
340: \label{FMM}
341: F_A&=&1-\alpha_A\frac{S}{(S+K_A)^2}, \eeqn where $V_A^{\rm max}=k_2$,
342: $K_A=(k_2+k_{-1})/k_1$, and $\alpha_A=2k_2/k_1$.  The expression for
343: mean flux $V_A$ is well-known \cite{Michaelis}, and $K_A$ is called
344: the Michaelis constant; the expression for the Fano factor $F_A$ is
345: less familiar.
346: 
347: This procedure is fully extendible to other more complicated
348: enzyme-mediated reactions.  The reaction scheme determines the master
349: equation and thus $\H^{(0)}$ and $\H^{(1)}$. Specifically, $\hat{H}^{(0)}$
350: is given by the Markov transition matrix for the enzymatic states
351: (disregarding the $n$ variable), and $\hat{H}^{(1)}$ has a $1$ marking
352: every rate where the product gets created, and a $-1$ where it is
353: destroyed. Then Eqns.\ (\ref{V}-\ref{F}) give the product formation
354: rate and the Fano factor, and higher orders in perturbation theory
355: would provide more cumulants. To illustrate the breadth of the method,
356: in the next section, we apply this procedure to three reaction schemes
357: that include multiple intermediate reaction steps.
358: 
359: 
360: \section{Results: Complex enzymatic reactions}
361: 
362: \subsection{Product distribution statistics}
363: 
364: Many enzyme-mediated reactions involve intermediate steps, and it is
365: instructive to illustrate our approach with three prototypical examples, shown in Fig.~\ref{cartoon}B-D.
366: 
367: \subsubsection{Reaction scheme B}
368:  
369: Fig.~\ref{cartoon}B depicts a case in which the complex undergoes an
370: intermediate step, such as a conformational change, before creating
371: the product \cite{frenzen}. This kinetic scheme is also equivalent to
372: certain ion channels \cite{Bezrukov}.  Such multistep enzymatic
373: reactions have been shown (including via our method here) to reduce
374: noise in chemical reactions \cite{Doan}.  The master equation
375: describing this system is \beqn
376: \frac{dP^E_n}{dt}&=&-k_1SP^E_n+k_{-1}P^{ES}_n+k_2P^{EP}_{n-1},\\
377: \frac{dP^{ES}_n}{dt}&=&k_1SP^E_n-(k_{-1}+k_+)P^{ES}_n+k_{-}P^{EP}_n,\\
378: \frac{dP^{EP}_n}{dt}&=&k_+P^{ES}_n-(k_{-}+k_2)P^{EP}_n, \eeqn which
379: yields
380: \ifthenelse{\value{col} = 1}{
381: \beq \H_B^{(0)} = \begin{pmatrix}
382:   -k_1 S & k_{-1}	   & k_2			\\
383:   k_1 S	 & -(k_{-1} + k_{+}) & k_{-}			\\
384:   0 & k_+ & -(k_{-} + k_2)
385: \end{pmatrix},
386: \qquad
387: \H_B^{(1)} = 
388: k_2\begin{pmatrix}
389: 0	& 0		& 1	\\
390: 0	& 0		& 0	\\
391: 0	& 0		& 0
392: \end{pmatrix}.
393: \eeq
394: }{
395: \beqn
396:  \H_B^{(0)} &=& \begin{pmatrix}
397:   -k_1 S & k_{-1}	   & k_2			\\
398:   k_1 S	 & -(k_{-1} + k_{+}) & k_{-}			\\
399:   0 & k_+ & -(k_{-} + k_2)
400: \end{pmatrix},\\
401: \H_B^{(1)} &=& 
402: k_2\begin{pmatrix}
403: 0	& 0		& 1	\\
404: 0	& 0		& 0	\\
405: 0	& 0		& 0
406: \end{pmatrix}.
407: \eeqn
408: }
409: The product flux and Fano factor are then
410: \beqn
411: \label{VB}
412: V_B&=&V_B^{\rm max}\frac{S}{S+K_B}\\
413: \label{FB}
414: F_B&=&1-\alpha_B\frac{S(S+K'_B)}{(S+K_B)^2}
415: \eeqn
416: where $V_B^{\rm max}=k_2k_+/(k_2+k_++k_{-})$, 
417: $K_B=(k_2k_++k_2k_{-1}+k_{-1}k_{-})/(k_1(k_2+k_++k_{-}))$, 
418: $\alpha_B=2k_2k_+/(k_2+k_++k_{-})^2$,
419: and $K'_B=(k_2+k_++k_{-}+k_{-1})/k_1$.
420: 
421: \subsubsection{Reaction scheme C}
422: 
423: Fig.~\ref{cartoon}C depicts a case in which the enzyme exists in an
424: inactive and an active state. The enzyme switches autonomously between
425: these states, but can only react with the substrate in its active
426: form. Note that in this case we have two isolated reactions, since the
427: enzyme remains in the active state when a product is produced.  This
428: scheme can be interpreted as a toy model for a voltage-gated ion
429: channel that can only transmit a single molecule at a time
430: \cite{hh}. Alternatively, this scheme could be a model for the
431: production-degradation and subsequent translation of mRNA ($E^*$) by
432: ribosomes ($S$) into protein ($P$). Finally, this is also an extreme model of an enzyme that has internal states with different rates of product formation, such as studied in \cite{English}. For this scheme we can write the
433: following master equation: \beqn
434: \frac{dP^E_n}{dt}&=&-k_+P^E_n + k_{-}P^{E^*}_n\\
435: \frac{dP^{E^*}_n}{dt}&=&k_+P^E_n-k_{-}P^{E^*}_n+k_2P^{E^*S}_{n-1}
436: \ifthenelse{\value{col} = 1}{}{\\ &&}
437: -k_1SP^{E^*}_n+k_{-1}P^{E^*S}_n\\
438: \frac{dP^{E^*S}_n}{dt}&=&-k_2P^{E^*S}_{n}-k_{-1}P^{E^*S}_n+k_1SP^{E^*}_n
439: \eeqn which yields
440: \ifthenelse{\value{col} = 1}{
441: \beq \H_C^{(0)} =
442: \begin{pmatrix}
443:   -k_+ 		& k_{-}			& 0						\\
444:   k_+ 		& -(k_{-}+k_1 S)	& k_{-1} + k_2			\\
445:   0		& k_1 S		& -(k_{-1} + k_2)
446: \end{pmatrix},
447: \qquad
448: \H_C^{(1)} = 
449: k_2\begin{pmatrix}
450: 0	& 0		& 0	\\
451: 0	& 0		& 1	\\
452: 0	& 0		& 0
453: \end{pmatrix}.
454: \eeq
455: }{
456: \beqn \H_C^{(0)} &=&
457: \begin{pmatrix}
458:   -k_+ 		& k_{-}			& 0						\\
459:   k_+ 		& -(k_{-}+k_1 S)	& k_{-1} + k_2			\\
460:   0		& k_1 S		& -(k_{-1} + k_2)
461: \end{pmatrix},\\
462: \H_C^{(1)} &=& 
463: k_2\begin{pmatrix}
464: 0	& 0		& 0	\\
465: 0	& 0		& 1	\\
466: 0	& 0		& 0
467: \end{pmatrix}.
468: \eeqn
469: }
470: The product flux and Fano factor are then
471: \beqn
472: \label{VC}
473: V_C &=& V^{\rm max}_C \frac{S}{S+K_C},\\
474: \label{FC}
475: F_C &=& 1 - \alpha_C\frac{S}{(S+K_C)^2}, \eeqn
476: where $V^{\rm max}_C=k_2$,
477: $K_C=(k_+{+}k_{-})(k_2+k_{-1})/(k_{+}k_1)$, and
478: $\alpha_C=2k_2[1+k_-(k_+-k_2-k_{-1})/k_+^2]/k_1$.  Note that these expressions
479: reduce to those for the MM reaction (Eqns.\ (\ref{VMM}-\ref{FMM})) for
480: $k_-\to 0$, since this limit corresponds to the enzyme always being in
481: the active state.  Note also that since $\alpha_C$ can be negative,
482: $F_C$ can be greater than 1 (and in fact it is infinite in the limit
483: of rare activation $k_+\to 0$) due to the compounded noise from the
484: independent stochastic processes of enzyme activation and complex
485: formation. Under the interpretation of this scheme as protein
486: translation, $F\gg1$ corresponds to many proteins in a translation
487: burst from a single rare mRNA.
488: 
489: \subsubsection{Reaction scheme D}
490: 
491: Figure \ref{cartoon}D shows a third example of a more complex reaction
492: scheme, in which an active enzyme transforms a substrate into a
493: product and, in contrast to scheme C, returns to its inactive state in
494: the process.  The enzyme must switch back to its active state for a
495: new reaction to occur. Similar dynamics have been found for the
496: $\beta$-galactosidase enzyme \cite{English}. Alternatively, this can be a model for an enzyme that transfers a phosphate group to a
497: substrate, and needs to reacquire a new phosphate group before
498: continuing to function as an enzyme.  For this scheme, we can write the
499: following master equation: \beqn
500: \frac{dP^E_n}{dt}&=&-k_{+}P^E_n + k_{2}P^{E^*S}_{n-1},\\
501: \frac{dP^{E^*}_n}{dt}&=&k_{+}P^E_n-k_1SP^{E^*}_n+k_{-1}P^{E^*S}_n,\\
502: \frac{dP^{E^*S}_n}{dt}&=&k_1SP^{E^*}_n-k_{-1}P^{E^*S}_{n}-k_2P^{E^*S}_n,
503: \eeqn which yields
504: \ifthenelse{\value{col} = 1}{
505: \beqn \H_D^{(0)} =
506: \begin{pmatrix}
507:   -k_+	& 0  		& k_2			\\
508:   k_+		& -k_1S	& k_{-1}		\\
509:   0		& k_1S	& -(k_{-1} + k_2)
510: \end{pmatrix},
511: \qquad
512: \H_D^{(1)} = 
513: k_2\begin{pmatrix}
514: 0	& 0		& 1	\\
515: 0	& 0		& 0	\\
516: 0	& 0		& 0
517: \end{pmatrix}.
518: \eeqn
519: }{
520: \beqn \H_D^{(0)} &=&
521: \begin{pmatrix}
522:   -k_+	& 0  		& k_2			\\
523:   k_+		& -k_1S	& k_{-1}		\\
524:   0		& k_1S	& -(k_{-1} + k_2)
525: \end{pmatrix},\\
526: \H_D^{(1)} &=& 
527: k_2\begin{pmatrix}
528: 0	& 0		& 1	\\
529: 0	& 0		& 0	\\
530: 0	& 0		& 0
531: \end{pmatrix}.
532: \eeqn
533: }
534: The product flux and the Fano factor are then
535: \beqn
536: \label{VD}
537: V_D &=& V^{\rm max}_D \frac{S}{S+K_D},\\
538: \label{FD}
539: F_D &=& 1 - \alpha_D\frac{S(S+K'_D)}{(S+K_D)^2}, \eeqn where $V^{\rm
540:   max}_D=k_2k_{+}/(k_2+k_{+})$,
541: $K_D=k_+(k_2+k_{-1})/(k_1(k_2+k_{+}))$,
542: $\alpha_D=2k_2k_+/(k_2+k_{+})^2$ and $K'_D=(k_2+k_{+}+k_{-1})/k_1$.
543: Note that these expressions reduce to those for the MM reaction
544: (Eqns.\ (\ref{VMM}-\ref{FMM})) for $k_+\to\infty$, since this limit
545: corresponds to the immediate reversion of the enzyme to its active
546: state following a product formation.
547: 
548: All four reactions in Fig.~\ref{cartoon} use an enzyme to convert a
549: substrate into a product, but as we have derived using the present
550: method, the statistical properties of the product molecule
551: distributions differ among the cases.
552: 
553: \subsection{Measurable differences between reaction schemes}
554: Since different reactions have different statistical properties, it
555: should be possible to use our methods and results to differentiate
556: among the underlying reactions based on experimental observations.
557: Here we demonstrate how basic measurements can differentiate among the
558: four reaction schemes presented above.
559: 
560: The mean product formation rates $V$ for all four reaction schemes A,
561: B, C and D shown in Fig.~\ref{cartoon}, Eqns.\ (\ref{VMM}, \ref{VB},
562: \ref{VC}, \ref{VD}), are qualitatively similar functions of substrate
563: concentration $S$, and it would not be possible to differentiate the
564: schemes based on mean data alone (see Fig.\ \ref{plots}).  Measurement
565: of the Fano factor $F$ [Eqns.\ (\ref{FMM}, \ref{FB}, \ref{FC},
566: \ref{FD})], however, can reveal qualitative and quantitative features
567: that can differentiate among these schemes, which we outline here and
568: summarize in Table \ref{tab}.
569: 
570: First, a distinction is possible based on the asymptotic value of $F$
571: as the substrate concentration $S$ saturates.  For reaction schemes A
572: and C, \beq F_{A,C}(S\rightarrow\infty)=1, \eeq whereas for reaction
573: schemes B and D, \beq F_{B,D}(S\rightarrow\infty)=1-\alpha_{B,D}, \eeq
574: where $\alpha_B$ and $\alpha_D$ are defined following Eqns.\
575: (\ref{FB}) and (\ref{FD}) respectively.  This expression has a minimum
576: value $1/2$ in the limits $k_2=k_{+}\gg k_{-}$ for B
577: and $k_2=k_{+}$ for D.  Thus a saturation value of $F$ that is
578: significantly less than 1 offers evidence for reaction scheme B or D
579: over A or C (see Fig.~\ref{plots}).
580: 
581: Second, distinctions are possible based on the value $F^*$ at the
582: extremum of
583: the Fano factor as a function of substrate concentration $S$.  For a
584: MM reaction (case A), there is a minimum:\beq
585: \label{FstarA}
586: F^*_A=1-\frac{\alpha_A}{4K_A} = 1-\frac{1}{2}\frac{k_2}{k_2+k_{-1}} ,
587: \eeq
588: which is always
589: between $1/2$ (for $k_2\gg k_{-1}$) and 1 (for $k_{-1}\gg k_2$).
590: Similarly, for reaction scheme C, we obtain
591: \beq 
592: \label{FstarC}
593: F^*_C=1-\frac{\alpha_C}{4K_C},
594: \eeq where $\alpha_C$
595: and $K_C$ are defined following Eqn.\ (\ref{FC}).  This expression
596: also has a minimum value of $1/2$ (for $k_+\gg k_{-}$ and $k_2 \gg
597: k_{-1}$), but, unlike in the MM case, it can become larger than 1 if
598: $k_+(k_++k_-)<k_-(k_2+k_{-1})$ (see Fig.\ \ref{plots}).  Indeed, as
599: mentioned, in the limit of rare activation $k_+\to 0$, we find
600: $F^*\rightarrow\infty$.
601: 
602: Depending on the kinetic rates, reaction schemes B and D
603: may or may not have a minimum for positive $S$ 
604: (see Fig.\ \ref{plots} for an example of each).
605: In the cases for which a minimum exists, \beq
606: \label{FstarB}
607: F^*_{B,D}=1-\frac{\alpha_{B,D}}{4}\frac{{K'}_{B,D}^2}{K_{B,D}(K'_{B,D}-K_{B,D})},
608: \eeq where $\alpha_B$, $K_B$, and $K'_B$ are defined following Eqn.\
609: (\ref{FB}) and $\alpha_D$, $K_D$, and $K'_D$ are defined following
610: Eqn.\ (\ref{FD}).  This expression has the minimum value $1/3$ in the
611: limit $k_+=k_2\gg k_{-1}$ for both schemes (and additionally $k_{+}\gg
612: k_{-}$ for B).  In the reaction scheme B, these limits reduce the
613: system to a linear irreversible three-step cascade; an $L$-step
614: irreversible cascade has minimum $F^*$ of $1/L$ in the analogous
615: limits \cite{Doan}.  Comparing with the MM minimum value of $F^*=1/2$,
616: it is clear that a measured value of $F^*$ less than $1/2$ is a strong
617: indication that more than one intermediate step is present
618: \footnote{In all schemes A, B, C, and D, $F^*$ is dependent on $k_1$
619:   through $S^*$, which explicitly ensures that $k_1S=k_2$; this is a
620:   commonly known result, and it may be used by nature to suppress
621:   noise in natural signaling systems such as phototransduction
622:   \cite{Doan}.}.
623: 
624: Lastly, distinctions can be made based on measurement of $S^*$, the
625: substrate concentration at which an extremum in $F$ occurs.
626: For cases A and C, \beq \frac{S^*_{A,C}}{K_{A,C}}=1, \eeq where $K_A$
627: and $K_C$ are defined following Eqns.\ (\ref{FMM}) and (\ref{FC})
628: respectively, and, as in all four cases, $K$ is the concentration at
629: which $V$ is half-maximal. For cases B and D, on the other hand (when
630: there is a minimum), \beq
631: \label{SstarB}
632: \frac{S^*_{B,D}}{K_{B,D}}=\frac{K'_{B,D}}{K'_{B,D}-2K_{B,D}}, \eeq
633: where $K_B$ and $K'_B$ are defined following Eqn.\ (\ref{FB}) and
634: $K_D$ and $K'_D$ are defined following Eqn.\ (\ref{FD}).  This
635: expression is bounded from below by 1
636: %(e.g.\ for $k_+\gg k_2 \gg k_{-1}$ for both schemes, and additionally $k_+\gg k_-$ for B),
637: (e.g.\ for $k_+\gg \{k_-,k_2,k_{-1}\}$ for B, or for $k_+\gg \{k_2,k_{-1}\}$ for D),
638: but can potentially be infinite
639: %(e.g.\ for $k_-=k_{-1}\gg k_2$ and $k_-\gg k_+$ for B, or for $k_{-1} \gg k_2=k_+$ for D).
640: (e.g.\ for $k_-=k_{-1}\gg \{k_2,k_+\}$ for B, or for $k_{-1} \gg k_2=k_+$ for D).
641: This implies that if
642: an extremum of the Fano factor occurs at a substrate concentration
643: significantly different from that at which the mean product formation
644: rate is half-maximal, it is a strong indication that more than one
645: intermediate step is present.
646: 
647: Table \ref{tab} summarizes these distinctions, and Figure \ref{plots}
648: showcases the qualitative differences in the Fano factor curves among
649: the four reaction schemes caused by differences in the underlying
650: kinetics. For more complicated reaction schemes, such as multiple
651: substrate binding by the enzyme, modeled by a high Hill coefficient,
652: the Fano factor curve would gain even more distinguishing features,
653: such as additional extrema and/or inflection points
654: \footnote{We leave this as an exercise for future q-bio Summer School students.}.
655: 
656: \begin{figure}
657: \centering
658: \ifthenelse{\value{col} = 1}{
659: \includegraphics[width = .8\textwidth]{Vplot_inset_New.pdf}
660: \includegraphics[width = .8\textwidth]{Fplot_inset_New.pdf}
661: }{
662: \includegraphics[width = .47\textwidth]{Vplot_inset_New.pdf}
663: \includegraphics[width = .47\textwidth]{Fplot_inset_New.pdf}
664: }
665: \linespread{1}
666: \caption{\label{plots} Mean product flux (also called dose-response
667:   curve) $V$ and the Fano factor $F$ versus substrate concentration
668:   $S$ for the four cases in Fig.~\ref{cartoon}: A solid, B dashed, C
669:   dotted, and D dot-dashed.  Plots are of Eqns.\ (\ref{VMM}, \ref{VB},
670:   \ref{VC}, \ref{VD}) for $V$ and Eqns.\ (\ref{FMM}, \ref{FB},
671:   \ref{FC}, \ref{FD}) for $F$, with $k_1=1$, $k_{-1}=1$,
672:   $k_2=1$, $k_+=0.1$, and $k_-=0.01$.  Note that while there are no
673:   qualitative differences in $V$ (and in fact all curves collapse when
674:   $V$ is normalized by $V^{\rm max}$ and $S$ by $K$, as seen in the
675:   inset), features can appear in $F$ that signify that a process is
676:   more complicated than the single-intermediate case A.}
677: \end{figure}
678: 
679: \begin{table}
680: \centering
681: \begin{tabular}{|c|c|c|c|c|}
682: \hline
683: &A		&B		&C	&D				\\ \hline
684: $F(S\rightarrow\infty)$	&$1$	&$\left[\frac{1}{2}, 1\right]$	&$1$	&$\left[\frac{1}{2}, 1\right]$		\\ \hline
685: $F^*$		&$\left[\frac{1}{2}, 1\right]$	&$\left[\frac{1}{3}, 1\right]$	&$\left[\frac{1}{2}, \infty\right)$	&$\left[\frac{1}{3}, 1\right]$\\ \hline
686: $S^*/K$		&$1$		&$\left[1, \infty\right)$	&$1$   &$\left[1, \infty\right)$					\\ \hline
687: \end{tabular}
688: \linespread{1}
689: \caption{Bounds on experimentally measurable quantities that are useful in 
690:   distinguishing among schemes for enzyme-mediated reactions.  A, B, 
691:   C, and D refer to reaction schemes in Fig.~\ref{cartoon}.  
692:   Star ($^*$) denotes the extremum of the Fano factor, such that $F^*$ is the 
693:   minimum or maximum value and $S^*$ is the substrate concentration at which 
694:   it occurs.  
695:   $K$ is the substrate concentration at which product formation rate $V$ is 
696:   half-maximal.  Generally speaking, minimum bounds on all three quantities 
697:   occur when forward reaction rates dominate backward rates, and maximum 
698:   bounds occur when backward rates dominate forward rates; see text for 
699:   more details.}
700: \label{tab}
701: \end{table}
702: 
703: \subsection{Extracting reaction rates from data}
704: In addition to helping one distinguish among competing reaction
705: schemes, experimental measurement of the dose-response curve $V(S)$
706: and the Fano curve $F(S)$ can be used to determine the kinetic rates
707: of the underlying biochemical reactions.  If the structure of the
708: biochemical reaction is known, analytical expressions for both curves
709: in terms of the kinetic rates and the substrate concentration can be
710: obtained using our method [see e.g.\ Eqns. (\ref{VMM}-\ref{FMM},
711: \ref{VB}-\ref{FB}, \ref{VC}-\ref{FC}, \ref{VD}-\ref{FD})] and can be
712: fit to experimental data.  Often times, measurements of the
713: qualitative features of both curves (such as those highlighted in
714: Table \ref{tab}) are enough to extract the kinetic rates; for more
715: complex reactions a full fit to the data would be necessary.
716: Additionally we note that performing full fits of experimental data to
717: the analytical expressions may also help in the original task of
718: distinguishing among (or at least eliminating) different biochemical
719: reaction schemes.
720: 
721: The MM reaction is an example of a case in which measurement of the qualitative
722: features is enough to extract all kinetic rates.  However, it is
723: important to note that in order to do this, one needs both the dose-response
724: curve and the Fano curve.  In particular, one needs only to measure
725: the reaction rate at saturation $V^{\rm max}_A$, the substrate
726: concentration $K_A$ at which the rate is half maximal, and the minimum
727: value of the Fano curve $F_A^*$.  Then, from Eqn.\ (\ref{FstarA}) and
728: the expressions following Eqn.\ (\ref{FMM}), one obtains \beqn
729: k_2&=&V_A^{\rm max},\\
730: k_{-1}&=&\frac{F_A^*-1/2}{1-F_A^*}V_A^{\rm max},\\
731: k_1&=&\frac{V_A^{\rm max}}{2K_A(1-F_A^*)}.  \eeqn Instead of obtaining
732: only $k_2$ and a combination of $k_1$ and $k_{-1}$ by measuring only
733: the dose-response curve (as is traditionally done for MM reactions),
734: we now have analytical expressions for all three rates.
735: 
736: For more complex reaction schemes, a similar analysis can be performed
737: to obtain analytical expressions for the kinetic rates in terms of the
738: experimental data.  However, it can be the case that not all rates can
739: be determined unambiguously from measurements of $V$ and $F$ (for the
740: reaction scheme B, for example, symmetries in the inverted expressions
741: imply that measurements of $V$ and $F$ do not always uniquely 
742: determine the five unknown kinetic rates).  
743: %In these cases, a full fit of
744: %the dose-response and the Fano curves to the data would be required to
745: %extract the rates.  
746: When experimentally feasible, one
747: may also compare higher moments of the measured product molecule
748: distributions with those calculated via our method.
749: 
750: \section{Discussion}
751: We have developed a method of solving chemical master equations for
752: multistep enzymatic reactions using a perturbation theory approach
753: analogous to that encountered in quantum mechanics. With this method,
754: finding cumulants of the distribution of product molecules is
755: equivalent to diagonalizing a matrix with dimensionality equal to the
756: number of internal states in the kinetic diagram of the reaction.
757: Then obtaining the first $m$ cumulants of the reaction can be done by
758: solving the perturbation theory to $m$th order, which is
759: straightforward. In particular, the first two moments $\avg{n}$ and
760: $\sigma^2$ together define the dose-response curve $V=d\avg{n}/dt$ and
761: the Fano factor $F=\sigma^2/\avg{n}$. As both are currently measurable
762: in a variety of systems, comparing the calculated $F$ to experimental
763: data can be used to identify the underlying structure of molecular
764: reactions.
765: 
766: We have applied this perturbation theory approach to four different
767: reaction schemes, starting with the simplest Michaelis-Menten
768: kinetics, and progressing to more complicated kinetic schemes with
769: internal states. We calculated the dose-response curve and the Fano
770: factor for each as functions of the substrate
771: concentration. Importantly, while the dose-response curves for all of
772: the considered reactions are qualitatively similar, prominent
773: qualitative features of the Fano factor curve (such as its values at
774: large substrate concentrations, as well as the position and values
775: at its extremum) allow us to disambiguate the considered reaction
776: schemes.  Performing detailed fits of the curves to experimental data
777: (when feasible) can be an ultimate test for whether the underlying
778: kinetic structure is known.
779: 
780: For the MM reaction, knowing just a handful of features of the $F(S)$
781: curve allows us to derive all three rates that completely
782: define the kinetic scheme, while the entire dose-response curve is
783: insufficient for this purpose. Similar results hold for the reactions
784: with intermediate steps, but here the analytical treatment is more
785: difficult, and often qualitative properties of $F$ alone do not define all of the
786: underlying kinetic parameters. Instead, a quantitative fit of
787: derived expressions for $F(S)$ to experimental data would be required.
788: 
789: We stress that the kinetic schemes analyzed in this article are simple
790: toy models only. However, extending our analysis to more complicated
791: schemes to derive the first few cumulants of the product number
792: distribution is not difficult, and it can be automated with just a
793: simple linear-algebra solver. In particular, calculation of the Fano
794: factor for a signaling cascade as in \cite{Doan} or for a complex
795: network of single protein confirmations \cite{Li} is
796: straightforward. It should be noted, however, that generating
797: sufficient experimental data to distinguish minute details of
798: competing kinetic schemes is not easy. Our approach simplifies the
799: problem somewhat since it does not require single-molecule kinetic
800: data, as in \cite{English,Li}, but it is based on measuring a
801: mesoscopic, fluctuating flux. Still, ideally qualitative differences
802: would dominate the disambiguation task, as emphasized with the toy
803: models considered here.
804: 
805: \section*{Acknowledgments}
806: The authors would like to thank the organizers, the lecturers, the
807: participants, and the sponsors of the $2^{\rm nd}$ q-bio Summer School
808: on Cellular Information Processing in Los Alamos, NM. We are also
809: thankful to Michael Wall for careful reading of the manuscript. WdR
810: was supported by the research program of the ``Stichting voor
811: Fundamenteel Onderzoek der Materie,'' which is financially supported
812: by The Netherlands Organization for Scientific Research.  BCD was
813: supported by NSF Grant DMR-0705167. AM was supported by NSF Grant
814: DGE-0742450.  NAS and IN were supported by DOE under Contract No.\
815: DE-AC52-06NA25396. IN was further supported by NSF Grant No.\
816: ECS-0425850.
817: 
818: \bibliographystyle{apsrev}
819: \bibliography{mm}
820: 
821: %\section{Supplementary information}
822: %\input{SI.tex}
823: \end{document}
824: 
825: