0811:0811.3283/mm.tex

1: %%% SWITCH BETWEEN THESE FOR 1 OR 2 COLUMNS

2: %\documentclass[preprint,superscriptaddress]{revtex4} % for 1 column

3: \documentclass[aps,prl,twocolumn,groupedaddress]{revtex4} % for 2 columns

4:

5: \newcounter{col}

6: %%% SWITCH BETWEEN THESE FOR 1 OR 2 COLUMNS

7: %\setcounter{col}{1} % for 1 column

8: \setcounter{col}{2} % for 2 columns

9:

10: %\usepackage[pdftex]{hyperref}

11: \usepackage[pdftex]{graphicx}

12: \usepackage{rotating}

13: \usepackage{subfigure}

14: \usepackage{verbatim}

15: \usepackage{amsmath}

16: \usepackage{amssymb}

17: \usepackage{color}

18: \usepackage{ifthen}

19:

20: \newcommand{\beq}{\begin{equation}}

21: \newcommand{\eeq}{\end{equation}}

22: \newcommand{\beqn}{\begin{eqnarray}}

23: \newcommand{\eeqn}{\end{eqnarray}}

24: \newcommand{\avg}[1]{\langle{#1}\rangle}

25: \newcommand{\ket}[1]{|{#1}\rangle}

26: \newcommand{\bra}[1]{\langle{#1}|}

27: \newcommand{\ip}[2]{\langle{#1}|{#2}\rangle}

28: \renewcommand{\H}{\hat{H}}

29: \newcommand{\medium}{4.in}

30:

31: \begin{document}

32:

33: \title{Statistical properties of multistep enzyme-mediated reactions}

34:

35: \author{Wiet H. de Ronde\footnote{These authors contributed equally to this work}}

36: \email{deronde@amolf.nl}

37: \affiliation{FOM Institute for Atomic and Molecular Physics, Kruislaan 407, 1098 SJ, Amsterdam}

38:

39: \author{Bryan C. Daniels\footnotemark[1]}

40: \email{bcd27@cornell.edu}

41: \affiliation{Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, NY 14853, USA}

42:

43: \author{Andrew Mugler\footnotemark[1]}

44: \email{ajm2121@columbia.edu}

45: \affiliation{Department of Physics, Columbia University, New York, NY 10027, USA}

46:

47: \author{Nikolai A. Sinitsyn}

48: \email{nsinitsyn@lanl.gov}

49: \affiliation{Computer, Computational and Statistical Sciences Division, Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM 87545, USA}

50:

51: \author{Ilya Nemenman}

52: \email{nemenman@lanl.gov}

53: \affiliation{Computer, Computational and Statistical Sciences Division, Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM 87545, USA}

54:

55: \date{\today}

56:

57: \ifthenelse{\value{col} = 1}{\linespread{1}}{}

58: \begin{abstract}

59:   Enzyme-mediated reactions may proceed through multiple intermediate

60:   conformational states before creating a final product molecule, and

61:   one often wishes to identify such intermediate structures from

62:   observations of the product creation. In this paper, we address this

63:   problem by solving the chemical master equations for various

64:   enzymatic reactions. We devise a perturbation theory analogous to

65:   that used in quantum mechanics that allows us to determine the first

66:   ($\avg{n}$) and the second ($\sigma^2$) cumulants of the

67:   distribution of created product molecules as a function of the

68:   substrate concentration and the kinetic rates of the intermediate

69:   processes. The mean product flux $V=d\avg{n}/dt$ (or

70:   ``dose-response'' curve) and the Fano factor $F=\sigma^2/\avg{n}$

71:   are both realistically measurable quantities, and while the mean

72:   flux can often appear the same for different reaction types, the

73:   Fano factor can be quite different. This suggests both qualitative

74:   and quantitative ways to discriminate between different reaction

75:   schemes, and we explore this possibility in the context of four

76:   sample multistep enzymatic reactions. We argue that measuring both

77:   the mean flux and the Fano factor can not only discriminate between

78:   reaction types, but can also provide some detailed information about

79:   the internal, unobserved kinetic rates, and this can be done without

80:   measuring single-molecule transition events.

81: \end{abstract}

82: \ifthenelse{\value{col} = 1}{\linespread{1.5}}{}

83:

84: \maketitle

85:

86: %\section{Introduction}

87: Enzyme-mediated reactions are ubiquitous in biology. Traditionally,

88: they have been described as a two-step Michaelis-Menten (MM) process

89: \cite{Michaelis}, in which the enzyme and the substrate form a complex

90: that can decay either back into the enzyme and the substrate, or

91: forward into the enzyme and the product (see Fig.~\ref{cartoon}A). The

92: latter step is usually assumed to be irreversible, leaving three

93: kinetic rates that specify the reaction. To determine these kinetic

94: rates, a typical experiment measures the average rate of product

95: formation (or product ``flux'') $V$ as a function of substrate

96: concentration $S$ (also called a ``dose-response'' curve), producing a

97: plot as in Fig.~\ref{plots}A.  Two pieces of information can be

98: extracted from this plot: the saturating reaction rate $V_{\max}$ and

99: the Michaelis constant $K$, the substrate concentration at half of the

100: maximum rate.  Importantly, these two measurements do not specify the

101: three underlying kinetic rates, thus they do not allow for a full

102: identification of the reaction processes.

103:

104: The MM mechanism is not entirely general: many enzyme-mediated

105: reactions consist of multiple intermediate internal steps (such as

106: conformational changes of either the enzyme or the substrate, enzymes

107: that occur in active and inactive states, etc.), each with its own

108: forward and backward reaction rates. While measurements of

109: substrate-enzyme complex formation and product releases are possible

110: even on a single molecule level in enzymatic kinetics \cite{English}

111: and in ion channel transport \cite{Rostovtseva,Nestorovich},

112: %mathematically equivalent to it,

113: typical experiments cannot resolve intermediate steps when measuring

114: only the average reaction rate since they produce qualitatively

115: similar curves for $V(S)$.  For example, the mean flux through an

116: arbitrary complex ion channel that holds at most one large transported

117: molecule at a time is indistinguishable from that through a simple

118: channel with just two internal states \cite{Bezrukov}.

119:

120: An interesting problem then is to determine which experimental

121: measurements could identify the multistep nature of an enzyme-mediated

122: reaction without requiring measurements at intermediate steps. We

123: suggest that this is possible by measuring not only the mean rate but

124: also the variance in the rate of the creation of product

125: molecules. Modern experiments can clearly perform this task in

126: different experimental systems \cite{English,Golding}.

127:

128: Here we present a general perturbative approach for calculating the

129: cumulants of a product molecule flux for a given enzymatic reaction

130: scheme. To illustrate the method, we first apply it to the usual MM

131: reaction (Fig.~\ref{cartoon}A).  In addition to recovering the

132: well-known result for the mean rate of product formation as a function

133: of substrate concentration, we derive the dependence on substrate of

134: the Fano factor, the ratio of the variance in the number of product

135: molecules to the mean.  Importantly, our approach is extendible, at

136: least in principle, to an arbitrary enzyme-mediated reaction scheme,

137: and we demonstrate this by analyzing three more complex reaction

138: schemes, shown in Fig.~\ref{cartoon}B-D.  In the context of these

139: reactions, we show that the dependence of the Fano factor on the

140: substrate concentration can produce qualitatively different results

141: for different reaction types, allowing one to distinguish them

142: experimentally.  In addition, we argue that quantitative features of

143: the Fano factor measurements can constrain the values of the

144: underlying kinetic rate constants more tightly than the mean rate

145: measurements alone. Measurements of higher order product formation

146: cumulants, if experimentally possible, would allow one to constrain

147: properties of the reaction even more strongly.

148:

149: \begin{figure}

150: \begin{tabular}{|c|c|} \hline

151: \ifthenelse{\value{col} = 1}{

152: {\LARGE A} & \scalebox{.5}{\input{cartoon_A.pdftex_t}}\\ \hline

153: {\LARGE B} & \scalebox{.5}{\input{cartoon_B.pdftex_t}}\\ \hline

154: {\LARGE C} & \scalebox{.5}{\input{cartoon_C.pdftex_t}}\\ \hline

155: {\LARGE D} & \scalebox{.5}{\input{cartoon_D.pdftex_t}}\\ \hline

156: }{

157: {\LARGE A} & \scalebox{.28}{\input{cartoon_A.pdftex_t}}\\ \hline

158: {\LARGE B} & \scalebox{.28}{\input{cartoon_B.pdftex_t}}\\ \hline

159: {\LARGE C} & \scalebox{.28}{\input{cartoon_C.pdftex_t}}\\ \hline

160: {\LARGE D} & \scalebox{.28}{\input{cartoon_D.pdftex_t}}\\ \hline

161: }

162: \end{tabular}

163: \linespread{1}

164: \caption{Potential schemes for an enzyme-mediated reaction, in which

165:   substrate $S$ is converted to product $P$.  {\bf A:} A simple

166:   Michaelis-Menten (MM) reaction.  {\bf B:} A MM reaction with an

167:   additional intermediate state (e.g.\ if the complex undergoes a

168:   conformational change before creating the product).  {\bf C:} A

169:   scheme in which the enzyme must become active (e.g., through

170:   phosphorylation) before mediating the reaction.  {\bf D:} A scheme

171:   in which the enzyme must become active before mediating the

172:   reaction, and the reaction leaves the enzyme inactive.}

173: \label{cartoon}

174: \end{figure}

175:

176:

177: \section{Methods: The Michaelis-Menten Model}

178:

179: Going beyond a simple description of the mean production of a

180: particular molecule and making predictions about the intrinsic noise

181: requires a stochastic description, such as the chemical master

182: equation (CME) \cite{vanKampen}.  The CME describes the evolution in

183: time of the joint probability distribution for the copy numbers of all

184: species involved in a reaction scheme.  For the enzyme-mediated

185: reactions we consider, we make the assumption that each enzyme acts

186: independently, that is, the substrate concentration is much larger

187: than the enzyme concentration. This is equivalent to treating the

188: process as if only one enzyme were present at a time.  Furthermore, we

189: assume that the concentration of the substrate is constant during each

190: experimental measurement, and thus our master equation needs only to

191: keep track of the enzyme's state and the number of created product

192: molecules $n$. We note that both of these assumptions can be relaxed

193: using recently developed techniques

194: \cite{Sinitsyn,Sinitsyn2}. Finally, we only search for the

195: distribution of the number of product molecules at times much longer

196: than a typical enzymatic turnover time.

197:

198: We begin by demonstrating our method on the simple Michaelis-Menten

199: (MM) reaction in Fig.\ \ref{cartoon}A.  In the MM reaction, the enzyme

200: will be in either a free state $E$ or a bound state $ES$.  Therefore

201: we partition the joint probability distribution into two parts:

202: $P^E_n$, the probability that $n$ product molecules have been created

203: {\it and} the enzyme is free, and $P^{ES}_n$, the probability that $n$

204: product molecules have been created {\it and} the enzyme is bound,

205: yielding the CME \cite{vanKampen} \beqn

206: \label{ma1}

207: \frac{dP^E_n}{dt}&=&-k_1SP^E_n+k_{-1}P^{ES}_n+k_2P^{ES}_{n-1}\\

208: \label{ma2}

209: \frac{dP^{ES}_n}{dt}&=&k_1SP^E_n-(k_{-1}+k_2)P^{ES}_n \eeqn where the

210: rates are defined in Fig.~\ref{cartoon}A, and $S$ is the number of

211: substrate molecules. (Note that $S$ can equivalently be thought of as

212: the concentration of substrate as long as one appropriately rescales

213: the rates).  The total probability of having $n$ product molecules is

214: then $P_n=P^E_n+P^{ES}_n$.

215:

216: We note that the situation where the product molecules are created and

217: never destroyed or transformed back into the substrate is not

218: physical, and additional reactions that degrade the product in some

219: way are needed. However, as long as we are interested in how many

220: product molecules have been created, rather than are present at a

221: given time, the creation, Eqn.~(\ref{ma1}, \ref{ma2}), and the decay

222: reactions can be considered independently.

223:

224: Similar to Refs.~\cite{Bagrets,Sinitsyn,Sinitsyn2,Gopich,Hornos} and

225: others, we begin our solution of Eqns.\ (\ref{ma1}-\ref{ma2}) by

226: defining the generating function \beq G^z(\chi) = \sum_{n=0}^{\infty}

227: P^z_n e^{i\chi n} \eeq with $z \in \{E, ES\}$.  Defining the vector

228: $\ket{G}=(G^E,G^{ES})^T$, we may write the total generating function

229: as \beq G(\chi) = \ip{\hat{1}}{G} = G^E+G^{ES} \eeq where

230: $\bra{\hat{1}}=(1,1)$ (note that we are adopting ``bra-ket'' vector

231: notation commonly used in quantum mechanics literature).  The

232: advantage of this formalism is that the mean $\avg{n}$ and variance

233: $\sigma^2$ of the distribution of product molecules $P_n$ can be

234: calculated from $G(\chi)$ via \beq

235: \label{cu}

236: \avg{n} = \left.\frac{d(\ln G)}{d(i\chi)}\right|_{\chi=0}, \qquad

237: \sigma^2 = \left.\frac{d^2(\ln G)}{d(i\chi)^2}\right|_{\chi=0}.

238: \eeq

239: Furthermore we note that having $N$ (independently acting) enzymes is equivalent to taking $G$ to $G^N$, so that extension to larger concentrations of enzymes is straightforward.

240:

241: Now multiplying Eqns.\ (\ref{ma1}-\ref{ma2}) by $e^{i\chi n}$ and

242: summing over $n$ produces

243: \beq

244: \label{eom}

245: \frac{d\ket{G}}{dt}=\H \ket{G},

246: \eeq

247: where, for the MM reaction,

248: \beq

249: \H=\H_A=	\begin{pmatrix}

250: 	-k_1 S	& k_{-1} + k_2e^{i\chi}	\\

251: 	k_1 S	& -(k_{-1} + k_2)

252: 	\end{pmatrix}.

253: \eeq

254:

255: Eqn.\ (\ref{eom}) is solved by \beq \ket{G(t)} = e^{\H t}\ket{G_0},

256: \eeq with an initial condition $\ket{G_0}$.  If we write the matrix

257: exponential in terms of the eigenvalues $\lambda_j$ and eigenvectors

258: $\ket{u_j}$ of $\H$ as \footnote{Note that since $\H$ is not symmetric,

259:   the eigenvectors do not satisfy $\ket{u_j}=\bra{u_j}^T$, but rather they

260:   solve $\H\ket{u_j}=\lambda_j\ket{u_j}$ and $\bra{u_j}\H=\lambda_j\bra{u_j}$,

261:   respectively.} \beq e^{\H t} = \sum_j e^{\lambda_j t}

262: \ket{u_j}\bra{u_j}, \eeq then, at $t$ much larger than the typical enzyme

263: turnover time, $G(\chi)$ becomes \beq G(\chi) = \sum_j e^{\lambda_j t}

264: \ip{\hat{1}}{u_j}\ip{u_j}{G_0} \approx e^{\lambda_0 t}

265: \ip{\hat{1}}{u_0}\ip{u_0}{G_0}, \eeq where $\lambda_0$ is the eigenvalue

266: with the least negative real part.  Taking the

267: log, we get \beq

268: \label{lnG}

269: \ln G(\chi) = \lambda_0 t + \ln\left(\ip{\hat{1}}{u_0}\ip{u_0}{G_0}\right)

270: \approx \lambda_0 t, \eeq since again, in the long-time limit, the

271: first term dominates the second (for any bounded $G_0$), and the

272: initial number of product molecules is forgotten.  Recalling Eqn.\

273: (\ref{cu}), it is clear now that one only needs to find the

274: $\chi$-dependence of the least negative eigenvalue $\lambda_0$ of the

275: matrix $\H_A$ in order to compute the cumulants of the product molecule

276: distribution.  In fact, writing $\lambda_0$ as a power series, \beq

277: \lambda_0 = \sum_{m=0}^\infty \lambda_0^{(m)} \frac{(i\chi)^m}{m!},

278: \eeq it is clear that one only needs to know the coefficients up to

279: $m=2$ in order to compute the mean and variance of the distribution;

280: i.e.\ \beqn

281: \label{p}

282: \avg{n} &=& \lambda_0^{(1)} t, \\

283: \label{sig}

284: \sigma^2 &=& \lambda_0^{(2)} t, \eeqn and higher order terms are

285: needed for higher cumulants only.  Since Eqn.~(\ref{cu})

286: takes $\chi\to 0$, this permits a perturbative approach similar to

287: that used in quantum mechanics \cite{Griffiths}, with $\chi$ treated as a small parameter.

288:

289: Specifically, we write $\H=\H^{(0)}+\H^{(1)}\sum_{m=1}^\infty(i\chi)^m/m!$

290: where (for the MM case)

291: \ifthenelse{\value{col} = 1}{

292: \beq

293: \label{MMH}

294: \H_A^{(0)}=	\begin{pmatrix}

295: 	-k_1 S	& k_{-1} + k_2	\\

296: 	k_1 S	& -(k_{-1} + k_2)

297: 	\end{pmatrix},

298: \qquad

299: \H_A^{(1)}=	k_2\begin{pmatrix}

300: 	0	& 1\\

301: 	0	& 0

302: \end{pmatrix}, \eeq

303: }{

304: \beqn

305: \label{MMH}

306: \H_A^{(0)}&=&	\begin{pmatrix}

307: 	-k_1 S	& k_{-1} + k_2	\\

308: 	k_1 S	& -(k_{-1} + k_2)

309: 	\end{pmatrix},\\

310: \H_A^{(1)}&=&	k_2\begin{pmatrix}

311: 	0	& 1\\

312: 	0	& 0

313: \end{pmatrix}, \eeqn

314: }

315: and we truncate at $m=2$.  We emphasize that this truncation does not

316: introduce any further approximation if one is interested only in the

317: first and second moments of the product molecule distribution.

318: The least negative eigenvalue of $\H^{(0)}$ is $\lambda_0^{(0)}=0$

319: \footnote{More precisely, $\H_0$ is a propensity matrix whose columns

320:   sum to zero, which means one of its eigenvalues is zero and the rest

321:   are negative \cite{vanKampen}.}, and the higher order corrections are

322: given by \cite{Griffiths}

323: \beqn

324: \label{l1}

325: \lambda_0^{(1)} &=& \bra{u_0^{(0)}}\H^{(1)}\ket{u_0^{(0)}},\\

326: \label{l2}

327: \lambda_0^{(2)} &=& \lambda_0^{(1)}-2\sum_{j\ne 0}\frac{1}{\lambda_j^{(0)}}|\bra{u_j^{(0)}}\H^{(1)}\ket{u_0^{(0)}}|^2.

328: \eeqn

329:

330: Noting Eqns.\ (\ref{p}-\ref{sig}), the rate of product formation

331: $V=d\avg{n}/dt$ and the Fano factor $F=\sigma^2/\avg{n}$ can now be

332: written: \beqn

333: \label{V}

334: V&=&\lambda_0^{(1)},\\

335: \label{F}

336: F&=&\lambda_0^{(2)}/\lambda_0^{(1)}.  \eeqn

337: For the MM case (Fig.~\ref{cartoon}A), this gives \beqn

338: \label{VMM}

339: V_A&=&V_A^{\rm max}\frac{S}{S+K_A},\\

340: \label{FMM}

341: F_A&=&1-\alpha_A\frac{S}{(S+K_A)^2}, \eeqn where $V_A^{\rm max}=k_2$,

342: $K_A=(k_2+k_{-1})/k_1$, and $\alpha_A=2k_2/k_1$.  The expression for

343: mean flux $V_A$ is well-known \cite{Michaelis}, and $K_A$ is called

344: the Michaelis constant; the expression for the Fano factor $F_A$ is

345: less familiar.

346:

347: This procedure is fully extendible to other more complicated

348: enzyme-mediated reactions.  The reaction scheme determines the master

349: equation and thus $\H^{(0)}$ and $\H^{(1)}$. Specifically, $\hat{H}^{(0)}$

350: is given by the Markov transition matrix for the enzymatic states

351: (disregarding the $n$ variable), and $\hat{H}^{(1)}$ has a $1$ marking

352: every rate where the product gets created, and a $-1$ where it is

353: destroyed. Then Eqns.\ (\ref{V}-\ref{F}) give the product formation

354: rate and the Fano factor, and higher orders in perturbation theory

355: would provide more cumulants. To illustrate the breadth of the method,

356: in the next section, we apply this procedure to three reaction schemes

357: that include multiple intermediate reaction steps.

358:

359:

360: \section{Results: Complex enzymatic reactions}

361:

362: \subsection{Product distribution statistics}

363:

364: Many enzyme-mediated reactions involve intermediate steps, and it is

365: instructive to illustrate our approach with three prototypical examples, shown in Fig.~\ref{cartoon}B-D.

366:

367: \subsubsection{Reaction scheme B}

368:

369: Fig.~\ref{cartoon}B depicts a case in which the complex undergoes an

370: intermediate step, such as a conformational change, before creating

371: the product \cite{frenzen}. This kinetic scheme is also equivalent to

372: certain ion channels \cite{Bezrukov}.  Such multistep enzymatic

373: reactions have been shown (including via our method here) to reduce

374: noise in chemical reactions \cite{Doan}.  The master equation

375: describing this system is \beqn

376: \frac{dP^E_n}{dt}&=&-k_1SP^E_n+k_{-1}P^{ES}_n+k_2P^{EP}_{n-1},\\

377: \frac{dP^{ES}_n}{dt}&=&k_1SP^E_n-(k_{-1}+k_+)P^{ES}_n+k_{-}P^{EP}_n,\\

378: \frac{dP^{EP}_n}{dt}&=&k_+P^{ES}_n-(k_{-}+k_2)P^{EP}_n, \eeqn which

379: yields

380: \ifthenelse{\value{col} = 1}{

381: \beq \H_B^{(0)} = \begin{pmatrix}

382:   -k_1 S & k_{-1}	   & k_2			\\

383:   k_1 S	 & -(k_{-1} + k_{+}) & k_{-}			\\

384:   0 & k_+ & -(k_{-} + k_2)

385: \end{pmatrix},

386: \qquad

387: \H_B^{(1)} =

388: k_2\begin{pmatrix}

389: 0	& 0		& 1	\\

390: 0	& 0		& 0	\\

391: 0	& 0		& 0

392: \end{pmatrix}.

393: \eeq

394: }{

395: \beqn

396:  \H_B^{(0)} &=& \begin{pmatrix}

397:   -k_1 S & k_{-1}	   & k_2			\\

398:   k_1 S	 & -(k_{-1} + k_{+}) & k_{-}			\\

399:   0 & k_+ & -(k_{-} + k_2)

400: \end{pmatrix},\\

401: \H_B^{(1)} &=&

402: k_2\begin{pmatrix}

403: 0	& 0		& 1	\\

404: 0	& 0		& 0	\\

405: 0	& 0		& 0

406: \end{pmatrix}.

407: \eeqn

408: }

409: The product flux and Fano factor are then

410: \beqn

411: \label{VB}

412: V_B&=&V_B^{\rm max}\frac{S}{S+K_B}\\

413: \label{FB}

414: F_B&=&1-\alpha_B\frac{S(S+K'_B)}{(S+K_B)^2}

415: \eeqn

416: where $V_B^{\rm max}=k_2k_+/(k_2+k_++k_{-})$,

417: $K_B=(k_2k_++k_2k_{-1}+k_{-1}k_{-})/(k_1(k_2+k_++k_{-}))$,

418: $\alpha_B=2k_2k_+/(k_2+k_++k_{-})^2$,

419: and $K'_B=(k_2+k_++k_{-}+k_{-1})/k_1$.

420:

421: \subsubsection{Reaction scheme C}

422:

423: Fig.~\ref{cartoon}C depicts a case in which the enzyme exists in an

424: inactive and an active state. The enzyme switches autonomously between

425: these states, but can only react with the substrate in its active

426: form. Note that in this case we have two isolated reactions, since the

427: enzyme remains in the active state when a product is produced.  This

428: scheme can be interpreted as a toy model for a voltage-gated ion

429: channel that can only transmit a single molecule at a time

430: \cite{hh}. Alternatively, this scheme could be a model for the

431: production-degradation and subsequent translation of mRNA ($E^*$) by

432: ribosomes ($S$) into protein ($P$). Finally, this is also an extreme model of an enzyme that has internal states with different rates of product formation, such as studied in \cite{English}. For this scheme we can write the

433: following master equation: \beqn

434: \frac{dP^E_n}{dt}&=&-k_+P^E_n + k_{-}P^{E^*}_n\\

435: \frac{dP^{E^*}_n}{dt}&=&k_+P^E_n-k_{-}P^{E^*}_n+k_2P^{E^*S}_{n-1}

436: \ifthenelse{\value{col} = 1}{}{\\ &&}

437: -k_1SP^{E^*}_n+k_{-1}P^{E^*S}_n\\

438: \frac{dP^{E^*S}_n}{dt}&=&-k_2P^{E^*S}_{n}-k_{-1}P^{E^*S}_n+k_1SP^{E^*}_n

439: \eeqn which yields

440: \ifthenelse{\value{col} = 1}{

441: \beq \H_C^{(0)} =

442: \begin{pmatrix}

443:   -k_+ 		& k_{-}			& 0						\\

444:   k_+ 		& -(k_{-}+k_1 S)	& k_{-1} + k_2			\\

445:   0		& k_1 S		& -(k_{-1} + k_2)

446: \end{pmatrix},

447: \qquad

448: \H_C^{(1)} =

449: k_2\begin{pmatrix}

450: 0	& 0		& 0	\\

451: 0	& 0		& 1	\\

452: 0	& 0		& 0

453: \end{pmatrix}.

454: \eeq

455: }{

456: \beqn \H_C^{(0)} &=&

457: \begin{pmatrix}

458:   -k_+ 		& k_{-}			& 0						\\

459:   k_+ 		& -(k_{-}+k_1 S)	& k_{-1} + k_2			\\

460:   0		& k_1 S		& -(k_{-1} + k_2)

461: \end{pmatrix},\\

462: \H_C^{(1)} &=&

463: k_2\begin{pmatrix}

464: 0	& 0		& 0	\\

465: 0	& 0		& 1	\\

466: 0	& 0		& 0

467: \end{pmatrix}.

468: \eeqn

469: }

470: The product flux and Fano factor are then

471: \beqn

472: \label{VC}

473: V_C &=& V^{\rm max}_C \frac{S}{S+K_C},\\

474: \label{FC}

475: F_C &=& 1 - \alpha_C\frac{S}{(S+K_C)^2}, \eeqn

476: where $V^{\rm max}_C=k_2$,

477: $K_C=(k_+{+}k_{-})(k_2+k_{-1})/(k_{+}k_1)$, and

478: $\alpha_C=2k_2[1+k_-(k_+-k_2-k_{-1})/k_+^2]/k_1$.  Note that these expressions

479: reduce to those for the MM reaction (Eqns.\ (\ref{VMM}-\ref{FMM})) for

480: $k_-\to 0$, since this limit corresponds to the enzyme always being in

481: the active state.  Note also that since $\alpha_C$ can be negative,

482: $F_C$ can be greater than 1 (and in fact it is infinite in the limit

483: of rare activation $k_+\to 0$) due to the compounded noise from the

484: independent stochastic processes of enzyme activation and complex

485: formation. Under the interpretation of this scheme as protein

486: translation, $F\gg1$ corresponds to many proteins in a translation

487: burst from a single rare mRNA.

488:

489: \subsubsection{Reaction scheme D}

490:

491: Figure \ref{cartoon}D shows a third example of a more complex reaction

492: scheme, in which an active enzyme transforms a substrate into a

493: product and, in contrast to scheme C, returns to its inactive state in

494: the process.  The enzyme must switch back to its active state for a

495: new reaction to occur. Similar dynamics have been found for the

496: $\beta$-galactosidase enzyme \cite{English}. Alternatively, this can be a model for an enzyme that transfers a phosphate group to a

497: substrate, and needs to reacquire a new phosphate group before

498: continuing to function as an enzyme.  For this scheme, we can write the

499: following master equation: \beqn

500: \frac{dP^E_n}{dt}&=&-k_{+}P^E_n + k_{2}P^{E^*S}_{n-1},\\

501: \frac{dP^{E^*}_n}{dt}&=&k_{+}P^E_n-k_1SP^{E^*}_n+k_{-1}P^{E^*S}_n,\\

502: \frac{dP^{E^*S}_n}{dt}&=&k_1SP^{E^*}_n-k_{-1}P^{E^*S}_{n}-k_2P^{E^*S}_n,

503: \eeqn which yields

504: \ifthenelse{\value{col} = 1}{

505: \beqn \H_D^{(0)} =

506: \begin{pmatrix}

507:   -k_+	& 0  		& k_2			\\

508:   k_+		& -k_1S	& k_{-1}		\\

509:   0		& k_1S	& -(k_{-1} + k_2)

510: \end{pmatrix},

511: \qquad

512: \H_D^{(1)} =

513: k_2\begin{pmatrix}

514: 0	& 0		& 1	\\

515: 0	& 0		& 0	\\

516: 0	& 0		& 0

517: \end{pmatrix}.

518: \eeqn

519: }{

520: \beqn \H_D^{(0)} &=&

521: \begin{pmatrix}

522:   -k_+	& 0  		& k_2			\\

523:   k_+		& -k_1S	& k_{-1}		\\

524:   0		& k_1S	& -(k_{-1} + k_2)

525: \end{pmatrix},\\

526: \H_D^{(1)} &=&

527: k_2\begin{pmatrix}

528: 0	& 0		& 1	\\

529: 0	& 0		& 0	\\

530: 0	& 0		& 0

531: \end{pmatrix}.

532: \eeqn

533: }

534: The product flux and the Fano factor are then

535: \beqn

536: \label{VD}

537: V_D &=& V^{\rm max}_D \frac{S}{S+K_D},\\

538: \label{FD}

539: F_D &=& 1 - \alpha_D\frac{S(S+K'_D)}{(S+K_D)^2}, \eeqn where $V^{\rm

540:   max}_D=k_2k_{+}/(k_2+k_{+})$,

541: $K_D=k_+(k_2+k_{-1})/(k_1(k_2+k_{+}))$,

542: $\alpha_D=2k_2k_+/(k_2+k_{+})^2$ and $K'_D=(k_2+k_{+}+k_{-1})/k_1$.

543: Note that these expressions reduce to those for the MM reaction

544: (Eqns.\ (\ref{VMM}-\ref{FMM})) for $k_+\to\infty$, since this limit

545: corresponds to the immediate reversion of the enzyme to its active

546: state following a product formation.

547:

548: All four reactions in Fig.~\ref{cartoon} use an enzyme to convert a

549: substrate into a product, but as we have derived using the present

550: method, the statistical properties of the product molecule

551: distributions differ among the cases.

552:

553: \subsection{Measurable differences between reaction schemes}

554: Since different reactions have different statistical properties, it

555: should be possible to use our methods and results to differentiate

556: among the underlying reactions based on experimental observations.

557: Here we demonstrate how basic measurements can differentiate among the

558: four reaction schemes presented above.

559:

560: The mean product formation rates $V$ for all four reaction schemes A,

561: B, C and D shown in Fig.~\ref{cartoon}, Eqns.\ (\ref{VMM}, \ref{VB},

562: \ref{VC}, \ref{VD}), are qualitatively similar functions of substrate

563: concentration $S$, and it would not be possible to differentiate the

564: schemes based on mean data alone (see Fig.\ \ref{plots}).  Measurement

565: of the Fano factor $F$ [Eqns.\ (\ref{FMM}, \ref{FB}, \ref{FC},

566: \ref{FD})], however, can reveal qualitative and quantitative features

567: that can differentiate among these schemes, which we outline here and

568: summarize in Table \ref{tab}.

569:

570: First, a distinction is possible based on the asymptotic value of $F$

571: as the substrate concentration $S$ saturates.  For reaction schemes A

572: and C, \beq F_{A,C}(S\rightarrow\infty)=1, \eeq whereas for reaction

573: schemes B and D, \beq F_{B,D}(S\rightarrow\infty)=1-\alpha_{B,D}, \eeq

574: where $\alpha_B$ and $\alpha_D$ are defined following Eqns.\

575: (\ref{FB}) and (\ref{FD}) respectively.  This expression has a minimum

576: value $1/2$ in the limits $k_2=k_{+}\gg k_{-}$ for B

577: and $k_2=k_{+}$ for D.  Thus a saturation value of $F$ that is

578: significantly less than 1 offers evidence for reaction scheme B or D

579: over A or C (see Fig.~\ref{plots}).

580:

581: Second, distinctions are possible based on the value $F^*$ at the

582: extremum of

583: the Fano factor as a function of substrate concentration $S$.  For a

584: MM reaction (case A), there is a minimum:\beq

585: \label{FstarA}

586: F^*_A=1-\frac{\alpha_A}{4K_A} = 1-\frac{1}{2}\frac{k_2}{k_2+k_{-1}} ,

587: \eeq

588: which is always

589: between $1/2$ (for $k_2\gg k_{-1}$) and 1 (for $k_{-1}\gg k_2$).

590: Similarly, for reaction scheme C, we obtain

591: \beq

592: \label{FstarC}

593: F^*_C=1-\frac{\alpha_C}{4K_C},

594: \eeq where $\alpha_C$

595: and $K_C$ are defined following Eqn.\ (\ref{FC}).  This expression

596: also has a minimum value of $1/2$ (for $k_+\gg k_{-}$ and $k_2 \gg

597: k_{-1}$), but, unlike in the MM case, it can become larger than 1 if

598: $k_+(k_++k_-)<k_-(k_2+k_{-1})$ (see Fig.\ \ref{plots}).  Indeed, as

599: mentioned, in the limit of rare activation $k_+\to 0$, we find

600: $F^*\rightarrow\infty$.

601:

602: Depending on the kinetic rates, reaction schemes B and D

603: may or may not have a minimum for positive $S$

604: (see Fig.\ \ref{plots} for an example of each).

605: In the cases for which a minimum exists, \beq

606: \label{FstarB}

607: F^*_{B,D}=1-\frac{\alpha_{B,D}}{4}\frac{{K'}_{B,D}^2}{K_{B,D}(K'_{B,D}-K_{B,D})},

608: \eeq where $\alpha_B$, $K_B$, and $K'_B$ are defined following Eqn.\

609: (\ref{FB}) and $\alpha_D$, $K_D$, and $K'_D$ are defined following

610: Eqn.\ (\ref{FD}).  This expression has the minimum value $1/3$ in the

611: limit $k_+=k_2\gg k_{-1}$ for both schemes (and additionally $k_{+}\gg

612: k_{-}$ for B).  In the reaction scheme B, these limits reduce the

613: system to a linear irreversible three-step cascade; an $L$-step

614: irreversible cascade has minimum $F^*$ of $1/L$ in the analogous

615: limits \cite{Doan}.  Comparing with the MM minimum value of $F^*=1/2$,

616: it is clear that a measured value of $F^*$ less than $1/2$ is a strong

617: indication that more than one intermediate step is present

618: \footnote{In all schemes A, B, C, and D, $F^*$ is dependent on $k_1$

619:   through $S^*$, which explicitly ensures that $k_1S=k_2$; this is a

620:   commonly known result, and it may be used by nature to suppress

621:   noise in natural signaling systems such as phototransduction

622:   \cite{Doan}.}.

623:

624: Lastly, distinctions can be made based on measurement of $S^*$, the

625: substrate concentration at which an extremum in $F$ occurs.

626: For cases A and C, \beq \frac{S^*_{A,C}}{K_{A,C}}=1, \eeq where $K_A$

627: and $K_C$ are defined following Eqns.\ (\ref{FMM}) and (\ref{FC})

628: respectively, and, as in all four cases, $K$ is the concentration at

629: which $V$ is half-maximal. For cases B and D, on the other hand (when

630: there is a minimum), \beq

631: \label{SstarB}

632: \frac{S^*_{B,D}}{K_{B,D}}=\frac{K'_{B,D}}{K'_{B,D}-2K_{B,D}}, \eeq

633: where $K_B$ and $K'_B$ are defined following Eqn.\ (\ref{FB}) and

634: $K_D$ and $K'_D$ are defined following Eqn.\ (\ref{FD}).  This

635: expression is bounded from below by 1

636: %(e.g.\ for $k_+\gg k_2 \gg k_{-1}$ for both schemes, and additionally $k_+\gg k_-$ for B),

637: (e.g.\ for $k_+\gg \{k_-,k_2,k_{-1}\}$ for B, or for $k_+\gg \{k_2,k_{-1}\}$ for D),

638: but can potentially be infinite

639: %(e.g.\ for $k_-=k_{-1}\gg k_2$ and $k_-\gg k_+$ for B, or for $k_{-1} \gg k_2=k_+$ for D).

640: (e.g.\ for $k_-=k_{-1}\gg \{k_2,k_+\}$ for B, or for $k_{-1} \gg k_2=k_+$ for D).

641: This implies that if

642: an extremum of the Fano factor occurs at a substrate concentration

643: significantly different from that at which the mean product formation

644: rate is half-maximal, it is a strong indication that more than one

645: intermediate step is present.

646:

647: Table \ref{tab} summarizes these distinctions, and Figure \ref{plots}

648: showcases the qualitative differences in the Fano factor curves among

649: the four reaction schemes caused by differences in the underlying

650: kinetics. For more complicated reaction schemes, such as multiple

651: substrate binding by the enzyme, modeled by a high Hill coefficient,

652: the Fano factor curve would gain even more distinguishing features,

653: such as additional extrema and/or inflection points

654: \footnote{We leave this as an exercise for future q-bio Summer School students.}.

655:

656: \begin{figure}

657: \centering

658: \ifthenelse{\value{col} = 1}{

659: \includegraphics[width = .8\textwidth]{Vplot_inset_New.pdf}

660: \includegraphics[width = .8\textwidth]{Fplot_inset_New.pdf}

661: }{

662: \includegraphics[width = .47\textwidth]{Vplot_inset_New.pdf}

663: \includegraphics[width = .47\textwidth]{Fplot_inset_New.pdf}

664: }

665: \linespread{1}

666: \caption{\label{plots} Mean product flux (also called dose-response

667:   curve) $V$ and the Fano factor $F$ versus substrate concentration

668:   $S$ for the four cases in Fig.~\ref{cartoon}: A solid, B dashed, C

669:   dotted, and D dot-dashed.  Plots are of Eqns.\ (\ref{VMM}, \ref{VB},

670:   \ref{VC}, \ref{VD}) for $V$ and Eqns.\ (\ref{FMM}, \ref{FB},

671:   \ref{FC}, \ref{FD}) for $F$, with $k_1=1$, $k_{-1}=1$,

672:   $k_2=1$, $k_+=0.1$, and $k_-=0.01$.  Note that while there are no

673:   qualitative differences in $V$ (and in fact all curves collapse when

674:   $V$ is normalized by $V^{\rm max}$ and $S$ by $K$, as seen in the

675:   inset), features can appear in $F$ that signify that a process is

676:   more complicated than the single-intermediate case A.}

677: \end{figure}

678:

679: \begin{table}

680: \centering

681: \begin{tabular}{|c|c|c|c|c|}

682: \hline

683: &A		&B		&C	&D				\\ \hline

684: $F(S\rightarrow\infty)$	&$1$	&$\left[\frac{1}{2}, 1\right]$	&$1$	&$\left[\frac{1}{2}, 1\right]$		\\ \hline

685: $F^*$		&$\left[\frac{1}{2}, 1\right]$	&$\left[\frac{1}{3}, 1\right]$	&$\left[\frac{1}{2}, \infty\right)$	&$\left[\frac{1}{3}, 1\right]$\\ \hline

686: $S^*/K$		&$1$		&$\left[1, \infty\right)$	&$1$   &$\left[1, \infty\right)$					\\ \hline

687: \end{tabular}

688: \linespread{1}

689: \caption{Bounds on experimentally measurable quantities that are useful in

690:   distinguishing among schemes for enzyme-mediated reactions.  A, B,

691:   C, and D refer to reaction schemes in Fig.~\ref{cartoon}.

692:   Star ($^*$) denotes the extremum of the Fano factor, such that $F^*$ is the

693:   minimum or maximum value and $S^*$ is the substrate concentration at which

694:   it occurs.

695:   $K$ is the substrate concentration at which product formation rate $V$ is

696:   half-maximal.  Generally speaking, minimum bounds on all three quantities

697:   occur when forward reaction rates dominate backward rates, and maximum

698:   bounds occur when backward rates dominate forward rates; see text for

699:   more details.}

700: \label{tab}

701: \end{table}

702:

703: \subsection{Extracting reaction rates from data}

704: In addition to helping one distinguish among competing reaction

705: schemes, experimental measurement of the dose-response curve $V(S)$

706: and the Fano curve $F(S)$ can be used to determine the kinetic rates

707: of the underlying biochemical reactions.  If the structure of the

708: biochemical reaction is known, analytical expressions for both curves

709: in terms of the kinetic rates and the substrate concentration can be

710: obtained using our method [see e.g.\ Eqns. (\ref{VMM}-\ref{FMM},

711: \ref{VB}-\ref{FB}, \ref{VC}-\ref{FC}, \ref{VD}-\ref{FD})] and can be

712: fit to experimental data.  Often times, measurements of the

713: qualitative features of both curves (such as those highlighted in

714: Table \ref{tab}) are enough to extract the kinetic rates; for more

715: complex reactions a full fit to the data would be necessary.

716: Additionally we note that performing full fits of experimental data to

717: the analytical expressions may also help in the original task of

718: distinguishing among (or at least eliminating) different biochemical

719: reaction schemes.

720:

721: The MM reaction is an example of a case in which measurement of the qualitative

722: features is enough to extract all kinetic rates.  However, it is

723: important to note that in order to do this, one needs both the dose-response

724: curve and the Fano curve.  In particular, one needs only to measure

725: the reaction rate at saturation $V^{\rm max}_A$, the substrate

726: concentration $K_A$ at which the rate is half maximal, and the minimum

727: value of the Fano curve $F_A^*$.  Then, from Eqn.\ (\ref{FstarA}) and

728: the expressions following Eqn.\ (\ref{FMM}), one obtains \beqn

729: k_2&=&V_A^{\rm max},\\

730: k_{-1}&=&\frac{F_A^*-1/2}{1-F_A^*}V_A^{\rm max},\\

731: k_1&=&\frac{V_A^{\rm max}}{2K_A(1-F_A^*)}.  \eeqn Instead of obtaining

732: only $k_2$ and a combination of $k_1$ and $k_{-1}$ by measuring only

733: the dose-response curve (as is traditionally done for MM reactions),

734: we now have analytical expressions for all three rates.

735:

736: For more complex reaction schemes, a similar analysis can be performed

737: to obtain analytical expressions for the kinetic rates in terms of the

738: experimental data.  However, it can be the case that not all rates can

739: be determined unambiguously from measurements of $V$ and $F$ (for the

740: reaction scheme B, for example, symmetries in the inverted expressions

741: imply that measurements of $V$ and $F$ do not always uniquely

742: determine the five unknown kinetic rates).

743: %In these cases, a full fit of

744: %the dose-response and the Fano curves to the data would be required to

745: %extract the rates.

746: When experimentally feasible, one

747: may also compare higher moments of the measured product molecule

748: distributions with those calculated via our method.

749:

750: \section{Discussion}

751: We have developed a method of solving chemical master equations for

752: multistep enzymatic reactions using a perturbation theory approach

753: analogous to that encountered in quantum mechanics. With this method,

754: finding cumulants of the distribution of product molecules is

755: equivalent to diagonalizing a matrix with dimensionality equal to the

756: number of internal states in the kinetic diagram of the reaction.

757: Then obtaining the first $m$ cumulants of the reaction can be done by

758: solving the perturbation theory to $m$th order, which is

759: straightforward. In particular, the first two moments $\avg{n}$ and

760: $\sigma^2$ together define the dose-response curve $V=d\avg{n}/dt$ and

761: the Fano factor $F=\sigma^2/\avg{n}$. As both are currently measurable

762: in a variety of systems, comparing the calculated $F$ to experimental

763: data can be used to identify the underlying structure of molecular

764: reactions.

765:

766: We have applied this perturbation theory approach to four different

767: reaction schemes, starting with the simplest Michaelis-Menten

768: kinetics, and progressing to more complicated kinetic schemes with

769: internal states. We calculated the dose-response curve and the Fano

770: factor for each as functions of the substrate

771: concentration. Importantly, while the dose-response curves for all of

772: the considered reactions are qualitatively similar, prominent

773: qualitative features of the Fano factor curve (such as its values at

774: large substrate concentrations, as well as the position and values

775: at its extremum) allow us to disambiguate the considered reaction

776: schemes.  Performing detailed fits of the curves to experimental data

777: (when feasible) can be an ultimate test for whether the underlying

778: kinetic structure is known.

779:

780: For the MM reaction, knowing just a handful of features of the $F(S)$

781: curve allows us to derive all three rates that completely

782: define the kinetic scheme, while the entire dose-response curve is

783: insufficient for this purpose. Similar results hold for the reactions

784: with intermediate steps, but here the analytical treatment is more

785: difficult, and often qualitative properties of $F$ alone do not define all of the

786: underlying kinetic parameters. Instead, a quantitative fit of

787: derived expressions for $F(S)$ to experimental data would be required.

788:

789: We stress that the kinetic schemes analyzed in this article are simple

790: toy models only. However, extending our analysis to more complicated

791: schemes to derive the first few cumulants of the product number

792: distribution is not difficult, and it can be automated with just a

793: simple linear-algebra solver. In particular, calculation of the Fano

794: factor for a signaling cascade as in \cite{Doan} or for a complex

795: network of single protein confirmations \cite{Li} is

796: straightforward. It should be noted, however, that generating

797: sufficient experimental data to distinguish minute details of

798: competing kinetic schemes is not easy. Our approach simplifies the

799: problem somewhat since it does not require single-molecule kinetic

800: data, as in \cite{English,Li}, but it is based on measuring a

801: mesoscopic, fluctuating flux. Still, ideally qualitative differences

802: would dominate the disambiguation task, as emphasized with the toy

803: models considered here.

804:

805: \section*{Acknowledgments}

806: The authors would like to thank the organizers, the lecturers, the

807: participants, and the sponsors of the $2^{\rm nd}$ q-bio Summer School

808: on Cellular Information Processing in Los Alamos, NM. We are also

809: thankful to Michael Wall for careful reading of the manuscript. WdR

810: was supported by the research program of the ``Stichting voor

811: Fundamenteel Onderzoek der Materie,'' which is financially supported

812: by The Netherlands Organization for Scientific Research.  BCD was

813: supported by NSF Grant DMR-0705167. AM was supported by NSF Grant

814: DGE-0742450.  NAS and IN were supported by DOE under Contract No.\

815: DE-AC52-06NA25396. IN was further supported by NSF Grant No.\

816: ECS-0425850.

817:

818: \bibliographystyle{apsrev}

819: \bibliography{mm}

820:

821: %\section{Supplementary information}

822: %\input{SI.tex}

823: \end{document}

824:

825: