0609:q-bio0609033/paper.tex

1: \documentclass[prl,twocolumn,floatfix]{revtex4}

2: \usepackage{graphicx,amsmath,amssymb,multirow}

3: \begin{document}

4: \title{Comparative study of the transcriptional regulatory networks

5: of E. coli and yeast:

6: Structural characteristics leading to marginal dynamic stability}

7: \author{Deok-Sun Lee}

8: \altaffiliation[Present address: ]{Department of Physics, University of Notre Dame, Notre Dame, Indiana 46556, USA}

9: \affiliation{Theoretische Physik, Universit\"{a}t des Saarlandes,

10:   66041 Saarbr\"{u}cken, Germany}

11: \author{Heiko Rieger}

12: \affiliation{Theoretische Physik, Universit\"{a}t des Saarlandes,

13:   66041 Saarbr\"{u}cken, Germany}

14: \date{\today}

15: \begin{abstract}

16: Dynamical properties of the transcriptional regulatory network of {\it

17: Escherichia coli} and {\it Saccharomyces cerevisiae} are studied

18: within the framework of random Boolean functions.  The dynamical

19: response of these networks to a single point mutation is characterized

20: by the number of mutated elements as a function of time and the

21: distribution of the relaxation time to a new stationary state, which

22: turn out to be different in both networks. Comparison with the

23: behavior of randomized networks reveals relevant structural

24: characteristics other than the mean connectivity, namely the

25: organization of circuits and the functional form of the in-degree

26: distribution. The abundance of single-element circuits in {\it

27: E. coli} and the power-law in-degree distribution of {\it

28: S. cerevisiae} shift their dynamics towards marginal stability

29: overcoming the restrictions imposed by their mean connectivities,

30: which is argued to be related to the simultaneous presence of

31: robustness and adaptivity in living organisms.

32: \end{abstract}

33: \maketitle

34:

35: \section{Introduction}

36: Living organisms depend simultaneously on a stable internal

37: environment and a capability to adapt to a fluctuating external

38: environment~\cite{causton01}. Since the biological characteristics of

39: an organism are determined by the interplay between its gene

40: repertoire and the regulatory apparatus~\cite{babu04}, robustness and

41: adaptiveness should be generic features of the molecular

42: interactions composing the gene regulation machinery.  The

43: organization of the gene transcriptional regulatory network has been

44: analyzed for numerous organisms, in particular for the prokaryote {\it

45: Escherichia coli} ({\it E. coli})

46: ~\cite{thieffry98,dobrin04,shenorr02} and the eukaryote {\it

47: Saccharomyces cerevisiae} ({\it S. cerevisiae})

48: ~\cite{guelzim02,tilee02,luscombe04}.

49:

50: Adaptivity of an organism implies the production of different cell

51: types with different functions from the same genome. This begins with

52: a regulated transcription by certain proteins, transcriptional factor

53: (TF)~\cite{orphanides02}. The identification of the target genes for

54: each TF allows the construction of a gene transcriptional regulatory

55: network, where the nodes are the genes or operons that produce TF's or

56: are regulated by TF's, and the directed edges indicate a regulatory

57: dependence: A directed edge from node $A$ to node $B$ implies that a

58: TF encoded by gene $A$ is involved in the regulation if the expression

59: of gene $B$. The expression level of each gene defines the dynamical

60: state of the network.  To achieve robustness and adaptiveness at the

61: same time one expects the regulatory network dynamics to be neither

62: chaotic nor fully insensitive to perturbations, but marginally

63: stable. Structural characteristics of the network must support these

64: dynamical features.

65:

66: Our study reveals specific topological features in the transcriptional

67: regulatory network architecture of {\it E. coli} and {\it S. cerevisiae} that

68: shift the dynamics towards marginal stability. {\it E. coli}'s network has

69: a very low mean connectivity, the number of edges per node, which would lead

70: in random networks to a high stability thus deteriorating adaptiveness.

71: But we find that single-element circuits which are anomalously rich

72: in {\it E. coli}'s network help mutations triggered by random perturbations

73: to persist, favoring an unstable dynamical behavior.

74: {\it S. cerevisiae} on the other hand has a

75: sufficiently high mean connectivity which favors chaotic dynamics in random

76: networks deteriorating stability. Here we find that {\it S. cerevisiae}'s

77: network has a broad (algebraic) node degree distribution and

78: we demonstrate the stabilizing effect of this feature upon the dynamics.

79:

80: Practically, the information about the transcriptional regulatory

81: network structure - which TF binds to which gene - is available via

82: the chromatin-immunoprecipitation microarray experiments

83: ~\cite{tilee02}. The question, whether a specific TF enforces or

84: inhibits the expression of a specific target gene, has to be studied

85: separately. However, those individual interactions do not necessarily

86: occur independently and these regulatory interactions are often

87: combinatorial~\cite{hwa03} and time-, cell cycle-, or

88: environment-dependent, limiting the available information on the

89: complete regulation profile. Generic dynamical features then have to

90: be extracted using model interactions as suggested by

91: Kauffman~\cite{kauffman}: One digitizes the continuous expression

92: level to a Boolean variable, $0$ (inactive) and $1$ (active), and

93: assumes a random static regulation rule for each gene in the form of a

94: random Boolean function for each gene determining its state at the

95: next time step by the current states of its regulators. Here {\it

96: random} means that the output value of these Boolean functions is $0$

97: or $1$ with equal probabilities.

98:

99: Based on considerations of random Boolean networks with a fixed number

100: of regulators $k$ for every element, Kauffman \cite{kauffman}

101: hypothesized that distinct stationary states - limit cycles -

102: correspond to different types of cells. This idea got some support

103: from the agreement of the scaling behavior of the number of

104: limit-cycles for $k=2$-random Boolean networks and the number of cell

105: types with respect to the genome size, but was also

106: debated~\cite{samuelsson03,klemm05}. Among networks with fixed

107: in-degree, $k=2$ is a critical point distinguishing two different

108: dynamical phases: stable and unstable against perturbations,

109: suggesting that the regulatory network dynamics of living organisms is

110: ``on the edge" between order and chaos~\cite{kauffman}.

111:

112: However, real regulatory networks do not have a fixed in-degree but a

113: heterogeneous connectivity, even their average in-degree $\langle

114: k\rangle$ is usually different from $2$. Nevertheless the Boolean

115: model itself is useful, and recently the effects of the nature of the

116: regulating rules on the dynamical stability were studied within its

117: framework~\cite{harris02,kauffman0304}. We propose that the network

118: structure itself is also relevant for the stability/instability aspect

119: mentioned before. Therefore we construct a network from the data for

120: the transcriptional regulatory interactions for {\it E. coli} and {\it

121: S. cerevisiae}, and study how a point mutation, i.e., an altered

122: dynamical state of a single element, spreads over the whole network by

123: inducing another mutation through regulatory interactions.

124:

125: \begin{figure}

126: \includegraphics[width=0.8\columnwidth]{f1.eps}

127: \caption{An example of Boolean dynamics. (a) A Boolean network of four nodes and three

128:   directed edges. Each node has a Boolean variable $\sigma_i$ ($i=A,B,C,D$)

129:   (b) Regulating rules $f_i$'s determining the node $i$'s state at time $t+1$ with

130:   its regulators' states at time $t$ as input.

131:   The nodes $A$ and $B$ have no regulator

132:   and their Boolean variables take constant values, respectively, at time $t+1$

133:   regardless of their values at time $t$.

134:   (c) An example of the time evolution of those Boolean

135:   variables under the regulating rules in (b).}

136: \label{fig:model}

137: \end{figure}

138:

139: \section{Method}

140: {\it Datasets} ---

141: For the transcriptional regulatory network in {\it E. coli}, we used

142: the data of Ref.~\cite{shenorr02}, which are based on an existing

143: database, RegulonDB, and enhanced by literature search. The resultant

144: network consists of $418$ operons and $519$ interactions with $111$

145: nodes having at least one outward edge. The data for {\it

146: S. cerevisiae} are taken from Ref.~\cite{tilee02} and were obtained

147: from the combination of Chromatin Immunoprecipitation and DNA

148: microarray analysis.  We chose the P value threshold $0.01$, yielding

149: a network of $4555$ nodes and $12455$ directed edges with $112$ nodes

150: having at least one outward edge.  Isolated nodes and those possessing

151: only self-regulation have been excluded in both networks since they

152: have no interaction with other elements.

153:

154: {\it Random Boolean functions} ---

155: These experimental data establish a directed network $G$ of $N$ nodes,

156: and we assign a dynamic Boolean variable $\sigma_i$ (that can take on

157: the values $0$ or $1$ only, corresponding to an inactive or active

158: state, respectively) to each node $i$.  These dynamical variables

159: evolve synchronously via $\sigma_i(t+1)=f_i(\sigma_{i_1}(t),

160: \sigma_{i_2}(t), \ldots, \sigma_{i_{k_i}}(t))$, with the nodes

161: $i_1, i_2, \ldots, i_{k_i}$ having the outward edges incident on the

162: node $i$. The output value of $f_i$ for each input configuration

163: $\{\sigma_{i_1}(t), \sigma_{i_2}(t), \ldots, \sigma_{i_{k_i}}(t)\}$

164: is $0$ with probability $p$ or $1$ with probability $1-p$, which is

165: determined at the beginning and not changed with time. If $k_i=0$,

166: $\sigma_i$ is fixed at $f_i$; $\sigma_i(t+1)=f_i$ regardless of the

167: value of $\sigma_i(t)$.  The parameter $p$ characterizes the

168: randomness of the regulating rules: If $p=0$ or $1$, the dynamics is

169: frozen while the system tends to be disordered with $p=1/2$. An

170: example network with this Boolean dynamics is given in

171: Fig.~\ref{fig:model}.

172:

173: {\it Stability measure} ---

174: The stability of a time-trajectory $\Sigma(t)$ is assessed by the

175: effects of a point mutation $\sigma_i \to 1-\sigma_i$ on the dynamical

176: evolution of the subsequent states. For this, we choose a

177: configuration $\Sigma = \{\sigma_1,\sigma_2,\ldots,\sigma_N\}$, and

178: prepare its mutant,

179: $\hat{\Sigma}=\{\hat{\sigma}_1,\hat{\sigma}_2,\ldots,\hat{\sigma}_N\}$,

180: where $\hat{\sigma}_i = \sigma_i$ for all $i$ except $j$ with $j$

181: chosen arbitrarily. Evolving $\Sigma$ and $\hat{\Sigma}$ on the same

182: network with the same regulating rules, we count $n_{\rm m} (t)$, the

183: number of elements $i$'s with

184: $\sigma_i(t)\ne \hat{\sigma}_i(t)$, at

185: each time step $t$.

186: A node with $\Delta \sigma_i(t) \equiv |\sigma_i(t)-\hat{\sigma}_i(t)|>0$

187: is considered as mutated. We average $n_{\rm m}(t)$ over different realizations of

188: the regulating rules and different initial pairs of configurations to get the

189: average, $N_{\rm m}(t)=\langle n_{\rm m} (t)\rangle$, which converges

190: to its stationary value $N_{\rm m}$.

191: For each individual normal-mutant pair $(\Sigma,\hat{\Sigma})$, one can measure

192: the relaxation time $t_{\rm r}$ after which $n_{\rm m}(t)$ reaches

193: its stationary value. Its distribution $P(t_{\rm r})$ is investigated as well.

194:

195: \begin{figure}

196: \includegraphics[width=\columnwidth]{f2a.eps}

197: \includegraphics[width=\columnwidth]{f2b.eps}

198: \includegraphics[width=\columnwidth]{f2c.eps}

199: \caption{Number of mutated elements

200:   $N_{\rm m}(t)$ and $N_{\rm m}=\lim_{t\to\infty} N_{\rm m}(t)$ and distribution of the

201:     relaxation time $P(t_{\rm r})$.

202:   (a) Plot of the stationary value $N_{\rm m}$ versus $\lambda=2p(1-p)$

203:   in the original network and two types of randomized graphs (see the

204:   text for the definition) for {\it E. coli}. The data are

205:   averages over $10^2$ initial pairs of configurations for each of more than

206:   $10^3$ realizations of regulating rules. The approximation given in

207:   Eq.~(\ref{eq:ecoliapprox}) is drawn together. The inset shows the time developments

208:   $N_{\rm m}(t)$ for selected values of $\lambda$ in the original {\it E. coli}

209:   network. (b) The same data as (a) for {\it S. cerevisiae}.

210:   (c) Plots of $P(t_{\rm r})$ with $p=1/2$ ($\lambda=1/2$) on the original networks and the

211:   randomized graphs for {\it E. coli} and {\it S. cerevisiae}.}

212: \label{fig:NmP}

213: \end{figure}

214:

215: \section{Results}

216: \subsection{Time evolution of the number of mutated elements}

217: Figure~\ref{fig:NmP} (a) and (b) present

218: the results for the number of mutated elements

219: $N_{\rm m}(t)$ and $N_{\rm m}$.

220: $N_{\rm m}(t)$ decreases very rapidly

221: from $N_{\rm m}(0)=1$ to a much smaller value  for all $p$'s

222: in {\it E. coli}. On the other hand, $N_{\rm m}$ for {\it S. cerevisiae}

223: increases with time up to a value larger than $1$ for $\lambda \equiv 2p(1-p)

224: \gtrsim 0.42$ ($0.3\lesssim p \lesssim 0.7$)  indicating the occurrence of

225: a mutation cascade. Both in {\it E. coli} and {\it S. cerevisiae},

226: $N_{\rm m}$ increases with increasing $p$ from $0$ to $1/2$ (or decreasing

227: $p$ from $1$ to $1/2$) since the probability that a regulating rule

228: yields different output values from different input configurations is

229: $2p (1-p)$, which has a maximum at $p=1/2$ and will be denoted by $\lambda$.

230: In {\it E. coli}, $N_{\rm m}$ stays smaller than $0.3$,

231: indicating that  system-wide mutations are suppressed.

232: Figure~\ref{fig:NmP} also shows that in {\it S. cerevisiae} $N_{\rm m}$ is smaller than in

233: {\it E. coli} for $\lambda\lesssim 0.2$ but increases with $\lambda$ more rapidly and is

234: larger for $\lambda\gtrsim 0.2$.

235:

236: The functional form of $P(t_{\rm r})$ for $p=1/2$ in Fig.~\ref{fig:NmP} (c)

237: is strikingly different between

238: {\it E. coli} and {\it S. cerevisiae}: it is exponential for {\it E. coli} and

239: a power-law, $P(t_{\rm r})\sim t_{\rm r}^{-1.5(2)}$, for {\it S. cerevisiae}.

240: This long tail of $P(t_{\rm r})$ implies that in the case of {\it S. cerevisiae}

241: an element can be mutated and recover even at very late times in the dynamics.

242:

243: \subsection{Mean connectivity}

244: These differences in the mutation spread dynamics may be

245: primarily attributed to a difference in the  mean connectivity and

246: can be understood by a mean-field approach~\cite{derrida86,aldana03}:

247: The probability $H(t)=\lim_{N\to\infty} N_{\rm m}(t)/N$ that a randomly chosen node

248: $i$ is mutated at time $t$, also called the Hamming distance,

249: is given in terms of the probability that a  regulator of the node $i$ is mutated,

250: which we denote by $\bar{H}(t)$, and the

251: probability that the regulating rule $f_i$ yields different output values

252: from different input configurations, $\lambda$, as

253: \begin{eqnarray}

254: H(t+1)&=& \sum_{k_{\rm in}} \lambda (1 - (1 - \bar{H}(t))^k) P_d(k),

255:   \nonumber\\

256: \bar{H}(t+1)&=& \sum_{k,q} \lambda (1 - (1 - \bar{H}(t))^k) \frac{q P_d(k,q)}{\langle q\rangle}.

257: \label{eq:sc}

258: \end{eqnarray}

259: Here $P_d(k,q)$ is the joint probability that a node has in-degree $k$ and

260: out-degree $q$ and is related to the in-degree distribution $P_d(k) = \sum_q

261: P_d(k,q)$. $H(t)$ and $\bar{H}(t)$ evolve towards their stationary values

262: $H$ and $\bar{H}$. Setting $\bar{H}(t+1)=\bar{H}(t)=\bar{H}$ and expanding

263: the second line of Eq.~(\ref{eq:sc}) for small $\bar{H}$, one finds

264: $\bar{H}\simeq \bar{H}\lambda \langle kq\rangle/\langle q\rangle  -

265: \bar{H}^2\lambda \langle k^2q\rangle/(2\langle q \rangle) +

266: \mathcal{O}(\bar{H}^3)$

267: provided $\langle q\rangle$, $\langle kq\rangle $, and $\langle k^2

268: q\rangle$ are all finite. Therefore $\bar{H}$ and $H$ are zero for

269: $\lambda$ smaller than a critical value $\lambda_c$ with

270: $\lambda_c=1/K$ and $K\equiv \langle kq\rangle/\langle q\rangle$ and

271: non-zero otherwise. The expression $\lambda_c=K^{-1}$ for the critical

272: point holds true as long as $K$ is finite. Since the Hamming distance

273: $H$ can be positive only if $K>2$, $N_{\rm m}\simeq HN$ for finite $N$

274: should be small in {\it E. coli} that has the value $K\simeq 1.08$ and

275: can be large, of order $N$, for $\lambda\gtrsim 0.42$ in {\it

276: S. cerevisiae} that has $K\simeq 2.35$.

277: Although the Hamming distance is not necessarily of order $N^{-1}$

278: at $\lambda_c$, one finds the

279: value of $\lambda$ for which $N_{\rm m}=1$ very close to the value

280: $K^{-1}\simeq 0.42$ in the latter.

281: The in-degree $k$ and the out-degree $q$ show no significant correlation

282: in the two networks according to our analysis not presented here,

283: that is, $P_d(k,q)\simeq P_d(k)P_d(q)$ , which yields $\langle kq \rangle

284: \simeq \langle k\rangle \langle q\rangle$ and $K\simeq \langle k\rangle$.

285:

286: \subsection{Comparison with randomized networks}

287: Next we studied the same dynamics in two kinds of randomized networks

288: derived from the regulatory networks of {\it E. coli} and {\it

289: S. cerevisiae}. The first type of randomized graphs (type I) are

290: constructed by the repetition of removing an edge connecting nodes

291: $v_1$ and $w_1$ and creating a new one between $v_2$ and $w_2$, where

292: both $v_1$ and $v_2$ had at least one outward edge and the node pair

293: $v_2$ and $w_2$ were not connected before this change. Thus these

294: type-I randomized networks have the same number of nodes, edges, and

295: TF's as the original networks, but the edges connect randomly-chosen

296: pairs of TF and target gene. Our results for $N_{\rm m}$ and $P(t_{\rm

297: r})$ are shown in Fig.~\ref{fig:NmP}. For the type-I randomized graphs

298: derived from {\it E. coli}, $N_{\rm m}$ is substantially suppressed as

299: compared with the original network. In the type-I random graphs

300: derived from {\it S. cerevisiae}, $N_{\rm m}$ increases much more

301: rapidly passing $\lambda\simeq 0.3$. The relaxation time distribution

302: for the random graphs from {\it E. coli} is broader than for the

303: original network but still decays faster than that for {\it

304: S. cerevisiae}. The type-I randomization does not change significantly

305: the relaxation time distribution for {\it S. cerevisiae}.

306:

307: The type-II randomized graphs we considered are constructed by

308: exchanging the end points of two edges: Two randomly chosen edges $e_1

309: = (v_1, w_1)$ and $e_2 = (v_2, w_2)$ are replaced by $e_1' = (v_1,

310: w_2)$ and $e_2' = (v_2, w_1)$, respectively. These graphs preserve the

311: joint degree distribution $P_d(k,q)$, but their local connectivity

312: patterns may be different from that in the original network.  We

313: present the plots of $N_{\rm m}$ and $P(t_{\rm r})$ in

314: Fig.~\ref{fig:NmP}. This type-II randomization does not change the

315: relaxation time distribution for {\it S. cerevisiae} neither. Thus

316: much faster decay of the relaxation time in the original and

317: randomized networks for {\it E. coli} than in those for {\it

318: S. cerevisiae} can be ascribed to the much lower mean connectivity,

319: $\langle k\rangle \simeq 1.24$, of the former than that of the latter,

320: $\langle k\rangle \simeq 2.73$. Interestingly the quantities $N_{\rm

321: m}$ and $P(t_{\rm r})$ for these randomized graphs agree well with

322: those for the original network of {\it S. cerevisiae}, but not for

323: {\it E. coli}: This implies that it is the degree distribution that is

324: mainly responsible for the spread of mutation in {\it S. cerevisiae}

325: while other (local) structural factors must be important in {\it

326: E. coli}.

327:

328: \begin{figure}

329: \includegraphics[width=\columnwidth]{f3.eps}

330: \caption{Network structure dependence of mutation spread.

331:   The regulating rules are given by

332:     $f_i(\sigma)= \sigma$ or $1-\sigma$ for nodes $i$'s with one input and

333:     $f_i = 1$ or $0$ for nodes $i$'s with no input. Thus a mutated regulator

334:     necessarily makes its target node mutated at the next time step. Time evolution of

335:     $\Delta \sigma_i = |\sigma_i - \hat{\sigma}_i|$ for each node is shown in

336:     tables.

337:     (a) No circuit (tree structure).  All nodes recover at $t=3$ and thus the Hamming

338:     distance $H$ is zero. (b) A circuit

339: of length $3$. The point mutation circulates with period $3$, resulting in $H=1/3$.

340: (c) A single-element circuit together with tree structure. All

341: nodes are mutated at $t=2$ and thus

342: $H=1$.}

343: \label{fig:tree-circuit}

344: \end{figure}

345:

346: \begin{figure}

347: \includegraphics[width=0.9\columnwidth]{f4.eps}

348: \caption{Organization of the core in {\it E. coli} and {\it S. cerevisiae}.

349: (a) Core of {\it E. coli}. It consists of $57$ nodes and $84$ edges. (b)

350:   Core of {\it S. cerevisiae}. It has $63$ nodes and $167$ edges. (c)

351: Histogram of the shortest circuit lengths.

352: In {\it E. coli}, a circuit longer than $1$

353: is not observed but all $54$ circuits are single-element ones.

354: In {\it S. cerevisiae}, $836$ pairs of nodes

355: among all possible $1953$ pairs in the core are connected by circuits

356: and the shortest circuit length ranges from $0$ to $19$.}

357: \label{fig:core}

358: \end{figure}

359:

360: \subsection{Abundance of single-element circuits in {\it E. coli}}

361: One might expect that circuits (directed closed paths) in the

362: regulatory network play an important role for the spread of mutations,

363: because in networks with a tree-structure, i.e., without circuits,

364: point mutations spread without circulation and a node that is mutated

365: will recover at the next time step and never become mutated again as

366: indicated in Fig.~\ref{fig:tree-circuit} (a). The nodes on a circuit,

367: on the other hand, can return to a mutated state even after recovery

368: [Fig.~\ref{fig:tree-circuit} (b)]. The nodes lying on circuits or

369: those on bridges connecting distinct circuits can in principle switch

370: their status permanently and thus they can be considered as comprising

371: a core in the dynamics of mutation spread.  As a subnetwork including

372: all such circuits and the bridges connecting them, we define the core

373: of a network as the maximal subgraph in which each node has at least

374: one inward edge coming from and at least one outward edge incident to

375: an element of the core.

376:

377: By deleting the edges having at either end a node that does not meet

378: the requirement for the core elements, we found the core subnetwork in

379: the regulatory networks of {\it E. coli} and {\it S. cerevisiae}. Note

380: that if an edge has the same node at both ends, the node, which

381: regulates itself, becomes the element of the core. The relevance of

382: the core to the mutation spread dynamics can be understood e.g., by

383: investigating the relaxation time distribution $P(t_{\rm r})$ in {\it

384: S. cerevisiae} depending on the location of the initial point

385: mutation. Our analysis shows that initial mutations in the core lead

386: to a qualitatively equal (power-law with the same exponent)

387: distribution of the relaxation time. On the other hand, initial

388: mutations in the output module, consisting of all nodes that have

389: inward edges coming from the nodes in the core and their edges, decay

390: very fast since the output module has a tree structure and cannot

391: cause mutations in the core.

392:

393: The organization of the core turns out to be very different in {\it

394: E. coli} and {\it S. cerevisiae} as shown in Fig.~\ref{fig:core} (a)

395: and (b), respectively. Most of all, the nodes are much more densely

396: connected in {\it S. cerevisiae} than in {\it E. coli}. This

397: difference can be first ascribed to different mean connectivities of

398: the nodes in the core: it is about $1.47$ in {\it E. coli} and $2.65$

399: in {\it S. cerevisiae}.  However, a more striking difference exists in

400: their core organization. In {\it E. coli}, all $54$ circuits are

401: identified, all of which are single-element circuits representing

402: self-regulation. There are no circuits whose length (i.e the number of

403: edges on the cycle) is larger than $1$~\cite{thieffry98}. On the

404: contrary, only one or two single-element circuits are formed in its

405: randomized graphs. This organization of circuits in {\it

406: E. coli} is also contrasted with the one in {\it S. cerevisiae}. We

407: computed the shortest circuit for each pair of nodes in the core and

408: counted the numbers of node pairs for each given shortest-circuit

409: length.  The distribution of shortest-circuit length obtained for {\it

410: S. cerevisiae} is broad as shown in Fig.~\ref{fig:core} (c).  We

411: propose that the presence of single-element circuits in {\it

412: E. coli} is the main reason for the enhancement of $N_{\rm m}$ of {\it

413: E. coli} compared with both of its randomized graphs.  Once a node $i$

414: regulating itself is mutated, the input configurations to the

415: regulating rule $f_i$ are necessarily different between the

416: normal-mutant pair $(\Sigma,\hat{\Sigma})$ since it is guaranteed that

417: at least one of its regulators, the node $i$ itself, is

418: mutated. Recalling that a node can be mutated at the next time step

419: only if the input configurations from the normal-mutant pair are

420: different, one can see that single-element circuits have a higher

421: probability to be mutated than nodes which do not regulate themselves

422: [See Fig.~\ref{fig:tree-circuit} (c)]. Therefore networks with more

423: single-element circuits can be more adaptive.

424:

425: In the core of {\it E. coli} network, $54$ edges are used for

426: single-element circuits and the remaining $30$ edges connect pairs of

427: distinct nodes. As a result, the network has many isolated nodes and

428: few small connected components, resulting in the rapid decay of the

429: relaxation time. In Fig.~\ref{fig:NmP} (c), we find that the

430: relaxation times observed in {\it E. coli} are mostly $1$ or $2$. From

431: this, we can analytically predict the value of $N_m$ as a function of

432: $\lambda$. Suppose $N_{\rm m}(t)$ saturates no later than time

433: $2$. From Eq.~(\ref{eq:sc}), $\bar{H}(1)=\lambda K N^{-1} + {\cal

434: O}(N^{-2})$ since $\bar{H}(0)=N^{-1}$ and

435: \begin{equation}

436: N_{\rm m}\simeq N H(2) \simeq N \lambda K \bar{H}(1)\simeq

437: \lambda^2 K^2.

438: \label{eq:ecoliapprox}

439: \end{equation}

440: This is in good agreement with the true value as shown in

441: Fig.~\ref{fig:NmP} (a).

442:

443: \begin{figure}

444: \includegraphics[width=\columnwidth]{f5a.eps}

445: \includegraphics[width=\columnwidth]{f5b.eps}

446: \caption{Connectivity pattern and its effect on the critical behavior of the

447: Hamming distance. (a) In-degree distributions $P_d(k)$ for {\it E. coli} and {\it S. cerevisiae}.

448: For {\it S. cerevisiae}, its asymptotic behavior is a power-law, $P_d(k)\sim k^{-\gamma}$

449: with $\gamma\simeq 2.7(2)$. On the other hand, the observed values of $k$ are only up to $6$

450: and so it is hard to discern the functional form of $P_d(k)$ in {\it E. coli}. (b)

451: Hamming distance $H$ as a function of $\lambda$ numerically obtained from

452: Eq.~(\ref{eq:sc_simple}) with $P_d(k)$ of the static model~\cite{lee04}, which

453: has a power-law tail as $P_d(k)\sim k^{-\gamma}$ with the exponent $\gamma$ tunable.

454: The inset shows that

455: $H\sim \Delta$ commonly for  $\gamma\to\infty$ and $\gamma=3.5$, and that $H\sim \Delta^2$

456: for $\gamma=2.5$, in agreement with Eq.~(\ref{eq:beta}).}

457: \label{fig:critical}

458: \end{figure}

459:

460: \subsection{Power-law in-degree distribution in {\it S. cerevisiae}}

461: In {\it S. cerevisiae}, the most significant dynamical feature that we

462: found and that we need to explain is the slower increase of $N_{\rm

463: m}$ with $\lambda$ as compared with the type-I randomized graph, shown

464: in Fig.~\ref{fig:NmP} (b). Contrary to the type-II randomized graphs,

465: those of type-I do not preserve the degree distribution of the

466: original network. From this, we can conjecture that the degree

467: distribution of {\it S. cerevisiae} causes the slow increase of

468: $N_{\rm m}$. To check this, we analyze in detail the dependence of the

469: Hamming distance on the degree distributions.

470:

471: With uncorrelated in- and out-degree as is the case in the regulatory networks

472: considered here, Eq.~(\ref{eq:sc}) is reduced to $H(t)=\bar{H}(t)$ and

473: \begin{equation}

474: H(t+1) = \lambda \sum_k [1-(1-H(t))^k] P_d(k).

475: \label{eq:sc_simple}

476: \end{equation}

477: Thus the in-degree distribution $P_d(k)$ determines the behavior of

478: the Hamming distance $H(t)$. The in-degree distributions of {\it

479: E. coli} and {\it S. cerevisiae} shown in Fig.~\ref{fig:critical} (a)

480: are quite different from each other.  The maximum degree is $31$ in

481: {\it S. cerevisiae} while it is only $6$ in {\it E. coli}.

482: Furthermore, the log-log plot of $P_d(k)$ in {\it S. cerevisiae}

483: indicates that $P_d(k)\sim k^{-\gamma}$ with $\gamma\simeq

484: 2.7(2)$. The functional form of $P_d(k)$ for {\it E. coli} is hard to

485: determine because of the small range for observable $k$ values.  Note

486: that the in-degree distribution of the type-I randomized graphs obey a

487: Poisson distribution, $P_d(k)=\langle k\rangle^k e^{-\langle

488: k\rangle}/k!$. Let us consider an in-degree distribution which has a

489: power-law tail, i.e., $P_d(k)\sim k^{-\gamma}$. Then, we find from

490: Eq.~(\ref{eq:sc_simple}) that the Hamming distance in the stationary

491: state behaves as $H\sim \Delta^\beta$ for $\lambda$ larger than the

492: critical value $\lambda_c$ with $\Delta\equiv \lambda/\lambda_c-1$ and

493: the critical exponent $\beta$ given by

494: \begin{equation}

495: \beta = \left \{

496: \begin{array}{ll}

497: 1 & (\gamma>3),\\

498: 1/(\gamma-2) & (2<\gamma<3).

499: \end{array}

500: \right.

501: \label{eq:beta}

502: \end{equation}

503: The derivation of Eq.~(\ref{eq:beta}) is given in Appendix.

504: We restricted the range of $\gamma$ to $\gamma>2$ because the mean

505: connectivity diverges with $\gamma<2$. When the in-degree is subject

506: to a Poisson distribution or an exponentially-decaying distribution, it

507: corresponds to $\gamma\to\infty$ and the critical behavior is

508: the same as that for $\gamma>3$. We present the numerical solution to

509: Eq.~(\ref{eq:sc_simple}) in Fig.~\ref{fig:critical} (b) for

510: $\gamma\to\infty$ (Poisson distribution), $\gamma=3.5$, and $\gamma=2.5$.

511:

512: The increase of $\beta$ with decreasing $\gamma$ below $\gamma=3$

513: indicates a difference in the behavior of the Hamming distance near

514: the critical point between networks with $\gamma>3$ and those with

515: $2<\gamma<3$. Suppose we have two networks with a power-law in-degree

516: distribution $P_d(k)\sim k^{-\gamma}$: One has $\gamma=3.5$ and the

517: other has $\gamma=2.5$, and both have $\langle k\rangle=4$. Then, in

518: the region $0<\Delta =\lambda/\lambda_c-1\ll 1$, the Hamming distance

519: behaves as $H\sim \Delta$ for $\gamma=3.5$ and $H\sim \Delta^2$ for

520: $\gamma=2.5$: the former increases more rapidly than the latter in the

521: region $\Delta\ll 1$. Also the region where the Hamming distance

522: remains non-zero but small, e.g., $H\leq 0.05$ is larger with

523: $\gamma=2.5$ than with $\gamma=3.5$: it is given by $\lambda\in

524: (0.25:0.29]$ with $\gamma=3.5$ and $\lambda\in (0.25:0.35]$ with

525: $\gamma=2.5$.  Such dependence of the Hamming distance on the

526: in-degree exponent $\gamma$ can thus explain different network

527: responses between {\it S. cerevisiae} and its type-I randomized

528: graphs. It is the broad in-degree distribution with $\gamma=2.7(2)$

529: that makes the number of mutated elements increase with $\lambda$ more

530: slowly than in the corresponding type-I randomized graphs that have

531: $\gamma\to\infty$. Due to such a slow increase of the Hamming

532: distance, {\it S. cerevisiae} can keep the size of mutation small for

533: a wider range of the parameter $p$ or $\lambda$, which would be much

534: larger with random structures.

535:

536: \section{Conclusion}

537: We performed numerical experiments - spread of mutation

538: - to probe the dynamic stability of the recently-unveiled networks of

539: gene transcriptional regulation of {\it E. coli} and {\it

540: S. cerevisiae} and provided analytical confirmation for the results by

541: analyzing their structural features. While the small number of edges

542: per node in {\it E. coli} fundamentally prohibits a global spread of

543: mutation, a relatively large number of edges in {\it S. cerevisiae}

544: enables a global mutation conditionally depending on the regulating

545: rules. We further identified the relevant structural features which

546: are distinguished from those of random graphs: All circuits of the

547: regulatory network of {\it E. coli} are single-element circuits and

548: the in-degree distribution of {\it S. cerevisiae} takes a power-law

549: form. Single-element circuits in {\it E. coli} have higher probability

550: to be mutated than nodes without self-regulation. The broad in-degree

551: distribution in {\it S. cerevisiae} smoothens the increase of the

552: number of mutated elements. This increase would be sharper for an

553: exponential distribution, as is the case in the random graphs.

554:

555: These biological networks appear to follow design principles that tend

556: to balance the size of mutation. The small mean connectivity of the

557: regulatory network of {\it E. coli} would restrict the size of

558: mutations drastically, which is compensated by the abundance of

559: single-element circuits that lead to the required enhancement of the

560: mutation size. In the case of {\it S. cerevisiae}, its global

561: characteristics of the regulatory network, a mean connectivity larger

562: than 2, would lead to a very large mutation size, but a very

563: heterogeneous interconnectivity pattern suppresses it. These local

564: structural features demonstrate that both genetic networks have

565: evolved, in spite of the restrictions imposed by the global

566: characteristics, in such a direction that they can stay dynamically

567: between stable (i.e., rarely mutated on a global scale) and unstable

568: (easily mutated).  Being neither stable nor unstable appears to be

569: necessary for living organisms to maintain their stable internal state

570: and adapt itself to fluctuating external environment

571: simultaneously. Therefore our finding suggests that such a marginal

572: dynamic stability of the whole system is supported by a selected

573: structural organization of the internal systems on smaller scales, as

574: the transcriptional regulatory network studied in this work. While we

575: have concentrated only on the average in-degree, the organization of

576: circuits, and the in-degree distribution of the network, further

577: structural analysis will be helpful to illuminate how structure

578: supports function.

579:

580: \acknowledgements

581: We thank Uri Alon and Richard A. Young for allowing us to use their data.

582: This work was supported by Deutsche Forschungsgemeinschaft (DFG).

583:

584: \appendix

585:

586: \section{Derivation of Eq.~(\ref{eq:beta}) from Eq.~(\ref{eq:sc_simple})}

587:

588: To find the behavior of $H=\lim_{t\to\infty} H(t)$ as a function of

589: $\lambda$ near the critical point $\lambda_c=\langle k\rangle^{-1}$,

590: we set $H(t+1)=H(t)=H$ and expand Eq.~(\ref{eq:sc_simple})

591: for small $H$, which leads to

592: \begin{equation}

593: H=\lambda \sum_{n=1}^\infty \frac{(-1)^{n+1}\langle k^n\rangle}{n!} H^n.

594: \label{eq:expand}

595: \end{equation}

596: Here $\langle k^n\rangle$ is the $n$th moment of the in-degree

597: distribution $P_d(k)$, i.e., $\langle k^n\rangle\equiv\sum_k k^nP_d(k)$.

598: It is finite for all $n$ only if $P_d(k)$ decays exponentially.

599: In this case, all the terms in the right-hand-side of Eq.~(\ref{eq:expand})

600: are analytic and keeping the first two leading terms, one finds

601: that Eq.~(\ref{eq:expand}) is expressed as

602: $H\simeq \lambda \langle k\rangle H - \lambda\langle k^2\rangle H^2/2$.

603: This allows us to see that $H=0$ for $\lambda<\lambda_c=\langle k\rangle^{-1}$

604: and $H\sim \Delta$ with $\Delta \equiv (\lambda-\lambda_c)/\lambda_c$

605: for $\lambda>\lambda_c$.

606:

607: When the in-degree distribution is a power-law asymptotically,

608: $P_d(k)\sim k^{-\gamma}$, all the moments $\langle k^n\rangle$ are

609: not finite: $\langle k^n\rangle$ for $n>n_*$ with

610: $n_*= \lceil\gamma-2\rceil$ diverges as

611: $k_{\rm max}^{n-\gamma+1}/(n-\gamma+1)$, where $\lceil x\rceil$

612: is the smallest integer not smaller than $x$ and $k_{\rm max}$ is

613: the (average) largest in-degree. The largest in-degree diverges

614: as $N^{1/(\gamma-1)}$, which is derived from the relation

615: $\sum_{k>k_{\rm max}} P_d(k) \sim N^{-1}$. Thus

616: $\langle k^n\rangle \sim N^{(n-\gamma+1)/(\gamma-1)}$.

617: Such diverging terms are arranged as

618: $H^{\gamma-1} \sum_{n>n_*} (-1)^{n+1} [k_{\rm max} H]^{n-\gamma+1}/

619: [n!(n-\gamma+1)]$ in the right-hand-side of Eq.~(\ref{eq:expand}).

620: Here the summation converges to a constant in the limit

621: $k_{\rm max}\bar{H}\to\infty$ due to alternating signs and

622: fast decay of the coefficients~\cite{lee05}. Thus the small-$H$

623: expansion  of Eq.~(\ref{eq:expand}) reads as

624: $H = \lambda \sum_{n=1}^{n_*} (-1)^{n+1} \langle k^n\rangle H^n/n!

625: + \lambda ({\rm constant}) H^{\gamma-1} + \cdots.$.

626: The $H^{\gamma-1}$ term is relevant to the critical behavior of $H$

627: for  $\gamma<3$ since it holds for $\gamma<3$ that

628: $H\simeq \lambda \langle k\rangle H + \lambda ({\rm const.}) H^{\gamma-1}$,

629: yielding $H\sim \Delta^{1/(\gamma-2)}$. On the other hand, the linear

630: and quadratic terms are relevant for $\gamma>3$ as for exponentially-decaying

631: in-degree distributions. In summary, the Hamming distance $H$

632: with a power-law in-degree distribution $P_d(k)\sim k^{-\gamma}$

633: behaves near the critical point as

634: \begin{equation}

635: H \sim \left\{

636: \begin{array}{cc}

637: \Delta & (\gamma>3),\\

638: \Delta^{1/(\gamma-2)} & (2<\gamma<3).

639: \end{array}

640: \right.

641: \label{eq:critical}

642: \end{equation}

643:

644: \begin{thebibliography}{99}

645: \bibitem{causton01}

646: H.C. Causton {\it et al.}, Mol. Biol. Cell {\bf 12} 323 (2001).

647: \bibitem{babu04}

648: M.M. Babu {\it et al.} Curr. Opin. Struct. Biol. {\bf 14}, 283 (2004).

649: \bibitem{thieffry98}

650: D. Thieffry, A.M. Huerta, E.P\'{e}rez-Rueda, and J. Collado-Vides,

651:   Bioessays {\bf 20}, 433 (1998).

652: \bibitem{dobrin04}

653: R. Dobrin, Q.K. Beg, A.-L. Barab\'{a}si, and Z.N. Oltvai, BMC Bioinformatics

654: {\bf 5}, 10 (2004).

655: \bibitem{shenorr02}

656: S.~Shen-Orr, R.~Milo, S.~Mangan, and U.~Alon, Nature Genetics, {\bf 31}, 64 (2002).

657: \bibitem{guelzim02}

658: N. Guelzim, S. Bottani, and F. K\'{e}p\`{e}s,

659:   Nature Genetics, {\bf 31}, 60 (2002).

660: \bibitem{tilee02}

661: T.~I.~Lee {\it et al.}, Science {\bf 298}, 799 (2002).

662: \bibitem{luscombe04}

663: N.M. Luscombe {\it et al.}, Nature {\bf 431}, 308 (2004).

664: \bibitem{orphanides02}

665: G. Orphanides and D. Reinberg, Cell {\bf 108}, 439 (2002).

666: \bibitem{hwa03}

667: N. Buchler, U. Gerland, and T. Hwa, Proc. Natl. Acad. Sci. U.S.A. {\bf 100}, 5136 (2003).

668: \bibitem{kauffman}

669: S.~Kauffman, J.~Theor.~Biol. {\bf 22}, 437 (1969);

670: {\it The Origins of Order: Self-organization and Selection in Evolution}

671: (Oxford Univ. Press, Oxford, 1993).

672: \bibitem{samuelsson03}

673: B. Samuelsson and C. Troein, Phys. Rev. Lett. {\bf 90}, 098701 (2003).

674: \bibitem{klemm05}

675: K. Klemm and S. Bornholdt, Phys. Rev. E {\bf 72}, 055101 (2005).

676: \bibitem{harris02}

677: S.E. Harris, B.K. Sawhill, A. Wuensche, and S. Kauffman, Complexity {\bf 7}, 23 (2002).

678: \bibitem{kauffman0304}

679: S.~Kauffman, C.~Peterson, B.~Samuelsson, and C.~Troein,

680:   Proc.~Natl.~Acad.~Sci.~U.S.A. {\bf 100}, 14796 (2003);

681: {\it ibid.} {\bf 101}, 17102 (2004).

682: \bibitem{derrida86}

683: B.~Derrida and Y.~Pomeau, Europhys.~Lett. {\bf 1}, 45 (1986).

684: \bibitem{aldana03}

685: M.~Aldana and P.~Cluzel, Proc.~Natl.~Acad.~Sci. {\bf 100}, 8713 (2003).

686: \bibitem{lee04}

687: D.-S. Lee, K.-I.~Goh, B.~Kahng, and D.~Kim, Nucl. Phys. B {\bf 696}, 351 (2004).

688: \bibitem{lee05}

689: D.-S. Lee, Phys. Rev. E {\bf 72} 026208 (2005).

690: \end{thebibliography}

691:

692: \end{document}

693:

694: