1: \documentclass[prl,twocolumn,floatfix]{revtex4}
2: \usepackage{graphicx,amsmath,amssymb,multirow}
3: \begin{document}
4: \title{Comparative study of the transcriptional regulatory networks
5: of E. coli and yeast:
6: Structural characteristics leading to marginal dynamic stability}
7: \author{Deok-Sun Lee}
8: \altaffiliation[Present address: ]{Department of Physics, University of Notre Dame, Notre Dame, Indiana 46556, USA}
9: \affiliation{Theoretische Physik, Universit\"{a}t des Saarlandes,
10: 66041 Saarbr\"{u}cken, Germany}
11: \author{Heiko Rieger}
12: \affiliation{Theoretische Physik, Universit\"{a}t des Saarlandes,
13: 66041 Saarbr\"{u}cken, Germany}
14: \date{\today}
15: \begin{abstract}
16: Dynamical properties of the transcriptional regulatory network of {\it
17: Escherichia coli} and {\it Saccharomyces cerevisiae} are studied
18: within the framework of random Boolean functions. The dynamical
19: response of these networks to a single point mutation is characterized
20: by the number of mutated elements as a function of time and the
21: distribution of the relaxation time to a new stationary state, which
22: turn out to be different in both networks. Comparison with the
23: behavior of randomized networks reveals relevant structural
24: characteristics other than the mean connectivity, namely the
25: organization of circuits and the functional form of the in-degree
26: distribution. The abundance of single-element circuits in {\it
27: E. coli} and the power-law in-degree distribution of {\it
28: S. cerevisiae} shift their dynamics towards marginal stability
29: overcoming the restrictions imposed by their mean connectivities,
30: which is argued to be related to the simultaneous presence of
31: robustness and adaptivity in living organisms.
32: \end{abstract}
33: \maketitle
34:
35: \section{Introduction}
36: Living organisms depend simultaneously on a stable internal
37: environment and a capability to adapt to a fluctuating external
38: environment~\cite{causton01}. Since the biological characteristics of
39: an organism are determined by the interplay between its gene
40: repertoire and the regulatory apparatus~\cite{babu04}, robustness and
41: adaptiveness should be generic features of the molecular
42: interactions composing the gene regulation machinery. The
43: organization of the gene transcriptional regulatory network has been
44: analyzed for numerous organisms, in particular for the prokaryote {\it
45: Escherichia coli} ({\it E. coli})
46: ~\cite{thieffry98,dobrin04,shenorr02} and the eukaryote {\it
47: Saccharomyces cerevisiae} ({\it S. cerevisiae})
48: ~\cite{guelzim02,tilee02,luscombe04}.
49:
50: Adaptivity of an organism implies the production of different cell
51: types with different functions from the same genome. This begins with
52: a regulated transcription by certain proteins, transcriptional factor
53: (TF)~\cite{orphanides02}. The identification of the target genes for
54: each TF allows the construction of a gene transcriptional regulatory
55: network, where the nodes are the genes or operons that produce TF's or
56: are regulated by TF's, and the directed edges indicate a regulatory
57: dependence: A directed edge from node $A$ to node $B$ implies that a
58: TF encoded by gene $A$ is involved in the regulation if the expression
59: of gene $B$. The expression level of each gene defines the dynamical
60: state of the network. To achieve robustness and adaptiveness at the
61: same time one expects the regulatory network dynamics to be neither
62: chaotic nor fully insensitive to perturbations, but marginally
63: stable. Structural characteristics of the network must support these
64: dynamical features.
65:
66: Our study reveals specific topological features in the transcriptional
67: regulatory network architecture of {\it E. coli} and {\it S. cerevisiae} that
68: shift the dynamics towards marginal stability. {\it E. coli}'s network has
69: a very low mean connectivity, the number of edges per node, which would lead
70: in random networks to a high stability thus deteriorating adaptiveness.
71: But we find that single-element circuits which are anomalously rich
72: in {\it E. coli}'s network help mutations triggered by random perturbations
73: to persist, favoring an unstable dynamical behavior.
74: {\it S. cerevisiae} on the other hand has a
75: sufficiently high mean connectivity which favors chaotic dynamics in random
76: networks deteriorating stability. Here we find that {\it S. cerevisiae}'s
77: network has a broad (algebraic) node degree distribution and
78: we demonstrate the stabilizing effect of this feature upon the dynamics.
79:
80: Practically, the information about the transcriptional regulatory
81: network structure - which TF binds to which gene - is available via
82: the chromatin-immunoprecipitation microarray experiments
83: ~\cite{tilee02}. The question, whether a specific TF enforces or
84: inhibits the expression of a specific target gene, has to be studied
85: separately. However, those individual interactions do not necessarily
86: occur independently and these regulatory interactions are often
87: combinatorial~\cite{hwa03} and time-, cell cycle-, or
88: environment-dependent, limiting the available information on the
89: complete regulation profile. Generic dynamical features then have to
90: be extracted using model interactions as suggested by
91: Kauffman~\cite{kauffman}: One digitizes the continuous expression
92: level to a Boolean variable, $0$ (inactive) and $1$ (active), and
93: assumes a random static regulation rule for each gene in the form of a
94: random Boolean function for each gene determining its state at the
95: next time step by the current states of its regulators. Here {\it
96: random} means that the output value of these Boolean functions is $0$
97: or $1$ with equal probabilities.
98:
99: Based on considerations of random Boolean networks with a fixed number
100: of regulators $k$ for every element, Kauffman \cite{kauffman}
101: hypothesized that distinct stationary states - limit cycles -
102: correspond to different types of cells. This idea got some support
103: from the agreement of the scaling behavior of the number of
104: limit-cycles for $k=2$-random Boolean networks and the number of cell
105: types with respect to the genome size, but was also
106: debated~\cite{samuelsson03,klemm05}. Among networks with fixed
107: in-degree, $k=2$ is a critical point distinguishing two different
108: dynamical phases: stable and unstable against perturbations,
109: suggesting that the regulatory network dynamics of living organisms is
110: ``on the edge" between order and chaos~\cite{kauffman}.
111:
112: However, real regulatory networks do not have a fixed in-degree but a
113: heterogeneous connectivity, even their average in-degree $\langle
114: k\rangle$ is usually different from $2$. Nevertheless the Boolean
115: model itself is useful, and recently the effects of the nature of the
116: regulating rules on the dynamical stability were studied within its
117: framework~\cite{harris02,kauffman0304}. We propose that the network
118: structure itself is also relevant for the stability/instability aspect
119: mentioned before. Therefore we construct a network from the data for
120: the transcriptional regulatory interactions for {\it E. coli} and {\it
121: S. cerevisiae}, and study how a point mutation, i.e., an altered
122: dynamical state of a single element, spreads over the whole network by
123: inducing another mutation through regulatory interactions.
124:
125: \begin{figure}
126: \includegraphics[width=0.8\columnwidth]{f1.eps}
127: \caption{An example of Boolean dynamics. (a) A Boolean network of four nodes and three
128: directed edges. Each node has a Boolean variable $\sigma_i$ ($i=A,B,C,D$)
129: (b) Regulating rules $f_i$'s determining the node $i$'s state at time $t+1$ with
130: its regulators' states at time $t$ as input.
131: The nodes $A$ and $B$ have no regulator
132: and their Boolean variables take constant values, respectively, at time $t+1$
133: regardless of their values at time $t$.
134: (c) An example of the time evolution of those Boolean
135: variables under the regulating rules in (b).}
136: \label{fig:model}
137: \end{figure}
138:
139: \section{Method}
140: {\it Datasets} ---
141: For the transcriptional regulatory network in {\it E. coli}, we used
142: the data of Ref.~\cite{shenorr02}, which are based on an existing
143: database, RegulonDB, and enhanced by literature search. The resultant
144: network consists of $418$ operons and $519$ interactions with $111$
145: nodes having at least one outward edge. The data for {\it
146: S. cerevisiae} are taken from Ref.~\cite{tilee02} and were obtained
147: from the combination of Chromatin Immunoprecipitation and DNA
148: microarray analysis. We chose the P value threshold $0.01$, yielding
149: a network of $4555$ nodes and $12455$ directed edges with $112$ nodes
150: having at least one outward edge. Isolated nodes and those possessing
151: only self-regulation have been excluded in both networks since they
152: have no interaction with other elements.
153:
154: {\it Random Boolean functions} ---
155: These experimental data establish a directed network $G$ of $N$ nodes,
156: and we assign a dynamic Boolean variable $\sigma_i$ (that can take on
157: the values $0$ or $1$ only, corresponding to an inactive or active
158: state, respectively) to each node $i$. These dynamical variables
159: evolve synchronously via $\sigma_i(t+1)=f_i(\sigma_{i_1}(t),
160: \sigma_{i_2}(t), \ldots, \sigma_{i_{k_i}}(t))$, with the nodes
161: $i_1, i_2, \ldots, i_{k_i}$ having the outward edges incident on the
162: node $i$. The output value of $f_i$ for each input configuration
163: $\{\sigma_{i_1}(t), \sigma_{i_2}(t), \ldots, \sigma_{i_{k_i}}(t)\}$
164: is $0$ with probability $p$ or $1$ with probability $1-p$, which is
165: determined at the beginning and not changed with time. If $k_i=0$,
166: $\sigma_i$ is fixed at $f_i$; $\sigma_i(t+1)=f_i$ regardless of the
167: value of $\sigma_i(t)$. The parameter $p$ characterizes the
168: randomness of the regulating rules: If $p=0$ or $1$, the dynamics is
169: frozen while the system tends to be disordered with $p=1/2$. An
170: example network with this Boolean dynamics is given in
171: Fig.~\ref{fig:model}.
172:
173: {\it Stability measure} ---
174: The stability of a time-trajectory $\Sigma(t)$ is assessed by the
175: effects of a point mutation $\sigma_i \to 1-\sigma_i$ on the dynamical
176: evolution of the subsequent states. For this, we choose a
177: configuration $\Sigma = \{\sigma_1,\sigma_2,\ldots,\sigma_N\}$, and
178: prepare its mutant,
179: $\hat{\Sigma}=\{\hat{\sigma}_1,\hat{\sigma}_2,\ldots,\hat{\sigma}_N\}$,
180: where $\hat{\sigma}_i = \sigma_i$ for all $i$ except $j$ with $j$
181: chosen arbitrarily. Evolving $\Sigma$ and $\hat{\Sigma}$ on the same
182: network with the same regulating rules, we count $n_{\rm m} (t)$, the
183: number of elements $i$'s with
184: $\sigma_i(t)\ne \hat{\sigma}_i(t)$, at
185: each time step $t$.
186: A node with $\Delta \sigma_i(t) \equiv |\sigma_i(t)-\hat{\sigma}_i(t)|>0$
187: is considered as mutated. We average $n_{\rm m}(t)$ over different realizations of
188: the regulating rules and different initial pairs of configurations to get the
189: average, $N_{\rm m}(t)=\langle n_{\rm m} (t)\rangle$, which converges
190: to its stationary value $N_{\rm m}$.
191: For each individual normal-mutant pair $(\Sigma,\hat{\Sigma})$, one can measure
192: the relaxation time $t_{\rm r}$ after which $n_{\rm m}(t)$ reaches
193: its stationary value. Its distribution $P(t_{\rm r})$ is investigated as well.
194:
195: \begin{figure}
196: \includegraphics[width=\columnwidth]{f2a.eps}
197: \includegraphics[width=\columnwidth]{f2b.eps}
198: \includegraphics[width=\columnwidth]{f2c.eps}
199: \caption{Number of mutated elements
200: $N_{\rm m}(t)$ and $N_{\rm m}=\lim_{t\to\infty} N_{\rm m}(t)$ and distribution of the
201: relaxation time $P(t_{\rm r})$.
202: (a) Plot of the stationary value $N_{\rm m}$ versus $\lambda=2p(1-p)$
203: in the original network and two types of randomized graphs (see the
204: text for the definition) for {\it E. coli}. The data are
205: averages over $10^2$ initial pairs of configurations for each of more than
206: $10^3$ realizations of regulating rules. The approximation given in
207: Eq.~(\ref{eq:ecoliapprox}) is drawn together. The inset shows the time developments
208: $N_{\rm m}(t)$ for selected values of $\lambda$ in the original {\it E. coli}
209: network. (b) The same data as (a) for {\it S. cerevisiae}.
210: (c) Plots of $P(t_{\rm r})$ with $p=1/2$ ($\lambda=1/2$) on the original networks and the
211: randomized graphs for {\it E. coli} and {\it S. cerevisiae}.}
212: \label{fig:NmP}
213: \end{figure}
214:
215: \section{Results}
216: \subsection{Time evolution of the number of mutated elements}
217: Figure~\ref{fig:NmP} (a) and (b) present
218: the results for the number of mutated elements
219: $N_{\rm m}(t)$ and $N_{\rm m}$.
220: $N_{\rm m}(t)$ decreases very rapidly
221: from $N_{\rm m}(0)=1$ to a much smaller value for all $p$'s
222: in {\it E. coli}. On the other hand, $N_{\rm m}$ for {\it S. cerevisiae}
223: increases with time up to a value larger than $1$ for $\lambda \equiv 2p(1-p)
224: \gtrsim 0.42$ ($0.3\lesssim p \lesssim 0.7$) indicating the occurrence of
225: a mutation cascade. Both in {\it E. coli} and {\it S. cerevisiae},
226: $N_{\rm m}$ increases with increasing $p$ from $0$ to $1/2$ (or decreasing
227: $p$ from $1$ to $1/2$) since the probability that a regulating rule
228: yields different output values from different input configurations is
229: $2p (1-p)$, which has a maximum at $p=1/2$ and will be denoted by $\lambda$.
230: In {\it E. coli}, $N_{\rm m}$ stays smaller than $0.3$,
231: indicating that system-wide mutations are suppressed.
232: Figure~\ref{fig:NmP} also shows that in {\it S. cerevisiae} $N_{\rm m}$ is smaller than in
233: {\it E. coli} for $\lambda\lesssim 0.2$ but increases with $\lambda$ more rapidly and is
234: larger for $\lambda\gtrsim 0.2$.
235:
236: The functional form of $P(t_{\rm r})$ for $p=1/2$ in Fig.~\ref{fig:NmP} (c)
237: is strikingly different between
238: {\it E. coli} and {\it S. cerevisiae}: it is exponential for {\it E. coli} and
239: a power-law, $P(t_{\rm r})\sim t_{\rm r}^{-1.5(2)}$, for {\it S. cerevisiae}.
240: This long tail of $P(t_{\rm r})$ implies that in the case of {\it S. cerevisiae}
241: an element can be mutated and recover even at very late times in the dynamics.
242:
243: \subsection{Mean connectivity}
244: These differences in the mutation spread dynamics may be
245: primarily attributed to a difference in the mean connectivity and
246: can be understood by a mean-field approach~\cite{derrida86,aldana03}:
247: The probability $H(t)=\lim_{N\to\infty} N_{\rm m}(t)/N$ that a randomly chosen node
248: $i$ is mutated at time $t$, also called the Hamming distance,
249: is given in terms of the probability that a regulator of the node $i$ is mutated,
250: which we denote by $\bar{H}(t)$, and the
251: probability that the regulating rule $f_i$ yields different output values
252: from different input configurations, $\lambda$, as
253: \begin{eqnarray}
254: H(t+1)&=& \sum_{k_{\rm in}} \lambda (1 - (1 - \bar{H}(t))^k) P_d(k),
255: \nonumber\\
256: \bar{H}(t+1)&=& \sum_{k,q} \lambda (1 - (1 - \bar{H}(t))^k) \frac{q P_d(k,q)}{\langle q\rangle}.
257: \label{eq:sc}
258: \end{eqnarray}
259: Here $P_d(k,q)$ is the joint probability that a node has in-degree $k$ and
260: out-degree $q$ and is related to the in-degree distribution $P_d(k) = \sum_q
261: P_d(k,q)$. $H(t)$ and $\bar{H}(t)$ evolve towards their stationary values
262: $H$ and $\bar{H}$. Setting $\bar{H}(t+1)=\bar{H}(t)=\bar{H}$ and expanding
263: the second line of Eq.~(\ref{eq:sc}) for small $\bar{H}$, one finds
264: $\bar{H}\simeq \bar{H}\lambda \langle kq\rangle/\langle q\rangle -
265: \bar{H}^2\lambda \langle k^2q\rangle/(2\langle q \rangle) +
266: \mathcal{O}(\bar{H}^3)$
267: provided $\langle q\rangle$, $\langle kq\rangle $, and $\langle k^2
268: q\rangle$ are all finite. Therefore $\bar{H}$ and $H$ are zero for
269: $\lambda$ smaller than a critical value $\lambda_c$ with
270: $\lambda_c=1/K$ and $K\equiv \langle kq\rangle/\langle q\rangle$ and
271: non-zero otherwise. The expression $\lambda_c=K^{-1}$ for the critical
272: point holds true as long as $K$ is finite. Since the Hamming distance
273: $H$ can be positive only if $K>2$, $N_{\rm m}\simeq HN$ for finite $N$
274: should be small in {\it E. coli} that has the value $K\simeq 1.08$ and
275: can be large, of order $N$, for $\lambda\gtrsim 0.42$ in {\it
276: S. cerevisiae} that has $K\simeq 2.35$.
277: Although the Hamming distance is not necessarily of order $N^{-1}$
278: at $\lambda_c$, one finds the
279: value of $\lambda$ for which $N_{\rm m}=1$ very close to the value
280: $K^{-1}\simeq 0.42$ in the latter.
281: The in-degree $k$ and the out-degree $q$ show no significant correlation
282: in the two networks according to our analysis not presented here,
283: that is, $P_d(k,q)\simeq P_d(k)P_d(q)$ , which yields $\langle kq \rangle
284: \simeq \langle k\rangle \langle q\rangle$ and $K\simeq \langle k\rangle$.
285:
286: \subsection{Comparison with randomized networks}
287: Next we studied the same dynamics in two kinds of randomized networks
288: derived from the regulatory networks of {\it E. coli} and {\it
289: S. cerevisiae}. The first type of randomized graphs (type I) are
290: constructed by the repetition of removing an edge connecting nodes
291: $v_1$ and $w_1$ and creating a new one between $v_2$ and $w_2$, where
292: both $v_1$ and $v_2$ had at least one outward edge and the node pair
293: $v_2$ and $w_2$ were not connected before this change. Thus these
294: type-I randomized networks have the same number of nodes, edges, and
295: TF's as the original networks, but the edges connect randomly-chosen
296: pairs of TF and target gene. Our results for $N_{\rm m}$ and $P(t_{\rm
297: r})$ are shown in Fig.~\ref{fig:NmP}. For the type-I randomized graphs
298: derived from {\it E. coli}, $N_{\rm m}$ is substantially suppressed as
299: compared with the original network. In the type-I random graphs
300: derived from {\it S. cerevisiae}, $N_{\rm m}$ increases much more
301: rapidly passing $\lambda\simeq 0.3$. The relaxation time distribution
302: for the random graphs from {\it E. coli} is broader than for the
303: original network but still decays faster than that for {\it
304: S. cerevisiae}. The type-I randomization does not change significantly
305: the relaxation time distribution for {\it S. cerevisiae}.
306:
307: The type-II randomized graphs we considered are constructed by
308: exchanging the end points of two edges: Two randomly chosen edges $e_1
309: = (v_1, w_1)$ and $e_2 = (v_2, w_2)$ are replaced by $e_1' = (v_1,
310: w_2)$ and $e_2' = (v_2, w_1)$, respectively. These graphs preserve the
311: joint degree distribution $P_d(k,q)$, but their local connectivity
312: patterns may be different from that in the original network. We
313: present the plots of $N_{\rm m}$ and $P(t_{\rm r})$ in
314: Fig.~\ref{fig:NmP}. This type-II randomization does not change the
315: relaxation time distribution for {\it S. cerevisiae} neither. Thus
316: much faster decay of the relaxation time in the original and
317: randomized networks for {\it E. coli} than in those for {\it
318: S. cerevisiae} can be ascribed to the much lower mean connectivity,
319: $\langle k\rangle \simeq 1.24$, of the former than that of the latter,
320: $\langle k\rangle \simeq 2.73$. Interestingly the quantities $N_{\rm
321: m}$ and $P(t_{\rm r})$ for these randomized graphs agree well with
322: those for the original network of {\it S. cerevisiae}, but not for
323: {\it E. coli}: This implies that it is the degree distribution that is
324: mainly responsible for the spread of mutation in {\it S. cerevisiae}
325: while other (local) structural factors must be important in {\it
326: E. coli}.
327:
328: \begin{figure}
329: \includegraphics[width=\columnwidth]{f3.eps}
330: \caption{Network structure dependence of mutation spread.
331: The regulating rules are given by
332: $f_i(\sigma)= \sigma$ or $1-\sigma$ for nodes $i$'s with one input and
333: $f_i = 1$ or $0$ for nodes $i$'s with no input. Thus a mutated regulator
334: necessarily makes its target node mutated at the next time step. Time evolution of
335: $\Delta \sigma_i = |\sigma_i - \hat{\sigma}_i|$ for each node is shown in
336: tables.
337: (a) No circuit (tree structure). All nodes recover at $t=3$ and thus the Hamming
338: distance $H$ is zero. (b) A circuit
339: of length $3$. The point mutation circulates with period $3$, resulting in $H=1/3$.
340: (c) A single-element circuit together with tree structure. All
341: nodes are mutated at $t=2$ and thus
342: $H=1$.}
343: \label{fig:tree-circuit}
344: \end{figure}
345:
346: \begin{figure}
347: \includegraphics[width=0.9\columnwidth]{f4.eps}
348: \caption{Organization of the core in {\it E. coli} and {\it S. cerevisiae}.
349: (a) Core of {\it E. coli}. It consists of $57$ nodes and $84$ edges. (b)
350: Core of {\it S. cerevisiae}. It has $63$ nodes and $167$ edges. (c)
351: Histogram of the shortest circuit lengths.
352: In {\it E. coli}, a circuit longer than $1$
353: is not observed but all $54$ circuits are single-element ones.
354: In {\it S. cerevisiae}, $836$ pairs of nodes
355: among all possible $1953$ pairs in the core are connected by circuits
356: and the shortest circuit length ranges from $0$ to $19$.}
357: \label{fig:core}
358: \end{figure}
359:
360: \subsection{Abundance of single-element circuits in {\it E. coli}}
361: One might expect that circuits (directed closed paths) in the
362: regulatory network play an important role for the spread of mutations,
363: because in networks with a tree-structure, i.e., without circuits,
364: point mutations spread without circulation and a node that is mutated
365: will recover at the next time step and never become mutated again as
366: indicated in Fig.~\ref{fig:tree-circuit} (a). The nodes on a circuit,
367: on the other hand, can return to a mutated state even after recovery
368: [Fig.~\ref{fig:tree-circuit} (b)]. The nodes lying on circuits or
369: those on bridges connecting distinct circuits can in principle switch
370: their status permanently and thus they can be considered as comprising
371: a core in the dynamics of mutation spread. As a subnetwork including
372: all such circuits and the bridges connecting them, we define the core
373: of a network as the maximal subgraph in which each node has at least
374: one inward edge coming from and at least one outward edge incident to
375: an element of the core.
376:
377: By deleting the edges having at either end a node that does not meet
378: the requirement for the core elements, we found the core subnetwork in
379: the regulatory networks of {\it E. coli} and {\it S. cerevisiae}. Note
380: that if an edge has the same node at both ends, the node, which
381: regulates itself, becomes the element of the core. The relevance of
382: the core to the mutation spread dynamics can be understood e.g., by
383: investigating the relaxation time distribution $P(t_{\rm r})$ in {\it
384: S. cerevisiae} depending on the location of the initial point
385: mutation. Our analysis shows that initial mutations in the core lead
386: to a qualitatively equal (power-law with the same exponent)
387: distribution of the relaxation time. On the other hand, initial
388: mutations in the output module, consisting of all nodes that have
389: inward edges coming from the nodes in the core and their edges, decay
390: very fast since the output module has a tree structure and cannot
391: cause mutations in the core.
392:
393: The organization of the core turns out to be very different in {\it
394: E. coli} and {\it S. cerevisiae} as shown in Fig.~\ref{fig:core} (a)
395: and (b), respectively. Most of all, the nodes are much more densely
396: connected in {\it S. cerevisiae} than in {\it E. coli}. This
397: difference can be first ascribed to different mean connectivities of
398: the nodes in the core: it is about $1.47$ in {\it E. coli} and $2.65$
399: in {\it S. cerevisiae}. However, a more striking difference exists in
400: their core organization. In {\it E. coli}, all $54$ circuits are
401: identified, all of which are single-element circuits representing
402: self-regulation. There are no circuits whose length (i.e the number of
403: edges on the cycle) is larger than $1$~\cite{thieffry98}. On the
404: contrary, only one or two single-element circuits are formed in its
405: randomized graphs. This organization of circuits in {\it
406: E. coli} is also contrasted with the one in {\it S. cerevisiae}. We
407: computed the shortest circuit for each pair of nodes in the core and
408: counted the numbers of node pairs for each given shortest-circuit
409: length. The distribution of shortest-circuit length obtained for {\it
410: S. cerevisiae} is broad as shown in Fig.~\ref{fig:core} (c). We
411: propose that the presence of single-element circuits in {\it
412: E. coli} is the main reason for the enhancement of $N_{\rm m}$ of {\it
413: E. coli} compared with both of its randomized graphs. Once a node $i$
414: regulating itself is mutated, the input configurations to the
415: regulating rule $f_i$ are necessarily different between the
416: normal-mutant pair $(\Sigma,\hat{\Sigma})$ since it is guaranteed that
417: at least one of its regulators, the node $i$ itself, is
418: mutated. Recalling that a node can be mutated at the next time step
419: only if the input configurations from the normal-mutant pair are
420: different, one can see that single-element circuits have a higher
421: probability to be mutated than nodes which do not regulate themselves
422: [See Fig.~\ref{fig:tree-circuit} (c)]. Therefore networks with more
423: single-element circuits can be more adaptive.
424:
425: In the core of {\it E. coli} network, $54$ edges are used for
426: single-element circuits and the remaining $30$ edges connect pairs of
427: distinct nodes. As a result, the network has many isolated nodes and
428: few small connected components, resulting in the rapid decay of the
429: relaxation time. In Fig.~\ref{fig:NmP} (c), we find that the
430: relaxation times observed in {\it E. coli} are mostly $1$ or $2$. From
431: this, we can analytically predict the value of $N_m$ as a function of
432: $\lambda$. Suppose $N_{\rm m}(t)$ saturates no later than time
433: $2$. From Eq.~(\ref{eq:sc}), $\bar{H}(1)=\lambda K N^{-1} + {\cal
434: O}(N^{-2})$ since $\bar{H}(0)=N^{-1}$ and
435: \begin{equation}
436: N_{\rm m}\simeq N H(2) \simeq N \lambda K \bar{H}(1)\simeq
437: \lambda^2 K^2.
438: \label{eq:ecoliapprox}
439: \end{equation}
440: This is in good agreement with the true value as shown in
441: Fig.~\ref{fig:NmP} (a).
442:
443: \begin{figure}
444: \includegraphics[width=\columnwidth]{f5a.eps}
445: \includegraphics[width=\columnwidth]{f5b.eps}
446: \caption{Connectivity pattern and its effect on the critical behavior of the
447: Hamming distance. (a) In-degree distributions $P_d(k)$ for {\it E. coli} and {\it S. cerevisiae}.
448: For {\it S. cerevisiae}, its asymptotic behavior is a power-law, $P_d(k)\sim k^{-\gamma}$
449: with $\gamma\simeq 2.7(2)$. On the other hand, the observed values of $k$ are only up to $6$
450: and so it is hard to discern the functional form of $P_d(k)$ in {\it E. coli}. (b)
451: Hamming distance $H$ as a function of $\lambda$ numerically obtained from
452: Eq.~(\ref{eq:sc_simple}) with $P_d(k)$ of the static model~\cite{lee04}, which
453: has a power-law tail as $P_d(k)\sim k^{-\gamma}$ with the exponent $\gamma$ tunable.
454: The inset shows that
455: $H\sim \Delta$ commonly for $\gamma\to\infty$ and $\gamma=3.5$, and that $H\sim \Delta^2$
456: for $\gamma=2.5$, in agreement with Eq.~(\ref{eq:beta}).}
457: \label{fig:critical}
458: \end{figure}
459:
460: \subsection{Power-law in-degree distribution in {\it S. cerevisiae}}
461: In {\it S. cerevisiae}, the most significant dynamical feature that we
462: found and that we need to explain is the slower increase of $N_{\rm
463: m}$ with $\lambda$ as compared with the type-I randomized graph, shown
464: in Fig.~\ref{fig:NmP} (b). Contrary to the type-II randomized graphs,
465: those of type-I do not preserve the degree distribution of the
466: original network. From this, we can conjecture that the degree
467: distribution of {\it S. cerevisiae} causes the slow increase of
468: $N_{\rm m}$. To check this, we analyze in detail the dependence of the
469: Hamming distance on the degree distributions.
470:
471: With uncorrelated in- and out-degree as is the case in the regulatory networks
472: considered here, Eq.~(\ref{eq:sc}) is reduced to $H(t)=\bar{H}(t)$ and
473: \begin{equation}
474: H(t+1) = \lambda \sum_k [1-(1-H(t))^k] P_d(k).
475: \label{eq:sc_simple}
476: \end{equation}
477: Thus the in-degree distribution $P_d(k)$ determines the behavior of
478: the Hamming distance $H(t)$. The in-degree distributions of {\it
479: E. coli} and {\it S. cerevisiae} shown in Fig.~\ref{fig:critical} (a)
480: are quite different from each other. The maximum degree is $31$ in
481: {\it S. cerevisiae} while it is only $6$ in {\it E. coli}.
482: Furthermore, the log-log plot of $P_d(k)$ in {\it S. cerevisiae}
483: indicates that $P_d(k)\sim k^{-\gamma}$ with $\gamma\simeq
484: 2.7(2)$. The functional form of $P_d(k)$ for {\it E. coli} is hard to
485: determine because of the small range for observable $k$ values. Note
486: that the in-degree distribution of the type-I randomized graphs obey a
487: Poisson distribution, $P_d(k)=\langle k\rangle^k e^{-\langle
488: k\rangle}/k!$. Let us consider an in-degree distribution which has a
489: power-law tail, i.e., $P_d(k)\sim k^{-\gamma}$. Then, we find from
490: Eq.~(\ref{eq:sc_simple}) that the Hamming distance in the stationary
491: state behaves as $H\sim \Delta^\beta$ for $\lambda$ larger than the
492: critical value $\lambda_c$ with $\Delta\equiv \lambda/\lambda_c-1$ and
493: the critical exponent $\beta$ given by
494: \begin{equation}
495: \beta = \left \{
496: \begin{array}{ll}
497: 1 & (\gamma>3),\\
498: 1/(\gamma-2) & (2<\gamma<3).
499: \end{array}
500: \right.
501: \label{eq:beta}
502: \end{equation}
503: The derivation of Eq.~(\ref{eq:beta}) is given in Appendix.
504: We restricted the range of $\gamma$ to $\gamma>2$ because the mean
505: connectivity diverges with $\gamma<2$. When the in-degree is subject
506: to a Poisson distribution or an exponentially-decaying distribution, it
507: corresponds to $\gamma\to\infty$ and the critical behavior is
508: the same as that for $\gamma>3$. We present the numerical solution to
509: Eq.~(\ref{eq:sc_simple}) in Fig.~\ref{fig:critical} (b) for
510: $\gamma\to\infty$ (Poisson distribution), $\gamma=3.5$, and $\gamma=2.5$.
511:
512: The increase of $\beta$ with decreasing $\gamma$ below $\gamma=3$
513: indicates a difference in the behavior of the Hamming distance near
514: the critical point between networks with $\gamma>3$ and those with
515: $2<\gamma<3$. Suppose we have two networks with a power-law in-degree
516: distribution $P_d(k)\sim k^{-\gamma}$: One has $\gamma=3.5$ and the
517: other has $\gamma=2.5$, and both have $\langle k\rangle=4$. Then, in
518: the region $0<\Delta =\lambda/\lambda_c-1\ll 1$, the Hamming distance
519: behaves as $H\sim \Delta$ for $\gamma=3.5$ and $H\sim \Delta^2$ for
520: $\gamma=2.5$: the former increases more rapidly than the latter in the
521: region $\Delta\ll 1$. Also the region where the Hamming distance
522: remains non-zero but small, e.g., $H\leq 0.05$ is larger with
523: $\gamma=2.5$ than with $\gamma=3.5$: it is given by $\lambda\in
524: (0.25:0.29]$ with $\gamma=3.5$ and $\lambda\in (0.25:0.35]$ with
525: $\gamma=2.5$. Such dependence of the Hamming distance on the
526: in-degree exponent $\gamma$ can thus explain different network
527: responses between {\it S. cerevisiae} and its type-I randomized
528: graphs. It is the broad in-degree distribution with $\gamma=2.7(2)$
529: that makes the number of mutated elements increase with $\lambda$ more
530: slowly than in the corresponding type-I randomized graphs that have
531: $\gamma\to\infty$. Due to such a slow increase of the Hamming
532: distance, {\it S. cerevisiae} can keep the size of mutation small for
533: a wider range of the parameter $p$ or $\lambda$, which would be much
534: larger with random structures.
535:
536: \section{Conclusion}
537: We performed numerical experiments - spread of mutation
538: - to probe the dynamic stability of the recently-unveiled networks of
539: gene transcriptional regulation of {\it E. coli} and {\it
540: S. cerevisiae} and provided analytical confirmation for the results by
541: analyzing their structural features. While the small number of edges
542: per node in {\it E. coli} fundamentally prohibits a global spread of
543: mutation, a relatively large number of edges in {\it S. cerevisiae}
544: enables a global mutation conditionally depending on the regulating
545: rules. We further identified the relevant structural features which
546: are distinguished from those of random graphs: All circuits of the
547: regulatory network of {\it E. coli} are single-element circuits and
548: the in-degree distribution of {\it S. cerevisiae} takes a power-law
549: form. Single-element circuits in {\it E. coli} have higher probability
550: to be mutated than nodes without self-regulation. The broad in-degree
551: distribution in {\it S. cerevisiae} smoothens the increase of the
552: number of mutated elements. This increase would be sharper for an
553: exponential distribution, as is the case in the random graphs.
554:
555: These biological networks appear to follow design principles that tend
556: to balance the size of mutation. The small mean connectivity of the
557: regulatory network of {\it E. coli} would restrict the size of
558: mutations drastically, which is compensated by the abundance of
559: single-element circuits that lead to the required enhancement of the
560: mutation size. In the case of {\it S. cerevisiae}, its global
561: characteristics of the regulatory network, a mean connectivity larger
562: than 2, would lead to a very large mutation size, but a very
563: heterogeneous interconnectivity pattern suppresses it. These local
564: structural features demonstrate that both genetic networks have
565: evolved, in spite of the restrictions imposed by the global
566: characteristics, in such a direction that they can stay dynamically
567: between stable (i.e., rarely mutated on a global scale) and unstable
568: (easily mutated). Being neither stable nor unstable appears to be
569: necessary for living organisms to maintain their stable internal state
570: and adapt itself to fluctuating external environment
571: simultaneously. Therefore our finding suggests that such a marginal
572: dynamic stability of the whole system is supported by a selected
573: structural organization of the internal systems on smaller scales, as
574: the transcriptional regulatory network studied in this work. While we
575: have concentrated only on the average in-degree, the organization of
576: circuits, and the in-degree distribution of the network, further
577: structural analysis will be helpful to illuminate how structure
578: supports function.
579:
580: \acknowledgements
581: We thank Uri Alon and Richard A. Young for allowing us to use their data.
582: This work was supported by Deutsche Forschungsgemeinschaft (DFG).
583:
584: \appendix
585:
586: \section{Derivation of Eq.~(\ref{eq:beta}) from Eq.~(\ref{eq:sc_simple})}
587:
588: To find the behavior of $H=\lim_{t\to\infty} H(t)$ as a function of
589: $\lambda$ near the critical point $\lambda_c=\langle k\rangle^{-1}$,
590: we set $H(t+1)=H(t)=H$ and expand Eq.~(\ref{eq:sc_simple})
591: for small $H$, which leads to
592: \begin{equation}
593: H=\lambda \sum_{n=1}^\infty \frac{(-1)^{n+1}\langle k^n\rangle}{n!} H^n.
594: \label{eq:expand}
595: \end{equation}
596: Here $\langle k^n\rangle$ is the $n$th moment of the in-degree
597: distribution $P_d(k)$, i.e., $\langle k^n\rangle\equiv\sum_k k^nP_d(k)$.
598: It is finite for all $n$ only if $P_d(k)$ decays exponentially.
599: In this case, all the terms in the right-hand-side of Eq.~(\ref{eq:expand})
600: are analytic and keeping the first two leading terms, one finds
601: that Eq.~(\ref{eq:expand}) is expressed as
602: $H\simeq \lambda \langle k\rangle H - \lambda\langle k^2\rangle H^2/2$.
603: This allows us to see that $H=0$ for $\lambda<\lambda_c=\langle k\rangle^{-1}$
604: and $H\sim \Delta$ with $\Delta \equiv (\lambda-\lambda_c)/\lambda_c$
605: for $\lambda>\lambda_c$.
606:
607: When the in-degree distribution is a power-law asymptotically,
608: $P_d(k)\sim k^{-\gamma}$, all the moments $\langle k^n\rangle$ are
609: not finite: $\langle k^n\rangle$ for $n>n_*$ with
610: $n_*= \lceil\gamma-2\rceil$ diverges as
611: $k_{\rm max}^{n-\gamma+1}/(n-\gamma+1)$, where $\lceil x\rceil$
612: is the smallest integer not smaller than $x$ and $k_{\rm max}$ is
613: the (average) largest in-degree. The largest in-degree diverges
614: as $N^{1/(\gamma-1)}$, which is derived from the relation
615: $\sum_{k>k_{\rm max}} P_d(k) \sim N^{-1}$. Thus
616: $\langle k^n\rangle \sim N^{(n-\gamma+1)/(\gamma-1)}$.
617: Such diverging terms are arranged as
618: $H^{\gamma-1} \sum_{n>n_*} (-1)^{n+1} [k_{\rm max} H]^{n-\gamma+1}/
619: [n!(n-\gamma+1)]$ in the right-hand-side of Eq.~(\ref{eq:expand}).
620: Here the summation converges to a constant in the limit
621: $k_{\rm max}\bar{H}\to\infty$ due to alternating signs and
622: fast decay of the coefficients~\cite{lee05}. Thus the small-$H$
623: expansion of Eq.~(\ref{eq:expand}) reads as
624: $H = \lambda \sum_{n=1}^{n_*} (-1)^{n+1} \langle k^n\rangle H^n/n!
625: + \lambda ({\rm constant}) H^{\gamma-1} + \cdots.$.
626: The $H^{\gamma-1}$ term is relevant to the critical behavior of $H$
627: for $\gamma<3$ since it holds for $\gamma<3$ that
628: $H\simeq \lambda \langle k\rangle H + \lambda ({\rm const.}) H^{\gamma-1}$,
629: yielding $H\sim \Delta^{1/(\gamma-2)}$. On the other hand, the linear
630: and quadratic terms are relevant for $\gamma>3$ as for exponentially-decaying
631: in-degree distributions. In summary, the Hamming distance $H$
632: with a power-law in-degree distribution $P_d(k)\sim k^{-\gamma}$
633: behaves near the critical point as
634: \begin{equation}
635: H \sim \left\{
636: \begin{array}{cc}
637: \Delta & (\gamma>3),\\
638: \Delta^{1/(\gamma-2)} & (2<\gamma<3).
639: \end{array}
640: \right.
641: \label{eq:critical}
642: \end{equation}
643:
644: \begin{thebibliography}{99}
645: \bibitem{causton01}
646: H.C. Causton {\it et al.}, Mol. Biol. Cell {\bf 12} 323 (2001).
647: \bibitem{babu04}
648: M.M. Babu {\it et al.} Curr. Opin. Struct. Biol. {\bf 14}, 283 (2004).
649: \bibitem{thieffry98}
650: D. Thieffry, A.M. Huerta, E.P\'{e}rez-Rueda, and J. Collado-Vides,
651: Bioessays {\bf 20}, 433 (1998).
652: \bibitem{dobrin04}
653: R. Dobrin, Q.K. Beg, A.-L. Barab\'{a}si, and Z.N. Oltvai, BMC Bioinformatics
654: {\bf 5}, 10 (2004).
655: \bibitem{shenorr02}
656: S.~Shen-Orr, R.~Milo, S.~Mangan, and U.~Alon, Nature Genetics, {\bf 31}, 64 (2002).
657: \bibitem{guelzim02}
658: N. Guelzim, S. Bottani, and F. K\'{e}p\`{e}s,
659: Nature Genetics, {\bf 31}, 60 (2002).
660: \bibitem{tilee02}
661: T.~I.~Lee {\it et al.}, Science {\bf 298}, 799 (2002).
662: \bibitem{luscombe04}
663: N.M. Luscombe {\it et al.}, Nature {\bf 431}, 308 (2004).
664: \bibitem{orphanides02}
665: G. Orphanides and D. Reinberg, Cell {\bf 108}, 439 (2002).
666: \bibitem{hwa03}
667: N. Buchler, U. Gerland, and T. Hwa, Proc. Natl. Acad. Sci. U.S.A. {\bf 100}, 5136 (2003).
668: \bibitem{kauffman}
669: S.~Kauffman, J.~Theor.~Biol. {\bf 22}, 437 (1969);
670: {\it The Origins of Order: Self-organization and Selection in Evolution}
671: (Oxford Univ. Press, Oxford, 1993).
672: \bibitem{samuelsson03}
673: B. Samuelsson and C. Troein, Phys. Rev. Lett. {\bf 90}, 098701 (2003).
674: \bibitem{klemm05}
675: K. Klemm and S. Bornholdt, Phys. Rev. E {\bf 72}, 055101 (2005).
676: \bibitem{harris02}
677: S.E. Harris, B.K. Sawhill, A. Wuensche, and S. Kauffman, Complexity {\bf 7}, 23 (2002).
678: \bibitem{kauffman0304}
679: S.~Kauffman, C.~Peterson, B.~Samuelsson, and C.~Troein,
680: Proc.~Natl.~Acad.~Sci.~U.S.A. {\bf 100}, 14796 (2003);
681: {\it ibid.} {\bf 101}, 17102 (2004).
682: \bibitem{derrida86}
683: B.~Derrida and Y.~Pomeau, Europhys.~Lett. {\bf 1}, 45 (1986).
684: \bibitem{aldana03}
685: M.~Aldana and P.~Cluzel, Proc.~Natl.~Acad.~Sci. {\bf 100}, 8713 (2003).
686: \bibitem{lee04}
687: D.-S. Lee, K.-I.~Goh, B.~Kahng, and D.~Kim, Nucl. Phys. B {\bf 696}, 351 (2004).
688: \bibitem{lee05}
689: D.-S. Lee, Phys. Rev. E {\bf 72} 026208 (2005).
690: \end{thebibliography}
691:
692: \end{document}
693:
694: