1: \documentstyle[11pt,fleqn,epsf,epsfig]{article}
2: %\def\baselinestretch{1.5}
3: %\documentstyle[12pt,fleqn,epsf,epsfig]{article}
4: %\baselineskip 2pc
5: \oddsidemargin 0mm
6: \textwidth 160mm
7: \topmargin -10mm
8: \headheight 0mm \headsep 0mm
9: \textheight 240mm
10: \footheight 5mm \footskip 10mm
11:
12: \begin{document}
13:
14: \title{On Recursive Production and Evolvabilty of Cells:
15: Catalytic Reaction Network Approach}
16:
17: \author{Kunihiko Kaneko\\
18: {\small \sl Department of Basic Science,
19: College of Arts and Sciences,}\\
20: {\small \sl University of Tokyo,}\\
21: {\small \sl Komaba, Meguro-ku, Tokyo 153, Japan}\\
22: }
23:
24:
25: \date{}
26:
27: \maketitle
28:
29: \tableofcontents
30:
31: \begin{abstract}
32: To unveil the logic of cell from a level
33: of chemical reaction dynamics, we need to clarify how ensemble of
34: chemicals can autonomously produce the set of chemical, without assuming a
35: specific external control mechanism.
36: A cell consists of a huge number of chemical
37: species that catalyze each other. Often the number of each molecule
38: species is not so large, and accordingly the number fluctuations in each molecule species
39: can be large. In the amidst of such diversity and large fluctuations, how can a cell
40: make recursive production? On the other hand, a cell can
41: change its state to evolve to a different type over a longer time span. How are reproduction
42: and evolution compatible? We address these questions, based on several
43: model studies with catalytic reaction network.
44: %\\
45:
46: In the present survey paper, we first formulate basic questions on the recursiveness and
47: evolvability of a cell, and then state the standpoint of our research to
48: answer the questions, that is termed as 'constructive biology'. Based on this standpoint,
49: we present general strategy of modeling a cell as a chemical reaction network.
50: %\\
51:
52: At the first part we investigate of the origin of heredity in a cell, by
53: noting that the molecules carrying heredity must be preserved well and control
54: the behavior of a cell. We take a simpled model consisting of two mutually
55: catalyzing molecule species, each of which has catalytically active and
56: inactive types. One of the molecule species is synthesized slowly, and thus
57: is a minority in population. Through the growth and division of this cell,
58: it is shown to reach and remain in a state in which a
59: active, minority molecules are preserved over generations, and
60: control the cell behavior. This minority controlled state is achieved
61: by preserving rare number fluctuations of molecules.
62: The state gives rise to a selection pressure for mechanisms
63: that ensure the transmission of the minority molecule. The minority
64: molecule, thus, carries heredity, and is a candidate for "genetic
65: information". Experimental confirmation of this minority control is also
66: presented.
67: %\\
68:
69: Next, a protocell model consisting of a large number mutually catalyzing
70: molecule species is studied, in order to investigate how chemical
71: compositions are transferred recursively under replication errors.
72: Depending on the numbers of molecules and species in a cell, and the
73: path rate in the reaction network, three phases are found: fast
74: switching state without recursive production, recursive production, and
75: itinerancy between the above two states. At a recursive production state
76: chemicals are found to form intermingled hypercycle network that consists of
77: core hypercycle and peripheral network that influence each other. How
78: this intermingled network supports the recursive production, and how
79: minority in the core hypercycle gives rise to a switch to other recursive states
80: at the itinerancy phase are elucidated. Evolution of this hypercycle
81: network is also studied, to show the approach to recursive production of cells and
82: switch to more efficient reproduction states. Finally, statistics of
83: the number distributions of each molecule species are studied,
84: to show (i)power-law distribution of fast switching
85: molecules (ii) suppression of fluctuation in the core-network molecule
86: species and (iii) ubiquity of log-normal distribution for most other
87: molecule species. The origin of these statistics are discussed, while
88: suppression of the number fluctuations of a minority molecule that has
89: high catalytic connections with others is clarified, that reinforces the
90: minority control in the replication network.
91:
92: (Key Words: Minority Control, Heredity, Origin of Life, Constructive Biology
93: Hypercycle, Chemical Reaction Network, Log-normal Distribution,
94: Self-reproduction, Evolution)
95:
96: \end{abstract}
97:
98: \pagebreak
99:
100: \section{Basic Question for Recursive Production of a Cell as Reaction Dynamics of Catalytic Network}
101:
102: {\bf Question: A cell consists of several replicating molecules that
103: mutually help the synthesis and keep some synchronization for replication.
104: At least a membrane that partly separates a cell from the outside
105: has to be synthesized, keeping some degree of synchronization with
106: the replication of other internal chemicals. How is such recursive
107: production maintained, while keeping diversity of chemicals?
108: Furthermore this recursive production is not complete, and there appears
109: a slow `mutational' change over generations, which leads to evolution.
110: How is evolvability compatible with recursive production?\cite{whatlife}}
111:
112: \subsection{Q1: Origin of Heredity}
113:
114: In a cell, among many chemicals, only some chemicals (e.g., DNA) are
115: regarded to carry genetic information. Why do only some specific
116: molecules play the role to carry the genetic information? How has
117: such separation of roles in molecules between genetic information and
118: metabolism progressed? Is it a necessary course of a system with
119: internal degrees and reproduction?
120:
121: In a cell, however, a variety of chemicals form a complex reaction
122: network to synthesize themselves. Then how such cell with a huge
123: number of components and complex reaction network can sustain
124: reproduction, keeping similar chemical compositions?
125:
126: To consider this problem, we start from a simple prototype cell that
127: consists of mutually catalyzing molecule species whose growth in
128: number leads to division of the protocell\cite{minority}. In this
129: protocell, the molecules that carry the genetic information are not
130: initially specified. The first question we discuss here is how
131: heredity to maintain production of the protocell emerges. Related
132: with the question, we ask if there appears some specific molecules to
133: carry information for heredity, to realize continual reproduction of
134: such protocell. We note that in the present cells, it is generally
135: believed that information is encoded in DNA, which controls the
136: behavior of a cell.
137:
138: Here, We do not necessarily take a ``geno-centric" standpoint, in the sense
139: that gene determines the course of a cell. In fact, even in these
140: cells, proteins and DNA both influence their replication process each
141: other. Still, it cannot be denied that there exists a difference
142: between DNA and protein molecules with regards to the role as
143: information carrier. In spite of this mutual dependence, why is DNA
144: molecule usually regarded as the carrier of heredity?
145: Is there any general rule that some specific molecules play the role of carrier
146: of genetic information so that the recursive production of cells continues?
147:
148: Now, the origin of genetic information in a replicating system is an
149: important theoretical topic that should be studied, not necessarily as
150: a property of certain molecules, but as a general property of
151: replicating systems.
152: To investigate this problem we need to clarify what ''information" really
153: means. In considering information, one often tends to be interested
154: in how several messages are encoded on a molecule. In fact, a
155: hetero-polymer such as DNA would be suited to encode many bits of
156: information. One might point out that DNA molecules would be suited
157: to encode many bits of information, and hence would be selected as an
158: information carrier. Although this `combinatorial' capacity of an
159: information carrier is important, what we are interested here is a
160: basic property that has to be satisfied prior to that, i.e., origin of
161: just ``1 bit" information.
162:
163: As Shannon beautifully demonstrated, information means selection of
164: one branch from several possibilities \cite{Shannon, Brillouin}.
165: Assume that there are two possibilities in an event, each of which can
166: occur with the probability $1/2$. In this case, when one of these
167: possibilities turns out to be true, then this choice of a branch is
168: regarded to have 1 bit information. In this sense, if a specific cell
169: state is selected from several possible states, this selection process
170: has information, and a molecule to control such process carries
171: information.
172:
173: Now, a molecule that carries the information is postulated to play the
174: role to control for the choice of cellular state. Furthermore, to play
175: the role to carry the information for heredity, the molecules must be
176: transmitted to next generations relatively faithfully. These two
177: features, i.e., control and preservation are nothing but the problem
178: of heredity.
179:
180: Let us reconsider what 'heredity' really means. The heredity causes a
181: high correlation in phenotype between ancestor and offspring. Then,
182: for a molecule to carry heredity, we identify the following two
183: features as necessary.
184:
185: (1) If this molecule is removed or replaced by a mutant, there is a
186: strong influence on the behavior of the cell. We refer to this as the
187: {\bf `control property'}.
188:
189: (2) Such molecules are preserved well over generations. The number of
190: such molecules exhibits smaller fluctuations than that of other
191: molecules, and their chemical structure (such as polymer sequence) is
192: preserved over a long time span, even under potential changes by
193: fluctuations through the synthesis of these molecules. We refer to
194: this as the {\bf `preservation property'}.
195:
196: These two conditions are regarded as a fundamental condition for a
197: molecule to establish the heredity. Now, the problem of `information'
198: at a minimal level, i.e., 1-bit information is nothing but the problem
199: of the origin of heredity. As the origin of heredity, we study how a
200: molecule starts to have the above two properties in a protocell. In
201: other words, we study how 1-bit information starts to be encoded on a
202: single molecule in a replicating cell system. After we answer this
203: basic question, we will then discuss how a protocell with the heredity
204: in the above sense attains incentive to evolve genetic information in
205: today's sense.
206:
207: To sum up, the first question we address here is restated as follows.
208: Consider a protocell with mutually catalyzing molecules. Then, under
209: what conditions, recursive production continues maintaining catalytic
210: activities? How are recursiveness and diversity in chemicals
211: compatible? How is evolvability of such protocells possible? To
212: answer these questions, are molecules carrying heredity necessary?
213: Under what conditions, does one molecule species begin to satisfy the
214: conditions (1) and (2) so that the molecule carries heredity? We
215: show, under rather general conditions in our model of mutually
216: catalyzing system, that a symmetry breaking between the two kinds of
217: molecules takes place, and through replication and selection, one kind
218: of molecule comes to satisfy the conditions (1) and (2).
219:
220: \subsection{Q2: Recursiveness and Evolvability with Diverse Chemicals}
221:
222:
223: In a cell, the total number of molecules is limited. If there are a huge number
224: of chemical species that catalyze each other, the number of some molecules species
225: may go to zero. Then molecules that are catalyzed by them no longer are
226: synthesized. Then, other molecules that are catalyzed by them
227: cannot be synthesized, either. In this manner, the chemical compositions
228: may vary drastically, and the cell may lose reproduction activity.
229:
230: Of course, a cell state is not constant, and a cell may not keep on dividing
231: for ever. Still, a cell state is sustained to some degree to keep
232: producing similar offspring cells. We call such condition for
233: reproduction of cell as 'recursive production' or 'recursiveness'.
234: The question we address here is if there are some conditions on
235: distribution of chemicals or structure of reaction network for recursive production.
236:
237: There are two directions of study. One is with regards to the static
238: aspect of reaction network structure (e.g., topology). The other is
239: the number distribution of chemical species and their dynamics. Of
240: course, one needs to combine the two aspects to fully understand the
241: condition for recursive production of a cell.
242:
243: Currently there are much interest in
244: the reaction network structure,
245: For example, Jeong et al.\cite{Barabasi} studied the metabolic reaction
246: network, without going into details of the topology. Write down all
247: (known) metabolic reaction equations. Here, the rate of reactions is
248: disregarded, and only if such reaction equation exists in a cell or
249: not is concerned. Then compute how many times a specific molecule
250: species appears in such reaction equations. If this number is large,
251: the molecule species is related with many biochemical reactions. For
252: example $H_2O$ has a large number of connections, since in many reactions it appears
253: either in the left hand or right hand side of the equation.
254: %$CO_2$ must have a high number also. Among more complex molecules
255: $ATP$ has a relatively high number of connections, too. From these data the
256: histogram $P(n)$ is obtained, as the number of molecules species that appears $n$
257: times in the equations.
258: %Of course this distribution gradually decreases with the increase of $n$.
259: From the data, it is shown that
260: $P(n)$ decays with some power of $n$ as $n^{-\alpha}$\cite{Barabasi}.
261:
262: So far, the discussion is limited only to topological structure of the
263: network. In the reaction network dynamics, the number of molecules
264: are distributed. On each 'node' of the network, the abundance of
265: the corresponding molecule species is assigned. Accordingly
266: some path is 'thick' where such reactions occur frequently. Such
267: abundance as well as their fluctuations and dynamics has to be
268: investigated.
269:
270: In a cell, the number of each molecule changes in time through
271: reaction, and the number, on the average is increased for the cell
272: replication. For this growth to progress effectively, some positive feedback process
273: underlying the replication process should exist, which, then, may lead to
274: amplification of the number fluctuations in molecules. With such large
275: fluctuations and complexity in
276: the reaction network, how is recursive production of cells sustained?
277: Is there any universal statistics in the number distribution of
278: molecules?
279:
280:
281: \section{Brief Historical Survey}
282:
283: \subsection{Eigen's Hypercycle}
284:
285: Of course, the problem raised in the last section has been
286: addressed in the study on the origin of life,
287: or origin of replicating system. Here we are not necessarily interested in
288: `what happened in past', but rather, we intend to unveil the universal logic of
289: cell. Still, it is relevant to review the earlier studies.
290:
291: To consider the origin of replication system, one needs to discuss how genetic information
292: is faithfully transferred to the next generation.
293: %A typical standpoint is seen in the 'RNA world', but the approach has been taken long time.
294: Mills et al.\cite{Spiegelman} set up an experiment of
295: RNA replication, by using a solution of RNA and enzyme.
296: In this experiment, some enzymes are supplied from outside,
297: and in this sense it is not an autonomous replication system.
298: Still, his group found that RNA molecules with proper sequences are
299: reproduced under some error.
300:
301:
302: Following this experimental study of Spiegelman on replication of RNA,
303: Eigen's group started theoretical study on the replication of
304: molecules\cite{Eigen}. The replication process of polymer in
305: biochemical reaction is generally carried out with the aid of enzymes.
306: The enzyme is given by a polymer, while its catalytic activity
307: strongly depends on its sequence. For most sequences of the polymers,
308: the catalytic activity is very small, but few of them may have high
309: catalytic activity. Depending on the sequence some polymer has a much higher
310: catalytic activity, and the replication rate of polymers depends on the
311: sequence. As a theoretical argument, consider replication of
312: polymers whose replication rate depends on its sequence. Now, assume
313: that a 'good' sequence has replication rate $\alpha$ times larger than
314: its mutant with a substitution of a monomer from the original
315: sequence. Here, the replication progresses under some error. Without
316: fine machinery for error correction, this error is not negligible.
317: Assume that in each replication process, a monomer is substituted by
318: another monomer with the rate $\mu$. Then the probability that a
319: polymer consisting of $N$ monomers can produce itself is given by
320: $(1-\mu)^N \approx exp(-N\mu)$, assuming that $\mu$ is small.
321:
322: Now, let us examine if the good polymer can continue replication,
323: maintaining its sequence, so that the information of this
324: sequence is transferred. The condition that the good sequence
325: dominates in populations in the ensemble of polymers is given by
326:
327: \begin{equation}
328: N<ln(\alpha)/\mu
329: \end{equation}
330:
331: Here,$\ln(\alpha)$ is typically $O(1)$, while the error rate in the
332: replication of monomer is estimated to be around $0.01\sim0.1$, in
333: usual polymer replication process. Then the above condition gives
334: $N<100$ or so. In other words, information using a polymer with a
335: sequence longer than this threshold $N$ is hardly be sustained. This
336: problem was first posed by Eigen, and is called 'error
337: catastrophe'\cite{Eigen}. On the other hand, information for the
338: replication for a minimal life system must require much larger
339: information. Of course, the error rate could be reduced once some
340: machinery for faithful replication as in the present life emerges.
341: However, such machinery requires much more information to be
342: transmitted by the polymer.
343:
344: Summing up: For replication to progress, catalysts are necessary, and
345: information on a polymer to replicate itself must be preserved. However,
346: error rate in replication must have been high at a primitive stage of
347: life, and accordingly, it is recognized that the information to carry
348: catalytic activity will be lost within few generations. In other
349: words, faithful replication system requires larger information, while
350: a larger information requires faithful replication system. Thus there
351: appears catch-22 type paradox.
352:
353: To resolve this problem of inevitable loss of catalytic activities
354: through replication errors, Eigen and Schuster proposed
355: hypercycle\cite{Eigen}, where replicating chemicals catalyze each
356: other forming a cycle, as ``A catalyzes the synthesis of B, B catalyzes
357: the synthesis of C, C catalyzes the synthesis of A". In this case,
358: each chemical mutually amplifies the synthesis of the corresponding
359: chemical species in this cycle. There occurs a variety of mutations
360: to each species, but this mutant is not generally catalyzed in some
361: other species in the cycle. Then, such mutant is not be catalyzed by
362: C. This is also understood by writing out the rate equation for the
363: increase of the population. In this hypercycle the population
364: increase is given by the product of the populations of molecules such
365: as $N_A\times N_B$, $N_B\times N_C$, $N_C \times N_A$, while the
366: growth of the population of the mutants is linear to each population
367: $N_A$, $N_B$, $N_C$. In the previous estimate for error cascade, the
368: good and mutant sequences increase both linearly to the number. Then
369: the the number of variety of mutants dominates. In the present case,
370: once the populations of the good sequence in the hypercycle is
371: dominated, they can sustain the population, against possible emergence
372: of mutants. With this hypercycle, the original problem of error
373: accumulation is avoided.
374:
375: Since the proposal of hypercycle, population dynamics of molecules for
376: such catalytic networks have been developed. However, the hypercycle
377: itself turned out to be weak against parasitic molecules, i.e., those
378: which replicate, catalyzed by a molecule in the cycle,
379: but do not catalyze those in the cycle.
380: In contrast to the previous mutant, the growth rate of the
381: population of these molecules is again the product of the populations
382: of two species, and such parasitic molecules can invade.
383:
384: Although the hypercycle itself may be weak against parasitic
385: molecules, i.e., those which are catalyzed but do not catalyze others,
386: it is then discussed that compartmentalization by a cell structure may
387: suppress the invasion of parasitic molecules, or that the
388: reaction-diffusion system at spatially extended system resolves this
389: parasite problem\cite{Hogeweg}. As chemistry of lipid, it is not so surprising
390: that a compartment structure is formed. Still, as the origin of life,
391: this means that more complexity and diversity in chemicals are
392: required other than a set of information carrying molecules (e.g.,
393: RNA).
394:
395: \subsection{Dyson's Loose Reproduction System}
396:
397: If initially there is a variety of chemicals that form a complex network of
398: mutual catalyzation, this system may be robust against the invasion of
399: parasitic molecules. Such idea resembles stability of ecosystem,
400: where complex network of several species may resist to invasion of
401: external species. Hence we need to study if replication of
402: complex reaction network can be sustained. In this case, from the beginning, there
403: are many molecule species that mutually catalyze, allowing for the existence of
404: many parasitic molecules. Here,
405: complete replication of the system is probably difficult.
406: Then the question we have to
407: address is if such complex network can maintain molecules that catalyze the synthesis
408: of the network species. This question was addressed by Dyson\cite{Dyson},
409: as a possibility of loose reproduction system.
410:
411: Dyson, noting the experiment of Oparin on the formation of cell-like
412: structure, considered a collection of molecules with proteins and
413: others. These molecules cannot replicate themselves like DNA or RNA.
414: They, on the other hand, can have enzyme activities, and catalyze the
415: synthesis of other molecules albeit not faithful reproduction they may
416: be. Still, they may keep similar compositions. Although
417: accurate replication of such variety of chemicals is not possible,
418: chemicals, as a set, may continue reproducing themselves loosely,
419: while keeping catalytic activity. Indeed, the accurate replication
420: must be difficult at the early stage of life, but loose reproduction
421: could be easier. However, if this collection of molecules can keep
422: catalytic activity through reproduction is not evident.
423:
424: Dyson obtained a condition for the sustainment of catalytic activities
425: in these collection of molecules, by taking an abstract model. For
426: simplicity he classified molecules into two states depending on if
427: they have catalytic activity or not. Furthermore, he assumed that the
428: ratio of the synthesis of catalytic molecules is amplified as the
429: fraction of catalytic molecules is larger, i.e., a positive feedback
430: process is assumed. This model is mapped to a kind of Ising model.
431: With the aid of mean-field analysis in statistical physics, he showed
432: that the catalytic activities can be sustained depending on the number
433: of molecules and their species. Although his model is abstract, the
434: result he obtained probably can be applied to any system with a set
435: of catalytic molecules, be it protein, lipids, or other polymers.
436:
437: It is important to study if such loose reproduction as a set is
438: possible in a mutually catalytic reaction network (also see
439: e.g.\cite{Kauffman,Bagley}). If this is possible, and if these
440: chemicals also include molecules forming a membrane for
441: compartmentalization, reproduction of a primitive cell will become
442: possible. In fact, from chemical nature of lipid molecules, it is not
443: so surprising that a compartment structure is formed.
444:
445: Still, in this reproduction system, any particular molecules carrying
446: information for reproduction do not exist, in contrast to the present
447: cell which has specific molecules (DNA) for it. As for a transition
448: from early loose reproduction to later accurate replication with
449: genetic information, Dyson did not give an explicit answer.
450: He only referred to 'genetic take-over' that was
451: originally proposed by Cairns-Smith\cite{Cairns-Smith}, who discussed
452: that a precise replication system by nucleic acids took over the
453: original loose reproduction system by clay. Indeed, Dyson wrote that
454: his idea is based on `Cairns-Smith theory minus clay'. However, the
455: logic for this "take over" is not unveiled.
456:
457: Considering these theoretical studies so far, it is important to study
458: how recursive production of a cell is possible, with the appearance of
459: some molecules to play a specific role for heredity.
460:
461: \section{Constructive Biology}
462:
463: \subsection{Standpoint of constructive biology}
464:
465: Before describing our theoretical model and explaining the numerical
466: results, it is relevant to briefly summarize our basic standpoint in
467: the study of biology, termed as "constructive biology"
468: \cite{whatlife,mtb}\footnote{One can skip this subsection, if one is
469: not much interested in general standpoint in the study of biology.}.
470: Here we are interested not in details of specific biological function but
471: in universal features of a biological
472: system. Accordingly we need to study some features that are not
473: influenced by the details of complicated biological processes. The
474: present organisms, however, include detailed elaborated processes that
475: are captured through the history of evolution. Then, for our purpose,
476: it is desirable to set up a minimal biological system, to understand
477: universal logic that organisms necessarily should obey. Hence, the
478: approach that should be taken will be 'constructive' in nature. This
479: constructive approach is carried out both experimentally and
480: theoretically.
481:
482: Our `constructive biology' consists of the following steps of studies.
483: (i) construct a model system by combining procedures;
484: (ii) clarify universal class of phenomena through the constructed model(s);
485: (iii) reveal the universal logic underlying the class of phenomena
486: and extract logic that the life process should obey;
487: (iv) provide a new look at data on the present organisms
488: from our discovered logic.
489:
490: There are three levels, to perform these steps:
491: (1)gedanken experiment ( logic) (2)computer model, and (3)real
492: experiment. The first one is theoretical study, reveling a logic
493: underlying universal features in life processes, essential to
494: understand the logic of 'what is life'.
495:
496: Still, life system has a complex relationship among many parts,
497: which constitute the characteristic feature as a whole, which then influences
498: the process of each part.
499: We have not gained sufficient theoretical intuition to
500: such complex system. Then it is also relevant to make computer experiments and
501: heuristically find some logic that cannot be easily reached by logical
502: reasoning only. This is the second approach mentioned above, i.e.,
503: construction of artificial world in a computer. Here we combine
504: well-defined simple procedures, to extract a general logic
505: therein \cite{minority,Complexity,KKTY,Furusawa,speciation}.
506:
507: Still, in a system with potentially huge degrees of freedom like life,
508: the construction in a computer may miss some essential factors.
509: Hence, we need the third experimental approach, i.e., construction in
510: a laboratory. In this case again, one constructs a possible biology
511: world in laboratory, by combining several procedures. For example,
512: this experimental constructive biology has been pursued by Yomo and
513: his collaborators (see e.g., \cite{Matsuura,Ko,Kashiwagi1,Kashiwagi2}
514: at the levels of biochemical reaction, cell, and ensembles of cells.)
515:
516: Taking this standpoint of constructive biology,
517: we have been working problems listed in the table both theoretically and experimentally.
518: The first two items in the table are related with the construction of
519: a replicating system with compartment, raised in the questions in \S
520: 1. Of course, this problem is essential to consider the origin of a
521: cellular life. However, we do not intend to reproduce what has
522: occurred in the earth. We do not try to guess the environmental
523: condition of the past earth. Rather we try to construct such
524: replication system from complex reaction network under a condition preset
525: up by us. For example, by constructing a protocell, in the present
526: paper, we ask the condition for the heredity, or universal features
527: of the reaction dynamics to support the recursive production of cells.
528:
529: The third to sixth items are related with the construction of
530: multicellular organisms with developmental process. When cells are
531: aggregated, they start to form differentiation of roles, and then from
532: a single cell, robust developmental process to form organized
533: structure of differentiated cells is generated. This developmental
534: process to form a cell aggregate is transferred to the next
535: generation. An experimental construction of multi-cellular organisms
536: (with cell differentiation) from bacteria is one target. Here again,
537: we do not try to imitate the process of the present multi-cellular
538: organisms. For example, by putting bacteria cells into some
539: artificial condition, we study if the cells can differentiate into
540: distinct types or form some robust distribution of cells. Also,
541: in-vitro construction of morphogenesis from undifferentiated cells has
542: been possible by putting cells into some given
543: conditions\cite{Asashima}. With these studies, we can establish a
544: viewpoint of universal dynamics underlying development rather than
545: the conventional picture as finely tuned-up process for it\cite{KKTY,Furusawa}.
546:
547: The seventh item is construction of evolution, in particular
548: speciation process, that is how a species splits into two distinct
549: groups different both in phenotype and genotype\cite{speciation}.
550:
551: To carry out this plan experimentally we need a system to design a
552: life system controlled as we like. Such controlled experiments are
553: now possible by recent advances in technology, such as flow-cytometry,
554: imaging techniques, microarray to measure gene expressions, while
555: advances in nanotechnology provide a powerful tool in constructing a system
556: to regulate and observe behaviors of a single cell or multiple cells,
557: in a well controlled situation.
558:
559: Here this construction is interesting by itself, but our goal is not
560: the construction itself. Rather we try to extract general features
561: that a life system should satisfy, and set up general questions. For
562: example, as posed in \S1, we set up a question if there are some `information molecules'
563: that control the replication system. Then we answer the question by
564: setting up a theory. For each item, we set up general questions, and
565: make model simulations, and set up a general theory to answer the
566: question. This theoretical part is carried out in tight collaboration
567: with the experiment.
568:
569:
570: Table I: examples of constructive biology under current investigation:
571:
572: \hspace{-.3in}\begin{tabular}{|c||c|c|c|} \hline
573:
574: construction of &experiment & theory & question to be addressed \\ \hline
575: replicating & in-vitro replicating & minority control & origin of\\
576: system & system with several & & information \\
577: & enzymes & & \\ \hline
578: cell system & replicating liposome & dynamic bottleneck& evolvability \\
579: & with internal & in autocatalytic & and recursiveness \\
580: & reaction network & reaction system & for growth \\ \hline
581: multicellular & interaction-induced & isologous diversi-& robustness in \\
582: system & differentiation of & fication in inter-& development \\
583: & an ensemble of cells & intra dynamics & \\ \hline
584: developmental & controlled & emergence of & irreversibility \\
585: process (I) & differentiation from & differentiation & in development \\
586: & undifferntiated cells & rule & \\ \hline
587: developmental & activin-controlled & self-consistency & origin of \\
588: process (II) & construction of & between pattern & positional \\
589: & tissues formation & and dynamics & information \\ \hline
590: generation & germ-line segregation & higher-level & origin of recursive \\
591: & from ensemble of cells& recursiveness & individuality \\ \hline
592: evolution & interaction-dependent & symbiotic & genetic fixation of \\
593: & evolution & sympatric & phenotypic \\
594: & of E Coli & speciation & differentiation \\ \hline
595: \end{tabular}
596:
597:
598: To close this subsection, we give a brief remark on the study of the
599: so called Artificial Life (AL). Indeed, our approach may have
600: something in common with AL\cite{AL}. In the AL study
601: people intended to construct life-as-it-could-be, not restricted to
602: the present organisms. Originally, in the study of AL, they have been
603: interested in logic of life that all possible biological system should
604: obey, be it on this earth or in other conditions in the universe.
605:
606: Indeed, there are some important studies on the origin of replicating
607: structure from the side of computation (e.g., \cite{Fontana}).
608: However, the conventional AL study often tended to imitate life, and could not
609: propose basic concepts to understand 'what is life'.
610: %, and the study often falls on superficial imitation.
611: %Even though they sometimes succeeded in
612: %making something similar to life, the success did not contribute in
613: %understanding the logic of life. AS for the evolution, they usually
614: %adopt the genetic algorithm as a simplified version of Darwinian
615: %evolution, but the AL study has not contributed in proposing novel
616: %concepts in evolution.
617: Also, the conventional AL study was often biased into the study in
618: a computer. It often assumes a combination of logical processes with
619: manipulation of symbols like the study of artificial intelligence.
620:
621: Our approach is distinct from the conventional artificial
622: life study in the two points. First, we do not take such symbol-based
623: approach, but rather we use dynamical systems approach. Second, tight
624: collaboration between experiment and theory is essential. Note,
625: however, this collaboration is not of the type to `fit the data' by
626: some theoretical expression, but rather at a conceptual level. We
627: will see an example of such collaboration in \S4.
628:
629: \subsection{Modeling strategy for the chemical reaction networks}
630:
631: \begin{figure}
632: \noindent
633: \hspace{-.3in}
634: \epsfig{file=schemmodel.ps,width=.6\textwidth}
635: \caption{Schematic representation of our modeling strategy of a cell}
636: \end{figure}
637:
638: Now, we discuss a standpoint in modeling cell, based on the standpoint
639: of the last section. Then, what type of a model is best suited for a
640: cell to answer the question in \S 1? With all the current biochemical
641: knowledge, we can say that one could write down several types of
642: intended models. Due to the complexity of a cell, there is a tendency
643: of building a complicated model in trying to capture the essence of a
644: cell. However, doing so only makes one difficult to extract new
645: concepts, although simulation of the model may produce similar
646: phenomena as those in living cells. Therefore, to avoid such
647: failures, it may be more appropriate to start with a simple model that
648: encompasses only the essential factors of living cells. Simple models
649: may not produce all the observed natural phenomena, but are
650: comprehensive enough to bring us new thoughts on the course of events
651: taken in nature.
652:
653: In setting up a theoretical model here, we do not put many conditions
654: to imitate the life process. Rather we impose the postulates as
655: minimum as possible, and study universal properties in such system.
656: For example, as a minimal condition for a cell, we consider a system
657: consisting of chemicals separated by a membrane. The chemicals are
658: synthesized through catalytic reactions, and accordingly the amount of
659: chemicals increases, including the membrane component. As the volume
660: of this system is larger, the surface tension for the membrane can no
661: longer sustain the system, and it will divide. After the division of
662: this protocell systems, they should interact with each other, since
663: they share resource chemicals. Under such minimum setup as will be
664: discussed later, we study the condition for the recursive growth of a
665: cell, as well as differentiation of the cell.
666:
667: Let us start from simple argument for a biochemical process that a
668: cell that grows must at least satisfy. In a cell, there are a huge
669: number of chemicals that catalyze each other and form a complex
670: network. These molecules are spatially arranged in a cell, and in
671: some problems such spatial arrangement is very important, while for
672: some others, the discussion on just the composition of chemicals in a
673: cell is sufficient to determine a state of a cell. Hence, for the
674: starting point we disregard the spatial structure within a cell, and
675: consider just the composition of chemicals in a cell. Hence, if there
676: are $k$ chemical species in a cell, the cell state is characterized by
677: the number of molecules of each species as $N_1,N_2,...N_k$. These
678: molecules change their number through reaction among these molecules.
679: Since most reactions are catalyzed by some other molecules, the
680: reaction dynamics consist of a catalytic reaction network.
681:
682: Through membrane, some chemicals may flow in, which are successively
683: transformed to other chemicals through this catalytic reaction
684: network. For a cell to grow recursively, a set of chemicals has to be
685: synthesized for the next generation. As the number of molecules is
686: large enough, the membrane is no longer sustained, even just due to
687: the constraint of surface tension. Then, when the number of molecules
688: is larger than some value, it is expected to divided. Hence, the
689: basic picture for a simple toy cell we take is given as in Fig.1.
690:
691: Of course, it is impossible to include all possible chemicals in a
692: model. As our constructive biology is aimed at neither making
693: complicated realistic model for a cell, nor imitating specific
694: cellular function, we set up a minimal model with reaction network, to
695: answer the questions raised in \S 1. Now, there are several levels
696: for the modeling depending on what question we try to answer.
697:
698: (0) By taking reversible two-body reactions, including all levels of
699: reactions, ranging from metabolites, proteins, nucleic acids, and so
700: forth. For example, to answer the general question, how
701: non-equilibrium condition is sustained in a cell, such level of model
702: is desirable\cite{Awazu}.
703:
704: (1) Assuming that some reaction process are fast, they can be
705: adiabatically eliminated. Also, most of fast reversible reactions can
706: be eliminated by assuming that they are already balanced.
707: Then we need to discuss only the
708: concentration (number) of molecules species, that change relatively
709: slowly. For example by assuming that enzyme is synthesized and
710: decomposed fast, the concentrations can be eliminated, to give
711: catalytic reaction network dynamics consisting of the reactions with
712:
713: \begin{equation}
714: X_i+X_j \rightarrow X_{\ell}+X_j
715: \end{equation}
716:
717: \noindent
718: where $X_j$ catalyzes the reaction\cite{KKTY,Zipf}. If the catalysis
719: progresses through several steps, this process is replace by
720:
721: \begin{equation}
722: X_i+mX_j \rightarrow X_{\ell}+mX_j
723: \end{equation}
724: leading to higher order catalysis\cite{Furusawa}.
725:
726: For a cell to grow, some resource chemicals must be supplied through
727: membrane. Through the above catalytic reaction network, the resource
728: chemicals are transformed to others, and as a result, cell grows.
729: Indeed, this class of model is adopted to study the condition for cell
730: growth, to unveil universal statistics for such cells, and also as a
731: model for cell differentiation.
732:
733: (2) Model focusing on the dynamics of replicating units
734: (e.g.. Hypercycle): For a cell to grow effectively, there should be
735: some positive feedback process to amplify the number of each molecule
736: species. Such positive feedback process leads to autocatalytic
737: process to synthesize each molecule species. For reproduction of a
738: cell, (almost) all molecule species are somehow synthesized. Then, it would be possible to take
739: a replication reaction from the beginning as a model. For example, consider a reaction
740:
741: $S+X+Y \rightarrow X'+Y : S'+X' \rightarrow 2X$.
742:
743: \noindent
744: Then as a total, the reaction is represented as
745:
746: $S+S'+X+Y \rightarrow 2X+Y$.
747: \noindent
748: Assuming the resources S and S' are constantly supplied, we can
749: consider the replication reaction
750: \begin{equation}
751: X+Y \rightarrow 2X+Y,
752: \end{equation}
753: catalyzed by $Y$.
754: At this level, we can take a unit of replicator, and consider a
755: replication reaction network. This model was first discussed in the
756: hypercycle by Eigen and Schuster discussed in \S 2.1.
757:
758: (3) coarse-grained (phenomenological) level: Some other reduced model
759: is adopted for the study of gene expression or signal transduction
760: network. The modeling at this level is relevant to understand
761: specific function of a cell.
762:
763: In the present paper we mainly use the modeling of the level (2).
764: This class of model can be obtained by reducing from the level-(1)
765: model, by restricting our interest only to take into account of
766: replicating units. In this sense, the model is a bit simpler than the
767: level-(1) model. On the other hand, it may not be suitable to discuss
768: the condition for cell growth, since at the level-(2) model, the
769: supply of resource chemicals is automatically assumed, and one cannot
770: discuss how transported chemicals are transformed into others. In the
771: present paper, we briefly refer to the level-(1) model only at the end of
772: \S 5.4, to demonstrate the universality of our result, but for
773: details, see the original papers \cite{KKTY,Zipf} on the level-(1)
774: modeling .
775:
776: To sum up, we envision a (proto)cell containing molecules. With a
777: supply of chemicals available to the cell, these molecules replicate
778: through catalytic reactions, so that their numbers within a cell
779: increase. When the total number of molecules exceeds a given
780: threshold, the cell divides into two, with each daughter cell
781: inheriting half of the molecules of the mother, chosen randomly.
782: Regarding the choice of chemical species and the reaction, we discuss
783: later for specific models. (see Fig.1 for schematic representation).
784:
785: \section{Minority Control Hypothesis for the Origin of Genetic Information}
786:
787: In the present section we propose an answer to the question raised in
788: \S 1.1, by taking a simple model of a cell with replicating molecules,
789: and proposing a novel concept on minority control, and providing
790: corresponding experimental results.
791:
792: \subsection{Model}
793:
794: As discussed in \S 3.2,
795: we start from consideration of a prototype of cell, consisting of molecules
796: that catalyze each other. As the reaction progresses, the number of
797: molecules in this protocell will increase. Then,
798: this cell will be divided, when its volume (the total number of molecules)
799: is beyond some threshold. Then the molecules split into two `daughter cells".
800: Then our question in \S 1 is restated as follows:
801: How are the chemical compositions transferred to the offspring cells?
802: Do some specific molecules start to carry heredity in the
803: sense of control and preservation, so that the reproduction continues?
804:
805: Before considering the specific model,
806: it may be relevant to recall the difference of roles between DNA (or RNA) and protein.
807: According to the present understanding of molecular biology\cite{Cell},
808: changes undergone by DNA molecules are believed to exercise stronger influences
809: on the behavior of cells than other chemicals.
810: Also, a DNA molecule is transferred to offspring cells
811: relatively accurately, compared with other constitutes of the cell.
812: Hence a DNA molecule satisfies (at least) the "preservation" and "control"
813: properties (1) and (2) in \S 1.1.
814:
815: In addition, a DNA molecule is stable, and the time scale for the
816: change of DNA, e.g., its replication process as well as its
817: decomposition process, is much slower. Because of this relatively
818: slow replication, the number of DNA molecules is smaller than the
819: number of protein molecules. At each generation of cells, single
820: replication of each DNA molecule typically occurs, while other
821: molecules undergo more replications (and decompositions).
822:
823: With these natures of DNA in mind, while without assuming the detailed
824: biochemical properties of DNA, we seek a general condition for the
825: differentiation of the roles of molecules in a cell and study the
826: origin of the control and preservation of some specific molecules.
827:
828: Now, we consider a very simple protocell
829: system\cite{minority}, consisting of two species of replicating
830: molecules that catalyze each other (see Fig.2). assuming that only
831: two kinds of molecules $X$ and $Y$ exist in this protocell, and they
832: catalyze each other for the synthesis of the molecules.
833:
834: \begin{equation}
835: X + Y \rightarrow 2X+ Y ;Y + X \rightarrow 2Y +X;
836: \end{equation}
837:
838: Here, this ``catalytic reaction" is not necessarily a single reaction.
839: In general there can be several intermediate processes for each
840: ``reaction". The model simply states that there are two molecules
841: that help the synthesis of the other, directly or indirectly. In
842: general, the catalytic activities as well as the synthesis speeds
843: differ by types of molecules. Without losing generality one can
844: assume that $X$ is synthesized faster than $Y$.
845:
846: With this synthesis of molecules, the total number of molecules in the
847: protocell will increase, until it divides into two. As long as the
848: molecules catalyze each other, this synthesis continues, as well as
849: the division (reproduction) of protocell. However, some structural
850: changes in molecules can occur through replication (`replication
851: error'). These structural changes in each kind of molecules may
852: result in the loss of catalytic activity. Indeed, the molecules with
853: catalytic activity are not so common. On the other hand, molecules
854: without catalytic activity can grow their number, if they are
855: catalyzed by other catalytic molecules. Then, as discussed in \S 2.1,
856: the maintenance of reproduction is not so easy.
857:
858: Following the above discussion, we consider the following model,
859: as a first step in answering the question posed \S 1.1\cite{minority}.
860:
861: \begin{figure}
862: \noindent
863: \hspace{-.3in}
864: \epsfig{file=figj4-3.ps,width=.4\textwidth}
865: \caption{Schematic representation of our model}
866: \end{figure}
867:
868:
869: (i) There are two species of molecules, X and Y, which are mutually catalyzing.
870:
871: (ii) For each species, there are active and inactive (``I'') types.
872: %There are thus four types,
873: Considering that the active molecule type is rather rare. There are
874: $F$ types of inactive molecules per active type. For most
875: simulations, we consider the case in which there is only one type of
876: active molecules for each species.
877:
878: Active types are denoted as $X^0$ and $Y^0$, while there are inactive
879: types $X^I$ and $Y^I$ with $I=1,2,...,F$. The active type has the
880: ability to catalyze the replication of both types of the other species
881: of molecules. The catalytic reactions for replication are assumed to
882: take the form
883:
884: \begin{math}
885: X^J + Y^0 \rightarrow 2 X^J +Y^0\end{math} (for $J=0,1,..,F$)
886:
887: and
888: \begin{math}
889: Y^J + X^0 \rightarrow 2 Y^J +X^0\end{math} (for $J=0,1,..,F$).
890:
891: (iii) The rates of synthesis (or catalytic activity) of
892: the molecules $X$ and $Y$ differ. We stipulate that the rate of the above replication process
893: for $Y$,
894: $\gamma_y$, is much smaller than that for $X$, $\gamma_x$.
895: This difference in the rates may also be caused by a difference in
896: catalytic activities between the two molecule species.
897:
898: (iv) In the replication process, there may occur structural changes
899: that alter the activity of molecules. Therefore the type (active or
900: inactive) of a daughter molecule can differ from that of the mother.
901: The rate of such structural change is given by $\mu$, which is not
902: necessarily small, due to thermodynamic fluctuations. This change can
903: consist of the alternation of a sequence in a polymer or other
904: conformational change, and may be regarded as replication `error'.
905: Note that the probability for the loss of activity is $F$ times
906: greater than for its gain, since there are $F$ times more types of
907: inactive molecules than active molecules. Hence, there are processes
908: described by
909:
910: \begin{math}
911: X^I \rightarrow X^0;\end{math}and \begin{math}Y^I \rightarrow Y^0\end{math} (with rate $\mu$)
912:
913: \begin{math}
914: X^0 \rightarrow X^I;\end{math}and \begin{math}Y^0 \rightarrow Y^I\end{math}(with rate $\mu $ for each),
915:
916: resulting from structural change.
917:
918: (v) When the total number of molecules in a protocell exceeds a given
919: value $2N$, it divides into two, and the chemicals therein are
920: distributed into the two daughter cells randomly, with $N$ molecules
921: going to each. Subsequently, the total number of molecules in each
922: daughter cell increases from $N$ to $2N$, at which point these divide.
923:
924: (vii) To include competition, we assume that there is a constant total
925: number $M_{tot}$ of protocells, so that one protocell, randomly chosen,
926: is removed whenever a (different) protocell divides into two.
927:
928: With the above described process, we have basically four sets of
929: parameters: the ratio of synthesis rates $\gamma_y/\gamma_x$, the
930: error rate $\mu$, the fraction of active molecules $1/F$, and the
931: number of molecules $N$. (The number $M_{tot}$ is not important, as
932: long as it is not too small).
933:
934: We carried out simulation of this model, according to the following procedure.
935: First, a pair of molecules is chosen randomly.
936: If these molecules are of different species, then if the
937: $X$ molecule is active, a new $Y$ molecule is produced with the probability
938: $\gamma_y$, and if the $Y$ molecule is active, a new $X$
939: molecule is produced with the probability $\gamma_x$.
940: Such replications occur with the error rates given above.
941: All the simulations were thus carried out
942: stochastically, in this manner.
943:
944: We consider a stochastic model rather than the corresponding rate
945: equation, which is valid for large $N$, since we are interested in the
946: case with relatively small $N$. This follows from the fact that in a
947: cell, often the number of molecules of a given species is not large,
948: and thus the continuum limit implied in the rate equation approach is
949: not necessarily justified \cite{Mikhailov}.
950:
951: Furthermore, it has recently been found that the discrete nature of a
952: molecule population leads to qualitatively different behavior than in
953: the continuum case in a simple autocatalytic reaction network
954: \cite{Togashi}. In a simple autocatalytic reaction system with a
955: small number of molecules, a novel steady state is found when the
956: number of molecules is small, that is not described by a continuum
957: rate equation of chemical concentrations. This novel state is first
958: found by stochastic particle simulations. The mechanism is now
959: understood in terms of fluctuation and discreteness in molecular
960: numbers. Indeed, some state with extinction of specific molecule
961: species shows a qualitatively different behavior from that with very
962: low concentration of the molecule. This difference leads to a transition to a novel
963: state, termed as discreteness-induced-transition. This phase
964: transition appears by decreasing the system size or flow to the
965: system, and is analyzed from the stochastic process, where a
966: single-molecule switch changes the distributions of molecules drastically.
967:
968: In \cite{Togashi}, given are examples in which a discreteness in molecule
969: number leads to a novel phase that is not observed from a continuous
970: rate equation of chemical reaction. In a cell, since the number of
971: some molecules species is very small, we need to seriously consider
972: the possibility that the discreteness in molecule numbers may lead
973: to a novel behavior distinct from the continuum description.
974:
975: \subsection{Result}
976:
977: If $N$ is very large, the above described stochastic model can be replaced by a
978: continuous model given by the rate equation.
979: Let us represent the total number of inactive molecules for each of $X$ and $Y$ as
980:
981: $ N_x^I =\sum_{j=1}^F N_x^j$; $ N_y^I =\sum_{j=1}^F N_y^j$
982:
983: Then the growth dynamics of the number of molecules
984: $N_x^J$ and $N_y^J$
985: % (for $J=A$ or $I$)
986: is described by the rate equations, using the total number of molecules $N^t$,
987:
988: \begin{equation}
989: dN_x^j/dt=\gamma_x N_x^j N_y^0/N^t;
990: dN_y^j/dt=\gamma_y N_x^0 N_y^j/N^t.
991: \end{equation}
992:
993: From these equations, under repeated divisions,
994: it is expected that the relations $\frac{N_x^0}{N_y^0}=\frac{\gamma_x}{\gamma_y}$,
995: $\frac{N_x^0}{N_x^I}= \frac{1}{F}$, and $\frac{N_y^0}{N_y^I} = \frac{1}{F}$ are eventually satisfied.
996: Indeed, even with our stochastic simulation,
997: this number distribution is approached as $N$ is increased.
998:
999: However, when $N$ is small, and with the selection process, there
1000: appears a significant deviation from the above
1001: distribution\cite{minority}. In Fig.3, we have plotted the average
1002: numbers $\langle N_x^0 \rangle$, $\langle N_x^I \rangle$, $\langle
1003: N_y^0 \rangle$, and $\langle N_y^I \rangle$. Here, each molecule
1004: number is computed for a cell just prior to the division, when the
1005: total number of molecules is $2N$, while the average $\langle
1006: ... \rangle$ is taken over all cells that divided throughout the
1007: simulation. (Accordingly, a cell removed without division does not
1008: contribute to the average). As shown in the figure, there appears a
1009: state satisfying $\langle N_y^0 \rangle \approx 2 - 10$, $\langle
1010: N_y^I \rangle \approx 0$. Since $F \gg 1$, such a state with
1011: $\frac{\langle N_y^0 \rangle}{\langle N_y^I \rangle}>1$ is not
1012: expected from the rate equation (6). Indeed, for the $X$- species,
1013: the number of inactive molecules is much larger than the number of
1014: active ones. Hence, we have found a novel state that can be realized
1015: due to the smallness of the number of molecules and the selection
1016: process.
1017:
1018: %In Fig.?, $\gamma_y/\gamma_x$ and $F$ are fixed to 0.01 and64, respectively.
1019:
1020: For the dependence of \{$\langle N_x^0 \rangle$,$\langle N_x^I \rangle$,$\langle N_y^0 \rangle$,$\langle N_y^I \rangle$ \}
1021: on these parameters, see also figures of the paper of \cite{minority}.
1022: From these numerical results,
1023: it is shown that the above mentioned state with $\langle N_y^0 \rangle \approx 2 - 10 $, $\langle N_y^I \rangle < 1$
1024: is reached and sustained when $\gamma_y/\gamma_x$ is small and $F$ is sufficiently large.
1025: In fact, for most dividing cells, $N_y^I$ is exactly 0, while there appear a few cells
1026: with $N_y^I>1$ from time to time.
1027: It should be noted that the state with almost no inactive Y
1028: molecules appears in the case of larger $F$, i.e., in the case of
1029: a larger possible variety of inactive molecules. This suppression of
1030: $Y^I$ for large $F$ contrasts with the behavior found in the continuum limit (the rate equation).
1031: In Fig.4, we have plotted $\frac{\langle N_y^0 \rangle}{\langle N_y^I \rangle}$ as a
1032: function of $F$.
1033: Up to some value of $F$, the proportion of active $Y$ molecules decreases,
1034: in agreement with the naive expectation provided by Eq. (6),
1035: but this proportion increases with further increase of $F$, in the case that
1036: $\gamma_y/\gamma_x$ is small ($\stackrel{<}{\sim}.02$) and $N$ is small.
1037:
1038:
1039: This behavior of the molecular populations can be understood from the
1040: viewpoint of selection: In a system with mutual catalysis, both $X^0$
1041: and $Y^0$ are necessary for the replication of protocells to continue.
1042: The number of $Y$ molecules is rather small, since their synthesis
1043: speed is much slower than that of $X$ molecules. Indeed, the fixed
1044: point distribution given by the continuum limit equations possesses a
1045: rather small $N_y^0$.
1046: %In fact, when the total number of molecules is sufficiently small, the value
1047: %of $\langle N_y^0 \rangle$ given by these equations is less than 1.
1048: However, in a system with mutual catalysis, both $X^0$ and $Y^0$ must
1049: be present for replication of protocells to continue. Note, for the
1050: replication of $X$ molecules to continue, at least a single active $Y$
1051: molecule is necessary. Hence, if $N_y^0$ vanishes, only the
1052: replication of inactive $Y$ molecules occurs, and divisions from this
1053: cell cannot proceed indefinitely, because the number of $X^0$
1054: molecules is cut in half at each division. Furthermore, a cell with
1055: $N_y^0=1$, only one of its daughter cells can have an active $Y$
1056: molecule. Summing up, under the presence of selection, protocells
1057: with $N_y^0>1$ are selected.
1058:
1059: %Hence a cell with $N_y^0=1$ has no potentiality to multiple through division, and for this reason,
1060:
1061: On the other hand, the total number of $Y$ molecules is limited to
1062: small values, due to their slow synthesis speed. This implies that a
1063: cell that suppresses the number of $Y^I$ molecules to be as small as
1064: possible is preferable under selection, so that there is a room for
1065: $Y^0$ molecules. Hence, a state with almost no $Y^I$ molecules and a
1066: few $Y^0$ molecules, once realized through fluctuations, is expected
1067: to be selected through competition for survival ( see Fig.5 for
1068: schematic representation).
1069:
1070: Of course, the probability for such rare fluctuations decrease quite
1071: rapidly as the total molecule number increases, and for sufficiently
1072: large numbers, the continuum description of the rate equation is
1073: valid. Clearly then, a state of the type described above is selected
1074: only when the total number of molecules within a protocell is not too
1075: large. In fact, a state with very small $N_Y^I$ appears only if the
1076: total number $N$ is smaller than some threshold value depending on $F$
1077: and $\gamma_y$. In other words, too large cell is not favorable, because
1078: the fluctuation is too small to produce such rare state.
1079:
1080: \begin{figure}
1081: \noindent
1082: \hspace{-.3in}
1083: \epsfig{file=fig8-2-90.eps,width=.6\textwidth}
1084: \caption{
1085: Dependence of $\langle N_x^0 \rangle (\times)$, $\langle N_x^I \rangle (+)$,
1086: $\langle N_y^0 \rangle (\Box)$, and $\langle N_y^I \rangle(*)$
1087: on $N$.
1088: The parameters were fixed as $\gamma_x=1$, $\gamma_y=0.01$, and $\mu =.05$.
1089: Plotted are the averages of $N_x^0$, $N_x^I$, $N_y^0$, and $N_y^I$
1090: at the division event, and thus their sum is
1091: $2N$.
1092: We use $M_{tot}=100$, and
1093: the sampling for the averages were taken over $10^5-3\times 10^5$ steps,
1094: where the number of divisions ranges from $10^4$ to $10^5$,
1095: depending on the parameters. Reproduced from \cite{minority}.}
1096: \end{figure}
1097:
1098: \begin{figure}
1099: \noindent
1100: \hspace{-.3in}
1101: \epsfig{file=figmin.ps,width=.7\textwidth}
1102: \caption{
1103: Dependence of the active-to-inactive ratio,
1104: $\frac{\langle N_y^0 \rangle }{\langle N_y^I \rangle }$,
1105: on $F$.
1106: The parameters were fixed as $\gamma_x=1$, $\gamma_y=.01$, $\mu =.05$, and $F=128$.
1107: Plots for $\gamma_y=.005$ ($\Diamond$), .01 (+), .015 ($\Box$), 0.02 ($\times$),
1108: 0.025 ($\triangle$),
1109: and 0.03 (*) are overlaid.
1110: Plotted are the averages of $N_x^0$, $N_x^I$, $N_y^0$, and $N_y^I$
1111: at the division event. Reproduced from \cite{minority}.}
1112: \end{figure}
1113:
1114:
1115: %\begin{figure}
1116: %\noindent
1117: %\hspace{-.3in}
1118: %\epsfig{file=fig8-3-90.eps,width=.6\textwidth}
1119: %\caption{
1120: %Dependence of $\langle N_x^0 \rangle (\times)$, $\langle N_x^I \rangle (+)$,
1121: %$\langle N_y^0 \rangle (\Box)$, and $\langle N_y^I \rangle(*)$
1122: %on $F$. The parameters were fixed as
1123: %$\gamma_x=1$, $\gamma_y=.01$, $\mu =.05$, and $N=1000$.
1124: %Plotted are the averages of $N_x^0$, $N_x^I$, $N_y^0$, and $N_y^I$
1125: %at the division event, and thus their sum is $2N=2000$.
1126: %Reproduced from \cite{KKTY02}}
1127: %\end{figure}
1128:
1129: \begin{figure}
1130: \noindent
1131: \hspace{-.3in}
1132: \epsfig{file=figj4-25.ps,width=.7\textwidth}
1133: \caption{Schematic representation of our logic
1134: Once an active molecule of each molecule species is lost, the
1135: reproduction does not continue.
1136: }
1137: \end{figure}
1138:
1139: \subsection{Minority Controlled State}
1140:
1141: We showed that in a mutually catalyzing replication system, the
1142: selected state is one in which the number of inactive molecules of the
1143: slower replicating species, $Y$, is drastically suppressed. In this
1144: section, we first show that the fluctuations of the number of active
1145: $Y$ molecules is smaller than those of active $X$ molecules in this
1146: state. Next, we show that the molecule species $Y$ (the minority
1147: species) becomes dominant in determining the growth speed of the
1148: protocell system. Then, considering a model with several active
1149: molecule types, the control of chemical composition through
1150: specificity symmetry breaking is discussed.
1151:
1152:
1153: \subsubsection{Preservation of minority molecule}
1154:
1155: First, we computed the time evolution of the number of active $X$ and
1156: $Y$ molecules, to see if the selection process acts more strongly to
1157: control the number of one or the other. We computed $N_x^0$ and
1158: $N_y^0$ at every division to obtain the histograms of cells with given
1159: numbers of active molecules.
1160:
1161: The fluctuations in the value of $N_y^0$ are found to be much smaller
1162: than those of $N_x^0$.
1163: The selection process
1164: discriminates more strongly between different concentrations of active
1165: $Y$ molecules than between those of active $X$ molecules. Hence the
1166: active $Y$ molecules are well preserved with relatively smaller
1167: fluctuations in the number.
1168:
1169:
1170: %The numbers $N_y^A$ and $N_y^I$ are more nearly conserved than $N_x^A$ and $N_x^I$, and
1171:
1172:
1173: \subsubsection{Control of the growth speed}
1174:
1175: Now, it is expected that the growth speed of our protocell has a
1176: stronger dependence on the number of active $Y$ molecules than the
1177: number of active $X$ molecules. We have found that the division time
1178: is a much more rapidly decreasing function of $N_y^0$ than of $N_x^0$.
1179: Even a slight change in the number of active $Y$ molecules has a
1180: strong influence on the division time of the cell. Of course, the
1181: growth rate also depends on $N_x^0$, but this dependence is much
1182: weaker. Hence, the growth speed is controlled mainly by the number of
1183: active $Y$ molecules.
1184:
1185: \subsubsection{Control of chemical composition by the minority molecule}
1186:
1187: As another demonstration of control, we study a model in which there
1188: is more specific catalysis of molecule synthesis. Here, instead of
1189: single active molecule types for $X$ and $Y$, we consider a system
1190: with $k$ types of active $X$ and $Y$ molecules, $X^{0i}$ and
1191: $Y^{0i}$ ($i=1,2,\cdots k$). In this model, each active molecule
1192: type catalyzes the synthesis of only a few types ($m<k$) of the other
1193: species of molecules. Here we assume that both $X$ and $Y$ molecules
1194: have the same ``specificity" (i.e., the same value of $m$) and study
1195: how this symmetry is broken.
1196:
1197: %Graphically representing the ability for such catalysis using arrows as
1198: %$i_x \rightarrow j_y$ for $X \rightarrow Y$ and $i_y \rightarrow j_x$ for
1199: %$Y \rightarrow X$, the network of arrows defining the catalyzing relations for
1200: %the entire system is chosen randomly, and is fixed throughout each simulation.
1201:
1202: As already shown, when $N$, $\gamma_y$ and $F$ satisfy the conditions
1203: necessary for realization of a state in which $N_y^I$ is sufficiently
1204: small, the surviving cell type contains only a few active $Y$
1205: molecules, while the number of inactive ones vanishes or is very
1206: small. Our simulations show that in the present model with several
1207: active molecule types, only a single type of active $Y$ molecule
1208: remains after a sufficiently long time. We call this ``surviving
1209: type", $i_r$ ($1 \leq i_r \leq k$). Contrastingly, at least $m$ types
1210: of $X^0$ species, that can be catalyzed by the remaining $Y^{0i_r}$
1211: molecule species remain. Accordingly, for a cell that survived after
1212: a sufficiently long time, a single type of $Y^{0i_r}$ molecule catalyzes
1213: the synthesis of (at least) $m$ kinds of $X$ molecule species, while
1214: the multiple types of $X$ molecules catalyze this single type of
1215: $Y^{0i}$ molecules. Thus, the original symmetry regarding the
1216: catalytic specificity is broken as a result of the difference between
1217: the synthesis speeds.
1218:
1219: Due to autocatalytic reactions, there is a tendency for further
1220: increase of the molecules that are in the majority. This leads to
1221: competition for replication between molecule types of the same
1222: species. Since the total number of $Y$ molecules is small, this
1223: competition leads to all-or-none behavior for the survival of
1224: molecules. As a result, only a single type of species $Y$ remains,
1225: while for species $X$, the numbers of molecules of different types are
1226: statistically distributed as guaranteed by the uniform replication
1227: error rate.
1228:
1229: Although $X$ and $Y$ molecules catalyze each other, a change in the type of
1230: the remaining active $Y$ molecule has a much stronger influence on $X$
1231: than a change in the types of the active $X$ molecules on $Y$,
1232: since the number of $Y$ molecules is much smaller.
1233:
1234: With the results so far, we can conclude that the $Y$ molecules, i.e.,
1235: the minority species, control the behavior of the system, and are
1236: preserved well over many generations. We therefore call this state
1237: the minority-controlled (MC) state.
1238:
1239: \subsubsection{Evolvability}
1240:
1241: An important characteristic of the MC state is evolvability.
1242: Consider a variety of active molecules $0i$, with different catalytic activities.
1243: Then the synthesis rates $\gamma_x$ and $\gamma_y$ depend on the activities of
1244: the catalyzing molecules. Thus, $\gamma_x$ can be written in terms of
1245: the molecule's inherent growth rate, $g_x$, and the activity, $e_y(i)$, of
1246: the corresponding catalyzing molecule $Y^{0i}$:
1247:
1248: \begin{math}
1249: \gamma_x =g_x \times e_y(i);
1250: \gamma_y =g_y \times e_x(i).
1251: \end{math}
1252:
1253: \noindent
1254: Since such a biochemical reaction is entirely facilitated by catalytic
1255: activity, a change of $e_y$ or $e_x$, for example by the structural
1256: change of polymers, is more important. Given the occurrence of
1257: such a change to molecules, those with greater catalytic activities
1258: will be selected through competition evolution, leading to the
1259: selection of larger $e_y$ and $e_x$. As an example to demonstrate
1260: this point, we have extended the model to include $k$ kinds of active
1261: molecules with different catalytic activities. Then, molecules with
1262: greater catalytic activities are selected through competition.
1263:
1264: Since only a few molecules of the $Y$ species exist in the MC state, a
1265: structural change to them strongly influences the catalytic activity
1266: of the protocell. On the other hand, a change to $X$ molecules has a
1267: weaker influence, on the average, since the deviation of the {\sl
1268: average} catalytic activity caused by such a change is smaller, as can
1269: be deduced from the law of large numbers. Hence the MC state is
1270: important for a protocell to realize evolvability.
1271:
1272: \subsection{Experiment}
1273:
1274: Recently, there have been some experiments to construct minimal
1275: replicating systems in vitro.
1276: As an experiment corresponding to this problem, we describe an in-vitro
1277: replication system, constructed by Yomo's group\cite{Matsuura}.
1278:
1279: In general, proteins are synthesized from the information on DNA
1280: through RNA, while DNA are synthesized through the action of proteins.
1281: As a set of chemicals, they autonomously replicate themselves. Now
1282: simplifying this replication process, Matsuura et al.\cite{Matsuura} constructed a
1283: replication system consisting of DNA and DNA polymerase i.e., an
1284: enzyme for the synthesis of DNA, and so forth. This DNA polymerase is
1285: synthesized by the corresponding gene in the DNA, while it works as
1286: the catalyst for the corresponding DNA. Through this mutual catalytic
1287: process the chemicals replicate themselves.
1288:
1289: \begin{figure}
1290: \noindent
1291: \hspace{-.3in}
1292: %\epsfig{file=chap4fig/yomo1a.eps,width=.8\textwidth}
1293: \epsfig{file=yomo1.ps,width=.9\textwidth}
1294: \caption{
1295: Illustration of in-vitro autonomous replication system
1296: consisting of DNA and DNA polymerase.
1297: See text and \cite{Matsuura} for details.
1298: Provided with the courtesy of Yomo, Matsuura et al.
1299: }
1300: \end{figure}
1301:
1302:
1303: %\begin{figure}
1304: %\noindent
1305: %\hspace{-.3in}
1306: %\epsfig{file=chap4fig/yomo-repl2.eps,width=.4\textwidth}
1307: %\caption{Procedure of experimentF In each of 10 test tubes containing a single DNA molecule,
1308: %autonomous replication progresses. The components of the tubes are mixed
1309: %in a pool, from which a single DNA is chosen to a tube, to repeat the
1310: %procedure.
1311: %See text and [Matsuura et al. 2002] for details.
1312: %Supplied with the courtesy of Yomo, Matsuura et al.
1313: %}
1314: %\end{figure}
1315:
1316:
1317: %\begin{figure}
1318: %\noindent
1319: %\hspace{-.3in}
1320: %\epsfig{file=chap4fig/yomo-repl.eps,width=.5\textwidth}
1321: %\caption{
1322: %Change of self-replication activity from a system with single DNA.
1323:
1324: %The activities for 10 tubes are shown, The next generation is
1325: %produced mostly from the top DNA. Although activities vary by each tube,
1326: %higher ones are selected, so that the activities are maintained.
1327: %See text and [Matsuura et al. 2002] for details.
1328: %Supplied with the courtesy of Yomo, Matsuura et al.
1329: %}
1330: %\end{figure}
1331:
1332: \begin{figure}
1333: \noindent
1334: \hspace{-.3in}
1335: %\epsfig{file=chap4fig/yomo-repl0.eps,width=.5\textwidth}
1336: \epsfig{file=yomo2a.ps,width=.7\textwidth}
1337: \epsfig{file=yomo2b.ps,width=.7\textwidth}
1338: \caption{
1339: Self-replication activities for each generation, measured as described in the
1340: text. The activities for 10 tubes are shown. : Upper: result from a single DNA, where the next generation is
1341: produced mostly from the top DNA. Although activities vary by each tube,
1342: higher ones are selected, so that the activities are maintained. Lower: result from 100 DNA molecules.
1343: Provided with the courtesy of
1344: Yomo, Matsuura et al\cite{Matsuura}.
1345: }
1346: \end{figure}
1347:
1348: As for the amplification of DNA, PCR is widely used, and is a
1349: standard tool for molecular biology. In this case, however, enzymes
1350: that are necessary for the replication of DNA must be supplied
1351: externally. In this sense, it is not a self-contained autonomous
1352: replication system. In the experiment by Yomo's group, while they use
1353: PCR as one step of experimental procedures, the enzyme (DNA
1354: polymerase) for DNA synthesis is also replicated in vitro within the
1355: system. Of course, some (raw) material, such as amino acid or ATP,
1356: have to be supplied, but otherwise the chemicals are replicated by
1357: themselves. (see Fig. 6 for the experimental procedure).
1358:
1359: In this experiment, there is mutual synthetic process between gene and enzymes.
1360: Roughly speaking, the polymerase in the experiment corresponds to
1361: $X$ in our model, while the polymerase gene corresponds to $Y$.
1362:
1363:
1364: Now, at each step of replication , about $2^{30}\sim2^{40}$ DNA molecules are replicated.
1365: Here, of course there are some errors. These errors can occur in the synthesis of
1366: enzyme, and also in the synthesis of DNA. With these errors, there appear DNA molecules
1367: with different sequences. Now a pool of DNA molecules with a variety of sequences
1368: is obtained as a first generation.
1369:
1370: From this pool, the DNA and enzymes are split into several tubes.
1371: Then, materials with ATP and amino acids are supplied, and the replication process
1372: is repeated (see Fig. 6). In other words, the 'test tube' here plays the role
1373: of ``cell compartmentalization". Instead of autonomous cell division, split into several tubes are operated externally.
1374:
1375: In this experiment, instead of changing the synthesis speed $\gamma_y$ or $N$ in the model,
1376: one can control the number of genes, by changing the condition how
1377: the pool is split into several test tubes.
1378:
1379: Indeed, they studied the two distinct cases, i.e.,
1380: split to tubes containing a single DNA in each and split
1381: to tubes containing 100 DNA molecules.
1382: Recall that in the theory, the evolvability by minority control is predicted.
1383: Hence, the behavior between the two cases may be drastically different.
1384:
1385: First, we describe the case with a single DNA in each tube.
1386: Here, the pool of chemicals
1387: is split into 10 tubes each of which has a single DNA molecule,
1388: and replication process described already
1389: progresses in each tube. Here, the sequence of DNA molecules could be different by tube,
1390: since there is replication error. Then the activity of DNA polymerase by each
1391: tube is also different, and the number of DNA molecules synthesized in each tube
1392: is different. In other words, some DNA molecules can produce more offspring, but others
1393: cannot. The variation of self-replication activity by tubes is shown in the upper column of Fig. 7.
1394: Then the contents of each tube are mixed.
1395: This soup of chemicals is used for the next generation. Then in this soup,
1396: the DNA molecules that have higher replication rate as well as their mutants generated
1397: from them are included with a larger fraction. Now a single DNA is selected from
1398: the soup in each of 10 tubes, and the same procedures are repeated.
1399: Hence, there is a larger probability that a DNA molecule with a
1400: higher reproduction activity is selected for the next generation. In other words,
1401: Darwinian selection acts at this stage.
1402: The self-replication activity
1403: from this soup is plotted in the third generation. Successive plots of the
1404: self-replication activity are given in the upper column of Fig. 7,
1405: As shown, the self-replication activity is not lost (or can evolve in some case),
1406: although it varies by each tube in each generation.
1407:
1408: One might say that the maintenance of replication is not surprising at all,
1409: since a gene for the DNA polymerase is included in the beginning.
1410: However, enzyme with such catalytic activity is rare. Indeed, with mutations
1411: some proteins that lost such catalytic activity but are synthesized in the
1412: present system could appear, which might take over the system.
1413: Then the self-replication activity would be lost. In fact, this is nothing but
1414: the error catastrophe by Eigen, discussed in \S 2.1.
1415: Then, why is the self-replication activity maintained in the present experiment?
1416:
1417: The answer is clear according to the theory in \S 4.2-4.3.
1418: In the model of \S 4.1, mutants that lost the catalytic
1419: activity are much more common(i.e., $F$ times larger in the model).
1420: Still, the number of such molecules is suppressed. This was possible
1421: first because the molecules are in a cell. In the experiment also
1422: they are in a test tube, i.e., in a compartment. Now the selection works
1423: for this compartment, not for each molecule. Hence the tube (cell) that
1424: includes a gene giving rise to lower enzyme activity produces less offspring.
1425: In this sense, compartmentalization is one essential factor for
1426: the maintenance of catalytic activity (see also \cite{Hogeweg,Szathmary,Eigen-book}.
1427: Here, another important factor is that in each compartment (cell) there is a single (or
1428: very few) DNA molecule (as the $Y$ molecule in the model of \S 4.1-3). In the theory,
1429: if the number of $Y$ molecules is larger, inactive $Y$ molecules surpass
1430: the active one in population.
1431:
1432: To confirm the validity of our theory, Matsuura et al.\cite{Matsuura} carried out a comparison
1433: experiment. Now, they split the chemicals in the soup so that each tube
1434: has 100 DNA molecules instead of a single one. Otherwise, they adopt
1435: the same procedure. In other words, this corresponds to a cell with 100
1436: copies of genome. Change of self-replication activity in the experiment
1437: is plotted in the lower column of Fig.7. As shown, the
1438: self-replication activity is lost by each generation, and after the
1439: fourth generation, capability of autonomous replication is totally lost.
1440: This result shows that the number of molecules to carry genetic
1441: information should be small, which is consistent with the theory.
1442:
1443: When there are many DNA molecules, there can be mutation to each DNA
1444: molecule. In each tube, the self-replication activity is given by the
1445: average of the enzyme activities from these 100 DNA molecules.
1446: Although catalytic activity of molecules varies by each, the variance
1447: of the average by tubes should be reduced drastically. Recall that
1448: the variance of the average of $N$ variables with the variance $\mu$
1449: is reduced to $\mu/N$, according to the central limit theorem of
1450: probability theory. Hence the average catalytic activity does not
1451: differ much by tube. Here, the mutant with a higher catalytic
1452: activity is rare. Most changes in the gene lead to smaller or null
1453: catalytic activity. Hence, on the average, the catalytic activity
1454: after mutations to original gene gets smaller, and the variance by
1455: tubes around this mean is rather small (see Fig. 7).
1456:
1457: By the selection, DNA from a tube with a higher catalytic activity
1458: could be selected, but the variation by tubes is so small that the
1459: selection does not work. Hence deleterious mutations remain in the
1460: soup, and the self-replication activity will be lost by generations.
1461: In other words, the selection works because the number of information
1462: carrier in a replication unit (cell) is very small, and is free from
1463: the statistical law of large numbers.
1464:
1465: Summing up: In the experiment, it was found that replication is
1466: maintained even under deleterious mutations (that correspond to
1467: structural changes from active to inactive molecules in the model),
1468: only when the population of DNA polymerase genes is small and
1469: competition of replicating systems is applied. When the number of
1470: genes (corresponding to $Y$) is small, the information containing in
1471: the DNA polymerase genes is preserved. This is made possible by the
1472: maintenance of rare fluctuations, as found in our theory. The system
1473: has evolvability only if the number of DNA in the system is small.
1474: Otherwise, the system gradually loses its activity to replicate
1475: itself. These experimental results are consistent with the minority
1476: control theory described.
1477:
1478: \subsection{Discussion}
1479:
1480: \subsubsection{Heredity from a kinetic viewpoint}
1481:
1482: In this section, we have shown that in a mutually catalyzing system,
1483: molecules $Y$ with the slower synthesis speed and minority in number,
1484: tend to act as the carrier of heredity. Through the selection under
1485: reproduction, a state, in which there is a few active $Y$ and almost zero
1486: inactive $Y$ molecules, is selected. This state is termed the
1487: ``minority controlled state". Between the two molecule species, there
1488: appears separation of roles, between that with a larger number, and that with
1489: a greater catalytic activity. The former has a variety of chemicals
1490: and reaction paths, while the latter works as a basis for the
1491: heredity, in the sense of the two properties mentioned in \S 1.1 and
1492: \S 4.3, `preservation' and `control'. We now discuss these properties
1493: in more detail.
1494:
1495: [Preservation property]: A state that can be reached only through
1496: very rare fluctuations is selected, and
1497: it is preserved over many generations, even though
1498: the realization of such a state is very rare
1499: when we consider the rate equation obtained in the continuum limit.
1500:
1501: [Control property]: A change in the number of $Y$ molecules
1502: has a stronger influence on the growth rate of a cell than a change
1503: in the number of $X$ molecules.
1504: Also, a change in the catalytic activity of the $Y$ molecules has a strong
1505: influence on the growth of the cell. The catalytic activity of the $Y$
1506: molecules acts as a control parameter of the system.
1507:
1508: Once this minority controlled state is established, the following
1509: scenario for the evolution of genetic information is expected. First,
1510: a new selection pressure is now possible to emerge, to evolve a
1511: machinery to ensure that the minority molecule makes it into the
1512: offspring cells, since otherwise the reproduction of the cell is
1513: highly damaged. Hence a machinery to guarantee the faithful
1514: transmission of the minority molecule should evolve. Now, the origin
1515: of heredity is established. Here, for this heredity, any specific
1516: metabolic or genetic contents transmitted faithfully is not necessary.
1517: It can appear from the loose reproduction system that Dyson considered
1518: (as in \S 2.2). This heredity evolves just as a result of kinetic
1519: phenomenon and is a rather general phenomenon in a reproducing
1520: protocell consisting of mutually catalytic molecules.
1521:
1522: This faithful transmission of minority molecule provides a basis for
1523: critical information for reproduction of the protocells. Since this
1524: minority molecule is protected to be transmitted, other chemicals that
1525: are synthesized in connection with it are probable to be transmitted,
1526: albeit not always faithfully. Hence there appears a further
1527: evolutionary incentive to package life-critical information into the
1528: minority molecule. Now more information (`many bits' of information)
1529: are encoded on the minority molecule. Then, the molecules work as a
1530: carrier of genetic information in the today's sense. With this
1531: evolution having more molecules catalyzed by the minority molecule, it
1532: is then easier to further develop the machinery to better take care of
1533: minority molecules, since this minority molecule is essential to many
1534: reactions for the synthesis of many other molecules.
1535:
1536: Hence the evolution of faithful transmission of minority molecules and
1537: of coding of more information reinforce each other. At this point one
1538: can expect a separation of metabolism and genetic information.
1539:
1540: To sum up, how a single molecule starts to reign the heredity is
1541: understood from a kinetic viewpoint. We first show the minority
1542: controlled state as a rather general consequence of kinetic process of
1543: mutually catalytic molecules. This provides a basis for heredity.
1544: Taking advantage of the evolvability of minority controlled state,
1545: then, preservation mechanism of the minority molecule evolves, which
1546: allows for more information encoded on it, leading to separation of
1547: genetic information and metabolism. In this sense, the minority
1548: molecule species with slower synthesis speed, leading to the
1549: preservation of rare states and control of the behavior of the system,
1550: acts as an information carrier. The important point of our theory is
1551: that heredity arises prior to any metabolic information that needs to
1552: be inherited.
1553:
1554:
1555: %\subsubsection{Accessibility to Minority Controlled state}
1556:
1557: %One important consequence of the existence of the MC state is
1558: %evolvability. Mutations introduced to the majority species tend to be
1559: %canceled out on the average, in accordance with the law of large
1560: %numbers. Hence, the catalytic activity of the minority species ($Y$ in
1561: %our model) is not only sustained, but has a greater potentiality to
1562: %increase through evolution.
1563:
1564: %The evolution and stability of the MC state with respect to mutation
1565: %was discussed in \S 4.4.3.
1566: %If the initial difference between the catalytic abilities $e_x$ and $e_y$
1567: %(and other parameters) satisfies the conditions stated in \S 4.4, it is shown that
1568: %the MC state once realized is stable over generations against mutations.
1569:
1570: %1. Higher Order Catalysis
1571:
1572: %2. Spatial structure of a cell:
1573:
1574:
1575: \subsubsection{some remarks}
1576:
1577: In \S 2, we described two standpoints on the origin of life, i.e.,
1578: genetic information first or complex metabolism first. We pointed out
1579: some difficulty at each standpoint. In the former picture, there was
1580: a problem on the stability against parasites, while the latter cannot
1581: solve how genetic information took over the original loose
1582: reproduction system. The minority control gives a new look to these
1583: problems.
1584:
1585: The first problem in \S 2.1 was the appearance of parasitic molecules to destroy
1586: the hypercycle, i.e. mutually catalytic reaction cycle. If only the
1587: replication process of molecules is concerned, it is not so easy to
1588: resolve the problem. Here we consider the dual level of replication,
1589: i.e., molecular and cellular replication.
1590:
1591: In the present theory for the origin of information, existence of a
1592: cell unit that reproduces itself is required. Two levels of
1593: reproduction, both molecules and cells are assumed here. Hence a cell
1594: with parasitic molecules cannot grow, and is selected out. Relevance
1595: of this type of two-level reproduction to avoid molecular parasites
1596: has been discussed \cite{Hogeweg,Szathmary,Eigen-book}. Here,
1597: relevance of cellular compartment to the {\em origin of genetic
1598: information} is more important.
1599:
1600: This two-level selection works effectively, with the aid of minority
1601: control of specific molecules for a cell. Indeed, surviving cells
1602: satisfy the minority control. With the selection pressure for
1603: reproduction of cells, there appears a state that is not expected by
1604: the rate equation for reaction of molecules, where the number of
1605: inactive $Y$ molecules that are parasitic to the catalytic reaction is
1606: suppressed. Furthermore, resistance against parasitic (inactive) $Y$
1607: molecules is established by this minority controlled state.
1608:
1609: This minority control also resolves the question on the genetic
1610: take-over, the problem in the ''metabolism first'' standpoint (in \S 2.2). Among
1611: several molecules, specific molecule species that are minority in
1612: population controls the behavior of a cell and is well preserved. The
1613: possible scenario mentioned in the beginning of this section gives one
1614: plausible answer how genetic take-over progresses.
1615:
1616: %from this minority controlled state.
1617:
1618: The differentiation of role between the molecules looks like
1619: ``symmetry breaking''. When initially two states are equally
1620: possible, and later only one of them is selected, it is said that the
1621: symmetry is broken. In the differentiation of roles of molecules
1622: studied here, however, the molecules have different characters as to
1623: the replication speed from the beginning. Here a difference in one
1624: character (i.e., the replication speed) is ''transformed'' into the
1625: difference in the control behavior, and in the role as a carrier
1626: of heredity. In other words, a characteristics with already broken
1627: symmetry is transformed into a different type of symmetry breaking.
1628: This kind of transformation of one character's difference to another
1629: is often seen in biology, as we have already discussed in the study of
1630: morphogenesis and sympatric speciation\cite{Furusawa,speciation}.
1631:
1632: \section{Recursive Production in an Autocatalytic Network}
1633:
1634: Now we come to the second question raised in \S 1. In the model of
1635: the last section, we considered a system consisting of two kinds of
1636: molecules. In a cell, however, a variety of chemicals form a complex
1637: reaction network to synthesize themselves. Here we study a model with
1638: a large number of chemical species, to discuss how a cell with such
1639: large number of components and complex reaction network can sustain
1640: reproduction, keeping similar chemical compositions
1641: \cite{KK-net,KK-PRE}(see also \cite{Lancet}).
1642:
1643: \subsection{Model}
1644:
1645: To unveil general features of a system with mutually catalyzing
1646: molecules, we study a system with a variety of chemicals ($k$ molecule
1647: species), forming a mutually catalyzing network. The molecules
1648: replicate through catalytic reactions, so that their numbers within a
1649: cell increase. (see Fig.1 again for schematic representation of the
1650: model).
1651:
1652: We envision a (proto)cell containing $k$ molecular species with some
1653: of the species possibly having a zero population. A chemical species
1654: can catalyze the synthesis of some other chemical species as
1655:
1656: \begin{equation}
1657: [i] + [j] \rightarrow [i] + 2[j],
1658: \label{reaction}
1659: \end{equation}
1660:
1661: \noindent
1662: with $i,j=1,\cdots,k$ according to a randomly chosen reaction network,
1663: where the reaction is set at far-from-equilibrium, In eq.(7), the
1664: molecule $i$ works as a catalyst for the synthesis of the molecule
1665: $j$, while the reverse reaction is neglected, as discussed in the
1666: hypercycle model. For each chemical the rate for the path of
1667: catalytic reaction in eq.(7) is given by $\rho$, i.e., each species has
1668: about $k\rho$ possible reactions. The rate is kept fixed throughout
1669: each simulation. Considering catalytic reaction dynamics, the reverse
1670: reaction process is neglected, and reactions $i \leftrightarrow j$ are
1671: not included. (Here we investigated the case without direct mutual
1672: connections, i.e., $i\rightarrow j$ was excluded as a possibility when
1673: there was a path $j \rightarrow i$, although this condition is not
1674: essential for the results to be discussed). Furthermore, each
1675: molecular species $i$ has a randomly chosen catalytic ability $c_i \in
1676: [0,1]$ (i.e., the above reaction occurs with the
1677: rate $c_i$). Assuming an environment with an ample
1678: supply of chemicals available to the cell, the molecules then
1679: replicate leading to an increase in their numbers within a cell.
1680:
1681: Again, when the total number of molecules exceeds a given threshold
1682: (here we used 2$N$), the cell is assumed to divide into two, with each
1683: daughter cell inheriting half of the molecules of the mother cell,
1684: chosen randomly.
1685:
1686: During the replication process, structural changes, e.g., the
1687: alternation of a sequence in a polymer, may occur that alter the
1688: catalytic activities of the molecules. Therefore, the activities of
1689: the replicated molecule species can differ from those of the mother
1690: species. The rate of such structural changes is given by the
1691: replication 'error rate' $\mu$. As a simplest case, we assume that
1692: this `error' leads to all other molecule species with equal
1693: probability (i.e., with the rate $\mu /(k-1)$), and could thus
1694: regard it as a background fluctuation. In reality, of course, even
1695: after a structural change, the replicated molecule will keep some
1696: similarity with the original molecule, and a replicated species with
1697: the `error' would be within a limited class of molecule species.
1698: Hence, this equal rate of transition to other molecule species is a
1699: drastic simplification. Some simulations where the errors in
1700: replication only lead to a limited range of molecule species, however,
1701: show that the simplification does not affect the basic conclusions
1702: presented here. Hence we use the simplest case for most simulations.
1703:
1704: In statistical physics, people study mostly the case the total number
1705: of molecules $N$ is very large, at least much much larger than a
1706: number of molecule species $k$. In this case, the continuum
1707: description is relevant. When $N/k$ is rather small, some molecules
1708: species can often fluctuate around 0, where the discreteness
1709: 0,1,2,... will be important, as already discussed. In order to take
1710: the importance of the discreteness in the molecule numbers into
1711: account, we adopted a stochastic rather than the usual differential
1712: equations approach, by taking a variety of possible chemicals, where
1713: $N$ and $k$ are of a comparable order.
1714:
1715: The model is simulated as follows: At each step, a pair of molecules,
1716: say, $i$ and $j$, is chosen randomly. If there is a reaction path
1717: between species $i$ and $j$, and $i$ ($j$) catalyzes $j$ ($i$), one
1718: molecule of the species $j$ ($i$) is added with probability $c_i$
1719: ($c_j$), respectively. The molecule is then changed to another
1720: randomly chosen species with the probability of the replication error
1721: rate $\mu$. When the total number of molecules exceeds a given
1722: threshold (denoted as $N$), the cell divides into two such that each
1723: daughter cell inherits half ($N/2$) of the molecules of the mother
1724: cell, chosen randomly\cite{minority}.
1725:
1726: Again, to include competition, we assume that there is a constant
1727: total number $M_{tot}$ of protocells, so that one protocell, randomly
1728: chosen, is removed whenever a (different) protocell divides into two.
1729: However, the result here does not depend on $M_{tot}$ so much. We
1730: choose mostly $M_{tot}=1$, in the results below but the simulation
1731: with $M_{tot}=100$ gives essentially the same behavior.
1732:
1733: %\noindent
1734: %with $i,j=1,\cdots,k$. The connection rate of the catalytic paths is given by $p$ per each chemical.
1735:
1736: %Again, replication is accompanied by some 'error', and instead of the replication of the molecule
1737: %$i$, one of other $k$ molecule species is synthesized with an error
1738: %rate $\mu$.
1739: %(see Fig.2, for schematic representation).
1740:
1741: %\begin{figure}
1742: %\noindent
1743: %\vspace{-.1in}
1744: %\hspace{-.3in}
1745: %\epsfig{file=alaska20.ps,width=.6\textwidth}
1746: %\caption{Schematic representation of the model.}
1747: %\end{figure}
1748:
1749:
1750: \subsection{Result}
1751:
1752: \subsubsection{Phases}
1753:
1754: Our main concern here is the dynamics of these molecule numbers $N_i$
1755: of the species $i$ in relationship with the condition of the recursive
1756: growth of the (proto)cell. In our model there are four basic
1757: parameters; the total number of molecules $N$, the total number of
1758: molecule species $k$, the mutation rate $\mu$, and the reaction path
1759: rate $\rho$. By carrying out simulations of this model, choosing a
1760: variety of parameter values $N,k,\mu,\rho$, also by taking various
1761: random networks, we have found that the behaviors are classified into
1762: the following three phases\cite{KK-net,KK-PRE}:
1763:
1764: (1) Fast switching states without recursiveness
1765:
1766: (2) Achievement of recursive production with similar chemical compositions
1767:
1768: (3) Switch over several quasi-recursive states
1769:
1770: \begin{figure}
1771: \noindent
1772: \epsfig{file=alaska3aC.ps,width=.5\textwidth}
1773: \caption{The number of molecules $N_n(i)$ for the species $i$ is
1774: plotted as a function of generation $n$ of cells, i.e., at each
1775: successive division event $n$. A random network with $k=500$ and
1776: $\rho=.2$. Dominant species change successively in generation.}
1777: \end{figure}
1778:
1779: \begin{figure}
1780: \noindent
1781: (a)\epsfig{file=fig1bcomp.ps,width=.53\textwidth}
1782: (b)\epsfig{file=alaska3b.ps,width=.53\textwidth}
1783: (c)\epsfig{file=figswitch2.ps,width=.53\textwidth}
1784: \caption{The number of molecules $N_n(i)$ for the species $i$ is
1785: plotted as a function of generation $n$ of cells, i.e., at each
1786: successive division event $n$. results from a random network with
1787: $k=200$ and $\rho=.1$ was adopted, with $N=64000$ and $\mu=0.01$ (a),
1788: and $\mu=0.1$ (b). Only some species (whose population get large at
1789: some generation) are plotted. in (a), a recursive production state is
1790: established, while in (b), a few quasi-recursive states are
1791: visited successively. (c): Expansion of Fig (b) around the time step 100000.}
1792: \end{figure}
1793:
1794: \begin{figure}
1795: \noindent
1796: (a)\epsfig{file=net1.ps,width=.5\textwidth}
1797: (b)\epsfig{file=net2.ps,width=.4\textwidth}
1798: \caption{The catalytic network of the dominant species that constitute
1799: the recursive state. The catalytic reaction is plotted by an arrow $i
1800: \rightarrow j$, as the replication of the species $j$ with the
1801: catalytic species $i$. The numbers in () denote $c_i$ of the species.
1802: Only the species that continue to exist with the population larger
1803: than 10 is plotted. (Note many other species can exist at each
1804: generation, through the replication error). (a): corresponding to the
1805: recursive state of Fig.9 a, where the three species connected by
1806: thick arrows are the top 3 species in Fig.9 a. The network (b) is
1807: another example observed in a different set of simulations with
1808: $k=200$ and $\rho =.1$, but with a different reaction network from
1809: Fig.9.}
1810: \end{figure}
1811:
1812: In the phase (1), there is no clear recursive production and the
1813: dominant molecule species changes by generation frequently. Even
1814: though each generation has some dominating species as with regards to
1815: the molecule numbers, the dominating species change every few
1816: generations. At one generation, some chemical species are dominant but
1817: only a few generations later. Information regarding the previously
1818: dominating species is totally lost often to the point that its
1819: population drops to zero (see Fig.8). Here no stable mutual catalytic
1820: relationships are formed among molecules. Hence, the time required
1821: for reproduction of a cell is quite large, and much larger than the
1822: case (2).
1823:
1824: In the phase (2), a recursive state is established, and the chemical
1825: composition is stabilized such that it is not altered much by the
1826: division process (see Fig.9). Generally, all the observed recursive states
1827: consist of 5-12 species, except for those species with one or two
1828: molecule numbers, which exist only as a result of replication errors.
1829: These 5-12 chemicals mutually catalyze, by forming a catalytic network
1830: as in Fig.10, which will be discussed later. The member of these 5-12
1831: species do not change by generations, and the chemical compositions
1832: are transferred to the offspring cells. Once reached, this state is
1833: preserved throughout whole simulations, lasting over more than 10000
1834: generations.
1835:
1836: The recursive state observed here is not necessarily a fixed point
1837: with regards to the population dynamics of the chemical
1838: concentrations. In some case, the chemical concentrations oscillate
1839: in time, but the nature of the oscillation is not altered by the
1840: process of cell division.
1841:
1842: %In all of these cases, the number of each molecule shows relatively large
1843: %fluctuations, since the total number of molecules $N$
1844: %is not large (typically we choose $N \sim
1845: %(10^2 \sim 10^5)$ in our simulations.).
1846:
1847: For example, in the recursive state depicted in Fig.9a), 11 species
1848: remain in existence throughout the simulation. As shown, three species
1849: have much higher populations than others, which form a hypercycle as
1850: $109\rightarrow 11 \rightarrow 13 \rightarrow 109$. (The numbers
1851: 11,13,.. are indices of chemical species, initially assigned
1852: arbitrarily). The hypercycle sustains the replication of the
1853: molecules, and is called 'core hypercycle'. The catalytic activities
1854: of the species satisfy $c_{13}>c_{109}>c_{11}$, and accordingly the
1855: respective populations satisfy $N_{11} > N_{109} > N_{13}$.
1856:
1857: In the phase (3), after one recursive state lasts over many
1858: generations (typically a thousand generations), a fast switching state
1859: appears until a new (quasi-)recursive state appears. As shown in
1860: Fig.9 b, for example, each (quasi-)recursive state is similar to that
1861: in the phase (2), but in this case, its lifetime is finite, and it is
1862: replaced by the fast switching state as in the phase (1). Then the
1863: same or different (quasi-)recursive state is reached again, which
1864: lasts until the next switching occurs. In the example of Fig.9b
1865: (see also Fig.9c) for its expansion), around the 12000th generation,
1866: the core network is taken over by parasites to enter the phase (1)
1867: like fast switching state which in turn gives way for a new
1868: quasi-recursive state around the 14000th generation.
1869:
1870: In the example of Fig. 9b, there is another type of switching, as
1871: shown around 85000th generation, as shown in Fig.9c with
1872: magnification. Here, the quasi-recursive state is still stable, but
1873: the core hypercycle consisting of dominant species changes. As in
1874: Fig.9c, a switch occurs from an initial core hypercycle
1875: ($109$,$11$,$13$), to the next core hypercycle $(11,13,195,155)$
1876: around the 8500th generation.
1877:
1878: This latter switching is the competition among core networks, while
1879: the former drastic switch is due to the invasion of parasitic
1880: molecules, which is most commonly observed. The mechanism of this
1881: switching is discussed again in \S 5.2.4.
1882:
1883: \subsubsection{Dependence of Phases on the Basic Parameters}
1884:
1885: Although the behavior of the system depends on the choice of the
1886: network, there is a general trend with regards to the phase change,
1887: from (1), to (3), and then to (2) with the increase of $N$, or with
1888: the decrease of $k$, as schematically shown in Fig.11. By choosing a
1889: variety of networks, however, we find a clear dependence of the
1890: fraction of the networks on the parameters, leading to a rough sketch
1891: of the phase diagram. Generally, the fraction of (2) increases and
1892: the fraction of (1) decreases also with the decrease of $\rho$ or $\mu$.
1893: For example, the fraction of (1) (or (3)) gets
1894: larger as $k$ is decreased from $k\stackrel{<}{\sim} 300$ for
1895: $N=50000$ (with $\rho=.1$ and $\mu=.01$), while dependence on $\rho$
1896: will be discussed below.
1897:
1898:
1899: \begin{figure}
1900: \noindent
1901: \hspace{-.3in}
1902: \epsfig{file=schem.eps,width=.5\textwidth}
1903: \caption{Schematic representation of the phase diagram of the three
1904: phases, plotted as a function of the total number of molecules $N$,
1905: and the total possible number of molecule species $k$.}
1906: \end{figure}
1907:
1908: For a quantitative investigation, it is useful to classify the phases by
1909: the similarity of the chemical compositions between two cell division
1910: events\cite{Lancet}. To check the similarity, we first define a
1911: $k$-dimensional
1912: vector $\stackrel{\rightarrow}{V_n}$=$(p_n(1),..,p_n(k))$ with $p_n(i)
1913: =N_n(i)/N$. Then, we measure the similarity between $\ell$ successive
1914: generations with the help of the inner product as
1915:
1916: \begin{equation}
1917: H_{\ell}=\stackrel{\rightarrow}{V_n} \cdot
1918: \stackrel{\rightarrow}{V_{n+\ell}}/(|V_n||V_{n+\ell}|)
1919: \end{equation}
1920: %(see Fig.??).
1921:
1922: In Fig.12, the average similarity $\overline{H_{20}}$ and the average
1923: division time are plotted for 50 randomly chosen reaction networks as
1924: a function of the path probability $\rho$. Roughly speaking the
1925: networks with $\overline{H_{20}}>.9$ belong to $(2)$, and those with
1926: $\overline{H_{20}}<.4$ to $(1)$, empirically. Hence, for $\rho >0.2$,
1927: the phase (1) is observed for nearly all the networks (e.g. $48/50$),
1928: while for lower path rates, the fraction of (2) or (3) increases. The
1929: value $\rho \sim .2$ gives the phase boundary in this case.
1930:
1931: Generally speaking, a positive correlation between the growth speed of
1932: a cell and the similarity $H$ exists. In Fig.12, the division time is
1933: also plotted, where to each point with a high similarity $H$, a lower
1934: division time corresponds. The network with higher similarity (i.e.,
1935: in the phase (2)) gives a higher growth speed. Indeed, the recursive
1936: states maintain higher growth speeds since they effectively suppress
1937: parasitic molecules. In Fig. 12, by decreasing path rates, the
1938: variations in the division speeds of the networks become larger, and
1939: some networks that reach recursive states have higher division speeds
1940: than networks with larger $\rho$. On the other hand, when the path
1941: rate is too low, the protocells generally cannot grow since the
1942: probability to have mutually catalytic connections in the network is
1943: nearly zero. Indeed there exists an optimal path rate seems (e.g.,
1944: around $.05$ for $k=200$, $N=12800$ as in Fig.12) for having a network
1945: with high growth speeds. Consequently, under competition for growth,
1946: protocells having such optimal networks will be evolved as will be
1947: discussed in \S 5.3.
1948:
1949: Besides the correlation between the growth speed and similarity, the
1950: correlation with the diversity of the molecules also exists.
1951: Protocells with higher growth speed and similarity in the phase (2)
1952: have higher chemical diversity also. In the phase (1), one (or a very
1953: few) molecule species is dominant in the population, while about 10
1954: species have higher population in the phase (2) with higher growth
1955: speed, where the chemical diversity is maintained.
1956:
1957: \begin{figure}
1958: \noindent
1959: %\hspace{-.3in}
1960: \epsfig{file=fig3.ps,width=.95\textwidth}
1961: %\includegraphics[width=63mm]{fig4.ps}
1962: %\includegraphics[width=65mm]{fig3.ps}
1963: \caption{The average similarity $\overline{H_{20}}$ ($+$), and the
1964: average division time ($\times$) are plotted as a function of the path
1965: rate $\rho$. For each $\rho$, data from 50 randomly chosen networks
1966: are plotted. The average is taken over 600 division events. The
1967: dotted line indicates the average of $\overline{H_{20}}$ over the 50
1968: networks for each $\rho$.
1969: For $\rho>.2$, networks over 98 \% have $H<.4$, and they show fast switching,
1970: while for $\rho=.08$, about 95\% belong to the phase (2) or (3)
1971: At $\rho=0.02$, 25 out of 50 networks cannot support cell growth,
1972: 4 cannot at $\rho=0.04$. (Adapted from \cite{KK-PRE}).}
1973: \end{figure}
1974:
1975: \subsubsection{Maintenance of Recursive Production}
1976:
1977: How is the recursive production sustained in the phase (2)? We have
1978: discussed already the danger of parasitic molecules that have lower
1979: catalytic activities and are catalyzed by molecules with higher
1980: catalytic activities. As discussed in \S 2.1, such parasitic molecules
1981: can invade the hypercycle. Indeed, under the structural changes and
1982: fluctuations, the recursive production state could be destabilized.
1983: To answer the question on the itinerancy and stability of
1984: recursive states, we have examined several reaction networks. The
1985: unveiled logic for the maintenance of recursive state is summarized as
1986: follows.
1987:
1988: (a) {\bf Stabilization by intermingled hypercycle network}:
1989:
1990: The 5-12 spices in the recursive state form a mutually catalytic
1991: network, for example, as in Fig. 10. This network has a {\sl core
1992: hypercycle network}, as shown in thick arrows in Fig.10a. As shown in
1993: Fig.13, such core hypercycle has a mutually catalytic relationship,
1994: as `` $A$ catalyzes $B$, $B$ catalyzes $C$, and $C$ catalyzes
1995: $A$''. However, they are connected with other hypercycle networks such
1996: as $G\rightarrow D \rightarrow B \rightarrow G$, and $D\rightarrow C
1997: \rightarrow E \rightarrow D$, and so forth. The hypercylces are
1998: intermingled to form a network. Coexistence of core hypercycle and
1999: other attached hypercycles are common to the recursive states we have
2000: found in our model.
2001:
2002: This intermingled hypercycle network (IHN) leads to stability against
2003: parasites and fluctuations. Assume that there appears a parasitic molecule to one species in the
2004: member of IHN (say $X$ as a parasite to $C$ in Fig.13). The species
2005: $X$ may decrease the number of the species $C$. If there were only a
2006: single hypercycle $A\rightarrow B \rightarrow C \rightarrow A$, the
2007: population of all the members $A,B,C$ would be easily decreased by
2008: this invasion of parasitic molecules, resulting in the collapse of the
2009: hypercycle. In the present case, however, other parts of the network
2010: (say, that consisting of $A,B$,$G,D$ in Fig.13), compensate the
2011: decrease of the population of $C$ by the parasite, so that the
2012: population of $A$ and $B$ are not so much decreased. Then, through
2013: the catalysis of the species B, the replication of the molecule $C$
2014: progresses, so that the population of $C$ is recovered. Hence the
2015: complexity in the hypercycle network leads to stability against the
2016: attack of parasite molecules.
2017:
2018: Next, IHN is also relevant to the stability against fluctuations. It
2019: is known that the population dynamics of a simple hypercycle often
2020: leads to heteroclinic cycle\cite{Sigmund}, where the population of one
2021: (or a few) member approaches 0, and then is recovered. For a
2022: continuum model, such heteroclinic cycle can continue forever, but in a stochastic
2023: model, due to fluctuations, the number of the corresponding molecule
2024: species is totally extinct sometimes. Once this molecule species goes extinct
2025: completely, and then its recovery by replication error would require a
2026: very long time. Hence, to achieve stability against fluctuations, a
2027: state with the heteroclinic cycle dynamics or any oscillation in which
2028: some of the population goes very low should be avoided. Indeed, by forming IHN,
2029: such oscillatory instability is often avoided or reduced. Due to
2030: coexistence of several hypercycle processes, instability in each
2031: hypercycle cancels out, leading to fixed-point dynamics or oscillation
2032: with a smaller amplitude. Thus the danger that the population of some
2033: molecules in the hypercycle goes to zero by fluctuations
2034: %due to finiteness of molecule numbers
2035: is reduced.
2036:
2037: Stability of coexistence of many species is discussed as 'homeochaos'
2038: \cite{homeochaos}, while stable reproduction in reaction network is
2039: also seen in \cite{Ikegami}.
2040:
2041: (b) {\bf Minority in the core hypercycle};
2042:
2043: Now we study more closely the population dynamics in a core hypercycle.
2044: Here, the number of molecules $N_j$ of molecule species $j$, is in the
2045: inverse order of their catalytic activity $c(j)$, i.e,, $N_A>N_B>N_C$
2046: for $c_A<c_B<c_C$. Because a molecule with higher catalytic activity
2047: helps the synthesis of others more, this inverse relationship is
2048: expected. Indeed, the population sizes of just three species $A,B,C$,
2049: with the catalytic relationship $A\rightarrow B \rightarrow C
2050: \rightarrow A$ are estimated by taking the continuum limit $N
2051: \rightarrow \infty$ and obtaining a fixed point solution of the rate
2052: equation for the concentrations of the chemicals as discussed in
2053: \cite{Eigen}. From a straightforward calculation we have:
2054: $N_A:N_B:N_C= c_A^{-1}:c_B^{-1}:c_C^{-1}$.
2055:
2056: Here, the $C$ molecule is catalyzed by a molecule species with higher
2057: activities but larger populations ($A$). Hence, the parasitic
2058: molecule species cannot easily invade to disrupt this mutually
2059: catalytic network. Since the minority molecule ($C$) is catalyzed by
2060: the majority molecule ($A$) (with the aid of another molecule ($B$)),
2061: a large fluctuation in molecule numbers is required to destroy this
2062: network.
2063:
2064: The stability in the minority molecule is also accelerated by the
2065: complexity in IHN. If the catalytic activity of $C$ is highest, the
2066: recursive state here is mainly achieved by catalysis of the molecule
2067: $C$. On the other hand, this also implies that $C$ is the minority in
2068: the core network. (The population of the molecule $C$ is usually
2069: larger than $D$,$E$, etc. in Fig.13, though.) Hence the attack to $C$
2070: molecule is most relevant to destroy this recursive state. In the
2071: IHN, this minority molecule species is involved in several hypercycles
2072: as in $C$ in Fig.13. This, on the one hand, demonstrates the
2073: prediction in \S 4.5, that more species are catalyzed by the minority
2074: molecules, while on the other hand, leads to the suppression of the
2075: fluctuation in the number of minority molecules, as will be discussed
2076: in \S 5.4. With the decrease of the fluctuation, the probability that
2077: the minority molecules is extinct is reduced, so that the recursive
2078: state is hardly destroyed.
2079:
2080: (c) {\bf Localization in a Random Network}
2081:
2082: The present system belongs to a class of system with reaction and
2083: diffusion, while the structural change by replication error leads to
2084: the diffusion within the network space. With random connection in the
2085: catalytic network, the present system is nothing but a
2086: reaction-diffusion in a random network. Generally, such problem is
2087: related with the Anderson localization, where concentrations are
2088: localized within some part of the network, depending on the degree of
2089: the connectivity in the network and the strength of the diffusion
2090: coupling. From this viewpoint, the formation of IHN, localized only
2091: within a limited species in the global network, may be understood as an
2092: example of such localization. It will be interesting to study the
2093: stability of the recursive production, in terms of the localization
2094: transition in the reaction network\cite{Takagi}.
2095:
2096: \begin{figure}
2097: \noindent
2098: \epsfig{file=alaska4.ps,width=.5\textwidth}
2099: \caption{An example of mutually catalytic network in our model. The
2100: core network for the recursive state is shown by circles, while
2101: parasitic molecules ($X$,$Y$,..) connected by broken arrows, are
2102: suppressed at a (quasi-)recursive state.}
2103: \end{figure}
2104:
2105: \subsubsection{Switching}
2106:
2107: Next, we discuss the mechanism of switching. In the phase (3), the
2108: recursive production state is destabilized, when the population of
2109: parasitic molecules increase. For example, the number of the molecule
2110: $C$ may be decreased due to fluctuations, while the number of some
2111: parasitic molecules ($X$) that are not originally in the catalytic
2112: network but are catalyzed by $C$, may increase. Frequency of such
2113: fluctuation increases as the total population of molecules in a cell
2114: is smaller. If such fluctuation appears, the other molecule species
2115: in the original network loses the main source of molecules that
2116: catalyze their synthesis, successively. Then the new parasitic
2117: molecule $X$ occupies a large portion of populations. However, the
2118: molecule's main catalyst ($C$) soon disappears, the synthesis of $X$
2119: is stopped, and this species $X$ is taken over by some molecules $Y$
2120: that are catalyzed by $X$ (see the broken arrows in Fig.13). Then,
2121: within a few generations, dominant species changes, and recursive
2122: production does not continue. Indeed, this is what occurred in the phase
2123: (1). Then the parasitic molecule $X$ is taken over some other
2124: $Y$. This take-over by parasites continues successively, until a new
2125: (or same) recursive state with hypercycle network is formed. Hence
2126: the fluctuation in the minority molecule in the core network is
2127: relevant to the switching process.
2128:
2129: \subsection{Evolution}
2130:
2131: {\bf Model A}
2132:
2133: The next question we have to address is whether the recursive
2134: production state is achieved through evolution. To check this problem
2135: we have extended our model to further include a ``mutational'' change
2136: of network at each division event. (model A).
2137: To be specific, at each division
2138: event we add or delete randomly (with equal probability) a few
2139: reaction paths, whose connection $i \rightarrow j$ is again chosen
2140: randomly. Here to see the evolution of catalytic activity, the index
2141: of the species is ordered with the value of catalytic activity, i.e.,
2142: the index $j$ is ordered so that $c_j$ monotonically increases with
2143: $j$. Since the mutational change is assumed to be random, a new path
2144: is added or deleted independent of the catalytic activity. In the
2145: simulation displayed here, there are 5 mutations of the network path
2146: at every generation. We have carried out numerical experiments of this
2147: model, to see if the path rate of the network stays around the state
2148: supporting the recursive production.
2149:
2150: \begin{figure}
2151: \noindent
2152: (a)\epsfig{file=figev0c.ps,width=.55\textwidth}
2153: (b)\epsfig{file=figev1c.ps,width=.55\textwidth}
2154: (c)\epsfig{file=figev2c.ps,width=.55\textwidth}
2155: \caption{Evolution of path-rates, recursiveness, and division time,
2156: plotted versus generation. The total number of species $k$ is 500,
2157: where $c_i$ is chosen as $100^{-(k-i)/k}$, so that it ranges from
2158: 0.01 to 1.0 equally in logarithmic scale. The number of molecules $N$ in
2159: a cell is set at 50,000, so that the cell divided when the total
2160: molecule number is 100,000. The initial path rate is set at
2161: $\rho=0.1$, i.e., 125,000 paths totally. At every division 5 paths
2162: are "mutated", i.e., with equal probability 5 paths are added or
2163: eliminated randomly. Totally there are $M_{tot}=100$, so that one of
2164: 100 cells are eliminated when one cell is divided into two. (a) the
2165: total path number. The path rate is obtained by dividing the number by
2166: $k^2$. (b) the division time, i.e., the required steps for a cell
2167: divide (c) the similarity $H^1(i)$, defined in \S 5.2.}
2168: \end{figure}
2169:
2170: %Chemical diversity is computed as $\sum_j p(j) log p(j) $ $p(i)=N(i)/N$ with }
2171:
2172: \begin{figure}
2173: \noindent
2174: %\epsfig{file=figev.ps,width=.8\textwidth}
2175: \epsfig{file=figevC.ps,width=.8\textwidth}
2176: \caption{Evolution of cell: Those species $i$ with $N(i)>100$ are
2177: plotted with the vertical axis as the species index $i$, and the
2178: longitudinal axis as the generation. The data are from the result of
2179: the simulation for Fig.14.}
2180: \end{figure}
2181:
2182: \begin{figure}
2183: \noindent
2184: \epsfig{file=evnet.ps,width=.5\textwidth}
2185: \caption{The catalytic network of the species
2186: that constitute the recursive state around $10^6$ th generation of Fig.14 or 15.
2187: }
2188: \end{figure}
2189:
2190: An example of the time series of path rates at each generation is
2191: shown in Fig.14, as well as the time series of the division time, and
2192: chemical diversity. Corresponding to this time series, the change of
2193: dominant species is plotted over generations in Fig.15.
2194:
2195: As shown, the recursive state is achieved, and is maintained over many
2196: generations, until it switches to other states. At each reproduction,
2197: there are changes in the reaction paths here. In spite of such
2198: mutations, the recursive production state is sustained over many
2199: generations. In each recursive production state, the path rate
2200: remains rather low. Here, such network that supports the recursive
2201: production is selected and is maintained. Note that many molecules are
2202: catalyzed by the minority species in the core hypercycle network. In
2203: this sense, a prototype of the evolution to package the information
2204: into the minority molecule that is suggested in \S 4.5 is observed
2205: here.
2206:
2207: An example of the network of dominant species is given in Fig. 16.
2208: Here intermingled hypercycle networks (IHN) are formed so that
2209: recursive production is formed. Again, there is a core hypercycle,
2210: and other hypercycles are connected with it. The surviving molecule
2211: species have a large connectivity in reaction paths, much larger than
2212: expected from a random network of the reaction path rate here. As in
2213: Fig 16. the IHN here forms a highly connected network, even though the
2214: average path rate remains small (As shown in Fig.14, the path per
2215: species is about 0.1 or lower). The paths forming the IHN are
2216: preserved over long generations, while a few paths are sometimes
2217: eliminated. Here, coexistence of several parallel paths among species
2218: is important to give the robustness of the recursive state against
2219: mutation that may delete one of the paths. As in the dynamics of the
2220: phase (3), the recursive production state is destabilized finally with
2221: the mutation of reaction paths, while after some generations, other
2222: recursive networks are formed through the mutation of the network.
2223:
2224: To sum up, the phase (3) gives a basis for evolvability, since a
2225: novel, (quasi-)recursive state with different chemical compositions is
2226: visited successively.
2227:
2228: {\bf Model B}
2229:
2230: So far, we have assumed that the structural change in the replication
2231: can occur equally to any other molecule species. Of course, this is a
2232: simplification, and the replication error occurs only to limited types
2233: of molecules species that have similarity to the original. To see
2234: this point, we have studied another model (model B) with some
2235: modifications from the original model of \S 5.1.
2236:
2237: Here, the catalytic activity is set as $c_i=i/k$, i.e., the activity
2238: is monotonically increasing with the species index. Then, instead of
2239: global change to any molecule species by replication error, we modify
2240: the rule so that the change occurs only within a given range $i_0 (\ll
2241: k)$ i.e., when the molecule species $j$ is synthesized, with the error
2242: rate $\mu$, the molecule $j+j'$ with $j'$ a random number over
2243: $[-i_0,i_0]$ is synthesized.
2244:
2245: In this {\bf model B}, we have not included any change of the network.
2246: The network is fixed in the beginning, and is not changed through the
2247: simulation. Instead, by local change of structural error, the range
2248: of species evolve by generations. Here we take species only with
2249: $i<i_{ini}$ in the initial condition, and examine if the evolution to
2250: a network with higher catalytic activities (i.e., with much larger
2251: $i$) progresses or not. In other words, we examine if the indices $i$
2252: in the network increase successively or not. An example is shown in
2253: Fig.17, where the catalytic activity increases through successively
2254: switching to one (quasi-)recursive state ( consisting of species
2255: within the width of the order $2i_0$ ), to another.
2256:
2257: Here the switching occurs as in the phase (3). With the pressure for
2258: selection of the protocells, cells with a new (quasi-)recursive state
2259: are selected that consist of molecules with higher catalytic
2260: activities (i.e., with larger indices of species). Again each
2261: recursive state consists of IHN, and the species with the highest
2262: catalytic activity in the core hypercycle is minority in population.
2263: Once the population of such species is decreased by fluctuations,
2264: there occurs a switch to a new state that has higher catalytic
2265: activities, and the species indices successively increase. Hence,
2266: evolution from a rather primitive cell consisting of low catalytic
2267: activities to that with higher activities is possible, by taking
2268: advantage of minority molecules.
2269:
2270: Note that this switching cannot occur if the total number of molecules
2271: $N$ is small. When the number is too small, the mutation of paths to
2272: destroy the recursive state hardly occurs. On the other hand, if the
2273: total number of molecules is too large, it is harder to establish a
2274: recursive state, due to a larger possibility to change the network.
2275: Hence, there is optimal value of the number of molecules in a
2276: protocell to realize the recursive production as well as the
2277: evolution.
2278:
2279: \begin{figure}
2280: \noindent
2281: %\epsfig{file=figevt.ps,width=.8\textwidth}
2282: \epsfig{file=figevtCc.ps,width=.8\textwidth}
2283: \caption{Evolution of species in a cell: Those species $i$ with $N(i)>100$ are plotted with the vertical axis
2284: as the species index $i$, and the longitudinal axis as the generation.
2285: The total number of species $k$ is 5000, where $c_i$ is chosen as
2286: $c_i=i/k$, so that it ranges from 0.0002 to 1.0 equally distributed.
2287: The number of molecules in a cell is set at 8,000, so that the cell divided
2288: when the total molecule number is 16,000. The path rate is set at $\rho= 0.1$.
2289: The replication error for the species occurs within the range of species
2290: $[i-100,i+100]$, instead of global selection from all species.
2291: Totally there are $M_{tot}=10$ cells,
2292: so that one of 10 cells is eliminated when a cell is divided into two.}
2293: \end{figure}
2294:
2295: \subsection{Statistical Law}
2296:
2297: To close the present section, we investigate the fluctuations of the
2298: molecule numbers of each of the species, by coming back to the
2299: original model studied in \S 5.2, without evolution of reaction paths.
2300: The characteristics of the fluctuations of the number of each molecule
2301: species over the generations can have a significant impact on the
2302: recursive production of a cell, since the number of each molecule
2303: species is not very large. In order to quantitatively characterize
2304: the sizes of these fluctuations, we have measured the distribution
2305: $P(N_i)$ for each molecule species $i$, by sampling over division
2306: events.
2307:
2308: Our numerical results are summarized as follows:
2309:
2310: (I) For the fast switching states, the distribution $P(N_i)$ satisfies
2311: the power law
2312:
2313: \begin{equation}
2314: P(N_i) \approx N_i^{-\alpha},
2315: \end{equation}
2316:
2317: \noindent
2318: with $1< \alpha \approx 2$, as shown in Fig. 18a. The exponent $\alpha$ depends
2319: on the parameters, and approaches 2 as alternation of dominant species is more frequent.
2320: For example, as shown in Fig. 18b, thex exponent $\alpha$ increases from 1 to 2, with the
2321: increase of the error rate $\mu$.
2322:
2323:
2324: (II) For recursive states, the fluctuations in the core network
2325: (i.e., 13,11,109 in Fig.9a or 10a) are typically small, (and are roughly
2326: fit by Gaussian distribution). On the other hand, for species that are
2327: peripheral to but catalyzed by the core hypercycle, the number distribution is
2328: closer to log-normal distributions
2329:
2330: \begin{equation} P(N_i) \approx \exp(-\frac{(\log N_i-\overline{\log N_i})^2}{2\sigma}),
2331: \end{equation}as shown in Fig.19.
2332:
2333: Even though the distribution does not agree well with the log-normal distribution,
2334: at least, the distribution if roughly symmetric after taking the logarithm
2335: (i.e., as the 0-th approximation the distribution is not normal but log-normal).
2336: The origin of the log-normal distributions here can be understood
2337: by the following rough argument: for a replicating system, the
2338: growth of the molecule number $N_m$ of the species $m$ is given by
2339:
2340: \begin{equation}
2341: dN_m/dt=AN_m,
2342: \end{equation}
2343:
2344: \noindent
2345: where $A$ is the average effect of all the molecules that catalyze $m$.
2346: We can then obtain the estimate
2347:
2348: \begin{equation}
2349: d\log N_m/dt =\overline{a} +\eta(t),
2350: \end{equation}
2351:
2352: \noindent
2353: by replacing $A$ with its temporal average $\overline{a}$ plus
2354: fluctuations $\eta(t)$ around it. If $\eta(t)$ is approximated by a
2355: Gaussian noise, the log-normal distribution for $P(N_m)$ is suggested
2356: This argument is valid if $\overline{a}>0$. As such this equation
2357: diverges with time, but here, the cell divides into two before the
2358: divergence becomes significant. Although the asymptotic distribution
2359: as $N \rightarrow \infty$ is not available then, the argument on the
2360: distribution form is valid as long as $N$ is sufficiently large.
2361:
2362: For the fast switching state, the growth of each molecule species is
2363: close to zero on the average. In this case the Langevin equation (12) can
2364: approach 0, and we need to consider the equation by seriously
2365: taking into account of the absorbing boundary condition at $N_m=0$.
2366: By taking into account of the normalization of the probability,
2367: the stationary solution for the Fokker-Planck equation corresponding to eq.(12)
2368: for $\overline{a} \leq 0$ is given by
2369: \begin{equation}
2370: P(N) \propto N^{-(1+\nu)},
2371: \end{equation}
2372: with
2373: \begin{equation}
2374: \nu =|\overline{a}|/(\overline{a^2}-\overline{a}^2).
2375: \end{equation}
2376: (see e.g., \cite{Sornette,Mikhailov-book}). Change of the exponent $\alpha$ against the
2377: error rate in Fig.18b will be understood as the change of the ratio of variance
2378: to the mean of $a$.
2379:
2380: \begin{figure}
2381: \noindent
2382: %\epsfig{file=figh.ps,width=.9\textwidth}
2383: (a)\epsfig{file=fighC.ps,width=.9\textwidth}
2384: (b)\epsfig{file=fighCm.ps,width=.9\textwidth}
2385: \caption{The number distribution of the molecules corresponding to the
2386: network in Fig.7 (fast switching states). (a); The distribution is sampled
2387: from 100000 division events. Plotted for 4 molecule species among 500.
2388: Log-Log plot. (b) Change of the distribution with the change of the error rate $\mu$,
2389: for a specific molecule species.}
2390: \end{figure}
2391:
2392: \begin{figure}
2393: \noindent
2394: \epsfig{file=fig4.ps,width=.9\textwidth}
2395: \caption{The number distribution of the molecules corresponding to the
2396: network in Fig.9a or 10a. The distribution is sampled from 1000 division
2397: events. From right to left, the plotted species are
2398: 11,109,13,155,176,181,195,196,23. Log-Log plot.}
2399: \end{figure}
2400:
2401: If several molecules mutually catalyze each other, however, one would
2402: expect that the fluctuations will not increase as in the Brownian
2403: motion as in eq. (12). For example, consider that the number of one
2404: species in the core cycle increase due to the fluctuation. Then it
2405: relatively decreases the number of molecules of the other species in
2406: the core network, resulting in the suppression of the catalytic
2407: reaction to replicate the increased species. Then the catalytic
2408: molecule of the original molecule species decreases. Hence the
2409: fluctuations in the core hypercycle is reduced.
2410:
2411: Another reason for the reduction of fluctuation of the species in the
2412: core cycle is high connectivity in the IHN. The chemicals of core
2413: part has catalytic paths with a large number of molecule species.
2414: Hence many processes work in parallel to the synthesis of the core
2415: species. Then, fluctuations due to other chemical concentrations are
2416: added in parallel. Thus, the fluctuations can come close to Gaussian
2417: distribution (recall the central limit theorem).
2418:
2419: Note also that for some networks, the distributions of the molecule
2420: numbers in the recursive sates may sometimes be intermediate between
2421: log-normal and Gaussian, and occasionally even have double peaks.
2422:
2423: By studying a variety of networks, the observed distributions of the
2424: molecule numbers can be summarized as:
2425:
2426: \begin{itemize}
2427:
2428: \item
2429: (1)Distribution close to Gaussian form, with relatively small
2430: variances in the core (hypercycle) of the network.
2431:
2432: \item
2433: (2)Distribution close to log-normal, with larger fluctuations
2434: for a peripheral part of the network.
2435:
2436: \item
2437: (3) Power-law distributions
2438: for parasitic molecules that appear intermittently.
2439:
2440: \end{itemize}
2441:
2442: To quantitatively study the magnitude of variance in the IHN for the
2443: recursive production, we have also plotted the variance
2444: $\overline{(N_i-\overline{N_i})^2}$ ($\overline{..}$ is the average of
2445: the distribution $P(N_i)$). As can be seen in Fig.20, the variance in
2446: the core network are small, especially for the minority species (i.e.,
2447: 13). For molecule species that do not belong to the core hypercycle,
2448: the variance scaled by the average increases as the average decreases.
2449: Suppression of the relative fluctuation in the core hypercycle comes
2450: from the direct feedback of the population change of the molecule
2451: species in the core, as well as multiple parallel reaction paths, as
2452: already mentioned.
2453:
2454: \begin{figure}
2455: \noindent
2456: %\hspace{-.3in}
2457: \epsfig{file=fig50.ps,width=.5\textwidth}
2458: %\includegraphics[width=57mm]{fig5.ps}
2459: \caption{Scaled variance, i.e., the variance of the molecule number
2460: divided by its average is plotted against the average. From the
2461: largest to the smaller, the species 11 (the largest $\overline{N_i}$),
2462: 109(the second largest),13,155,194,176,195,181,196,23, 34(smallest
2463: $\overline{N_i}$) are plotted. Computed from the data in Fig.19. The
2464: asterisk denotes the species 13, that has largest catalytic activity here
2465: and the minority in the hypercycle core. Adapted from \cite{KK-PRE}.}
2466: \end{figure}
2467:
2468: {\bf Remark: Universal Statistics}
2469:
2470: Quite recently Furusawa and the author\cite{Zipf,
2471: log} have studied several models of minimal cell consisting
2472: of catalytic reaction networks, without assuming the replication
2473: process itself. In other words, the molecules are successively
2474: synthesized from nutrition chemicals transported from the membrane,
2475: where the level-(1) model of \S 3.2 is adopted. They have found
2476: universal statistical law of chemicals for a cell that grows
2477: recursively.
2478:
2479: (i) The number of molecules of each chemical species over all cells
2480: generally obey the log-normal distribution. This distribution is
2481: universally observed for a state with recursive production. Existence
2482: of such log-normal distributions is also experimentally
2483: verified\cite{Zipf}. Ubiquity of log-normal distribution in
2484: the level-(2) model described in this section is thus supported in the
2485: level-(1) model.
2486:
2487: (ii) A power law in the average abundances of chemicals. This is
2488: statistics against a huge number of molecule species. When the
2489: abundances of all chemical species are ordered according to the
2490: magnitude, the abundances of chemicals are inversely proportional to
2491: the rank of the magnitude. Such law was originally found in the
2492: linguistics by Zipf\cite{Zipf-book}. This Zipf's law on chemical abundances
2493: \cite{Zipf} is found to be universal when a cell optimizes
2494: the efficiency and faithfulness of self-reproduction. It is a
2495: universal statistics when the cell model shows a recursive growth
2496: under fluctuations in the molecule numbers. Furthermore, using data
2497: from gene expression databases on various organisms and tissues,
2498: the abundances of expressed genes exhibit this
2499: law. Thus, the universal statistics are also supported
2500: experimentally. It is shown that this power law of gene expression is
2501: maintained by a hierarchical organization of catalytic reactions.
2502: Major chemical species are synthesized, catalyzed by chemicals with a
2503: little less abundant chemicals. The latter chemicals are synthesized
2504: by chemicals with much less abundance, and this hierarchy of
2505: catalytic reactions continues until it reaches the minor chemical
2506: species.
2507:
2508: {\bf Remark: Search for the deviation from universal statistics}
2509:
2510: So far we have observed ubiquity of log-normal distribution, in
2511: several models. The fluctuations in such distribution are generally
2512: very large. This is in contrast to our naive impression that a
2513: process in a cell system must be well controlled.
2514:
2515: Then, is there some relevance of such large fluctuations to biology? Quite
2516: recently, we have extended the idea of fluctuation-dissipation theorem
2517: in statistical physics to evolution, and proposed linear relationship
2518: (or high correlation) between (genetic) evolution speed and
2519: (phenotypic) fluctuations. This proposition turns out to be supported
2520: by experimental data on the evolution of E Coli to enhance the
2521: fluorescence in its proteins\cite{Sato}. Hence the fluctuations are
2522: quite important biologically.
2523:
2524: The log-normal distribution is also rather universal in the present
2525: cell, as demonstrated in the distribution of some proteins, measured
2526: by the degree of fluorescence\cite{log}. Now, is this universality the final
2527: statement for "cell statistical mechanics"? We have to be cautious
2528: here, since too universal laws may not be so relevant to biological
2529: function. In fact, chemicals that obey the log-normal distribution
2530: may have too large fluctuations to control some function. Some other
2531: mechanism to suppress the fluctuation may work in a cell.
2532:
2533: Indeed, the minority control suggests the
2534: possibility of such control to suppress the fluctuation, as discussed
2535: in \S 4.5. For a recursive production system, some mechanism to
2536: decrease the fluctuation in minority molecule may be evolved.
2537:
2538: At least there can be two possibilities to decrease the fluctuation
2539: leading to deviation from log-normal distribution.
2540:
2541: The first one is some negative feedback process. In general, the
2542: negative feedback can suppress the response as well
2543: as the fluctuation. Still, it is not a trivial question how chemical
2544: reaction can give rise to suppression of fluctuation, since to realize
2545: the negative feedback in chemical reaction, production of some
2546: molecules is necessary, which may further add fluctuations.
2547:
2548: The second possible mechanism is the use of multiple parallel reaction
2549: paths. If several processes work sequentially, the fluctuations would
2550: generally be increased. When reaction processes work in parallel for
2551: some species, the population change of such molecule is influenced by
2552: several fluctuation terms added in parallel. If a synthesis (or
2553: decomposition) of some chemical species is a result of the average of
2554: these processes working in parallel, the fluctuation around this
2555: average can be decreased by the law of large numbers. Indeed, the
2556: minority in the core network that has higher reaction paths has
2557: relatively lower fluctuation as in Fig.20. Suppression of fluctuation
2558: by multiple parallel paths may be a strategy adopted in a cell. Note
2559: that this is also consistent with the scenario that more and more
2560: molecules are related with the minority species as discussed in \S
2561: 4.5. With the increase of the paths connected with the minority
2562: molecules, the fluctuation of minority molecules is reduced, which
2563: further reinforces the minority control mechanism. Hence the increase
2564: of the reaction paths connected with the minority molecule species
2565: through evolution, decrease of the fluctuation in the population of
2566: minority molecules, and enhancement of minority control reinforce
2567: each other. With this regards, search for molecules that deviate from
2568: log-normal distribution should be important, in future.
2569:
2570: In physics, we are often interested in some quantities that deviate
2571: from Gaussian (normal) distribution, since the deviation is exceptional. Indeed, in physics, search for
2572: power-law distribution or log-normal distributions has been popular
2573: over a few decades. On the other hand, a biological unit can grow and
2574: reproduce, to increase the number. For such system, the components
2575: within have to be synthesized, so that amplification process is
2576: common. Then, the fluctuation is also amplified. In such system, the
2577: power-law or log-normal distributions are quite common, as already
2578: discussed here, and as is also shown in several models and experiments
2579: \cite{Zipf,log}. In this case, the Gaussian (normal)
2580: distribution is not so common (normal). Then exceptional molecules
2581: that obey the normal distribution with regards to their concentration
2582: may be more important.
2583:
2584: Also, the ubiquity of log-normal distribution we found is true for a
2585: state with recursive production. If a cell is not in a stationary
2586: growth state but in a transient process switching from one steady
2587: state to another, the universal statistics can be violated. Search
2588: for such violation will be important both experimentally and
2589: theoretically.
2590:
2591: \section{Summary}
2592:
2593: We have studied a problem of recursive production and evolution of a
2594: cell, by adopting a simple protocell system. This protocell consists of
2595: catalytic reaction network with replicating molecules. The basic
2596: concepts we have proposed through several simulations are as follows:
2597:
2598: (i) {\bf Minority control}
2599:
2600: In a cell system with mutually catalytic molecules, replicating
2601: molecules with a smaller size in population are shown to control the
2602: behavior of the total cell system. This minority controlled state is
2603: achieved by preserving rare fluctuations with regards to the molecule
2604: number. The molecule species, minority in its number, works as a
2605: carrier of heredity, in the sense that it is preserved well with
2606: suppressed number fluctuations and that it controls the behavior of a
2607: cell relatively strongly. Since molecules that are replicated by
2608: this minority species are also preserved, more molecules will be synthesized
2609: with the hep of it. In addition,
2610: reaction paths to stabilize the replication of this
2611: minority molecules is expected to evolve. Hence, the replication of more and
2612: more molecule species is packaged into the synthesis of this minority
2613: molecule, that also ensures the transmission of the minority molecule.
2614: The minority molecule species, thus, gives a basis for "genetic
2615: information". Hence evolution from loose reproduction system to a
2616: faithful replication system with genes is understood from a kinetic viewpoint
2617: of chemical reaction.
2618:
2619: (2) {\bf Recursiveness of production in an intermingled hypercycle network}
2620:
2621: Next, a protocell model consisting of a variety of mutually catalyzing
2622: molecule species is investigated. When the numbers of molecules in a
2623: cell is not too small and the number of possible species is not too
2624: large in a cell, recursive production of a cell is achieved. This
2625: recursive production state consists of 5-12 dominant molecule species,
2626: which form intermingled hypercycle network(IHN). Within this IHN, there
2627: is a core hypercycle, while parallel multiple reaction paths in the
2628: IHN are important to ensure the stability of the state against invasion
2629: of parasitic molecules and against fluctuations in the molecule number.
2630:
2631: (3) {\bf Itinerant dynamics over recursive production states}
2632:
2633: When the fluctuation in molecule number is not small enough, there
2634: appears switches over (quasi-)recursive production states. A given
2635: quasi-recursive state is destabilized by being taken over by some parasitic
2636: molecules. Then, the dominant molecule species change frequently by
2637: generations, where the growth speed of a cell is suppressed. After this
2638: transient, the fast change of chemical compositions is reduced so that a
2639: quasi-recursive production of a cell is sustained again. Each switching
2640: occurs with the loss of chemical diversity. Note that in
2641: high-dimensional dynamical systems, such switching over quasi-stable
2642: states through unstable transient dynamics is studied as chaotic
2643: itinerancy\cite{CI1,CI2,CI3}, where the loss of degrees of freedom is also
2644: observed in the process of switching.
2645:
2646: Destabilization of a recursive state in the present model occurs through
2647: the decrease of the population of the minority molecules in the core
2648: hyper cycle. As this molecule species is taken over by parasitic
2649: molecules, the switching starts to occur. In this sense, the process in
2650: the switching is not random, but is restricted to specific routes within
2651: the phase space of chemical composition, as in the chaotic itinerancy.
2652: It is interesting to study the present switching over recursive state as
2653: a stochastic version of chaotic itinerancy.
2654:
2655: (4){\bf Evolution through itinerant dynamics}
2656:
2657: By considering change in the available reaction paths to the model, this
2658: hypercycle network evolves to recursive production states. Following the
2659: itinerant dynamics above, each recursive state is later destabilized,
2660: but later another recursive state is evolved. Through these successive
2661: visits of recursive states, a cell can evolve to have a chemical
2662: network supporting a higher growth speed. Since the minority species in
2663: the hypercycle network is relevant to this switch, minority molecules
2664: are shown to be important to evolution.
2665:
2666: (5){\bf Universal statistics and control of fluctuations}
2667:
2668: Statistics of the number fluctuations of each molecule species is
2669: studied. We have found that (i)power-law distribution of fast switching
2670: molecules (ii) suppression of fluctuation in the core hypercycle species
2671: and (iii) ubiquity of log-normal distribution for most other molecule
2672: species. The origin of log-normal distribution is generally due to
2673: multiplicative stochastic process in the catalytic reaction dynamics,
2674: as is confirmed in several other reaction network models. On the other
2675: hand, suppression of the number fluctuations of the core hypercycle is
2676: due to high connections in reaction paths with other molecules. In
2677: particular, reduced is the number fluctuations of the minority molecule
2678: species that has high catalytic connections with others. This
2679: suppression of fluctuation further reinforces the minority control for
2680: the reproduction of a cell. The deviation from ubiquitous log-normal
2681: distribution thus appears, which may be important in control of cell
2682: function.
2683:
2684: In the present paper, we have not discussed cell-cell interaction, and
2685: restricted our study only to a production process of a single cell. Of
2686: course, cells start to interact with each other, as the cell density is
2687: increased through the cell division. Indeed, including the cell-cell
2688: interaction to the present cell model with reaction network, cell
2689: differentiation and morphogenesis of a cell aggregate are
2690: studied\cite{KKTY,Furusawa}. Through instability of
2691: intra-cellular dynamics with cell-cell interaction, cell
2692: differentiation, irreversible loss of plasticity in cells, and robust
2693: pattern formation process appear as a general course of development with
2694: the increase of the cell number. Relevance of minority control and
2695: deviation from universal statistics to such multicellular developmental
2696: process will be an important issue to be studied in future.
2697:
2698: {\sl acknowledgments}
2699:
2700: The author is grateful to T. Yomo, C. Furusawa, W. Fontana, Y. Togashi,
2701: A. Awazu, and K. Fujimoto for discussions. The work is partially
2702: supported by Grant-in-Aids for Scientific Research from the Ministry of
2703: Education, Science, and Culture of Japan (11CE2006).
2704:
2705:
2706:
2707: \begin{thebibliography}{999}
2708:
2709: \bibitem{whatlife}
2710: K. Kaneko 'What is Life?: A complex systems approach", in Japanese,
2711: Univ Tokyo Press. 2003
2712:
2713: \bibitem{minority}
2714: K. Kaneko, T. Yomo,
2715: % 2002a. On a kinetic origin of heredity :minority control in
2716: %replicating molecules.
2717: J. Theor. Biol. 214 (2002) 563-576
2718:
2719: \bibitem{Shannon}
2720: C. Shannon and W. Weaver ``The Mathematical Theory of Communication",
2721: Univ. of llinois Press, 1949
2722:
2723: \bibitem{Brillouin}
2724: L. Brillouin,
2725: {\sl Science and Information Theory},
2726: Academic Press 1969
2727:
2728: \bibitem{Barabasi}
2729: H. Jeong, et al., {\it Nature} {\bf 407}, 651 (2000);
2730: H. Jeong, S. P. Mason, A.-L. Barab\'{a}si, {\it Nature} {\bf 411}, 41 (2001).
2731:
2732: \bibitem{Spiegelman}
2733: D.R. Mills, R.L. Peterson, and S. Spiegelman,
2734: %An extracellular Darwinian experiment with a self-duplicating nucleic acid molecule,
2735: Proc. Nat. Acad. Sci. USA 58 (1967) 217;
2736: D.R. Mills, F.R. Kramer, and S. Spiegelman,
2737: %Complete nucleotide sequence of a replicating RNA molecule,
2738: Science 180 (1973) 916
2739:
2740: \bibitem{Eigen}
2741: M. Eigen and P. Schuster, {\sl The Hypercycle} (Springer, 1979).
2742:
2743: \bibitem{Hogeweg}
2744: M. Boerlijst and P. Hogeweg, Physica 48D (1991) 17;
2745: P.Hogeweg
2746: %``Multilevel evolution: replicators and the evolution of diversity",
2747: Physica 75 D (1994)275-291
2748:
2749: \bibitem{Dyson}
2750: F. Dyson, {\sl Origins of Life}, Cambridge Univ. Press., 1985
2751:
2752: \bibitem{Kauffman}
2753: S.A. Kauffman, {\sl The Origin of Order}, Oxford Univ. Press. 1993
2754:
2755: \bibitem{Bagley}
2756: R.Bagley, J.D. Farmer, S. Kauffmans,
2757: pp 93-140, in {\sl Artificial Life} 1989, ed. C. Langton
2758:
2759: \bibitem{Cairns-Smith}
2760: A.G. Cairns-Smith,
2761: Clay Minerals and the Origin of Life(1982), Cambridge Univ. Press.
2762:
2763: \bibitem{mtb}
2764: K. Kaneko,
2765: in {\sl Function and Regulation of Cellular Systems} (2003)
2766: %Constructive and Dynamical Systems Approach to Life "
2767: Birkhauser (ed. A. Deutsch et al.)
2768:
2769: \bibitem{Complexity}
2770: K. Kaneko,
2771: %``Life as Complex Systems: Viewpoint from Intra-Inter Dynamics'',
2772: Complexity, 3 (1998c) 53-60
2773:
2774: \bibitem{KKTY}
2775: K. Kaneko and T. Yomo,
2776: %`` Cell Division, Differentiation, and Dynamic Clustering",
2777: Physica 75 D (1994), 89-102;
2778: B. Math.Biol. 59 (1997) 139;
2779: %``Isologous Diversification for Robust Development of Cell Society ",
2780: J. Theor. Biol., 199 243-256 (1999)
2781:
2782: \bibitem{Furusawa}
2783: Furusawa C. \& Kaneko K.,
2784: %``Emergence of Rules in Cell Society: Differentiation, Hierarchy, and Stability"
2785: Bull.Math.Biol. 60(1998) 659-687;
2786: %Furusawa C, Kaneko K. 2000a. Origin of complexity in multicellular organisms.
2787: Phys Rev Lett. 84:6130-6133
2788: %C. Furusawa and K. Kaneko;
2789: %Theory of Robustness of Irreversible Differentiation in a Stem Cell
2790: %System: Chaos Hypothesis;
2791: J. Theor. Biol. 209 (2001) 395-416;
2792: Anatomical Record, 268 (2002) 327-342;
2793: J. Theor. Biol. 224 (2003) 413-435.
2794:
2795: \bibitem{speciation}
2796: K. Kaneko and T. Yomo,
2797: % ``Sympatric Speciation: compliance with phenotype diversification from a single genotype ",
2798: Proc. Roy. Soc. B, 267 (2000) 2367-2373;
2799: K. Kaneko,
2800: %" Symbiotic Sympatric Speciation: Compliance with Interaction-driven Phenotype Differentiation from a Single Genotype "
2801: Population Ecology, 44 (2002) 71-85
2802:
2803: \bibitem{Matsuura}
2804: T. Matsuura, T. Yomo, M. Yamaguchi, N. Shibuya., E.P. Ko-Mitamura, Y. Shima, and
2805: I. Urabe
2806: %``Importance of compartment formation for a self-encoding system",
2807: Proc. Nat. Acad. Sci. USA 99 (2002) 7514-7517
2808:
2809: \bibitem{Ko}
2810: E. Ko, T.Yomo, and I. Urabe, Physica 75 D (1993)81-88
2811:
2812: \bibitem{Kashiwagi1}
2813: Kashiwagi A., Noumachi W., Katsuno M., Alam M.T., Urabe I., and Yomo T.
2814: %``Plasticity of Fitness and Diversification Process During an Experimental Molecular Evolution",
2815: J. Mol. Evol., (2001) {\bf 52} 502-509 .
2816:
2817: \bibitem{Kashiwagi2}
2818: A. Kashiwagi, I. Urabe, K. Kaneko, T. Yomo, submitted (2003)
2819:
2820: \bibitem{Asashima}
2821: T. Ariizumi and M. Asashima,
2822: Int. J. Devl Biol. 45 (2001) 273-279
2823:
2824: %\bibitem{McCaskill}
2825: %S.Altmeyer and J.S. McCaskill,
2826: %Phys. Rev. Lett. 86 (2001) 5819%-5822
2827:
2828: \bibitem{AL}
2829: C. Langton eds. Artificial Life 1989, Adisson Wesley
2830:
2831: \bibitem{Fontana}
2832: W. Fontana and L.W. Buss, 1994.
2833: %The arrival of the fittest: Toward a theory of biological organization.
2834: Bull Math Biol 56:1-64
2835:
2836: \bibitem{Awazu}
2837: A. Awazu and K. Kaneko, preprint 2003.
2838:
2839: \bibitem{Zipf}
2840: C. Furusawa and K. Kaneko, Phys. Rev. Lett. 90 (2003) 088102.
2841:
2842: \bibitem{Cell}B. Alberts, D.Bray, J. Lewis, M. Raff, K. Roberts, and J.D. Watson,
2843: {\sl The Molecular Biology of the Cell}, 1983,1989,1994,2002
2844:
2845: \bibitem{Mikhailov}
2846: B. Hess and A. Mikhailov,
2847: %Self-Organization in Living Cells
2848: Science {\bf 264}, 223 (1994);
2849: A. Mikhailov and B. Hess,
2850: J. Theor. Biol. {\bf 176}, 185-192 (1995).
2851:
2852: \bibitem{Togashi}
2853: Y. Togashi and K. Kaneko,
2854: %`` Transitions Induced by the Discreteness of Molecules
2855: %in a Small Autocatalytic System''
2856: Phys. Rev. Lett. , 86 (2001) 2459;
2857: J.Phys.Soc.Japan 72 (2003)62-68;
2858: preprint 2003.
2859:
2860: \bibitem{Szathmary}
2861: E. Szathmary and J. Maynard Smith,
2862: %``From Replicators to Reproducers: the First Major Transitions Leading to Life",
2863: J. Theor. Biol. 187 (1997) 555-571
2864:
2865: \bibitem{Eigen-book}
2866: M. Eigen, Steps towards Life, Oxford Univ. Press., 1992
2867:
2868: \bibitem{KK-net}
2869: K. Kaneko, J. Biol. Phys., 28 (2002) 781;%-792
2870: Adv. in Complex Systems, 6 (2003)79-92
2871:
2872: \bibitem{KK-PRE}
2873: K. Kaneko, Phys. Rev.E. 68 (2003) 031909;
2874:
2875: \bibitem{Lancet}
2876: %A recursive state in a mutually catalytic system was also discussed by as a `compositional genome',
2877: D. Segr\'{e}, D. Ben-Eli, D. Lancet,
2878: %``Compositional genomes: prebiotic information transfer in mutually catalytic noncovalent assemblies'',
2879: Proc. Natl. Acad. Sci. USA 97 (2000)4112;
2880: D. Segr\'{e} et al., J. theor. Biol. {\bf 213} (2001) 481
2881: D. Segr\'{e} and D. Lancet, EMBO Reports {\bf 1} (2000) 217,
2882:
2883: \bibitem{Sigmund}
2884: J. Hofbauer and K. Sigmund,
2885: {\sl Evolutionary Games and Population Dynamics},
2886: Cambridge Univ. Press. 1998
2887:
2888: \bibitem{homeochaos}
2889: K. Kaneko and T. Ikegami,
2890: %"Homeochaos: Dynamics Stability of a symbiotic network with population dynamics and evolving mutation rates",
2891: Physica D 56 (1992) 406-429
2892:
2893: \bibitem{Ikegami}
2894: T. Ikegami and T. Hashimoto,
2895: Artificial Life 2 (1996) 305-318
2896:
2897: \bibitem{Takagi}
2898: H. Takagi and K. Kaneko, preprint (2003)
2899:
2900: \bibitem{Mikhailov-book}
2901: A. S. Mikhailov \& V. Calenbuhr, ``From Cells to Societies''
2902: Springer 2002
2903:
2904: \bibitem{Sornette}
2905: D.Sornette, {\sl Critical phenomena in Natural Science}, Springer 2002
2906:
2907: \bibitem{Zipf-book}
2908: G. K. Zipf, {\it Human Behavior and the Principle of Least Effort}
2909: (Addison-Wesley, Cambridge, 1949).
2910:
2911:
2912: \bibitem{log}
2913: C. Furusawa, T. Suzuki, A. Kashiwagi, T. Yomo and K. Kaneko ;
2914: Ubiquity of Log-normal Distribution in gene expression,
2915: preprint
2916:
2917: \bibitem{Sato}
2918: K. Sato, Y. Ito, T. Yomo, and K. Kaneko;
2919: %On the Relation between Fluctuation and Response in Biological Systems;
2920: Proc. Nat. Acad. Sci. USA 100 (2003) 14086-14090
2921:
2922: \bibitem{CI1}
2923: K. Kaneko, %``Clustering, Coding, Switching, Hierarchical Ordering,
2924: %and Control in Network of Chaotic Elements",
2925: Physica D 41(1990) 137-172
2926:
2927: \bibitem{CI2}
2928: I. Tsuda, Neural Networks 5(1992)313
2929:
2930: \bibitem{CI3}
2931: K. Kaneko and I. Tsuda. ed., Focus issue on ``Chaotic Itinerancy",
2932: Chaos. 13 (2003) 926
2933:
2934: \end{thebibliography}
2935: \end{document}
2936: