0504:q-bio0504016/acc.tex

1: \documentstyle[11pt,fleqn,epsf,epsfig]{article}

2: %\def\baselinestretch{1.5}

3: %\documentstyle[12pt,fleqn,epsf,epsfig]{article}

4: %\baselineskip 2pc

5: \oddsidemargin   0mm

6: \textwidth     160mm

7: \topmargin   -10mm

8: \headheight 0mm   \headsep  0mm

9: \textheight 240mm

10: \footheight 5mm   \footskip 10mm

11:

12: \begin{document}

13:

14: \title{On Recursive  Production and Evolvabilty of Cells:

15: Catalytic Reaction Network Approach}

16:

17: \author{Kunihiko Kaneko\\

18: {\small \sl Department of Basic Science,

19: College of Arts and Sciences,}\\

20: {\small \sl University of Tokyo,}\\

21: {\small \sl Komaba, Meguro-ku, Tokyo 153, Japan}\\

22: }

23:

24:

25: \date{}

26:

27: \maketitle

28:

29: \tableofcontents

30:

31: \begin{abstract}

32: To unveil the logic of cell from a level

33: of chemical reaction dynamics, we need to clarify how ensemble of

34: chemicals can autonomously produce the set of chemical, without assuming a

35: specific external control mechanism.

36: A cell consists of a huge number of chemical

37: species that catalyze each other.  Often the number of each molecule

38: species is not so large, and accordingly the number fluctuations in each molecule species

39: can be large. In the amidst of such diversity and large fluctuations, how can a cell

40: make recursive production?  On the other hand, a cell can

41: change its state to evolve to a different type over a longer time span.  How are reproduction

42: and evolution compatible?  We address these questions, based on several

43: model studies with catalytic reaction network.

44: %\\

45:

46: In the present survey paper, we first formulate basic questions on the recursiveness and

47: evolvability of a cell, and then state the standpoint of our research to

48: answer the questions, that is  termed as 'constructive biology'.  Based on this standpoint,

49: we present general strategy of modeling a cell as a chemical reaction network.

50: %\\

51:

52: At the first part we investigate of the origin of heredity in a cell, by

53: noting that the molecules carrying heredity must be preserved well and control

54: the behavior of a cell.  We take a simpled model consisting of two mutually

55: catalyzing molecule species, each of which has catalytically active and

56: inactive types.  One of the molecule species is synthesized slowly, and thus

57: is a minority in population. Through the growth and division of this cell,

58: it is shown to reach and remain in a state in which a

59:  active, minority molecules are preserved over generations, and

60: control the cell behavior.  This minority controlled state is achieved

61: by preserving rare number fluctuations of molecules.

62: The state gives rise to a selection pressure for mechanisms

63: that ensure the transmission of the minority molecule.  The minority

64: molecule, thus, carries heredity, and is a candidate for "genetic

65: information".  Experimental confirmation of this minority control is also

66: presented.

67: %\\

68:

69: Next, a protocell model consisting of a large number mutually catalyzing

70: molecule species is studied, in order to investigate how chemical

71: compositions are transferred recursively under replication errors.

72: Depending on the numbers of molecules and species in a cell, and the

73: path rate in the reaction network, three phases are found: fast

74: switching state without recursive production, recursive production, and

75: itinerancy between the above two states. At a recursive production state

76: chemicals are found to form intermingled hypercycle network that consists of

77: core hypercycle and peripheral network that influence each other. How

78: this intermingled network supports the recursive production, and how

79: minority in the core hypercycle gives rise to a switch to other recursive states

80: at the itinerancy phase are elucidated.  Evolution of this hypercycle

81: network is also studied, to show the approach to recursive production of cells and

82: switch to more efficient reproduction states.  Finally, statistics of

83: the number distributions of each molecule species are studied,

84: to show (i)power-law distribution of fast switching

85: molecules (ii) suppression of fluctuation in the core-network molecule

86: species and (iii) ubiquity of log-normal distribution for most other

87: molecule species.  The origin of these statistics are discussed, while

88: suppression of the number fluctuations of a minority molecule that has

89: high catalytic connections with others is clarified, that reinforces the

90: minority control in the replication network.

91:

92: (Key Words: Minority Control, Heredity, Origin of Life, Constructive Biology

93: Hypercycle, Chemical Reaction Network, Log-normal Distribution,

94: Self-reproduction, Evolution)

95:

96: \end{abstract}

97:

98: \pagebreak

99:

100: \section{Basic Question for Recursive Production of a Cell as Reaction Dynamics of Catalytic Network}

101:

102: {\bf Question: A cell consists of several replicating molecules that

103: mutually help the synthesis and keep some synchronization for replication.

104: At least a membrane that partly separates a cell from the outside

105: has to be synthesized, keeping some degree of synchronization with

106: the replication of other internal chemicals.  How is such recursive

107: production maintained, while keeping diversity of chemicals?

108: Furthermore this recursive production is not complete, and there appears

109: a slow `mutational' change over generations, which leads to evolution.

110: How is evolvability compatible with recursive production?\cite{whatlife}}

111:

112: \subsection{Q1: Origin of Heredity}

113:

114: In a cell, among many chemicals, only some chemicals (e.g., DNA) are

115: regarded to carry genetic information.  Why do only some specific

116: molecules play the role to carry the genetic information?  How has

117: such separation of roles in molecules between genetic information and

118: metabolism progressed? Is it a necessary course of a system with

119: internal degrees and reproduction?

120:

121: In a cell, however, a variety of chemicals form a complex reaction

122: network to synthesize themselves.  Then how such cell with a huge

123: number of components and complex reaction network can sustain

124: reproduction, keeping similar chemical compositions?

125:

126: To consider this problem, we start from a simple prototype cell that

127: consists of mutually catalyzing molecule species whose growth in

128: number leads to division of the protocell\cite{minority}.  In this

129: protocell, the molecules that carry the genetic information are not

130: initially specified.  The first question we discuss here is how

131: heredity to maintain production of the protocell emerges.  Related

132: with the question, we ask if there appears some specific molecules to

133: carry information for heredity, to realize continual reproduction of

134: such protocell.  We note that in the present cells, it is generally

135: believed that information is encoded in DNA, which controls the

136: behavior of a cell.

137:

138: Here, We do not necessarily take a ``geno-centric" standpoint, in the sense

139: that gene determines the course of a cell.  In fact, even in these

140: cells, proteins and DNA both influence their replication process each

141: other.  Still, it cannot be denied that there exists a difference

142: between DNA and protein molecules with regards to the role as

143: information carrier.  In spite of this mutual dependence, why is DNA

144: molecule usually regarded as the carrier of heredity?

145: Is there any general rule that some specific molecules play the role of carrier

146: of genetic information so that the recursive production of cells continues?

147:

148: Now, the origin of genetic information in a replicating system is an

149: important theoretical topic that should be studied, not necessarily as

150: a property of certain molecules, but as a general property of

151: replicating systems.

152: To investigate this problem we need to clarify what ''information"  really

153: means.  In considering information, one often tends to be interested

154: in how several messages are encoded on a molecule.  In fact, a

155: hetero-polymer such as DNA would be suited to encode many bits of

156: information.  One might point out that DNA molecules would be suited

157: to encode many bits of information, and hence would be selected as an

158: information carrier.  Although this `combinatorial' capacity of an

159: information carrier is important, what we are interested here is a

160: basic property that has to be satisfied prior to that, i.e., origin of

161: just ``1 bit" information.

162:

163: As Shannon beautifully demonstrated, information means selection of

164: one branch from several possibilities \cite{Shannon, Brillouin}.

165: Assume that there are two possibilities in an event, each of which can

166: occur with the probability $1/2$.  In this case, when one of these

167: possibilities turns out to be true, then this choice of a branch is

168: regarded to have 1 bit information.  In this sense, if a specific cell

169: state is selected from several possible states, this selection process

170: has information, and a molecule to control such process carries

171: information.

172:

173: Now, a molecule that carries the information is postulated to play the

174: role to control for the choice of cellular state. Furthermore, to play

175: the role to carry the information for heredity, the molecules must be

176: transmitted to next generations relatively faithfully.  These two

177: features, i.e., control and preservation are nothing but the problem

178: of heredity.

179:

180: Let us reconsider what 'heredity' really means.  The heredity causes a

181: high correlation in phenotype between ancestor and offspring.  Then,

182: for a molecule to carry heredity, we identify the following two

183: features as necessary.

184:

185: (1) If this molecule is removed or replaced by a mutant, there is a

186: strong influence on the behavior of the cell.  We refer to this as the

187: {\bf `control property'}.

188:

189: (2) Such molecules are preserved well over generations.  The number of

190: such molecules exhibits smaller fluctuations than that of other

191: molecules, and their chemical structure (such as polymer sequence) is

192: preserved over a long time span, even under potential changes by

193: fluctuations through the synthesis of these molecules.  We refer to

194: this as the {\bf `preservation property'}.

195:

196: These two conditions are regarded as a fundamental condition for a

197: molecule to establish the heredity.  Now, the problem of `information'

198: at a minimal level, i.e., 1-bit information is nothing but the problem

199: of the origin of heredity. As the origin of heredity, we study how a

200: molecule starts to have the above two properties in a protocell.  In

201: other words, we study how 1-bit information starts to be encoded on a

202: single molecule in a replicating cell system.  After we answer this

203: basic question, we will then discuss how a protocell with the heredity

204: in the above sense attains incentive to evolve genetic information in

205: today's sense.

206:

207: To sum up, the first question we address here is restated as follows.

208: Consider a protocell with mutually catalyzing molecules.  Then, under

209: what conditions, recursive production continues maintaining catalytic

210: activities?  How are recursiveness and diversity in chemicals

211: compatible?  How is evolvability of such protocells possible?  To

212: answer these questions, are molecules carrying heredity necessary?

213: Under what conditions, does one molecule species begin to satisfy the

214: conditions (1) and (2) so that the molecule carries heredity?  We

215: show, under rather general conditions in our model of mutually

216: catalyzing system, that a symmetry breaking between the two kinds of

217: molecules takes place, and through replication and selection, one kind

218: of molecule comes to satisfy the conditions (1) and (2).

219:

220: \subsection{Q2: Recursiveness and Evolvability with Diverse Chemicals}

221:

222:

223: In a cell, the total number of molecules is limited.  If there are a huge number

224: of chemical species that catalyze each other, the number of some molecules  species

225: may go to zero.  Then molecules that are catalyzed by them no longer are

226: synthesized.  Then, other molecules that are catalyzed by them

227: cannot be synthesized, either.  In this manner, the chemical compositions

228: may vary drastically, and the cell may lose reproduction activity.

229:

230: Of course, a cell state is not constant, and a cell may not keep on dividing

231: for ever.  Still, a cell state is sustained to some degree to keep

232: producing similar offspring cells.  We call such condition for

233: reproduction of cell as 'recursive production' or 'recursiveness'.

234: The question we address here is if there are some conditions on

235: distribution of chemicals or structure of reaction network for recursive production.

236:

237: There are two directions of study.  One is with regards to the static

238: aspect of reaction network structure (e.g., topology).  The other is

239: the number distribution of chemical species and their dynamics.  Of

240: course, one needs to combine the two aspects to fully understand the

241: condition for recursive production of a cell.

242:

243: Currently there are much interest in

244: the reaction network structure,

245: For example, Jeong et al.\cite{Barabasi} studied the metabolic reaction

246: network, without going into details of the topology.  Write down all

247: (known) metabolic reaction equations.  Here, the rate of reactions is

248: disregarded, and only if such reaction equation exists in a cell or

249: not is concerned.  Then compute how many times a specific molecule

250: species appears in such reaction equations.  If this number is large,

251: the molecule species is related with many biochemical reactions.  For

252: example $H_2O$ has a large number of connections, since in many reactions it appears

253: either in the left hand or right hand side of the equation.

254: %$CO_2$ must have a high number also.  Among more complex molecules

255: $ATP$ has a relatively high number of connections, too.  From these data the

256: histogram $P(n)$ is obtained, as the number of molecules species that appears $n$

257: times in the equations.

258: %Of course this distribution gradually decreases with the increase of $n$.

259: From the data, it is shown that

260: $P(n)$ decays with some power of $n$ as $n^{-\alpha}$\cite{Barabasi}.

261:

262: So far, the discussion is limited only to topological structure of the

263: network.  In the reaction network dynamics, the number of molecules

264: are distributed.  On each 'node' of the network, the abundance of

265: the corresponding molecule species is assigned.  Accordingly

266: some path is 'thick' where such reactions occur frequently.  Such

267: abundance as well as their fluctuations and dynamics has to be

268: investigated.

269:

270: In a cell,  the number of each molecule changes in time through

271: reaction, and the number, on the average is increased for the cell

272: replication.  For this growth to progress effectively, some positive feedback process

273: underlying the replication process should exist, which, then, may lead to

274: amplification of the number fluctuations in molecules.  With such large

275: fluctuations and complexity in

276: the reaction network, how is recursive production of cells sustained?

277: Is there any universal statistics in the number distribution of

278: molecules?

279:

280:

281: \section{Brief Historical Survey}

282:

283: \subsection{Eigen's Hypercycle}

284:

285: Of course, the problem raised in the last section has been

286: addressed in the study on the origin of life,

287: or origin of replicating system.  Here we are not necessarily interested in

288: `what happened in past', but rather, we intend to unveil the universal logic of

289: cell.  Still, it is relevant to review the earlier studies.

290:

291: To consider the origin of replication system, one needs to discuss how genetic information

292: is faithfully transferred to the next generation.

293: %A typical standpoint is seen in the 'RNA world', but the approach has been taken long time.

294: Mills et al.\cite{Spiegelman} set up an experiment of

295: RNA replication, by  using a solution of RNA and enzyme.

296: In this experiment, some enzymes are supplied from outside,

297: and in this sense it is not an autonomous replication system.

298: Still, his group found that RNA molecules with proper sequences are

299: reproduced under some error.

300:

301:

302: Following this experimental study of Spiegelman on replication of RNA,

303: Eigen's group started theoretical study on the replication of

304: molecules\cite{Eigen}.  The replication process of polymer in

305: biochemical reaction is generally carried out with the aid of enzymes.

306: The enzyme is given by a polymer, while its catalytic activity

307: strongly depends on its sequence.  For most sequences of the polymers,

308: the catalytic activity is very small, but few of them may have high

309: catalytic activity.  Depending on the sequence some polymer has a much higher

310: catalytic activity, and the replication rate of polymers depends on the

311: sequence.  As a theoretical argument, consider replication of

312: polymers whose replication rate depends on its sequence.  Now, assume

313: that a 'good' sequence has replication rate $\alpha$ times larger than

314: its mutant with a substitution of a monomer from the original

315: sequence.  Here, the replication progresses under some error.  Without

316: fine machinery for error correction, this error is not negligible.

317: Assume that in each replication process, a monomer is substituted by

318: another monomer with the rate $\mu$.  Then the probability that a

319: polymer consisting of $N$ monomers can produce itself is given by

320: $(1-\mu)^N \approx exp(-N\mu)$, assuming that $\mu$ is small.

321:

322: Now, let us examine if the good polymer can continue replication,

323: maintaining its sequence, so that the information of this

324: sequence is transferred.  The condition that the good sequence

325: dominates in populations in the ensemble of polymers is given by

326:

327: \begin{equation}

328: N<ln(\alpha)/\mu

329: \end{equation}

330:

331: Here,$\ln(\alpha)$ is typically $O(1)$, while the error rate in the

332: replication of monomer is estimated to be around $0.01\sim0.1$, in

333: usual polymer replication process.  Then the above condition gives

334: $N<100$ or so.  In other words, information using a polymer with a

335: sequence longer than this threshold $N$ is hardly be sustained.  This

336: problem was first posed by Eigen, and is called 'error

337: catastrophe'\cite{Eigen}.  On the other hand, information for the

338: replication for a minimal life system must require much larger

339: information.  Of course, the error rate could be reduced once some

340: machinery for faithful replication as in the present life emerges.

341: However, such machinery requires much more information to be

342: transmitted by the polymer.

343:

344: Summing up: For replication to progress, catalysts are necessary, and

345: information on a polymer to replicate itself must be preserved.  However,

346: error rate in replication must have been high at a primitive stage of

347: life, and accordingly, it is recognized that the information to carry

348: catalytic activity will be lost within few generations.  In other

349: words, faithful replication system requires larger information, while

350: a larger information requires faithful replication system.  Thus there

351: appears catch-22 type paradox.

352:

353: To resolve this problem of inevitable loss of catalytic activities

354: through replication errors, Eigen and Schuster proposed

355: hypercycle\cite{Eigen}, where replicating chemicals catalyze each

356: other forming a cycle, as ``A catalyzes the synthesis of B, B catalyzes

357: the synthesis of C, C catalyzes the synthesis of A".  In this case,

358: each chemical mutually amplifies the synthesis of the corresponding

359: chemical species in this cycle.  There occurs a variety of mutations

360: to each species, but this mutant is not generally catalyzed in some

361: other species in the cycle.  Then, such mutant is not be catalyzed by

362: C.  This is also understood by writing out the rate equation for the

363: increase of the population.  In this hypercycle the population

364: increase is given by the product of the populations of molecules such

365: as $N_A\times N_B$, $N_B\times N_C$, $N_C \times N_A$, while the

366: growth of the population of the mutants is linear to each population

367: $N_A$, $N_B$, $N_C$.  In the previous estimate for error cascade, the

368: good and mutant sequences increase both linearly to the number.  Then

369: the the number of variety of mutants dominates.  In the present case,

370: once the populations of the good sequence in the hypercycle is

371: dominated, they can sustain the population, against possible emergence

372: of mutants.  With this hypercycle, the original problem of error

373: accumulation is avoided.

374:

375: Since the proposal of hypercycle, population dynamics of molecules for

376: such catalytic networks have been developed.  However, the hypercycle

377: itself turned out to be weak against parasitic molecules, i.e., those

378: which replicate, catalyzed by a molecule in the cycle,

379: but do not catalyze those in the cycle.

380: In contrast to the previous mutant, the growth rate of the

381: population of these molecules is again the product of the populations

382: of two species, and such parasitic molecules can invade.

383:

384: Although the hypercycle itself may be weak against parasitic

385: molecules, i.e., those which are catalyzed but do not catalyze others,

386: it is then discussed that compartmentalization by a cell structure may

387: suppress the invasion of parasitic molecules, or that the

388: reaction-diffusion system at spatially extended system resolves this

389: parasite problem\cite{Hogeweg}.  As chemistry of lipid, it is not so surprising

390: that a compartment structure is formed.  Still, as the origin of life,

391: this means that more complexity and diversity in chemicals are

392: required other than a set of information carrying molecules (e.g.,

393: RNA).

394:

395: \subsection{Dyson's Loose Reproduction System}

396:

397: If initially there is a variety of chemicals that form a complex network of

398: mutual catalyzation, this system may be robust against the invasion of

399: parasitic molecules.  Such idea resembles stability of ecosystem,

400: where complex network of several species may resist to invasion of

401: external species.  Hence we need to study if replication of

402: complex reaction network can be sustained.  In this case, from the beginning, there

403: are many molecule species that mutually catalyze, allowing for the existence of

404: many parasitic molecules.  Here,

405: complete replication of the system is probably difficult.

406: Then the question we have to

407: address is if such complex network can maintain molecules that catalyze the synthesis

408: of the network species.  This question was addressed by Dyson\cite{Dyson},

409: as a possibility of loose reproduction system.

410:

411: Dyson, noting the experiment of Oparin on the formation of cell-like

412: structure, considered a collection of molecules with proteins and

413: others.  These molecules cannot replicate themselves like DNA or RNA.

414: They, on the other hand, can have enzyme activities, and catalyze the

415: synthesis of other molecules albeit not faithful reproduction they may

416: be.  Still, they may keep similar compositions.  Although

417: accurate replication of such variety of chemicals is not possible,

418: chemicals, as a set, may continue reproducing themselves loosely,

419: while keeping catalytic activity.  Indeed, the accurate replication

420: must be difficult at the early stage of life, but loose reproduction

421: could be easier.  However, if this collection of molecules can keep

422: catalytic activity through reproduction is not evident.

423:

424: Dyson obtained a condition for the sustainment of catalytic activities

425: in these collection of molecules, by taking an abstract model.  For

426: simplicity he classified molecules into two states depending on if

427: they have catalytic activity or not.  Furthermore, he assumed that the

428: ratio of the synthesis of catalytic molecules is amplified as the

429: fraction of catalytic molecules is larger, i.e., a positive feedback

430: process is assumed.  This model is mapped to a kind of Ising model.

431: With the aid of mean-field analysis in statistical physics, he showed

432: that the catalytic activities can be sustained depending on the number

433: of molecules and their species.  Although his model is abstract, the

434: result he obtained probably can be applied to any system with a set

435: of catalytic molecules, be it protein, lipids, or other polymers.

436:

437: It is important to study if such loose reproduction as a set is

438: possible in a mutually catalytic reaction network (also see

439: e.g.\cite{Kauffman,Bagley}).  If this is possible, and if these

440: chemicals also include molecules forming a membrane for

441: compartmentalization, reproduction of a primitive cell will become

442: possible.  In fact, from chemical nature of lipid molecules, it is not

443: so surprising that a compartment structure is formed.

444:

445: Still, in this reproduction system, any particular molecules carrying

446: information for reproduction do not exist, in contrast to the present

447: cell which has specific molecules (DNA) for it.  As for a transition

448: from early loose reproduction to later accurate replication with

449: genetic information, Dyson did not give an explicit answer.

450: He only referred to 'genetic take-over' that was

451: originally proposed by Cairns-Smith\cite{Cairns-Smith}, who discussed

452: that a precise replication system by nucleic acids took over the

453: original loose reproduction system by clay.  Indeed, Dyson wrote that

454: his idea is based on `Cairns-Smith theory minus clay'.  However, the

455: logic for this "take over" is not unveiled.

456:

457: Considering these theoretical studies so far, it is important to study

458: how recursive production of a cell is possible, with the appearance of

459: some molecules to play a specific role for heredity.

460:

461: \section{Constructive Biology}

462:

463: \subsection{Standpoint of constructive biology}

464:

465: Before describing our theoretical model and explaining the numerical

466: results, it is relevant to briefly summarize our basic standpoint in

467: the study of biology, termed as "constructive biology"

468: \cite{whatlife,mtb}\footnote{One can skip this subsection, if one is

469: not much interested in general standpoint in the study of biology.}.

470: Here we are interested not in details of specific biological function but

471: in universal features of a biological

472: system.  Accordingly we need to study some features that are not

473: influenced by the details of complicated biological processes.  The

474: present organisms, however, include detailed elaborated processes that

475: are captured through the history of evolution.  Then, for our purpose,

476: it is desirable to set up a minimal biological system, to understand

477: universal logic that organisms necessarily should obey. Hence, the

478: approach that should be taken will be 'constructive' in nature.  This

479: constructive approach is carried out both experimentally and

480: theoretically.

481:

482: Our `constructive biology' consists of the following steps of studies.

483: (i) construct a model system by combining procedures;

484: (ii) clarify universal class of phenomena through the constructed model(s);

485: (iii) reveal the universal logic underlying the class of phenomena

486: and extract logic that the life process should obey;

487: (iv) provide a new look at data on the present organisms

488: from our discovered logic.

489:

490: There are three levels, to perform these steps:

491: (1)gedanken experiment ( logic) (2)computer model, and (3)real

492: experiment.  The first one is theoretical study, reveling a logic

493: underlying universal features in life processes, essential to

494: understand the logic of 'what is life'.

495:

496: Still, life system has a complex relationship among many parts,

497: which constitute the characteristic feature as a whole, which then influences

498: the process of each part.

499: We have not gained sufficient theoretical intuition to

500: such complex system.  Then it is also relevant to make computer experiments and

501: heuristically find some logic that cannot be easily reached by logical

502: reasoning only.  This is the second approach mentioned above, i.e.,

503: construction of artificial world in a computer.  Here we combine

504: well-defined simple procedures, to extract a general logic

505: therein \cite{minority,Complexity,KKTY,Furusawa,speciation}.

506:

507: Still, in a system with potentially huge degrees of freedom like life,

508: the construction in a computer may miss some essential factors.

509: Hence, we need the third experimental approach, i.e., construction in

510: a laboratory.  In this case again, one constructs a possible biology

511: world in laboratory, by combining several procedures.  For example,

512: this experimental constructive biology has been pursued by Yomo and

513: his collaborators (see e.g., \cite{Matsuura,Ko,Kashiwagi1,Kashiwagi2}

514: at the levels of biochemical reaction, cell, and ensembles of cells.)

515:

516: Taking this standpoint of constructive biology,

517: we have been working problems listed in the table both theoretically and experimentally.

518: The first two items in the table are related with the construction of

519: a replicating system with compartment, raised in the questions in \S

520: 1.  Of course, this problem is essential to consider the origin of a

521: cellular life.  However, we do not intend to reproduce what has

522: occurred in the earth.  We do not try to guess the environmental

523: condition of the past earth.  Rather we try to construct such

524: replication system from complex reaction network under a condition preset

525: up by us.  For example, by constructing a protocell, in the present

526: paper, we ask the condition for the heredity, or universal features

527: of the reaction dynamics to support the recursive production of cells.

528:

529: The third to sixth items are related with the construction of

530: multicellular organisms with developmental process.  When cells are

531: aggregated, they start to form differentiation of roles, and then from

532: a single cell, robust developmental process to form organized

533: structure of differentiated cells is generated.  This developmental

534: process to form a cell aggregate is transferred to the next

535: generation.  An experimental construction of multi-cellular organisms

536: (with cell differentiation) from bacteria is one target.  Here again,

537: we do not try to imitate the process of the present multi-cellular

538: organisms.  For example, by putting bacteria cells into some

539: artificial condition, we study if the cells can differentiate into

540: distinct types or form some robust distribution of cells.  Also,

541: in-vitro construction of morphogenesis from undifferentiated cells has

542: been possible by putting cells into some given

543: conditions\cite{Asashima}. With these studies, we can establish a

544: viewpoint of universal dynamics underlying development rather than

545: the conventional picture as finely tuned-up process for it\cite{KKTY,Furusawa}.

546:

547: The seventh item is construction of evolution, in particular

548: speciation process, that is how a species splits into two distinct

549: groups different both in phenotype and genotype\cite{speciation}.

550:

551: To carry out this plan experimentally we need a system to design a

552: life system controlled as we like.  Such controlled experiments are

553: now possible by recent advances in technology, such as flow-cytometry,

554: imaging techniques, microarray to measure gene expressions, while

555: advances in nanotechnology provide a powerful tool in constructing a system

556: to regulate and observe behaviors of a single cell or multiple cells,

557: in a well controlled situation.

558:

559: Here this construction is interesting by itself, but our goal is not

560: the construction itself.  Rather we try to extract general features

561: that a life system should satisfy, and set up general questions.  For

562: example, as posed in \S1,  we set up a question if there are some `information molecules'

563: that control the replication system.  Then we answer the question by

564: setting up a theory.  For each item, we set up general questions, and

565: make model simulations, and set up a general theory to answer the

566: question.  This theoretical part is carried out in tight collaboration

567: with the experiment.

568:

569:

570: Table I: examples of constructive biology under current investigation:

571:

572: \hspace{-.3in}\begin{tabular}{|c||c|c|c|} \hline

573:

574: construction of &experiment             & theory            & question to be addressed \\ \hline

575: replicating     & in-vitro replicating  & minority control  & origin of\\

576: system          & system with several   &                   & information \\

577:                 & enzymes               &                   &                 \\  \hline

578: cell system     & replicating liposome   & dynamic bottleneck& evolvability \\

579:                 & with internal         & in autocatalytic  & and recursiveness \\

580:                 & reaction network      & reaction system   & for growth       \\ \hline

581: multicellular   & interaction-induced   & isologous diversi-& robustness in \\

582: system          & differentiation of    & fication in inter-& development      \\

583:                 & an ensemble of cells  & intra dynamics    &                \\ \hline

584: developmental   & controlled            & emergence of      & irreversibility  \\

585: process (I)     & differentiation from  &  differentiation  & in development   \\

586:                 & undifferntiated cells  & rule             &                  \\ \hline

587: developmental   & activin-controlled    & self-consistency  & origin of      \\

588: process (II)    & construction of       & between pattern   & positional     \\

589:                 &  tissues formation    &  and dynamics     & information       \\ \hline

590: generation      & germ-line segregation & higher-level      & origin of recursive \\

591:                 & from ensemble of cells& recursiveness     & individuality      \\ \hline

592: evolution       & interaction-dependent & symbiotic         & genetic fixation of \\

593:                 & evolution             & sympatric         & phenotypic         \\

594:                 & of E Coli             & speciation        & differentiation    \\ \hline

595: \end{tabular}

596:

597:

598: To close this subsection, we give a brief remark on the study of the

599: so called Artificial Life (AL).  Indeed, our approach may have

600: something in common with AL\cite{AL}.  In the AL study

601: people intended to construct life-as-it-could-be, not restricted to

602: the present organisms.  Originally, in the study of AL, they have been

603: interested in logic of life that all possible biological system should

604: obey, be it on this earth or in other conditions in the universe.

605:

606: Indeed, there are some important studies on the origin of replicating

607: structure from the side of computation (e.g., \cite{Fontana}).

608: However, the conventional AL study often tended to imitate life, and could not

609: propose basic concepts to understand 'what is life'.

610: %, and the study often falls on superficial imitation.

611: %Even though they sometimes succeeded in

612: %making something similar to life, the success did not contribute in

613: %understanding the logic of life.  AS for the evolution, they usually

614: %adopt the genetic algorithm as a simplified version of Darwinian

615: %evolution, but the AL study has not contributed in proposing novel

616: %concepts in evolution.

617: Also, the conventional AL study was often biased into the study in

618: a computer. It often assumes a combination of logical processes with

619: manipulation of symbols like the study of artificial intelligence.

620:

621: Our approach is distinct from the conventional artificial

622: life study in the two points.  First, we do not take such symbol-based

623: approach, but rather we use dynamical systems approach.  Second, tight

624: collaboration between experiment and theory is essential.  Note,

625: however, this collaboration is not of the type to `fit the data' by

626: some theoretical expression, but rather at a conceptual level.  We

627: will see an example of such collaboration in \S4.

628:

629: \subsection{Modeling strategy for the chemical reaction networks}

630:

631: \begin{figure}

632: \noindent

633: \hspace{-.3in}

634: \epsfig{file=schemmodel.ps,width=.6\textwidth}

635: \caption{Schematic representation of our modeling strategy of a cell}

636: \end{figure}

637:

638: Now, we discuss a standpoint in modeling cell, based on the standpoint

639: of the last section.  Then, what type of a model is best suited for a

640: cell to answer the question in \S 1?  With all the current biochemical

641: knowledge, we can say that one could write down several types of

642: intended models.  Due to the complexity of a cell, there is a tendency

643: of building a complicated model in trying to capture the essence of a

644: cell.  However, doing so only makes one difficult to extract new

645: concepts, although simulation of the model may produce similar

646: phenomena as those in living cells.  Therefore, to avoid such

647: failures, it may be more appropriate to start with a simple model that

648: encompasses only the essential factors of living cells.  Simple models

649: may not produce all the observed natural phenomena, but are

650: comprehensive enough to bring us new thoughts on the course of events

651: taken in nature.

652:

653: In setting up a theoretical model here, we do not put many conditions

654: to imitate the life process.  Rather we impose the postulates as

655: minimum as possible, and study universal properties in such system.

656: For example, as a minimal condition for a cell, we consider a system

657: consisting of chemicals separated by a membrane.  The chemicals are

658: synthesized through catalytic reactions, and accordingly the amount of

659: chemicals increases, including the membrane component.  As the volume

660: of this system is larger, the surface tension for the membrane can no

661: longer sustain the system, and it will divide.  After the division of

662: this protocell systems, they should interact with each other, since

663: they share resource chemicals.  Under such minimum setup as will be

664: discussed later, we study the condition for the recursive growth of a

665: cell, as well as differentiation of the cell.

666:

667: Let us start from simple argument for a biochemical process that a

668: cell that grows must at least satisfy.  In a cell, there are a huge

669: number of chemicals that catalyze each other and form a complex

670: network.  These molecules are spatially arranged in a cell, and in

671: some problems such spatial arrangement is very important, while for

672: some others, the discussion on just the composition of chemicals in a

673: cell is sufficient to determine a state of a cell.  Hence, for the

674: starting point we disregard the spatial structure within a cell, and

675: consider just the composition of chemicals in a cell.  Hence, if there

676: are $k$ chemical species in a cell, the cell state is characterized by

677: the number of molecules of each species as $N_1,N_2,...N_k$.  These

678: molecules change their number through reaction among these molecules.

679: Since most reactions are catalyzed by some other molecules, the

680: reaction dynamics consist of a catalytic reaction network.

681:

682: Through membrane, some chemicals may flow in, which are successively

683: transformed to other chemicals through this catalytic reaction

684: network.  For a cell to grow recursively, a set of chemicals has to be

685: synthesized for the next generation.  As the number of molecules is

686: large enough, the membrane is no longer sustained, even just due to

687: the constraint of surface tension.  Then, when the number of molecules

688: is larger than some value, it is expected to divided.  Hence, the

689: basic picture for a simple toy cell we take is given as in Fig.1.

690:

691: Of course, it is impossible to include all possible chemicals in a

692: model.  As our constructive biology is aimed at neither making

693: complicated realistic model for a cell, nor imitating specific

694: cellular function, we set up a minimal model with reaction network, to

695: answer the questions raised in \S 1.  Now, there are several levels

696: for the modeling depending on what question we try to answer.

697:

698: (0) By taking reversible two-body reactions, including all levels of

699: reactions, ranging from metabolites, proteins, nucleic acids, and so

700: forth.  For example, to answer the general question, how

701: non-equilibrium condition is sustained in a cell, such level of model

702: is desirable\cite{Awazu}.

703:

704: (1) Assuming that some reaction process are fast, they can be

705: adiabatically eliminated.  Also, most of fast reversible reactions can

706: be eliminated by assuming that they are already balanced.

707: Then we need to discuss only the

708: concentration (number) of molecules species, that change relatively

709: slowly.  For example by assuming that enzyme is synthesized and

710: decomposed fast, the concentrations can be eliminated, to give

711: catalytic reaction network dynamics consisting of the reactions with

712:

713: \begin{equation}

714: X_i+X_j \rightarrow  X_{\ell}+X_j

715: \end{equation}

716:

717: \noindent

718: where $X_j$ catalyzes the reaction\cite{KKTY,Zipf}.  If the catalysis

719: progresses through several steps, this process is replace by

720:

721: \begin{equation}

722: X_i+mX_j \rightarrow  X_{\ell}+mX_j

723: \end{equation}

724: leading to higher order catalysis\cite{Furusawa}.

725:

726: For a cell to grow, some resource chemicals must be supplied through

727: membrane.  Through the above catalytic reaction network, the resource

728: chemicals are transformed to others, and as a result, cell grows.

729: Indeed, this class of model is adopted to study the condition for cell

730: growth, to unveil universal statistics for such cells, and also as a

731: model for cell differentiation.

732:

733: (2) Model focusing on the dynamics of replicating units

734: (e.g.. Hypercycle): For a cell to grow effectively, there should be

735: some positive feedback process to amplify the number of each molecule

736: species.  Such positive feedback process leads to autocatalytic

737: process to synthesize each molecule species. For reproduction of a

738: cell, (almost) all molecule species are somehow synthesized. Then, it would be possible to take

739: a replication reaction from  the beginning as a model.  For example, consider a reaction

740:

741: $S+X+Y \rightarrow X'+Y : S'+X' \rightarrow 2X$.

742:

743: \noindent

744: Then as  a total, the reaction is represented as

745:

746: $S+S'+X+Y \rightarrow 2X+Y$.

747: \noindent

748: Assuming the resources S and S' are constantly supplied, we can

749: consider the replication reaction

750: \begin{equation}

751: X+Y \rightarrow 2X+Y,

752: \end{equation}

753: catalyzed by $Y$.

754: At this level, we can take a unit of replicator, and consider a

755: replication reaction network.  This model was first discussed in the

756: hypercycle by Eigen and Schuster discussed in \S 2.1.

757:

758: (3) coarse-grained (phenomenological) level: Some other reduced model

759: is adopted for the study of gene expression or signal transduction

760: network.  The modeling at this level is relevant to understand

761: specific function of a cell.

762:

763: In the present paper we mainly use the modeling of the level (2).

764: This class of model can be obtained by reducing from the level-(1)

765: model, by restricting our interest only to take into account of

766: replicating units.  In this sense, the model is a bit simpler than the

767: level-(1) model.  On the other hand, it may not be suitable to discuss

768: the condition for cell growth, since at the level-(2) model, the

769: supply of resource chemicals is automatically assumed, and one cannot

770: discuss how transported chemicals are transformed into others.  In the

771: present paper, we briefly refer to the level-(1) model only  at the end of

772: \S 5.4, to demonstrate the universality of our result, but for

773: details, see the original papers \cite{KKTY,Zipf} on the level-(1)

774: modeling .

775:

776: To sum up, we envision a (proto)cell containing molecules.  With a

777: supply of chemicals available to the cell, these molecules replicate

778: through catalytic reactions, so that their numbers within a cell

779: increase.  When the total number of molecules exceeds a given

780: threshold, the cell divides into two, with each daughter cell

781: inheriting half of the molecules of the mother, chosen randomly.

782: Regarding the choice of chemical species and the reaction, we discuss

783: later for specific models. (see Fig.1 for schematic representation).

784:

785: \section{Minority Control Hypothesis for the Origin of Genetic Information}

786:

787: In the present section we propose an answer to the question raised in

788: \S 1.1, by taking a simple model of a cell with replicating molecules,

789: and proposing a novel concept on minority control, and providing

790: corresponding experimental results.

791:

792: \subsection{Model}

793:

794: As discussed in \S 3.2,

795: we start from consideration of a prototype of cell, consisting of molecules

796: that catalyze each other.  As the reaction progresses, the number of

797: molecules in this protocell will increase.  Then,

798: this cell will be divided, when its volume (the total number of molecules)

799: is beyond some threshold.  Then the molecules split into two `daughter cells".

800: Then our question in \S 1 is restated as follows:

801: How are the chemical compositions transferred to the offspring cells?

802: Do some specific molecules start to carry heredity in the

803: sense of control and preservation, so that the reproduction continues?

804:

805: Before considering the specific model,

806: it may be relevant to recall the difference of roles between DNA (or RNA) and protein.

807: According to the present understanding of molecular biology\cite{Cell},

808: changes undergone by DNA molecules are believed to exercise stronger influences

809: on the behavior of cells than other chemicals.

810: Also, a DNA molecule is transferred to offspring cells

811: relatively accurately, compared with other constitutes of the cell.

812: Hence a DNA molecule satisfies (at least) the "preservation" and "control"

813: properties (1) and (2) in \S 1.1.

814:

815: In addition, a DNA molecule is stable, and the time scale for the

816: change of DNA, e.g., its replication process as well as its

817: decomposition process, is much slower.  Because of this relatively

818: slow replication, the number of DNA molecules is smaller than the

819: number of protein molecules.  At each generation of cells, single

820: replication of each DNA molecule typically occurs, while other

821: molecules undergo more replications (and decompositions).

822:

823: With these natures of DNA in mind, while without assuming the detailed

824: biochemical properties of DNA, we seek a general condition for the

825: differentiation of the roles of molecules in a cell and study the

826: origin of the control and preservation of some specific molecules.

827:

828: Now, we consider a very simple protocell

829: system\cite{minority}, consisting of two species of replicating

830: molecules that catalyze each other (see Fig.2).  assuming that only

831: two kinds of molecules $X$ and $Y$ exist in this protocell, and they

832: catalyze each other for the synthesis of the molecules.

833:

834: \begin{equation}

835: X + Y \rightarrow 2X+ Y ;Y + X \rightarrow 2Y +X;

836: \end{equation}

837:

838: Here, this ``catalytic reaction" is not necessarily a single reaction.

839: In general there can be several intermediate processes for each

840: ``reaction".  The model simply states that there are two molecules

841: that help the synthesis of the other, directly or indirectly.  In

842: general, the catalytic activities as well as the synthesis speeds

843: differ by types of molecules.  Without losing generality one can

844: assume that $X$ is synthesized faster than $Y$.

845:

846: With this synthesis of molecules, the total number of molecules in the

847: protocell will increase, until it divides into two.  As long as the

848: molecules catalyze each other, this synthesis continues, as well as

849: the division (reproduction) of protocell.  However, some structural

850: changes in molecules can occur through replication (`replication

851: error').  These structural changes in each kind of molecules may

852: result in the loss of catalytic activity.  Indeed, the molecules with

853: catalytic activity are not so common.  On the other hand, molecules

854: without catalytic activity can grow their number, if they are

855: catalyzed by other catalytic molecules.  Then, as discussed in \S 2.1,

856: the maintenance of reproduction is not so easy.

857:

858: Following the above discussion, we consider the following model,

859: as a first step in answering the question posed \S 1.1\cite{minority}.

860:

861: \begin{figure}

862: \noindent

863: \hspace{-.3in}

864: \epsfig{file=figj4-3.ps,width=.4\textwidth}

865: \caption{Schematic representation of our model}

866: \end{figure}

867:

868:

869: (i) There are two species of molecules, X and Y, which are mutually catalyzing.

870:

871: (ii) For each species, there are active and inactive (``I'') types.

872: %There are thus four types,

873: Considering that the active molecule type is rather rare.  There are

874: $F$ types of inactive molecules per active type.  For most

875: simulations, we consider the case in which there is only one type of

876: active molecules for each species.

877:

878: Active types are denoted as $X^0$ and $Y^0$, while there are inactive

879: types $X^I$ and $Y^I$ with $I=1,2,...,F$.  The active type has the

880: ability to catalyze the replication of both types of the other species

881: of molecules. The catalytic reactions for replication are assumed to

882: take the form

883:

884: \begin{math}

885: X^J + Y^0 \rightarrow 2 X^J +Y^0\end{math} (for $J=0,1,..,F$)

886:

887: and

888: \begin{math}

889: Y^J + X^0 \rightarrow 2 Y^J +X^0\end{math} (for $J=0,1,..,F$).

890:

891: (iii) The rates of synthesis (or catalytic activity) of

892: the molecules $X$ and $Y$ differ.  We stipulate that the rate of the above replication process

893: for $Y$,

894: $\gamma_y$, is much smaller than that for $X$, $\gamma_x$.

895: This difference in the rates may also be caused by a difference in

896: catalytic activities between the two molecule species.

897:

898: (iv) In the replication process, there may occur structural changes

899: that alter the activity of molecules. Therefore the type (active or

900: inactive) of a daughter molecule can differ from that of the mother.

901: The rate of such structural change is given by $\mu$, which is not

902: necessarily small, due to thermodynamic fluctuations.  This change can

903: consist of the alternation of a sequence in a polymer or other

904: conformational change, and may be regarded as replication `error'.

905: Note that the probability for the loss of activity is $F$ times

906: greater than for its gain, since there are $F$ times more types of

907: inactive molecules than active molecules.  Hence, there are processes

908: described by

909:

910: \begin{math}

911: X^I \rightarrow X^0;\end{math}and \begin{math}Y^I \rightarrow Y^0\end{math} (with rate $\mu$)

912:

913: \begin{math}

914: X^0 \rightarrow X^I;\end{math}and \begin{math}Y^0 \rightarrow Y^I\end{math}(with rate $\mu $ for each),

915:

916: resulting from structural change.

917:

918: (v) When the total number of molecules in a protocell exceeds a given

919: value $2N$, it divides into two, and the chemicals therein are

920: distributed into the two daughter cells randomly, with $N$ molecules

921: going to each.  Subsequently, the total number of molecules in each

922: daughter cell increases from $N$ to $2N$, at which point these divide.

923:

924: (vii) To include competition, we assume that there is a constant total

925: number $M_{tot}$ of protocells, so that one protocell, randomly chosen,

926: is removed whenever a (different) protocell divides into two.

927:

928: With the above described process, we have basically four sets of

929: parameters: the ratio of synthesis rates $\gamma_y/\gamma_x$, the

930: error rate $\mu$, the fraction of active molecules $1/F$, and the

931: number of molecules $N$.  (The number $M_{tot}$ is not important, as

932: long as it is not too small).

933:

934: We carried out simulation of this model, according to the following procedure.

935: First, a pair of molecules is chosen randomly.

936: If these molecules are of different species, then if the

937: $X$ molecule is active, a new $Y$ molecule is produced with the probability

938:  $\gamma_y$, and if the $Y$ molecule is active, a new $X$

939: molecule is produced with the probability $\gamma_x$.

940: Such replications occur with the error rates given above.

941: All the simulations were thus carried out

942: stochastically, in this manner.

943:

944: We consider a stochastic model rather than the corresponding rate

945: equation, which is valid for large $N$, since we are interested in the

946: case with relatively small $N$.  This follows from the fact that in a

947: cell, often the number of molecules of a given species is not large,

948: and thus the continuum limit implied in the rate equation approach is

949: not necessarily justified \cite{Mikhailov}.

950:

951: Furthermore, it has recently been found that the discrete nature of a

952: molecule population leads to qualitatively different behavior than in

953: the continuum case in a simple autocatalytic reaction network

954: \cite{Togashi}.  In a simple autocatalytic reaction system with a

955: small number of molecules, a novel steady state is found when the

956: number of molecules is small, that is not described by a continuum

957: rate equation of chemical concentrations.  This novel state is first

958: found by stochastic particle simulations.  The mechanism is now

959: understood in terms of fluctuation and discreteness in molecular

960: numbers.  Indeed, some state with extinction of specific molecule

961: species shows a qualitatively different behavior from that with very

962: low concentration of the molecule.  This difference leads to a transition to a novel

963: state, termed as discreteness-induced-transition.  This phase

964: transition appears by decreasing the system size or flow to the

965: system, and is analyzed from the stochastic process, where a

966: single-molecule switch changes the distributions of molecules drastically.

967:

968: In \cite{Togashi}, given are examples in which a discreteness in molecule

969: number leads to a novel phase that is not observed from a continuous

970: rate equation of chemical reaction.  In a cell, since the number of

971: some molecules species is very small,  we need to seriously consider

972: the possibility that the discreteness in molecule numbers may lead

973: to a novel behavior distinct from the continuum description.

974:

975: \subsection{Result}

976:

977: If $N$ is very large, the above described stochastic model can be replaced by a

978: continuous model given by the rate equation.

979: Let us represent the total number of inactive molecules for each of $X$ and $Y$ as

980:

981: $ N_x^I =\sum_{j=1}^F N_x^j$; $ N_y^I =\sum_{j=1}^F N_y^j$

982:

983: Then the growth dynamics of the number of molecules

984: $N_x^J$ and $N_y^J$

985: % (for $J=A$ or $I$)

986: is described by the rate equations, using the total number of molecules $N^t$,

987:

988: \begin{equation}

989: dN_x^j/dt=\gamma_x N_x^j N_y^0/N^t;

990: dN_y^j/dt=\gamma_y N_x^0 N_y^j/N^t.

991: \end{equation}

992:

993: From these equations, under repeated divisions,

994: it is expected that the relations $\frac{N_x^0}{N_y^0}=\frac{\gamma_x}{\gamma_y}$,

995: $\frac{N_x^0}{N_x^I}= \frac{1}{F}$, and  $\frac{N_y^0}{N_y^I} = \frac{1}{F}$ are eventually satisfied.

996: Indeed, even with our stochastic simulation,

997: this number distribution is approached as $N$ is increased.

998:

999: However, when $N$ is small, and with the selection process, there

1000: appears a significant deviation from the above

1001: distribution\cite{minority}.  In Fig.3, we have plotted the average

1002: numbers $\langle N_x^0 \rangle$, $\langle N_x^I \rangle$, $\langle

1003: N_y^0 \rangle$, and $\langle N_y^I \rangle$.  Here, each molecule

1004: number is computed for a cell just prior to the division, when the

1005: total number of molecules is $2N$, while the average $\langle

1006: ... \rangle$ is taken over all cells that divided throughout the

1007: simulation.  (Accordingly, a cell removed without division does not

1008: contribute to the average).  As shown in the figure, there appears a

1009: state satisfying $\langle N_y^0 \rangle \approx 2 - 10$, $\langle

1010: N_y^I \rangle \approx 0$.  Since $F \gg 1$, such a state with

1011: $\frac{\langle N_y^0 \rangle}{\langle N_y^I \rangle}>1$ is not

1012: expected from the rate equation (6).  Indeed, for the $X$- species,

1013: the number of inactive molecules is much larger than the number of

1014: active ones.  Hence, we have found a novel state that can be realized

1015: due to the smallness of the number of molecules and the selection

1016: process.

1017:

1018: %In Fig.?, $\gamma_y/\gamma_x$ and $F$ are fixed to 0.01 and64, respectively.

1019:

1020: For the dependence of \{$\langle N_x^0 \rangle$,$\langle N_x^I \rangle$,$\langle N_y^0 \rangle$,$\langle N_y^I \rangle$ \}

1021: on these parameters, see also figures of the paper of \cite{minority}.

1022: From these numerical results,

1023: it is shown that the above mentioned state with $\langle N_y^0 \rangle \approx 2 - 10 $, $\langle N_y^I \rangle < 1$

1024: is reached and sustained when $\gamma_y/\gamma_x$ is small and $F$ is sufficiently large.

1025: In fact, for most dividing cells, $N_y^I$ is exactly 0, while there appear a few cells

1026: with $N_y^I>1$ from time to time.

1027: It should be noted that the state with almost no inactive Y

1028: molecules appears in the case of larger $F$, i.e., in the case of

1029: a larger possible variety of inactive molecules.  This suppression of

1030: $Y^I$ for large $F$ contrasts with the behavior found in the continuum limit (the rate equation).

1031: In Fig.4, we have plotted $\frac{\langle N_y^0 \rangle}{\langle N_y^I \rangle}$ as a

1032: function of $F$.

1033: Up to some value of $F$, the proportion of active $Y$ molecules decreases,

1034: in agreement with the naive expectation provided by Eq. (6),

1035: but this proportion increases with further increase of $F$,  in the case that

1036: $\gamma_y/\gamma_x$ is small ($\stackrel{<}{\sim}.02$) and $N$ is small.

1037:

1038:

1039: This behavior of the molecular populations can be understood from the

1040: viewpoint of selection: In a system with mutual catalysis, both $X^0$

1041: and $Y^0$ are necessary for the replication of protocells to continue.

1042: The number of $Y$ molecules is rather small, since their synthesis

1043: speed is much slower than that of $X$ molecules.  Indeed, the fixed

1044: point distribution given by the continuum limit equations possesses a

1045: rather small $N_y^0$.

1046: %In fact, when the total number of molecules is sufficiently small, the value

1047: %of $\langle N_y^0 \rangle$ given by these equations is less than 1.

1048: However, in a system with mutual catalysis, both $X^0$ and $Y^0$ must

1049: be present for replication of protocells to continue.  Note, for the

1050: replication of $X$ molecules to continue, at least a single active $Y$

1051: molecule is necessary.  Hence, if $N_y^0$ vanishes, only the

1052: replication of inactive $Y$ molecules occurs, and divisions from this

1053: cell cannot proceed indefinitely, because the number of $X^0$

1054: molecules is cut in half at each division.  Furthermore, a cell with

1055: $N_y^0=1$, only one of its daughter cells can have an active $Y$

1056: molecule.  Summing up, under the presence of selection, protocells

1057: with $N_y^0>1$ are selected.

1058:

1059: %Hence a cell with $N_y^0=1$ has no  potentiality to multiple through division, and for this reason,

1060:

1061: On the other hand, the total number of $Y$ molecules is limited to

1062: small values, due to their slow synthesis speed.  This implies that a

1063: cell that suppresses the number of $Y^I$ molecules to be as small as

1064: possible is preferable under selection, so that there is a room for

1065: $Y^0$ molecules.  Hence, a state with almost no $Y^I$ molecules and a

1066: few $Y^0$ molecules, once realized through fluctuations, is expected

1067: to be selected through competition for survival ( see Fig.5 for

1068: schematic representation).

1069:

1070:  Of course, the probability for such rare fluctuations decrease quite

1071: rapidly as the total molecule number increases, and for sufficiently

1072: large numbers, the continuum description of the rate equation is

1073: valid. Clearly then, a state of the type described above is selected

1074: only when the total number of molecules within a protocell is not too

1075: large. In fact, a state with very small $N_Y^I$ appears only if the

1076: total number $N$ is smaller than some threshold value depending on $F$

1077: and $\gamma_y$. In other words, too large cell is not favorable, because

1078: the fluctuation is too small to produce such rare state.

1079:

1080: \begin{figure}

1081: \noindent

1082: \hspace{-.3in}

1083: \epsfig{file=fig8-2-90.eps,width=.6\textwidth}

1084: \caption{

1085: Dependence of $\langle N_x^0 \rangle (\times)$, $\langle N_x^I \rangle (+)$,

1086: $\langle N_y^0 \rangle (\Box)$, and $\langle N_y^I \rangle(*)$

1087: on $N$.

1088: The parameters were fixed as $\gamma_x=1$, $\gamma_y=0.01$, and $\mu =.05$.

1089: Plotted are the averages of $N_x^0$, $N_x^I$, $N_y^0$, and $N_y^I$

1090: at the division event, and thus their sum is

1091: $2N$.

1092: We use $M_{tot}=100$, and

1093: the sampling for the averages were taken over $10^5-3\times 10^5$ steps,

1094: where the number of divisions ranges from $10^4$ to $10^5$,

1095: depending on the parameters. Reproduced from \cite{minority}.}

1096: \end{figure}

1097:

1098: \begin{figure}

1099: \noindent

1100: \hspace{-.3in}

1101: \epsfig{file=figmin.ps,width=.7\textwidth}

1102: \caption{

1103: Dependence of the active-to-inactive ratio,

1104: $\frac{\langle N_y^0 \rangle }{\langle N_y^I \rangle }$,

1105: on $F$.

1106: The parameters were fixed as $\gamma_x=1$, $\gamma_y=.01$, $\mu =.05$, and $F=128$.

1107: Plots for $\gamma_y=.005$ ($\Diamond$), .01 (+), .015 ($\Box$), 0.02 ($\times$),

1108: 0.025 ($\triangle$),

1109: and 0.03 (*) are overlaid.

1110: Plotted are the averages of $N_x^0$, $N_x^I$, $N_y^0$, and $N_y^I$

1111: at the division event. Reproduced from \cite{minority}.}

1112: \end{figure}

1113:

1114:

1115: %\begin{figure}

1116: %\noindent

1117: %\hspace{-.3in}

1118: %\epsfig{file=fig8-3-90.eps,width=.6\textwidth}

1119: %\caption{

1120: %Dependence of $\langle N_x^0 \rangle (\times)$, $\langle N_x^I \rangle (+)$,

1121: %$\langle N_y^0 \rangle (\Box)$, and $\langle N_y^I \rangle(*)$

1122: %on $F$. The parameters were fixed as

1123: %$\gamma_x=1$, $\gamma_y=.01$, $\mu =.05$, and $N=1000$.

1124: %Plotted are the averages of $N_x^0$, $N_x^I$, $N_y^0$, and $N_y^I$

1125: %at the division event, and thus their sum is $2N=2000$.

1126: %Reproduced from \cite{KKTY02}}

1127: %\end{figure}

1128:

1129: \begin{figure}

1130: \noindent

1131: \hspace{-.3in}

1132: \epsfig{file=figj4-25.ps,width=.7\textwidth}

1133: \caption{Schematic representation of our logic

1134: Once an active molecule of each molecule species is lost, the

1135: reproduction does not continue.

1136: }

1137: \end{figure}

1138:

1139: \subsection{Minority Controlled State}

1140:

1141: We showed that in a mutually catalyzing replication system, the

1142: selected state is one in which the number of inactive molecules of the

1143: slower replicating species, $Y$, is drastically suppressed.  In this

1144: section, we first show that the fluctuations of the number of active

1145: $Y$ molecules is smaller than those of active $X$ molecules in this

1146: state.  Next, we show that the molecule species $Y$ (the minority

1147: species) becomes dominant in determining the growth speed of the

1148: protocell system.  Then, considering a model with several active

1149: molecule types, the control of chemical composition through

1150: specificity symmetry breaking is discussed.

1151:

1152:

1153: \subsubsection{Preservation of minority molecule}

1154:

1155: First, we computed the time evolution of the number of active $X$ and

1156: $Y$ molecules, to see if the selection process acts more strongly to

1157: control the number of one or the other.  We computed $N_x^0$ and

1158: $N_y^0$ at every division to obtain the histograms of cells with given

1159: numbers of active molecules.

1160:

1161: The fluctuations in the value of $N_y^0$ are found to be much smaller

1162: than those of $N_x^0$.

1163: The selection process

1164: discriminates more strongly between different concentrations of active

1165: $Y$ molecules than between those of active $X$ molecules.  Hence the

1166: active $Y$ molecules are well preserved with relatively smaller

1167: fluctuations in the number.

1168:

1169:

1170: %The numbers $N_y^A$ and $N_y^I$ are more nearly conserved than $N_x^A$ and $N_x^I$, and

1171:

1172:

1173: \subsubsection{Control of the growth speed}

1174:

1175: Now, it is expected that the growth speed of our protocell has a

1176: stronger dependence on the number of active $Y$ molecules than the

1177: number of active $X$ molecules.  We have found that the division time

1178: is a much more rapidly decreasing function of $N_y^0$ than of $N_x^0$.

1179: Even a slight change in the number of active $Y$ molecules has a

1180: strong influence on the division time of the cell.  Of course, the

1181: growth rate also depends on $N_x^0$, but this dependence is much

1182: weaker.  Hence, the growth speed is controlled mainly by the number of

1183: active $Y$ molecules.

1184:

1185: \subsubsection{Control of chemical composition by the minority molecule}

1186:

1187: As another demonstration of control, we study a model in which there

1188: is more specific catalysis of molecule synthesis.  Here, instead of

1189: single active molecule types for $X$ and $Y$, we consider a system

1190: with $k$ types of active $X$ and $Y$ molecules, $X^{0i}$ and

1191: $Y^{0i}$ ($i=1,2,\cdots k$).  In this model, each active molecule

1192: type catalyzes the synthesis of only a few types ($m<k$) of the other

1193: species of molecules.  Here we assume that both $X$ and $Y$ molecules

1194: have the same ``specificity" (i.e., the same value of $m$) and study

1195: how this symmetry is broken.

1196:

1197: %Graphically representing the ability for such catalysis using arrows as

1198: %$i_x \rightarrow j_y$ for $X \rightarrow Y$ and $i_y \rightarrow j_x$ for

1199: %$Y \rightarrow X$, the network of arrows defining the catalyzing relations for

1200: %the entire system is chosen randomly, and is fixed throughout each simulation.

1201:

1202: As already shown, when $N$, $\gamma_y$ and $F$ satisfy the conditions

1203: necessary for realization of a state in which $N_y^I$ is sufficiently

1204: small, the surviving cell type contains only a few active $Y$

1205: molecules, while the number of inactive ones vanishes or is very

1206: small.  Our simulations show that in the present model with several

1207: active molecule types, only a single type of active $Y$ molecule

1208: remains after a sufficiently long time.  We call this ``surviving

1209: type", $i_r$ ($1 \leq i_r \leq k$).  Contrastingly, at least $m$ types

1210: of $X^0$ species, that can be catalyzed by the remaining $Y^{0i_r}$

1211: molecule species remain.  Accordingly, for a cell that survived after

1212: a sufficiently long time, a single type of $Y^{0i_r}$ molecule catalyzes

1213: the synthesis of (at least) $m$ kinds of $X$ molecule species, while

1214: the multiple types of $X$ molecules catalyze this single type of

1215: $Y^{0i}$ molecules.  Thus, the original symmetry regarding the

1216: catalytic specificity is broken as a result of the difference between

1217: the synthesis speeds.

1218:

1219: Due to autocatalytic reactions, there is a tendency for further

1220: increase of the molecules that are in the majority.  This leads to

1221: competition for replication between molecule types of the same

1222: species.  Since the total number of $Y$ molecules is small, this

1223: competition leads to all-or-none behavior for the survival of

1224: molecules. As a result, only a single type of species $Y$ remains,

1225: while for species $X$, the numbers of molecules of different types are

1226: statistically distributed as guaranteed by the uniform replication

1227: error rate.

1228:

1229: Although $X$ and $Y$ molecules catalyze each other, a change in the type of

1230: the remaining active $Y$ molecule has a much stronger influence on $X$

1231: than a change in the types of the active $X$ molecules on $Y$,

1232: since the number of $Y$ molecules is much smaller.

1233:

1234: With the results so far, we can conclude that the $Y$ molecules, i.e.,

1235: the minority species, control the behavior of the system, and are

1236: preserved well over many generations.  We therefore call this state

1237: the minority-controlled (MC) state.

1238:

1239: \subsubsection{Evolvability}

1240:

1241: An important characteristic  of the MC state is evolvability.

1242: Consider a variety of active molecules $0i$, with different catalytic activities.

1243: Then the synthesis rates $\gamma_x$ and $\gamma_y$ depend on the activities of

1244: the catalyzing molecules.  Thus, $\gamma_x$ can be written in terms of

1245: the molecule's inherent growth rate, $g_x$, and the activity, $e_y(i)$, of

1246: the corresponding catalyzing molecule $Y^{0i}$:

1247:

1248: \begin{math}

1249: \gamma_x =g_x \times e_y(i);

1250: \gamma_y =g_y \times e_x(i).

1251: \end{math}

1252:

1253: \noindent

1254: Since such a biochemical reaction is entirely facilitated by catalytic

1255: activity, a change of $e_y$ or $e_x$, for example by the structural

1256: change of polymers, is more important. Given the occurrence of

1257: such a change to molecules, those with greater catalytic activities

1258: will be selected through competition evolution, leading to the

1259: selection of larger $e_y$ and $e_x$.  As an example to demonstrate

1260: this point, we have extended the model to include $k$ kinds of active

1261: molecules with different catalytic activities.  Then, molecules with

1262: greater catalytic activities are selected through competition.

1263:

1264: Since only a few molecules of the $Y$ species exist in the MC state, a

1265: structural change to them strongly influences the catalytic activity

1266: of the protocell.  On the other hand, a change to $X$ molecules has a

1267: weaker influence, on the average, since the deviation of the {\sl

1268: average} catalytic activity caused by such a change is smaller, as can

1269: be deduced from the law of large numbers.  Hence the MC state is

1270: important for a protocell to realize evolvability.

1271:

1272: \subsection{Experiment}

1273:

1274: Recently, there have been some experiments to construct minimal

1275: replicating systems in vitro.

1276: As an experiment corresponding to this problem, we describe an in-vitro

1277: replication system, constructed by Yomo's group\cite{Matsuura}.

1278:

1279: In general, proteins are synthesized from the information on DNA

1280: through RNA, while DNA are synthesized through the action of proteins.

1281: As a set of chemicals, they autonomously replicate themselves.  Now

1282: simplifying this replication process, Matsuura et al.\cite{Matsuura} constructed a

1283: replication system consisting of DNA and DNA polymerase i.e., an

1284: enzyme for the synthesis of DNA, and so forth.  This DNA polymerase is

1285: synthesized by the corresponding gene in the DNA, while it works as

1286: the catalyst for the corresponding DNA.  Through this mutual catalytic

1287: process the chemicals replicate themselves.

1288:

1289: \begin{figure}

1290: \noindent

1291: \hspace{-.3in}

1292: %\epsfig{file=chap4fig/yomo1a.eps,width=.8\textwidth}

1293: \epsfig{file=yomo1.ps,width=.9\textwidth}

1294: \caption{

1295: Illustration of in-vitro autonomous replication system

1296: consisting of DNA and DNA polymerase.

1297: See text and \cite{Matsuura} for details.

1298: Provided with the courtesy of Yomo, Matsuura et al.

1299: }

1300: \end{figure}

1301:

1302:

1303: %\begin{figure}

1304: %\noindent

1305: %\hspace{-.3in}

1306: %\epsfig{file=chap4fig/yomo-repl2.eps,width=.4\textwidth}

1307: %\caption{Procedure of experiment�F In each of 10 test tubes containing a single DNA molecule,

1308: %autonomous replication progresses.  The components of the tubes are mixed

1309: %in a pool, from which a single DNA is chosen to a tube, to repeat the

1310: %procedure.

1311: %See text and [Matsuura et al. 2002] for details.

1312: %Supplied with the courtesy of Yomo, Matsuura et al.

1313: %}

1314: %\end{figure}

1315:

1316:

1317: %\begin{figure}

1318: %\noindent

1319: %\hspace{-.3in}

1320: %\epsfig{file=chap4fig/yomo-repl.eps,width=.5\textwidth}

1321: %\caption{

1322: %Change of self-replication activity from a system with single DNA.

1323:

1324: %The activities for 10 tubes are shown,  The next generation is

1325: %produced mostly from the top DNA.  Although activities vary by each tube,

1326: %higher ones are  selected, so that the activities are maintained.

1327: %See text and [Matsuura et al. 2002] for details.

1328: %Supplied with the courtesy of Yomo, Matsuura et al.

1329: %}

1330: %\end{figure}

1331:

1332: \begin{figure}

1333: \noindent

1334: \hspace{-.3in}

1335: %\epsfig{file=chap4fig/yomo-repl0.eps,width=.5\textwidth}

1336: \epsfig{file=yomo2a.ps,width=.7\textwidth}

1337: \epsfig{file=yomo2b.ps,width=.7\textwidth}

1338: \caption{

1339: Self-replication activities for each generation, measured as described in the

1340: text. The activities for 10 tubes are shown.  :  Upper: result from a single DNA, where the  next generation is

1341: produced mostly from the top DNA.  Although activities vary by each tube,

1342: higher ones are  selected, so that the activities are maintained. Lower: result from 100 DNA molecules.

1343: Provided with the courtesy of

1344: Yomo, Matsuura et al\cite{Matsuura}.

1345: }

1346: \end{figure}

1347:

1348: As for the amplification of DNA, PCR is widely used, and is a

1349: standard tool for molecular biology.  In this case, however, enzymes

1350: that are necessary for the replication of DNA must be supplied

1351: externally.  In this sense, it is not a self-contained autonomous

1352: replication system.  In the experiment by Yomo's group, while they use

1353: PCR as one step of experimental procedures, the enzyme (DNA

1354: polymerase) for DNA synthesis is also replicated in vitro within the

1355: system.  Of course, some (raw) material, such as amino acid or ATP,

1356: have to be supplied, but otherwise the chemicals are replicated by

1357: themselves. (see Fig. 6 for the experimental procedure).

1358:

1359: In this experiment, there is mutual synthetic process between gene and enzymes.

1360: Roughly speaking, the polymerase in the experiment corresponds to

1361: $X$ in our model, while the polymerase gene corresponds to $Y$.

1362:

1363:

1364: Now, at each step of replication , about $2^{30}\sim2^{40}$ DNA molecules are replicated.

1365: Here, of course there are some errors. These errors can occur in the synthesis of

1366: enzyme, and also in the synthesis of DNA.  With these errors, there appear DNA molecules

1367: with different sequences.  Now a pool of DNA molecules with a variety of sequences

1368: is obtained as a first generation.

1369:

1370: From this pool, the DNA and enzymes are split into several tubes.

1371: Then, materials with ATP and amino acids are supplied, and the replication process

1372: is repeated (see Fig. 6).  In other words, the 'test tube' here plays the role

1373: of ``cell compartmentalization".  Instead of autonomous cell division, split into several tubes are operated externally.

1374:

1375: In this experiment, instead of changing the synthesis speed $\gamma_y$ or $N$ in the model,

1376: one can control the number of genes, by changing the condition how

1377: the pool is split into several test tubes.

1378:

1379: Indeed, they studied the two distinct cases, i.e.,

1380: split to tubes containing a single DNA in each and split

1381: to tubes containing 100 DNA molecules.

1382: Recall that  in the theory, the evolvability by minority control is predicted.

1383: Hence, the behavior between the two cases may be drastically different.

1384:

1385: First, we describe the case with a single DNA in each tube.

1386: Here, the pool of chemicals

1387: is split into 10 tubes each of which has a single DNA molecule,

1388: and replication process described already

1389: progresses in each tube.  Here, the sequence of DNA molecules could be different by tube,

1390: since there is replication error.  Then the activity of DNA polymerase by each

1391: tube is also different, and the number of DNA molecules synthesized in each tube

1392: is different.  In other words,  some DNA molecules can produce more offspring, but others

1393: cannot.  The variation of self-replication activity by tubes is shown in the upper column of Fig. 7.

1394: Then the contents of each tube are mixed.

1395: This soup  of chemicals is used for the next generation.  Then in this soup,

1396: the DNA molecules that have higher replication rate as well as  their mutants generated

1397: from them are included with a larger fraction.  Now a single DNA is selected from

1398: the soup in each of 10 tubes, and the same procedures are repeated.

1399: Hence, there is a larger probability that a DNA molecule with a

1400: higher reproduction activity is selected for the next generation. In other words,

1401: Darwinian selection acts at this stage.

1402: The self-replication activity

1403: from this soup is plotted in the third generation.  Successive plots of the

1404: self-replication activity are given in the upper column of Fig. 7,

1405: As shown, the self-replication activity is not lost (or can evolve in some case),

1406: although it varies by each tube in each generation.

1407:

1408: One might say that the maintenance of replication is not surprising at all,

1409: since a gene for the DNA polymerase is included in the beginning.

1410: However, enzyme with such  catalytic activity is rare. Indeed, with mutations

1411: some proteins that lost such catalytic activity but are synthesized in the

1412: present system could appear, which might take over the system.

1413: Then the self-replication activity would be lost.  In fact, this is nothing but

1414: the error catastrophe by Eigen, discussed in \S 2.1.

1415: Then, why is the self-replication activity maintained in the present experiment?

1416:

1417: The answer is clear according to the theory in \S 4.2-4.3.

1418: In the model of \S 4.1, mutants that lost the catalytic

1419: activity are much more common(i.e., $F$ times larger in the model).

1420: Still, the number of such molecules is suppressed.  This was possible

1421: first because the molecules are in a cell.  In the experiment also

1422: they are in a test tube, i.e., in a compartment.  Now the selection works

1423: for this compartment, not for each molecule.  Hence the tube (cell) that

1424: includes a gene giving rise to lower enzyme activity produces less offspring.

1425: In this sense, compartmentalization is one essential factor for

1426: the maintenance of catalytic activity (see also \cite{Hogeweg,Szathmary,Eigen-book}.

1427: Here, another important factor is that in each compartment (cell) there is a single (or

1428: very few)  DNA molecule (as the $Y$ molecule in the model of \S 4.1-3).  In the theory,

1429: if the number of $Y$ molecules is larger,  inactive $Y$ molecules surpass

1430: the active one in population.

1431:

1432: To confirm the validity of our theory, Matsuura et al.\cite{Matsuura} carried out a comparison

1433: experiment.  Now, they split the chemicals in the soup so that each tube

1434: has 100 DNA molecules instead of a single one.  Otherwise, they adopt

1435: the same procedure.  In other words, this corresponds to a cell with 100

1436: copies of genome.  Change of self-replication activity in the experiment

1437: is plotted in the lower column of Fig.7.  As shown, the

1438: self-replication activity is lost by each generation, and after the

1439: fourth generation, capability of autonomous replication is totally lost.

1440: This result shows that the number of molecules to carry genetic

1441: information should be small, which is consistent with the theory.

1442:

1443: When there are many DNA molecules, there can be mutation to each DNA

1444: molecule.  In each tube, the self-replication activity is given by the

1445: average of the enzyme activities from these 100 DNA molecules.

1446: Although catalytic activity of molecules varies by each, the variance

1447: of the average by tubes should be reduced drastically.  Recall that

1448: the variance of the average of $N$ variables with the variance $\mu$

1449: is reduced to $\mu/N$, according to the central limit theorem of

1450: probability theory.  Hence the average catalytic activity does not

1451: differ much by tube.  Here, the mutant with a higher catalytic

1452: activity is rare.  Most changes in the gene lead to smaller or null

1453: catalytic activity.  Hence, on the average, the catalytic activity

1454: after mutations to original gene gets smaller, and the variance by

1455: tubes around this mean is rather small (see Fig. 7).

1456:

1457: By the selection, DNA from a tube with a higher catalytic activity

1458: could be selected, but the variation by tubes is so small that the

1459: selection does not work.  Hence deleterious mutations remain in the

1460: soup, and the self-replication activity will be lost by generations.

1461: In other words, the selection works because the number of information

1462: carrier in a replication unit (cell) is very small, and is free from

1463: the statistical law of large numbers.

1464:

1465: Summing up: In the experiment, it was found that replication is

1466: maintained even under deleterious mutations (that correspond to

1467: structural changes from active to inactive molecules in the model),

1468: only when the population of DNA polymerase genes is small and

1469: competition of replicating systems is applied.  When the number of

1470: genes (corresponding to $Y$) is small, the information containing in

1471: the DNA polymerase genes is preserved.  This is made possible by the

1472: maintenance of rare fluctuations, as found in our theory.  The system

1473: has evolvability only if the number of DNA in the system is small.

1474: Otherwise, the system gradually loses its activity to replicate

1475: itself.  These experimental results are consistent with the minority

1476: control theory described.

1477:

1478: \subsection{Discussion}

1479:

1480: \subsubsection{Heredity from a kinetic viewpoint}

1481:

1482: In this section, we have shown that in a mutually catalyzing system,

1483: molecules $Y$ with the slower synthesis speed and minority in number,

1484: tend to act as the carrier of heredity. Through the selection under

1485: reproduction, a state, in which there is a few active $Y$ and almost zero

1486: inactive $Y$ molecules, is selected.  This state is termed the

1487: ``minority controlled state".  Between the two molecule species, there

1488: appears separation of roles, between that with a larger number, and that with

1489: a greater catalytic activity.  The former has a variety of chemicals

1490: and reaction paths, while the latter works as a basis for the

1491: heredity, in the sense of the two properties mentioned in \S 1.1 and

1492: \S 4.3, `preservation' and `control'.  We now discuss these properties

1493: in more detail.

1494:

1495: [Preservation property]: A state that can be reached only through

1496: very rare fluctuations is selected, and

1497: it is preserved over many generations, even though

1498: the realization of such a state is very rare

1499: when we consider the rate equation obtained in the continuum limit.

1500:

1501: [Control property]: A change in the number of $Y$ molecules

1502: has a stronger influence on the growth rate of a cell than a change

1503: in the number of $X$ molecules.

1504: Also, a change in the catalytic activity of the $Y$ molecules has a strong

1505: influence on the growth of the cell.  The catalytic activity of the $Y$

1506: molecules acts as a control parameter of the system.

1507:

1508: Once this minority controlled state is established, the following

1509: scenario for the evolution of genetic information is expected.  First,

1510: a new selection pressure is now possible to emerge, to evolve a

1511: machinery to ensure that the minority molecule makes it into the

1512: offspring cells, since otherwise the reproduction of the cell is

1513: highly damaged.  Hence a machinery to guarantee the faithful

1514: transmission of the minority molecule should evolve.  Now, the origin

1515: of heredity is established.  Here, for this heredity, any specific

1516: metabolic or genetic contents transmitted faithfully is not necessary.

1517: It can appear from the loose reproduction system that Dyson considered

1518: (as in \S 2.2).  This heredity evolves just as a result of kinetic

1519: phenomenon and is a rather general phenomenon in a reproducing

1520: protocell consisting of mutually catalytic molecules.

1521:

1522: This faithful transmission of minority molecule provides a basis for

1523: critical information for reproduction of the protocells.  Since this

1524: minority molecule is protected to be transmitted, other chemicals that

1525: are synthesized in connection with it are probable to be transmitted,

1526: albeit not always faithfully.  Hence there appears a further

1527: evolutionary incentive to package life-critical information into the

1528: minority molecule.  Now more information (`many bits' of information)

1529: are encoded on the minority molecule.  Then, the molecules work as a

1530: carrier of genetic information in the today's sense.  With this

1531: evolution having more molecules catalyzed by the minority molecule, it

1532: is then easier to further develop the machinery to better take care of

1533: minority molecules, since this minority molecule is essential to many

1534: reactions for the synthesis of many other molecules.

1535:

1536: Hence the evolution of faithful transmission of minority molecules and

1537: of coding of more information reinforce each other.  At this point one

1538: can expect a separation of metabolism and genetic information.

1539:

1540: To sum up, how a single molecule starts to reign the heredity is

1541: understood from a kinetic viewpoint.  We first show the minority

1542: controlled state as a rather general consequence of kinetic process of

1543: mutually catalytic molecules. This provides a basis for heredity.

1544: Taking advantage of the evolvability of minority controlled state,

1545: then, preservation mechanism of the minority molecule evolves, which

1546: allows for more information encoded on it, leading to separation of

1547: genetic information and metabolism.  In this sense, the minority

1548: molecule species with slower synthesis speed, leading to the

1549: preservation of rare states and control of the behavior of the system,

1550: acts as an information carrier. The important point of our theory is

1551: that heredity arises prior to any metabolic information that needs to

1552: be inherited.

1553:

1554:

1555: %\subsubsection{Accessibility to Minority Controlled state}

1556:

1557: %One important consequence of the existence of the MC state is

1558: %evolvability.  Mutations introduced to the majority species tend to be

1559: %canceled out on the average, in accordance with the law of large

1560: %numbers.  Hence, the catalytic activity of the minority species ($Y$ in

1561: %our model) is not only sustained, but has a greater potentiality to

1562: %increase through evolution.

1563:

1564: %The evolution and stability  of the MC state with respect to mutation

1565: %was discussed in \S 4.4.3.

1566: %If the initial difference between the catalytic abilities $e_x$ and $e_y$

1567: %(and other parameters) satisfies the conditions stated in \S 4.4, it is shown that

1568: %the MC state once realized is stable over generations against mutations.

1569:

1570: %1. Higher Order Catalysis

1571:

1572: %2. Spatial structure of a cell:

1573:

1574:

1575: \subsubsection{some remarks}

1576:

1577: In \S 2, we described two standpoints on the origin of life, i.e.,

1578: genetic information first or complex metabolism first.  We pointed out

1579: some difficulty at each standpoint.  In the former picture, there was

1580: a problem on the stability against parasites, while the latter cannot

1581: solve how genetic information took over the original loose

1582: reproduction system.  The minority control gives a new look to these

1583: problems.

1584:

1585: The first problem in \S 2.1  was the appearance of parasitic molecules to destroy

1586: the hypercycle, i.e. mutually catalytic reaction cycle.  If only the

1587: replication process of molecules is concerned, it is not so easy to

1588: resolve the problem.  Here we consider the dual level of replication,

1589: i.e., molecular and cellular replication.

1590:

1591: In the present theory for the origin of information, existence of a

1592: cell unit that reproduces itself is required.  Two levels of

1593: reproduction, both molecules and cells are assumed here.  Hence a cell

1594: with parasitic molecules cannot grow, and is selected out.  Relevance

1595: of this type of two-level reproduction to avoid molecular parasites

1596: has been discussed \cite{Hogeweg,Szathmary,Eigen-book}.  Here,

1597: relevance of cellular compartment to the {\em origin of genetic

1598: information} is more important.

1599:

1600: This two-level selection works effectively, with the aid of minority

1601: control of specific molecules for a cell.  Indeed, surviving cells

1602: satisfy the minority control.  With the selection pressure for

1603: reproduction of cells, there appears a state that is not expected by

1604: the rate equation for reaction of molecules, where the number of

1605: inactive $Y$ molecules that are parasitic to the catalytic reaction is

1606: suppressed.  Furthermore, resistance against parasitic (inactive) $Y$

1607: molecules is established by this minority controlled state.

1608:

1609: This minority control also resolves the question on the genetic

1610: take-over, the problem in the ''metabolism first'' standpoint (in \S 2.2).  Among

1611: several molecules, specific molecule species that are minority in

1612: population controls the behavior of a cell and is well preserved.  The

1613: possible scenario mentioned in the beginning of this section gives one

1614: plausible answer how genetic take-over progresses.

1615:

1616: %from this minority controlled state.

1617:

1618: The differentiation of role between the molecules looks like

1619: ``symmetry breaking''.  When initially two states are equally

1620: possible, and later only one of them is selected, it is said that the

1621: symmetry is broken. In the differentiation of roles of molecules

1622: studied here, however, the molecules have different characters as to

1623: the replication speed from the beginning. Here a difference in one

1624: character (i.e., the replication speed) is ''transformed'' into the

1625: difference in the control behavior, and in the role as a carrier

1626: of heredity.  In other words, a characteristics with already broken

1627: symmetry is transformed into a different type of symmetry breaking.

1628: This kind of transformation of one character's difference to another

1629: is often seen in biology, as we have already discussed in the study of

1630: morphogenesis and sympatric speciation\cite{Furusawa,speciation}.

1631:

1632: \section{Recursive Production in an Autocatalytic Network}

1633:

1634: Now we come to the second question raised in \S 1.  In the model of

1635: the last section, we considered a system consisting of two kinds of

1636: molecules.  In a cell, however, a variety of chemicals form a complex

1637: reaction network to synthesize themselves.  Here we study a model with

1638: a large number of chemical species, to discuss how a cell with such

1639: large number of components and complex reaction network can sustain

1640: reproduction, keeping similar chemical compositions

1641: \cite{KK-net,KK-PRE}(see also \cite{Lancet}).

1642:

1643: \subsection{Model}

1644:

1645: To unveil general features of a system with mutually catalyzing

1646: molecules, we study a system with a variety of chemicals ($k$ molecule

1647: species), forming a mutually catalyzing network.  The molecules

1648: replicate through catalytic reactions, so that their numbers within a

1649: cell increase.  (see Fig.1 again for schematic representation of the

1650: model).

1651:

1652: We envision a (proto)cell containing $k$ molecular species with some

1653: of the species possibly having a zero population.  A chemical species

1654: can catalyze the synthesis of some other chemical species as

1655:

1656: \begin{equation}

1657: [i] + [j] \rightarrow [i] + 2[j],

1658: \label{reaction}

1659: \end{equation}

1660:

1661: \noindent

1662: with $i,j=1,\cdots,k$ according to a randomly chosen reaction network,

1663: where the reaction is set at far-from-equilibrium, In eq.(7), the

1664: molecule $i$ works as a catalyst for the synthesis of the molecule

1665: $j$, while the reverse reaction is neglected, as discussed in the

1666: hypercycle model.  For each chemical the rate for the path of

1667: catalytic reaction in eq.(7) is given by $\rho$, i.e., each species has

1668: about $k\rho$ possible reactions.  The rate is kept fixed throughout

1669: each simulation.  Considering catalytic reaction dynamics, the reverse

1670: reaction process is neglected, and reactions $i \leftrightarrow j$ are

1671: not included.  (Here we investigated the case without direct mutual

1672: connections, i.e., $i\rightarrow j$ was excluded as a possibility when

1673: there was a path $j \rightarrow i$, although this condition is not

1674: essential for the results to be discussed).  Furthermore, each

1675: molecular species $i$ has a randomly chosen catalytic ability $c_i \in

1676: [0,1]$ (i.e., the above reaction occurs with the

1677: rate $c_i$).   Assuming an environment with an ample

1678: supply of chemicals available to the cell, the molecules then

1679: replicate leading to an increase in their numbers within a cell.

1680:

1681: Again, when the total number of molecules exceeds a given threshold

1682: (here we used 2$N$), the cell is assumed to divide into two, with each

1683: daughter cell inheriting half of the molecules of the mother cell,

1684: chosen randomly.

1685:

1686: During the replication process, structural changes, e.g., the

1687: alternation of a sequence in a polymer, may occur that alter the

1688: catalytic activities of the molecules.  Therefore, the activities of

1689: the replicated molecule species can differ from those of the mother

1690: species.  The rate of such structural changes is given by the

1691: replication 'error rate' $\mu$.  As a simplest case, we assume that

1692: this `error' leads to all other molecule species with equal

1693: probability (i.e., with the rate $\mu /(k-1)$),  and could thus

1694: regard it as a background fluctuation.  In reality, of course, even

1695: after a structural change, the replicated molecule will keep some

1696: similarity with the original molecule, and a replicated species with

1697: the `error' would be within a limited class of molecule species.

1698: Hence, this equal rate of transition to other molecule species is a

1699: drastic simplification.  Some simulations where the errors in

1700: replication only lead to a limited range of molecule species, however,

1701: show that the simplification does not affect the basic conclusions

1702: presented here.  Hence we use the simplest case for most simulations.

1703:

1704: In statistical physics, people study mostly the case the total number

1705: of molecules $N$ is very large, at least much much larger than a

1706: number of molecule species $k$.  In this case, the continuum

1707: description is relevant.  When $N/k$ is rather small, some molecules

1708: species can often fluctuate around 0, where the discreteness

1709: 0,1,2,... will be important, as already discussed.  In order to take

1710: the importance of the discreteness in the molecule numbers into

1711: account, we adopted a stochastic rather than the usual differential

1712: equations approach, by taking a variety of possible chemicals, where

1713: $N$ and $k$ are of a comparable order.

1714:

1715: The model is simulated as follows: At each step, a pair of molecules,

1716: say, $i$ and $j$, is chosen randomly.  If there is a reaction path

1717: between species $i$ and $j$, and $i$ ($j$) catalyzes $j$ ($i$), one

1718: molecule of the species $j$ ($i$) is added with probability $c_i$

1719: ($c_j$), respectively.  The molecule is then changed to another

1720: randomly chosen species with the probability of the replication error

1721: rate $\mu$.  When the total number of molecules exceeds a given

1722: threshold (denoted as $N$), the cell divides into two such that each

1723: daughter cell inherits half ($N/2$) of the molecules of the mother

1724: cell, chosen randomly\cite{minority}.

1725:

1726: Again, to include competition, we assume that there is a constant

1727: total number $M_{tot}$ of protocells, so that one protocell, randomly

1728: chosen, is removed whenever a (different) protocell divides into two.

1729: However, the result here does not depend on $M_{tot}$ so much.  We

1730: choose mostly $M_{tot}=1$, in the results below but the simulation

1731: with $M_{tot}=100$ gives essentially the same behavior.

1732:

1733: %\noindent

1734: %with $i,j=1,\cdots,k$.  The connection rate of the catalytic paths is given by $p$ per each chemical.

1735:

1736: %Again, replication is accompanied by some 'error', and instead of the replication of the molecule

1737: %$i$, one of other $k$ molecule species is synthesized with an error

1738: %rate $\mu$.

1739: %(see Fig.2, for schematic representation).

1740:

1741: %\begin{figure}

1742: %\noindent

1743: %\vspace{-.1in}

1744: %\hspace{-.3in}

1745: %\epsfig{file=alaska20.ps,width=.6\textwidth}

1746: %\caption{Schematic representation of the model.}

1747: %\end{figure}

1748:

1749:

1750: \subsection{Result}

1751:

1752: \subsubsection{Phases}

1753:

1754: Our main concern here is the dynamics of these molecule numbers $N_i$

1755: of the species $i$ in relationship with the condition of the recursive

1756: growth of the (proto)cell.  In our model there are four basic

1757: parameters; the total number of molecules $N$, the total number of

1758: molecule species $k$, the mutation rate $\mu$, and the reaction path

1759: rate $\rho$.  By carrying out simulations of this model, choosing a

1760: variety of parameter values $N,k,\mu,\rho$, also by taking various

1761: random networks, we have found that the behaviors are classified into

1762: the following three phases\cite{KK-net,KK-PRE}:

1763:

1764: (1) Fast switching states without recursiveness

1765:

1766: (2) Achievement of recursive production  with similar chemical compositions

1767:

1768: (3) Switch over several quasi-recursive states

1769:

1770: \begin{figure}

1771: \noindent

1772: \epsfig{file=alaska3aC.ps,width=.5\textwidth}

1773: \caption{The number of molecules $N_n(i)$ for the species $i$ is

1774: plotted as a function of generation $n$ of cells, i.e., at each

1775: successive division event $n$.  A random network with $k=500$ and

1776: $\rho=.2$.  Dominant species change successively in generation.}

1777: \end{figure}

1778:

1779: \begin{figure}

1780: \noindent

1781: (a)\epsfig{file=fig1bcomp.ps,width=.53\textwidth}

1782: (b)\epsfig{file=alaska3b.ps,width=.53\textwidth}

1783: (c)\epsfig{file=figswitch2.ps,width=.53\textwidth}

1784: \caption{The number of molecules $N_n(i)$ for the species $i$ is

1785: plotted as a function of generation $n$ of cells, i.e., at each

1786: successive division event $n$.  results from a random network with

1787: $k=200$ and $\rho=.1$ was adopted, with $N=64000$ and $\mu=0.01$ (a),

1788: and $\mu=0.1$ (b).  Only some species (whose population get large at

1789: some generation) are plotted. in (a), a recursive production state is

1790: established, while in (b), a few quasi-recursive states are

1791: visited successively. (c): Expansion of Fig (b) around the time step 100000.}

1792: \end{figure}

1793:

1794: \begin{figure}

1795: \noindent

1796: (a)\epsfig{file=net1.ps,width=.5\textwidth}

1797: (b)\epsfig{file=net2.ps,width=.4\textwidth}

1798: \caption{The catalytic network of the dominant species that constitute

1799: the recursive state. The catalytic reaction is plotted by an arrow $i

1800: \rightarrow j$, as the replication of the species $j$ with the

1801: catalytic species $i$.  The numbers in () denote $c_i$ of the species.

1802: Only the species that continue to exist with the population larger

1803: than 10 is plotted.  (Note many other species can exist at each

1804: generation, through the replication error).  (a): corresponding to the

1805: recursive state of Fig.9 a, where the three species connected by

1806: thick arrows are the top 3 species in Fig.9 a.  The network (b) is

1807: another example observed in a different set of simulations with

1808: $k=200$ and $\rho =.1$, but with a different reaction network from

1809: Fig.9.}

1810: \end{figure}

1811:

1812: In the phase (1), there is no clear recursive production and the

1813: dominant molecule species changes by generation frequently. Even

1814: though each generation has some dominating species as with regards to

1815: the molecule numbers, the dominating species change every few

1816: generations.  At one generation, some chemical species are dominant but

1817: only a few generations later. Information regarding the previously

1818: dominating species is totally lost often to the point that its

1819: population drops to zero (see Fig.8).  Here no stable mutual catalytic

1820: relationships are formed among molecules.  Hence, the time required

1821: for reproduction of a cell is quite large, and much larger than the

1822: case (2).

1823:

1824: In the phase (2), a recursive state is established, and the chemical

1825: composition is stabilized such that it is not altered much by the

1826: division process (see Fig.9).  Generally, all the observed recursive states

1827: consist of 5-12 species, except for those species with one or two

1828: molecule numbers, which exist only as a result of replication errors.

1829: These 5-12 chemicals mutually catalyze, by forming a catalytic network

1830: as in Fig.10, which will be discussed later.  The member of these 5-12

1831: species do not change by generations, and the chemical compositions

1832: are transferred to the offspring cells.  Once reached, this state is

1833: preserved throughout whole simulations, lasting over more than 10000

1834: generations.

1835:

1836: The recursive state observed here is not necessarily a fixed point

1837: with regards to the population dynamics of the chemical

1838: concentrations.  In some case, the chemical concentrations oscillate

1839: in time, but the nature of the oscillation is not altered by the

1840: process of cell division.

1841:

1842: %In all of these cases, the number of each molecule shows relatively large

1843: %fluctuations, since the total number of molecules $N$

1844: %is not large (typically we choose $N \sim

1845: %(10^2 \sim 10^5)$ in our simulations.).

1846:

1847: For example, in the recursive state depicted in Fig.9a), 11 species

1848: remain in existence throughout the simulation. As shown, three species

1849: have much higher populations than others, which form a hypercycle as

1850: $109\rightarrow 11 \rightarrow 13 \rightarrow 109$. (The numbers

1851: 11,13,.. are indices of chemical species, initially assigned

1852: arbitrarily).  The hypercycle sustains the replication of the

1853: molecules, and is called 'core hypercycle'.  The catalytic activities

1854: of the species satisfy $c_{13}>c_{109}>c_{11}$, and accordingly the

1855: respective populations satisfy $N_{11} > N_{109} > N_{13}$.

1856:

1857: In the phase (3), after one recursive state lasts over many

1858: generations (typically a thousand generations), a fast switching state

1859: appears until a new (quasi-)recursive state appears.  As shown in

1860: Fig.9 b, for example, each (quasi-)recursive state is similar to that

1861: in the phase (2), but in this case, its lifetime is finite, and it is

1862: replaced by the fast switching state as in the phase (1).  Then the

1863: same or different (quasi-)recursive state is reached again, which

1864: lasts until the next switching occurs.  In the example of Fig.9b

1865: (see also Fig.9c) for its expansion), around the 12000th generation,

1866: the core network is taken over by parasites to enter the phase (1)

1867: like fast switching state which in turn gives way for a new

1868: quasi-recursive state around the 14000th generation.

1869:

1870: In the example of Fig. 9b, there is another type of switching, as

1871: shown around 85000th generation, as shown in Fig.9c with

1872: magnification.  Here, the quasi-recursive state is still stable, but

1873: the core hypercycle consisting of dominant species changes.  As in

1874: Fig.9c, a switch occurs from an initial core hypercycle

1875: ($109$,$11$,$13$), to the next core hypercycle $(11,13,195,155)$

1876: around the 8500th generation.

1877:

1878: This latter switching is the competition among core networks, while

1879: the former drastic switch is due to the invasion of parasitic

1880: molecules, which is most commonly observed.  The mechanism of this

1881: switching is discussed again in \S 5.2.4.

1882:

1883: \subsubsection{Dependence of Phases on the Basic Parameters}

1884:

1885: Although the behavior of the system depends on the choice of the

1886: network, there is a general trend with regards to the phase change,

1887: from (1), to (3), and then to (2) with the increase of $N$, or with

1888: the decrease of $k$, as schematically shown in Fig.11.  By choosing a

1889: variety of networks, however, we find a clear dependence of the

1890: fraction of the networks on the parameters, leading to a rough sketch

1891: of the phase diagram.  Generally, the fraction of (2) increases and

1892: the fraction of (1) decreases also with the decrease of $\rho$ or $\mu$.

1893: For example, the fraction of (1) (or (3)) gets

1894: larger as $k$ is decreased from $k\stackrel{<}{\sim} 300$ for

1895: $N=50000$ (with $\rho=.1$ and $\mu=.01$), while dependence on $\rho$

1896: will be discussed below.

1897:

1898:

1899: \begin{figure}

1900: \noindent

1901: \hspace{-.3in}

1902: \epsfig{file=schem.eps,width=.5\textwidth}

1903: \caption{Schematic representation of the phase diagram of the three

1904: phases, plotted as a function of the total number of molecules $N$,

1905: and the total possible number of molecule species $k$.}

1906: \end{figure}

1907:

1908: For a quantitative investigation, it is useful to classify the phases by

1909: the similarity of the chemical compositions between two cell division

1910: events\cite{Lancet}. To check the similarity, we first define a

1911: $k$-dimensional

1912: vector $\stackrel{\rightarrow}{V_n}$=$(p_n(1),..,p_n(k))$ with $p_n(i)

1913: =N_n(i)/N$. Then, we measure the similarity between $\ell$ successive

1914: generations with the help of the inner product as

1915:

1916: \begin{equation}

1917: H_{\ell}=\stackrel{\rightarrow}{V_n} \cdot

1918: \stackrel{\rightarrow}{V_{n+\ell}}/(|V_n||V_{n+\ell}|)

1919: \end{equation}

1920: %(see Fig.??).

1921:

1922: In Fig.12, the average similarity $\overline{H_{20}}$ and the average

1923: division time are plotted for 50 randomly chosen reaction networks as

1924: a function of the path probability $\rho$.  Roughly speaking the

1925: networks with $\overline{H_{20}}>.9$ belong to $(2)$, and those with

1926: $\overline{H_{20}}<.4$ to $(1)$, empirically.  Hence, for $\rho >0.2$,

1927: the phase (1) is observed for nearly all the networks (e.g. $48/50$),

1928: while for lower path rates, the fraction of (2) or (3) increases. The

1929: value $\rho \sim .2$ gives the phase boundary in this case.

1930:

1931: Generally speaking, a positive correlation between the growth speed of

1932: a cell and the similarity $H$ exists.  In Fig.12, the division time is

1933: also plotted, where to each point with a high similarity $H$, a lower

1934: division time corresponds.  The network with higher similarity (i.e.,

1935: in the phase (2)) gives a higher growth speed.  Indeed, the recursive

1936: states maintain higher growth speeds since they effectively suppress

1937: parasitic molecules.  In Fig. 12, by decreasing path rates, the

1938: variations in the division speeds of the networks become larger, and

1939: some networks that reach recursive states have higher division speeds

1940: than networks with larger $\rho$. On the other hand, when the path

1941: rate is too low, the protocells generally cannot grow since the

1942: probability to have mutually catalytic connections in the network is

1943: nearly zero.  Indeed there exists an optimal path rate seems (e.g.,

1944: around $.05$ for $k=200$, $N=12800$ as in Fig.12) for having a network

1945: with high growth speeds. Consequently, under competition for growth,

1946: protocells having such optimal networks will be evolved as will be

1947: discussed in \S 5.3.

1948:

1949: Besides the correlation between the growth speed and similarity, the

1950: correlation with the diversity of the molecules also exists.

1951: Protocells with higher growth speed and similarity in the phase (2)

1952: have higher chemical diversity also.  In the phase (1), one (or a very

1953: few) molecule species is dominant in the population, while about 10

1954: species have higher population in the phase (2) with higher growth

1955: speed, where the chemical diversity is maintained.

1956:

1957: \begin{figure}

1958: \noindent

1959: %\hspace{-.3in}

1960: \epsfig{file=fig3.ps,width=.95\textwidth}

1961: %\includegraphics[width=63mm]{fig4.ps}

1962: %\includegraphics[width=65mm]{fig3.ps}

1963: \caption{The average similarity $\overline{H_{20}}$ ($+$), and the

1964: average division time ($\times$) are plotted as a function of the path

1965: rate $\rho$.  For each $\rho$, data from 50 randomly chosen networks

1966: are plotted.  The average is taken over 600 division events.  The

1967: dotted line indicates the average of $\overline{H_{20}}$ over the 50

1968: networks for each $\rho$.

1969: For $\rho>.2$, networks over 98 \% have $H<.4$, and they show fast switching,

1970: while for $\rho=.08$, about 95\% belong to the phase (2) or (3)

1971: At $\rho=0.02$, 25 out of 50 networks cannot support cell growth,

1972: 4 cannot at $\rho=0.04$. (Adapted from \cite{KK-PRE}).}

1973: \end{figure}

1974:

1975: \subsubsection{Maintenance of Recursive Production}

1976:

1977: How is the recursive production sustained in the phase (2)?  We have

1978: discussed already the danger of parasitic molecules that have lower

1979: catalytic activities and are catalyzed by molecules with higher

1980: catalytic activities. As discussed in \S 2.1, such parasitic molecules

1981: can invade the hypercycle.  Indeed, under the structural changes and

1982: fluctuations, the recursive production state could be destabilized.

1983: To answer the question on the itinerancy and stability of

1984: recursive states, we have examined several reaction networks.  The

1985: unveiled logic for the maintenance of recursive state is summarized as

1986: follows.

1987:

1988: (a) {\bf Stabilization by intermingled hypercycle network}:

1989:

1990: The 5-12 spices in the recursive state form a mutually catalytic

1991: network, for example, as in Fig. 10.  This network has a {\sl core

1992: hypercycle network}, as shown in thick arrows in Fig.10a.  As shown in

1993: Fig.13, such core hypercycle has a mutually catalytic relationship,

1994: as `` $A$ catalyzes $B$, $B$ catalyzes $C$, and $C$ catalyzes

1995: $A$''. However, they are connected with other hypercycle networks such

1996: as $G\rightarrow D \rightarrow B \rightarrow G$, and $D\rightarrow C

1997: \rightarrow E \rightarrow D$, and so forth.  The hypercylces are

1998: intermingled to form a network.  Coexistence of core hypercycle and

1999: other attached hypercycles are common to the recursive states we have

2000: found in our model.

2001:

2002: This intermingled hypercycle network (IHN) leads to stability against

2003: parasites and fluctuations.  Assume that there appears a parasitic molecule to one species in the

2004: member of IHN (say $X$ as a parasite to $C$ in Fig.13).  The species

2005: $X$ may decrease the number of the species $C$.  If there were only a

2006: single hypercycle $A\rightarrow B \rightarrow C \rightarrow A$, the

2007: population of all the members $A,B,C$ would be easily decreased by

2008: this invasion of parasitic molecules, resulting in the collapse of the

2009: hypercycle.  In the present case, however, other parts of the network

2010: (say, that consisting of $A,B$,$G,D$ in Fig.13), compensate the

2011: decrease of the population of $C$ by the parasite, so that the

2012: population of $A$ and $B$ are not so much decreased.  Then, through

2013: the catalysis of the species B, the replication of the molecule $C$

2014: progresses, so that the population of $C$ is recovered.  Hence the

2015: complexity in the hypercycle network leads to stability against the

2016: attack of parasite molecules.

2017:

2018: Next, IHN is also relevant to the stability against fluctuations.  It

2019: is known that the population dynamics of a simple hypercycle often

2020: leads to heteroclinic cycle\cite{Sigmund}, where the population of one

2021: (or a few) member approaches 0, and then is recovered.  For a

2022: continuum model, such heteroclinic cycle can continue forever, but in a stochastic

2023: model, due to fluctuations, the number of the corresponding molecule

2024: species is totally extinct sometimes.  Once this molecule species goes extinct

2025: completely, and then its recovery by replication error would require a

2026: very long time.  Hence, to achieve stability against fluctuations, a

2027: state with the heteroclinic cycle dynamics or any oscillation in which

2028: some of the population goes very low should be avoided.  Indeed, by forming IHN,

2029: such oscillatory instability is often avoided or reduced.  Due to

2030: coexistence of several hypercycle processes, instability in each

2031: hypercycle cancels out, leading to fixed-point dynamics or oscillation

2032: with a smaller amplitude.  Thus the danger that the population of some

2033: molecules in the hypercycle goes to zero by fluctuations

2034: %due to finiteness of molecule numbers

2035: is reduced.

2036:

2037: Stability of coexistence of many species is discussed as 'homeochaos'

2038: \cite{homeochaos}, while stable reproduction in reaction network is

2039: also seen in \cite{Ikegami}.

2040:

2041: (b) {\bf Minority in the core hypercycle};

2042:

2043: Now we study more closely the population dynamics in a core hypercycle.

2044: Here, the number of molecules $N_j$ of molecule species $j$, is in the

2045: inverse order of their catalytic activity $c(j)$, i.e,, $N_A>N_B>N_C$

2046: for $c_A<c_B<c_C$.  Because a molecule with higher catalytic activity

2047: helps the synthesis of others more, this inverse relationship is

2048: expected. Indeed, the population sizes of just three species $A,B,C$,

2049: with the catalytic relationship $A\rightarrow B \rightarrow C

2050: \rightarrow A$ are estimated by taking the continuum limit $N

2051: \rightarrow \infty$ and obtaining a fixed point solution of the rate

2052: equation for the concentrations of the chemicals as discussed in

2053: \cite{Eigen}.  From a straightforward calculation we have:

2054: $N_A:N_B:N_C= c_A^{-1}:c_B^{-1}:c_C^{-1}$.

2055:

2056: Here, the $C$ molecule is catalyzed by a molecule species with higher

2057: activities but larger populations ($A$).  Hence, the parasitic

2058: molecule species cannot easily invade to disrupt this mutually

2059: catalytic network.  Since the minority molecule ($C$) is catalyzed by

2060: the majority molecule ($A$) (with the aid of another molecule ($B$)),

2061: a large fluctuation in molecule numbers is required to destroy this

2062: network.

2063:

2064: The stability in the minority molecule is also accelerated by the

2065: complexity in IHN.  If the catalytic activity of $C$ is highest, the

2066: recursive state here is mainly achieved by catalysis of the molecule

2067: $C$.  On the other hand, this also implies that $C$ is the minority in

2068: the core network.  (The population of the molecule $C$ is usually

2069: larger than $D$,$E$, etc. in Fig.13, though.)  Hence the attack to $C$

2070: molecule is most relevant to destroy this recursive state.  In the

2071: IHN, this minority molecule species is involved in several hypercycles

2072: as in $C$ in Fig.13.  This, on the one hand, demonstrates the

2073: prediction in \S 4.5, that more species are catalyzed by the minority

2074: molecules, while on the other hand, leads to the suppression of the

2075: fluctuation in the number of minority molecules, as will be discussed

2076: in \S 5.4.  With the decrease of the fluctuation, the probability that

2077: the minority molecules is extinct is reduced, so that the recursive

2078: state is hardly destroyed.

2079:

2080: (c) {\bf Localization in a Random Network}

2081:

2082: The present system belongs to a class of system with reaction and

2083: diffusion, while the structural change by replication error leads to

2084: the diffusion within the network space.  With random connection in the

2085: catalytic network, the present system is nothing but a

2086: reaction-diffusion in a random network.  Generally, such problem is

2087: related with the Anderson localization, where concentrations are

2088: localized within some part of the network, depending on the degree of

2089: the connectivity in the network and the strength of the diffusion

2090: coupling. From this viewpoint, the formation of IHN, localized only

2091: within a limited species in the global network, may be understood as an

2092: example of such localization.  It will be interesting to study the

2093: stability of the recursive production, in terms of the localization

2094: transition in the reaction network\cite{Takagi}.

2095:

2096: \begin{figure}

2097: \noindent

2098: \epsfig{file=alaska4.ps,width=.5\textwidth}

2099: \caption{An example of mutually catalytic network in our model.  The

2100: core network for the recursive state is shown by circles, while

2101: parasitic molecules ($X$,$Y$,..) connected by broken arrows, are

2102: suppressed at a (quasi-)recursive state.}

2103: \end{figure}

2104:

2105: \subsubsection{Switching}

2106:

2107: Next, we discuss the mechanism of switching.  In the phase (3), the

2108: recursive production state is destabilized, when the population of

2109: parasitic molecules increase.  For example, the number of the molecule

2110: $C$ may be decreased due to fluctuations, while the number of some

2111: parasitic molecules ($X$) that are not originally in the catalytic

2112: network but are catalyzed by $C$, may increase.  Frequency of such

2113: fluctuation increases as the total population of molecules in a cell

2114: is smaller.  If such fluctuation appears, the other molecule species

2115: in the original network loses the main source of molecules that

2116: catalyze their synthesis, successively.  Then the new parasitic

2117: molecule $X$ occupies a large portion of populations. However, the

2118: molecule's main catalyst ($C$) soon disappears, the synthesis of $X$

2119: is stopped, and this species $X$ is taken over by some molecules $Y$

2120: that are catalyzed by $X$ (see the broken arrows in Fig.13).  Then,

2121: within a few generations, dominant species changes, and recursive

2122: production does not continue.  Indeed, this is what occurred in the phase

2123: (1).  Then the parasitic molecule $X$ is taken over some other

2124: $Y$. This take-over by parasites continues successively, until a new

2125: (or same) recursive state with hypercycle network is formed.  Hence

2126: the fluctuation in the minority molecule in the core network is

2127: relevant to the switching process.

2128:

2129: \subsection{Evolution}

2130:

2131: {\bf Model A}

2132:

2133: The next question we have to address is whether the recursive

2134: production state is achieved through evolution.  To check this problem

2135: we have extended our model to further include a ``mutational'' change

2136: of network at each division event. (model A).

2137: To be specific, at each division

2138: event we add or delete randomly (with equal probability) a few

2139: reaction paths, whose connection $i \rightarrow j$ is again chosen

2140: randomly.  Here to see the evolution of catalytic activity, the index

2141: of the species is ordered with the value of catalytic activity, i.e.,

2142: the index $j$ is ordered so that $c_j$ monotonically increases with

2143: $j$.  Since the mutational change is assumed to be random, a new path

2144: is added or deleted independent of the catalytic activity.  In the

2145: simulation displayed here, there are 5 mutations of the network path

2146: at every generation.  We have carried out numerical experiments of this

2147: model, to see if the path rate of the network stays around the state

2148: supporting the recursive production.

2149:

2150: \begin{figure}

2151: \noindent

2152: (a)\epsfig{file=figev0c.ps,width=.55\textwidth}

2153: (b)\epsfig{file=figev1c.ps,width=.55\textwidth}

2154: (c)\epsfig{file=figev2c.ps,width=.55\textwidth}

2155: \caption{Evolution of path-rates, recursiveness, and division time,

2156: plotted versus generation.  The total number of species $k$ is 500,

2157: where $c_i$ is chosen as $100^{-(k-i)/k}$, so that it ranges from

2158: 0.01 to 1.0 equally in logarithmic scale.  The number of molecules $N$ in

2159: a cell is set at 50,000, so that the cell divided when the total

2160: molecule number is 100,000. The initial path rate is set at

2161: $\rho=0.1$, i.e., 125,000 paths totally.  At every division 5 paths

2162: are "mutated", i.e., with equal probability 5 paths are added or

2163: eliminated randomly.  Totally there are $M_{tot}=100$, so that one of

2164: 100 cells are eliminated when one cell is divided into two.  (a) the

2165: total path number.  The path rate is obtained by dividing the number by

2166: $k^2$.  (b) the division time, i.e., the required steps for a cell

2167: divide (c) the similarity $H^1(i)$, defined in \S 5.2.}

2168: \end{figure}

2169:

2170: %Chemical diversity is computed as $\sum_j p(j) log p(j) $ $p(i)=N(i)/N$ with }

2171:

2172: \begin{figure}

2173: \noindent

2174: %\epsfig{file=figev.ps,width=.8\textwidth}

2175: \epsfig{file=figevC.ps,width=.8\textwidth}

2176: \caption{Evolution of cell: Those species $i$ with $N(i)>100$ are

2177: plotted with the vertical axis as the species index $i$, and the

2178: longitudinal axis as the generation.  The data are from the result of

2179: the simulation for Fig.14.}

2180: \end{figure}

2181:

2182: \begin{figure}

2183: \noindent

2184: \epsfig{file=evnet.ps,width=.5\textwidth}

2185: \caption{The catalytic network of the species

2186: that constitute the recursive state around $10^6$ th generation of Fig.14 or 15.

2187: }

2188: \end{figure}

2189:

2190: An example of the time series of path rates at each generation is

2191: shown in Fig.14, as well as the time series of the division time, and

2192: chemical diversity.  Corresponding to this time series, the change of

2193: dominant species is plotted over generations in Fig.15.

2194:

2195: As shown, the recursive state is achieved, and is maintained over many

2196: generations, until it switches to other states.  At each reproduction,

2197: there are changes in the reaction paths here.  In spite of such

2198: mutations, the recursive production state is sustained over many

2199: generations.  In each recursive production state, the path rate

2200: remains rather low.  Here, such network that supports the recursive

2201: production is selected and is maintained. Note that many molecules are

2202: catalyzed by the minority species in the core hypercycle network. In

2203: this sense, a prototype of the evolution to package the information

2204: into the minority molecule that is suggested in \S 4.5 is observed

2205: here.

2206:

2207: An example of the network of dominant species is given in Fig. 16.

2208: Here intermingled hypercycle networks (IHN) are formed so that

2209: recursive production is formed.  Again, there is a core hypercycle,

2210: and other hypercycles are connected with it.  The surviving molecule

2211: species have a large connectivity in reaction paths, much larger than

2212: expected from a random network of the reaction path rate here.  As in

2213: Fig 16. the IHN here forms a highly connected network, even though the

2214: average path rate remains small (As shown in Fig.14, the path per

2215: species is about 0.1 or lower).  The paths forming the IHN are

2216: preserved over long generations, while a few paths are sometimes

2217: eliminated.  Here, coexistence of several parallel paths among species

2218: is important to give the robustness of the recursive state against

2219: mutation that may delete one of the paths.  As in the dynamics of the

2220: phase (3), the recursive production state is destabilized finally with

2221: the mutation of reaction paths, while after some generations, other

2222: recursive networks are formed through the mutation of the network.

2223:

2224: To sum up, the phase (3) gives a basis for evolvability, since a

2225: novel, (quasi-)recursive state with different chemical compositions is

2226: visited successively.

2227:

2228: {\bf Model B}

2229:

2230: So far, we have assumed that the structural change in the replication

2231: can occur equally to any other molecule species.  Of course, this is a

2232: simplification, and the replication error occurs only to limited types

2233: of molecules species that have similarity to the original.  To see

2234: this point, we have  studied another model (model B) with some

2235: modifications from the original model of \S 5.1.

2236:

2237: Here, the catalytic activity is set as $c_i=i/k$, i.e., the activity

2238: is monotonically increasing with the species index.  Then, instead of

2239: global change to any molecule species by replication error, we modify

2240: the rule so that the change occurs only within a given range $i_0 (\ll

2241: k)$ i.e., when the molecule species $j$ is synthesized, with the error

2242: rate $\mu$, the molecule $j+j'$ with $j'$ a random number over

2243: $[-i_0,i_0]$ is synthesized.

2244:

2245: In this {\bf model B}, we have not included any change of the network.

2246: The network is fixed in the beginning, and is not changed through the

2247: simulation.  Instead, by local change of structural error, the range

2248: of species evolve by generations.  Here we take species only with

2249: $i<i_{ini}$ in the initial condition, and examine if the evolution to

2250: a network with higher catalytic activities (i.e., with much larger

2251: $i$) progresses or not.  In other words, we examine if the indices $i$

2252: in the network increase successively or not.  An example is shown in

2253: Fig.17, where the catalytic activity increases through successively

2254: switching to one (quasi-)recursive state ( consisting of species

2255: within the width of the order $2i_0$ ), to another.

2256:

2257: Here the switching occurs as in the phase (3). With the pressure for

2258: selection of the protocells, cells with a new (quasi-)recursive state

2259: are selected that consist of molecules with higher catalytic

2260: activities (i.e., with larger indices of species).  Again each

2261: recursive state consists of IHN, and the species with the highest

2262: catalytic activity in the core hypercycle is minority in population.

2263: Once the population of such species is decreased by fluctuations,

2264: there occurs a switch to a new state that has higher catalytic

2265: activities, and the species indices successively increase.  Hence,

2266: evolution from a rather primitive cell consisting of low catalytic

2267: activities to that with higher activities is possible, by taking

2268: advantage of minority molecules.

2269:

2270: Note that this switching cannot occur if the total number of molecules

2271: $N$ is small.  When the number is too small, the mutation of paths to

2272: destroy the recursive state hardly occurs.  On the other hand, if the

2273: total number of molecules is too large, it is harder to establish a

2274: recursive state, due to a larger possibility to change the network.

2275: Hence, there is optimal value of the number of molecules in a

2276: protocell to realize the recursive production as well as the

2277: evolution.

2278:

2279: \begin{figure}

2280: \noindent

2281: %\epsfig{file=figevt.ps,width=.8\textwidth}

2282: \epsfig{file=figevtCc.ps,width=.8\textwidth}

2283: \caption{Evolution of species in a cell:  Those species $i$ with $N(i)>100$ are plotted with the vertical axis

2284: as the species index $i$, and the longitudinal axis as the generation.

2285: The total number of species $k$ is 5000, where $c_i$ is chosen as

2286: $c_i=i/k$, so that it ranges from 0.0002 to 1.0 equally distributed.

2287: The number of molecules in a cell is set at 8,000, so that the cell divided

2288: when the total molecule number is 16,000. The path rate is set at $\rho= 0.1$.

2289: The replication error for the species occurs within the range of species

2290: $[i-100,i+100]$,  instead of global selection from all species.

2291: Totally there are $M_{tot}=10$ cells,

2292: so that one of 10 cells is eliminated when a cell is divided into two.}

2293: \end{figure}

2294:

2295: \subsection{Statistical Law}

2296:

2297: To close the present section, we investigate the fluctuations of the

2298: molecule numbers of each of the species, by coming back to the

2299: original model studied in \S 5.2, without evolution of reaction paths.

2300: The characteristics of the fluctuations of the number of each molecule

2301: species over the generations can have a significant impact on the

2302: recursive production of a cell, since the number of each molecule

2303: species is not very large.  In order to quantitatively characterize

2304: the sizes of these fluctuations, we have measured the distribution

2305: $P(N_i)$ for each molecule species $i$, by sampling over division

2306: events.

2307:

2308: Our numerical results are summarized as follows:

2309:

2310: (I) For the fast switching states, the distribution $P(N_i)$ satisfies

2311: the power law

2312:

2313: \begin{equation}

2314: P(N_i) \approx N_i^{-\alpha},

2315: \end{equation}

2316:

2317: \noindent

2318: with $1< \alpha \approx 2$, as shown in Fig. 18a.  The exponent $\alpha$ depends

2319: on the parameters, and approaches 2 as alternation of dominant species is more frequent.

2320: For example, as shown in Fig. 18b, thex exponent $\alpha$  increases from 1 to 2, with the

2321: increase of the error rate $\mu$.

2322:

2323:

2324: (II) For recursive states, the fluctuations in the core network

2325: (i.e., 13,11,109 in Fig.9a or 10a) are typically small, (and are roughly

2326: fit by Gaussian distribution).  On the other hand, for species that are

2327: peripheral to but catalyzed by the core hypercycle, the number distribution is

2328: closer to log-normal distributions

2329:

2330: \begin{equation} P(N_i) \approx \exp(-\frac{(\log N_i-\overline{\log N_i})^2}{2\sigma}),

2331: \end{equation}as shown in Fig.19.

2332:

2333: Even though the distribution does not agree well with the log-normal distribution,

2334: at least, the distribution if roughly symmetric after taking the logarithm

2335: (i.e., as the 0-th approximation the distribution is not normal but log-normal).

2336: The origin of the log-normal distributions here can be understood

2337: by the following rough argument: for a replicating system, the

2338: growth of the molecule number $N_m$ of the species $m$ is given by

2339:

2340: \begin{equation}

2341: dN_m/dt=AN_m,

2342: \end{equation}

2343:

2344: \noindent

2345: where $A$ is the average effect of all the molecules that catalyze $m$.

2346: We can then obtain the estimate

2347:

2348: \begin{equation}

2349: d\log N_m/dt =\overline{a} +\eta(t),

2350: \end{equation}

2351:

2352: \noindent

2353: by replacing $A$ with its temporal average $\overline{a}$ plus

2354: fluctuations $\eta(t)$ around it.  If $\eta(t)$ is approximated by a

2355: Gaussian noise, the log-normal distribution for $P(N_m)$ is suggested

2356: This argument is valid if $\overline{a}>0$.  As such this equation

2357: diverges with time, but here, the cell divides into two before the

2358: divergence becomes significant.  Although the asymptotic distribution

2359: as $N \rightarrow \infty$ is not available then, the argument on the

2360: distribution form is valid as long as $N$ is sufficiently large.

2361:

2362: For the fast switching state, the growth of each molecule species is

2363: close to zero on the average.  In this case the Langevin equation (12) can

2364: approach 0, and we need to consider the equation by seriously

2365: taking into account of the absorbing boundary condition at $N_m=0$.

2366: By taking into account of the normalization of the probability,

2367: the stationary solution for the Fokker-Planck equation corresponding to eq.(12)

2368: for $\overline{a} \leq 0$ is given by

2369: \begin{equation}

2370: P(N) \propto N^{-(1+\nu)},

2371: \end{equation}

2372: with

2373: \begin{equation}

2374: \nu =|\overline{a}|/(\overline{a^2}-\overline{a}^2).

2375: \end{equation}

2376: (see e.g., \cite{Sornette,Mikhailov-book}). Change of the exponent $\alpha$ against the

2377: error rate in Fig.18b will be understood as the change of the ratio of variance

2378: to the mean of $a$.

2379:

2380: \begin{figure}

2381: \noindent

2382: %\epsfig{file=figh.ps,width=.9\textwidth}

2383: (a)\epsfig{file=fighC.ps,width=.9\textwidth}

2384: (b)\epsfig{file=fighCm.ps,width=.9\textwidth}

2385: \caption{The number distribution of the molecules corresponding to the

2386: network in Fig.7 (fast switching states).  (a); The distribution is sampled

2387: from 100000 division events. Plotted for 4 molecule species among 500.

2388: Log-Log plot. (b) Change of the distribution with the change of the error rate $\mu$,

2389: for a specific molecule species.}

2390: \end{figure}

2391:

2392: \begin{figure}

2393: \noindent

2394: \epsfig{file=fig4.ps,width=.9\textwidth}

2395: \caption{The number distribution of the molecules corresponding to the

2396: network in Fig.9a or 10a.  The distribution is sampled from 1000 division

2397: events.  From right to left, the plotted species are

2398: 11,109,13,155,176,181,195,196,23.  Log-Log plot.}

2399: \end{figure}

2400:

2401: If several molecules mutually catalyze each other, however, one would

2402: expect that the fluctuations will not increase as in the Brownian

2403: motion as in eq. (12).  For example, consider that the number of one

2404: species in the core cycle increase due to the fluctuation.  Then it

2405: relatively decreases the number of molecules of the other species in

2406: the core network, resulting in the suppression of the catalytic

2407: reaction to replicate the increased species.  Then the catalytic

2408: molecule of the original molecule species decreases.  Hence the

2409: fluctuations in the core hypercycle is reduced.

2410:

2411: Another reason for the reduction of fluctuation of the species in the

2412: core cycle is high connectivity in the IHN.  The chemicals of core

2413: part has catalytic paths with a large number of molecule species.

2414: Hence many processes work in parallel to the synthesis of the core

2415: species.  Then, fluctuations due to other chemical concentrations are

2416: added in parallel.  Thus, the fluctuations can come close to Gaussian

2417: distribution (recall the central limit theorem).

2418:

2419: Note also that for some networks, the distributions of the molecule

2420: numbers in the recursive sates may sometimes be intermediate between

2421: log-normal and Gaussian, and occasionally even have double peaks.

2422:

2423: By studying a variety of networks, the observed distributions of the

2424: molecule numbers can be summarized as:

2425:

2426: \begin{itemize}

2427:

2428: \item

2429: (1)Distribution close to Gaussian form, with relatively small

2430: variances in the core (hypercycle) of the network.

2431:

2432: \item

2433: (2)Distribution close to log-normal, with larger fluctuations

2434: for a peripheral part of the network.

2435:

2436: \item

2437: (3) Power-law distributions

2438: for parasitic molecules that appear intermittently.

2439:

2440: \end{itemize}

2441:

2442: To quantitatively study the magnitude of variance in the IHN for the

2443: recursive production, we have also plotted the variance

2444: $\overline{(N_i-\overline{N_i})^2}$ ($\overline{..}$ is the average of

2445: the distribution $P(N_i)$).  As can be seen in Fig.20, the variance in

2446: the core network are small, especially for the minority species (i.e.,

2447: 13).  For molecule species that do not belong to the core hypercycle,

2448: the variance scaled by the average increases as the average decreases.

2449: Suppression of the relative fluctuation in the core hypercycle comes

2450: from the direct feedback of the population change of the molecule

2451: species in the core, as well as multiple parallel reaction paths, as

2452: already mentioned.

2453:

2454: \begin{figure}

2455: \noindent

2456: %\hspace{-.3in}

2457: \epsfig{file=fig50.ps,width=.5\textwidth}

2458: %\includegraphics[width=57mm]{fig5.ps}

2459: \caption{Scaled variance, i.e., the variance of the molecule number

2460: divided by its average is plotted against the average.  From the

2461: largest to the smaller, the species 11 (the largest $\overline{N_i}$),

2462: 109(the second largest),13,155,194,176,195,181,196,23, 34(smallest

2463: $\overline{N_i}$) are plotted.  Computed from the data in Fig.19.  The

2464: asterisk denotes the species 13, that has largest catalytic activity here

2465: and the minority in the hypercycle core. Adapted from \cite{KK-PRE}.}

2466: \end{figure}

2467:

2468: {\bf Remark: Universal Statistics}

2469:

2470: Quite recently Furusawa and the author\cite{Zipf,

2471: log} have studied several models of minimal cell consisting

2472: of catalytic reaction networks, without assuming the replication

2473: process itself.  In other words, the molecules are successively

2474: synthesized from nutrition chemicals transported from the membrane,

2475: where the level-(1) model of \S 3.2 is adopted.  They have found

2476: universal statistical law of chemicals for a cell that grows

2477: recursively.

2478:

2479: (i) The number of molecules of each chemical species over all cells

2480: generally obey the log-normal distribution.  This distribution is

2481: universally observed for a state with recursive production.  Existence

2482: of such log-normal distributions is also experimentally

2483: verified\cite{Zipf}.  Ubiquity of log-normal distribution in

2484: the level-(2) model described in this section is thus supported in the

2485: level-(1) model.

2486:

2487: (ii) A power law in the average abundances of chemicals.  This is

2488: statistics against a huge number of molecule species. When the

2489: abundances of all chemical species are ordered according to the

2490: magnitude, the abundances of chemicals are inversely proportional to

2491: the rank of the magnitude.  Such law was originally found in the

2492: linguistics by Zipf\cite{Zipf-book}.  This Zipf's law on chemical abundances

2493: \cite{Zipf} is found to be universal when a cell optimizes

2494: the efficiency and faithfulness of self-reproduction. It is a

2495: universal statistics when the cell model shows a recursive growth

2496: under fluctuations in the molecule numbers.  Furthermore, using data

2497: from gene expression databases on various organisms and tissues,

2498: the abundances of expressed genes exhibit this

2499: law.  Thus, the universal statistics are also supported

2500: experimentally.  It is shown that this power law of gene expression is

2501: maintained by a hierarchical organization of catalytic reactions.

2502: Major chemical species are synthesized, catalyzed by chemicals with a

2503: little less abundant chemicals.  The latter chemicals are synthesized

2504: by chemicals with much less abundance, and this hierarchy of

2505: catalytic reactions continues until it reaches the minor chemical

2506: species.

2507:

2508: {\bf Remark: Search for the deviation from universal statistics}

2509:

2510: So far we have observed ubiquity of log-normal distribution, in

2511: several models. The fluctuations in such distribution are generally

2512: very large.  This is in contrast to our naive impression that a

2513: process in a cell system must be well controlled.

2514:

2515: Then, is there some relevance of such large fluctuations to biology?  Quite

2516: recently, we have extended the idea of fluctuation-dissipation theorem

2517: in statistical physics to evolution, and proposed linear relationship

2518: (or high correlation) between (genetic) evolution speed and

2519: (phenotypic) fluctuations.  This proposition turns out to be supported

2520: by experimental data on the evolution of E Coli to enhance the

2521: fluorescence in its proteins\cite{Sato}.  Hence the fluctuations are

2522: quite important biologically.

2523:

2524: The log-normal distribution is also rather universal in the present

2525: cell, as demonstrated in the distribution of some proteins, measured

2526: by the degree of fluorescence\cite{log}.  Now, is this universality the final

2527: statement for "cell statistical mechanics"?  We have to be cautious

2528: here, since too universal laws may not be so relevant to biological

2529: function.  In fact, chemicals that obey the log-normal distribution

2530: may have too large fluctuations to control some function.  Some other

2531: mechanism to suppress the fluctuation may work in a cell.

2532:

2533: Indeed, the minority control suggests the

2534: possibility of such control to suppress the fluctuation, as discussed

2535: in \S 4.5. For a recursive production system, some mechanism to

2536: decrease the fluctuation in minority molecule may be evolved.

2537:

2538: At least there can be two possibilities to decrease the fluctuation

2539: leading to deviation from log-normal distribution.

2540:

2541: The first one is some negative feedback process. In general, the

2542: negative feedback can suppress the response as well

2543: as the fluctuation.  Still, it is not a trivial question how chemical

2544: reaction can give rise to suppression of fluctuation, since to realize

2545: the negative feedback in chemical reaction, production of some

2546: molecules is necessary, which may further add fluctuations.

2547:

2548: The second possible mechanism is the use of multiple parallel reaction

2549: paths.  If several processes work sequentially, the fluctuations would

2550: generally be increased.  When reaction processes work in parallel for

2551: some species, the population change of such molecule is influenced by

2552: several fluctuation terms added in parallel.  If a synthesis (or

2553: decomposition) of some chemical species is a result of the average of

2554: these processes working in parallel, the fluctuation around this

2555: average can be decreased by the law of large numbers.  Indeed, the

2556: minority in the core network that has higher reaction paths has

2557: relatively lower fluctuation as in Fig.20.  Suppression of fluctuation

2558: by multiple parallel paths may be a strategy adopted in a cell.  Note

2559: that this is also consistent with the scenario that more and more

2560: molecules are related with the minority species as discussed in \S

2561: 4.5.  With the increase of the paths connected with the minority

2562: molecules, the fluctuation of minority molecules is reduced, which

2563: further reinforces the minority control mechanism.  Hence the increase

2564: of the reaction paths connected with the minority molecule species

2565: through evolution, decrease of the fluctuation in the population of

2566: minority molecules, and enhancement of minority control reinforce

2567: each other.  With this regards, search for molecules that deviate from

2568: log-normal distribution should be important, in future.

2569:

2570: In physics, we are often interested in some quantities that deviate

2571: from Gaussian (normal) distribution, since the deviation is exceptional.  Indeed, in physics, search for

2572: power-law distribution or log-normal distributions has been popular

2573: over a few decades. On the other hand, a biological unit can grow and

2574: reproduce, to increase the number.  For such system, the components

2575: within have to be synthesized, so that amplification process is

2576: common.  Then, the fluctuation is also amplified.  In such system, the

2577: power-law or log-normal distributions are quite common, as already

2578: discussed here, and as is also shown in several models and experiments

2579: \cite{Zipf,log}.  In this case, the Gaussian (normal)

2580: distribution is not so common (normal).  Then exceptional molecules

2581: that obey the normal distribution with regards to their concentration

2582: may be more important.

2583:

2584: Also, the ubiquity of log-normal distribution we found is true for a

2585: state with recursive production.  If a cell is not in a stationary

2586: growth state but in a transient process switching from one steady

2587: state to another, the universal statistics can be violated.  Search

2588: for such violation will be important both experimentally and

2589: theoretically.

2590:

2591: \section{Summary}

2592:

2593: We have studied a problem of recursive production and evolution  of a

2594: cell, by adopting a simple protocell system.  This protocell consists of

2595: catalytic reaction network with replicating molecules.  The basic

2596: concepts we have proposed through several simulations are as follows:

2597:

2598: (i) {\bf Minority control}

2599:

2600: In a cell system with mutually catalytic molecules, replicating

2601: molecules with a smaller size in population are shown to control the

2602: behavior of the total cell system.  This minority controlled state is

2603: achieved by preserving rare fluctuations with regards to the molecule

2604: number.  The molecule species, minority in its number, works as a

2605: carrier of heredity, in the sense that it is preserved well with

2606: suppressed number fluctuations and that it controls the behavior of a

2607: cell relatively strongly.  Since molecules that are replicated by

2608: this minority species are also preserved, more molecules will be synthesized

2609: with the hep of it. In addition,

2610: reaction paths to stabilize the replication of this

2611: minority molecules is expected to evolve.  Hence, the replication of more and

2612: more molecule species is packaged into the synthesis of this minority

2613: molecule, that also ensures the transmission of the minority molecule.

2614: The minority molecule species, thus, gives a basis for "genetic

2615: information".  Hence evolution from loose reproduction system to a

2616: faithful replication system with genes is understood from a kinetic viewpoint

2617: of chemical reaction.

2618:

2619: (2) {\bf Recursiveness of production in an intermingled hypercycle network}

2620:

2621: Next, a protocell model consisting of a variety of mutually catalyzing

2622: molecule species is investigated.  When the numbers of molecules in a

2623: cell is not too small and the number of possible species is not too

2624: large in a cell, recursive production of a cell is achieved. This

2625: recursive production state consists of 5-12 dominant molecule species,

2626: which form intermingled hypercycle network(IHN).  Within this IHN, there

2627: is a core hypercycle, while parallel multiple reaction paths in the

2628: IHN are important to ensure the stability of the state against invasion

2629: of parasitic molecules and against fluctuations in the molecule number.

2630:

2631: (3) {\bf Itinerant dynamics over recursive production states}

2632:

2633: When the fluctuation in molecule number is not small enough, there

2634: appears switches over (quasi-)recursive production states.  A given

2635: quasi-recursive state is destabilized by being taken over by some parasitic

2636: molecules.  Then, the dominant molecule species change frequently by

2637: generations, where the growth speed of a cell is suppressed.  After this

2638: transient, the fast change of chemical compositions is reduced so that a

2639: quasi-recursive production of a cell is sustained again.  Each switching

2640: occurs with the loss of chemical diversity.  Note that in

2641: high-dimensional dynamical systems, such switching over quasi-stable

2642: states through unstable transient dynamics is studied as chaotic

2643: itinerancy\cite{CI1,CI2,CI3}, where the loss of degrees of freedom is also

2644: observed in the process of switching.

2645:

2646: Destabilization of a recursive state in the present model occurs through

2647: the decrease of the population of the minority molecules in the core

2648: hyper cycle.  As this molecule species is taken over by parasitic

2649: molecules, the switching starts to occur.  In this sense, the process in

2650: the switching is not random, but is restricted to specific routes within

2651: the phase space of chemical composition, as in the chaotic itinerancy.

2652: It is interesting to study the present switching over recursive state as

2653: a stochastic version of chaotic itinerancy.

2654:

2655: (4){\bf Evolution through itinerant dynamics}

2656:

2657: By considering change in the available reaction paths to the model, this

2658: hypercycle network evolves to recursive production states.  Following the

2659: itinerant dynamics above, each recursive state is later destabilized,

2660: but later another recursive state is evolved. Through these successive

2661: visits of recursive states, a cell can evolve to have a chemical

2662: network supporting a higher growth speed.  Since the minority species in

2663: the hypercycle network is relevant to this switch, minority molecules

2664: are shown to be important to evolution.

2665:

2666: (5){\bf Universal statistics and control of fluctuations}

2667:

2668: Statistics of the number fluctuations of each molecule species is

2669: studied. We have found that (i)power-law distribution of fast switching

2670: molecules (ii) suppression of fluctuation in the core hypercycle species

2671: and (iii) ubiquity of log-normal distribution for most other molecule

2672: species.  The origin of log-normal distribution is generally due to

2673: multiplicative stochastic process in the catalytic reaction dynamics,

2674: as is confirmed in several other reaction network models.  On the other

2675: hand, suppression of the number fluctuations of the core hypercycle is

2676: due to high connections in reaction paths with other molecules.  In

2677: particular, reduced is the number fluctuations of the minority molecule

2678: species that has high catalytic connections with others.  This

2679: suppression of fluctuation further reinforces the minority control for

2680: the reproduction of a cell.  The deviation from ubiquitous log-normal

2681: distribution thus appears, which may be important in control of cell

2682: function.

2683:

2684: In the present paper, we have not discussed cell-cell interaction, and

2685: restricted our study only to a production process of a single cell.  Of

2686: course, cells start to interact with each other, as the cell density is

2687: increased through the cell division.  Indeed, including the cell-cell

2688: interaction to the present cell model with reaction network, cell

2689: differentiation and morphogenesis of a cell aggregate are

2690: studied\cite{KKTY,Furusawa}.  Through instability of

2691: intra-cellular dynamics with cell-cell interaction, cell

2692: differentiation, irreversible loss of plasticity in cells, and robust

2693: pattern formation process appear as a general course of development with

2694: the increase of the cell number.  Relevance of minority control and

2695: deviation from universal statistics to such multicellular developmental

2696: process will be an important issue to be studied in future.

2697:

2698: {\sl acknowledgments}

2699:

2700: The author is grateful to T. Yomo, C. Furusawa, W. Fontana, Y. Togashi,

2701: A. Awazu, and K. Fujimoto for discussions.  The work is partially

2702: supported by Grant-in-Aids for Scientific Research from the Ministry of

2703: Education, Science, and Culture of Japan (11CE2006).

2704:

2705:

2706:

2707: \begin{thebibliography}{999}

2708:

2709: \bibitem{whatlife}

2710: K. Kaneko 'What is Life?: A complex systems approach", in Japanese,

2711: Univ Tokyo Press. 2003

2712:

2713: \bibitem{minority}

2714: K. Kaneko, T. Yomo,

2715: % 2002a. On a kinetic origin of heredity :minority control in

2716: %replicating molecules.

2717: J. Theor. Biol. 214 (2002) 563-576

2718:

2719: \bibitem{Shannon}

2720: C. Shannon and W. Weaver ``The Mathematical Theory of Communication",

2721: Univ. of llinois Press, 1949

2722:

2723: \bibitem{Brillouin}

2724: L. Brillouin,

2725: {\sl Science and Information Theory},

2726: Academic Press 1969

2727:

2728: \bibitem{Barabasi}

2729: H. Jeong, et al., {\it Nature} {\bf 407}, 651 (2000);

2730: H. Jeong, S. P. Mason, A.-L. Barab\'{a}si, {\it Nature} {\bf 411}, 41 (2001).

2731:

2732: \bibitem{Spiegelman}

2733: D.R. Mills, R.L. Peterson, and S. Spiegelman,

2734: %An extracellular Darwinian experiment with a self-duplicating nucleic acid molecule,

2735: Proc. Nat. Acad. Sci. USA 58 (1967) 217;

2736: D.R. Mills, F.R. Kramer, and S. Spiegelman,

2737: %Complete nucleotide sequence of a replicating RNA molecule,

2738: Science 180 (1973) 916

2739:

2740: \bibitem{Eigen}

2741: M. Eigen and  P. Schuster, {\sl The Hypercycle} (Springer, 1979).

2742:

2743: \bibitem{Hogeweg}

2744: M. Boerlijst and P. Hogeweg, Physica 48D (1991) 17;

2745: P.Hogeweg

2746: %``Multilevel evolution: replicators and the evolution of diversity",

2747: Physica 75 D (1994)275-291

2748:

2749: \bibitem{Dyson}

2750: F. Dyson, {\sl Origins of Life}, Cambridge Univ. Press., 1985

2751:

2752: \bibitem{Kauffman}

2753: S.A. Kauffman, {\sl The Origin of Order}, Oxford Univ. Press. 1993

2754:

2755: \bibitem{Bagley}

2756: R.Bagley, J.D. Farmer, S. Kauffmans,

2757: pp 93-140, in {\sl Artificial Life} 1989, ed. C. Langton

2758:

2759: \bibitem{Cairns-Smith}

2760: A.G. Cairns-Smith,

2761: Clay Minerals and the Origin of Life(1982), Cambridge Univ. Press.

2762:

2763: \bibitem{mtb}

2764: K. Kaneko,

2765: in {\sl Function and Regulation of Cellular Systems} (2003)

2766: %Constructive and Dynamical Systems Approach to Life "

2767: Birkhauser (ed. A. Deutsch et al.)

2768:

2769: \bibitem{Complexity}

2770: K. Kaneko,

2771: %``Life as Complex Systems: Viewpoint from Intra-Inter Dynamics'',

2772: Complexity, 3 (1998c) 53-60

2773:

2774: \bibitem{KKTY}

2775: K. Kaneko and T. Yomo,

2776: %`` Cell Division, Differentiation, and Dynamic Clustering",

2777: Physica 75 D (1994), 89-102;

2778: B. Math.Biol. 59 (1997) 139;

2779: %``Isologous Diversification for Robust Development of Cell Society ",

2780: J. Theor. Biol., 199 243-256 (1999)

2781:

2782: \bibitem{Furusawa}

2783: Furusawa C. \& Kaneko K.,

2784: %``Emergence of Rules in Cell Society: Differentiation, Hierarchy, and Stability"

2785: Bull.Math.Biol. 60(1998) 659-687;

2786: %Furusawa C, Kaneko K. 2000a. Origin of complexity in multicellular organisms.

2787: Phys Rev Lett. 84:6130-6133

2788: %C. Furusawa and K. Kaneko;

2789: %Theory of Robustness of Irreversible Differentiation in a Stem Cell

2790: %System: Chaos Hypothesis;

2791: J. Theor. Biol.  209 (2001) 395-416;

2792: Anatomical Record, 268 (2002) 327-342;

2793: J. Theor. Biol. 224 (2003) 413-435.

2794:

2795: \bibitem{speciation}

2796: K. Kaneko and T. Yomo,

2797: % ``Sympatric Speciation: compliance with phenotype diversification from a single genotype ",

2798: Proc. Roy. Soc. B, 267 (2000) 2367-2373;

2799: K. Kaneko,

2800: %" Symbiotic Sympatric Speciation: Compliance with Interaction-driven Phenotype Differentiation from a Single Genotype "

2801: Population Ecology, 44 (2002) 71-85

2802:

2803: \bibitem{Matsuura}

2804: T. Matsuura, T. Yomo, M. Yamaguchi, N. Shibuya., E.P. Ko-Mitamura, Y. Shima, and

2805: I. Urabe

2806: %``Importance of compartment formation for a self-encoding system",

2807: Proc. Nat. Acad. Sci. USA  99 (2002) 7514-7517

2808:

2809: \bibitem{Ko}

2810: E. Ko, T.Yomo, and I. Urabe, Physica 75 D (1993)81-88

2811:

2812: \bibitem{Kashiwagi1}

2813: Kashiwagi A., Noumachi W., Katsuno M., Alam M.T., Urabe I., and Yomo T.

2814: %``Plasticity of Fitness and Diversification Process During an Experimental Molecular Evolution",

2815: J. Mol. Evol., (2001)  {\bf 52}  502-509 .

2816:

2817: \bibitem{Kashiwagi2}

2818: A. Kashiwagi, I. Urabe, K. Kaneko, T. Yomo, submitted (2003)

2819:

2820: \bibitem{Asashima}

2821: T. Ariizumi and M. Asashima,

2822: Int. J. Devl Biol. 45 (2001) 273-279

2823:

2824: %\bibitem{McCaskill}

2825: %S.Altmeyer and J.S. McCaskill,

2826: %Phys. Rev. Lett. 86 (2001) 5819%-5822

2827:

2828: \bibitem{AL}

2829: C. Langton eds.  Artificial Life 1989, Adisson Wesley

2830:

2831: \bibitem{Fontana}

2832: W. Fontana and L.W. Buss, 1994.

2833: %The arrival of the fittest: Toward a theory of biological organization.

2834: Bull Math Biol 56:1-64

2835:

2836: \bibitem{Awazu}

2837: A. Awazu and K. Kaneko, preprint 2003.

2838:

2839: \bibitem{Zipf}

2840: C. Furusawa and K. Kaneko, Phys. Rev. Lett. 90 (2003) 088102.

2841:

2842: \bibitem{Cell}B. Alberts, D.Bray, J. Lewis, M. Raff, K. Roberts, and J.D. Watson,

2843: {\sl The Molecular Biology of the Cell}, 1983,1989,1994,2002

2844:

2845: \bibitem{Mikhailov}

2846: B. Hess and A. Mikhailov,

2847: %Self-Organization in Living Cells

2848: Science {\bf 264}, 223 (1994);

2849: A. Mikhailov and B. Hess,

2850: J. Theor. Biol. {\bf 176}, 185-192 (1995).

2851:

2852: \bibitem{Togashi}

2853: Y. Togashi and K. Kaneko,

2854: %`` Transitions Induced by the Discreteness of Molecules

2855: %in a Small Autocatalytic System''

2856: Phys. Rev. Lett. , 86 (2001) 2459;

2857: J.Phys.Soc.Japan 72 (2003)62-68;

2858: preprint 2003.

2859:

2860: \bibitem{Szathmary}

2861: E. Szathmary and J. Maynard Smith,

2862: %``From Replicators to Reproducers: the First Major Transitions Leading to Life",

2863: J. Theor. Biol. 187 (1997) 555-571

2864:

2865: \bibitem{Eigen-book}

2866: M. Eigen,  Steps towards Life, Oxford Univ. Press., 1992

2867:

2868: \bibitem{KK-net}

2869: K. Kaneko, J. Biol. Phys., 28 (2002) 781;%-792

2870: Adv. in Complex Systems, 6 (2003)79-92

2871:

2872: \bibitem{KK-PRE}

2873: K. Kaneko, Phys. Rev.E. 68 (2003) 031909;

2874:

2875: \bibitem{Lancet}

2876: %A recursive state in a mutually catalytic system was also discussed by as a `compositional genome',

2877: D. Segr\'{e}, D. Ben-Eli, D. Lancet,

2878: %``Compositional genomes: prebiotic information transfer in mutually catalytic noncovalent assemblies'',

2879: Proc. Natl. Acad. Sci. USA 97 (2000)4112;

2880: D. Segr\'{e} et al., J. theor. Biol. {\bf 213} (2001) 481

2881: D. Segr\'{e} and D. Lancet, EMBO Reports {\bf 1} (2000) 217,

2882:

2883: \bibitem{Sigmund}

2884: J. Hofbauer and K. Sigmund,

2885: {\sl Evolutionary Games and Population Dynamics},

2886: Cambridge Univ. Press. 1998

2887:

2888: \bibitem{homeochaos}

2889: K. Kaneko and T. Ikegami,

2890: %"Homeochaos: Dynamics Stability of a symbiotic network with population dynamics and evolving mutation rates",

2891: Physica D 56 (1992) 406-429

2892:

2893: \bibitem{Ikegami}

2894: T. Ikegami and T. Hashimoto,

2895: Artificial Life 2 (1996) 305-318

2896:

2897: \bibitem{Takagi}

2898: H. Takagi and K. Kaneko, preprint (2003)

2899:

2900: \bibitem{Mikhailov-book}

2901: A. S. Mikhailov \& V. Calenbuhr, ``From Cells to Societies''

2902: Springer 2002

2903:

2904: \bibitem{Sornette}

2905: D.Sornette, {\sl Critical phenomena in Natural Science}, Springer 2002

2906:

2907: \bibitem{Zipf-book}

2908: G. K. Zipf, {\it Human Behavior and the Principle of Least Effort}

2909: (Addison-Wesley, Cambridge, 1949).

2910:

2911:

2912: \bibitem{log}

2913: C. Furusawa, T. Suzuki, A. Kashiwagi, T. Yomo and K. Kaneko ;

2914: Ubiquity of Log-normal Distribution in gene expression,

2915: preprint

2916:

2917: \bibitem{Sato}

2918: K. Sato, Y. Ito, T. Yomo, and K. Kaneko;

2919: %On the Relation between Fluctuation and Response in Biological Systems;

2920: Proc. Nat. Acad. Sci. USA 100 (2003) 14086-14090

2921:

2922: \bibitem{CI1}

2923: K. Kaneko, %``Clustering, Coding, Switching, Hierarchical Ordering,

2924: %and Control in Network of Chaotic Elements",

2925: Physica D 41(1990) 137-172

2926:

2927: \bibitem{CI2}

2928: I. Tsuda, Neural Networks 5(1992)313

2929:

2930: \bibitem{CI3}

2931: K. Kaneko and I. Tsuda. ed.,  Focus issue on  ``Chaotic Itinerancy",

2932: Chaos. 13 (2003) 926

2933:

2934: \end{thebibliography}

2935: \end{document}

2936: