0611:q-bio0611031/main.tex

1: \documentclass[preprint,aps]{revtex4}

2: %\documentclass[a4paper,12pt]{article}

3: %\documentclass[preprint,aps,draft]{revtex4}

4: %\documentclass[preprint,showpacs,preprintnumbers,amsmath,amssymb]{revtex4}

5:

6: \usepackage{graphicx}

7:

8: %\usepackage[dviout]{color}

9:

10: %\input{c:/sato/unix/Latex/Style/format_small.tex}

11: %\input{c:/sato/unix/Latex/Style/format_normal.tex}

12:

13: %%���x���̖��O��������悤�ɂ���

14: %\input{c:/sato/unix/Latex/Style/dummy.tex}

15: %\input{c:/sato/unix/Latex/Style/emerge_label.tex}

16:

17: %%�y�[�W���łȂ��悤�ɂ��Ă���

18: %\pagestyle{empty}

19: %\thispagestyle{empty}

20:

21: %% standard commands 2006/08/15

22: \newcommand{\integrate}{\int}

23: %\renewcommand{\<}{\left\langle}

24: %\renewcommand{\>}{\right\rangle}

25: \newcommand{\integral}{\int}

26: \newcommand{\bra}{\langle}

27: \newcommand{\ket}{\rangle}

28: \newcommand{\braket}[1]{\langle #1 \rangle}

29: \newcommand{\bubun}[2]{\frac{\partial #1}{\partial #2}}

30: \newcommand{\bibun}[2]{\frac{d #1}{d #2}}

31: \newcommand{\infinity}{\infty}

32:

33: \begin{document}

34:

35: \preprint{ }

36:

37: \title{Evolution Equation of Phenotype Distribution: General Formulation and Application

38: to Error Catastrophe} \author{Katsuhiko Sato$^1$ and Kunihiko Kaneko$^{1,2}$} \address{

39: $^1$ Complex Systems Biology Project, ERATO JST} \address{ $^2$ Department of Pure and

40: Applied Sciences, University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo 153-8902, Japan }

41: \email{sato@complex.c.u-tokyo.ac.jp; kaneko@complex.c.u-tokyo.ac.jp} \date{\today}

42:

43: \begin{abstract}

44: An equation describing the evolution of phenotypic distribution is derived using methods

45: developed in statistical physics.  The equation is solved by using the singular

46: perturbation method, and assuming that the number of bases in the genetic sequence is

47: large.  Applying the equation to the mutation-selection model by Eigen provides the

48: critical mutation rate for the error catastrophe.  Phenotypic fluctuation of clones

49: (individuals sharing the same gene) is introduced into this evolution equation.  With this

50: formalism, it is found that the critical mutation rate is sometimes increased by the

51: phenotypic fluctuations, i.e., noise can enhance robustness of a fitted state to mutation.

52: Our formalism is systematic and general, while approximations to derive more tractable

53: evolution equations are also discussed.

54: \end{abstract}

55:

56: \pacs{ }

57:

58: \keywords{ }

59:

60: \maketitle

61:

62: \section{Introduction}

63:

64: For decades, quantitative studies of evolution in laboratories have used bacteria and

65: other microorganisms\cite{Lenski,Lenski-Rose,Kishony}.  Changes in phenotypes, such as

66: enzyme activity and gene expressions introduced by mutations in genes, are measured along

67: with the changes in their population distribution in phenotypes

68: \cite{Bactreia-many-generations, Dekel-Alon,Kashiwagi-Noumachi,Ito}.  Following

69: such experimental advances, it is important to analyze the evolution equation of

70: population distribution of concerned genotypes and phenotypes.

71:

72: In general, fitness for reproduction is given by a phenotype, not directly by a genetic

73: sequence. Here, we consider evolution in a fixed environment, so that the fitness is given

74: as a fixed function of the phenotype.  A phenotype is determined by mapping a genetic

75: sequence. This phenotype is typically represented by a continuous (scalar) variable, such

76: as enzyme activity, protein abundances, and body size. For studying the evolution of a

77: phenotype, it is essential to establish a description of the distribution function for a

78: continuous phenotypic variable, where the fitness for survival, given as a function of

79: such a continuous variable, determines population distribution changes over generations.

80:

81: However, since a gene is originally encoded on a base sequence (such as AGCTGCTT in DNA),

82: it is represented by a symbol sequence of a large number of discrete elements. Mutation in

83: a sequence is not originally represented by a continuous change.  Since the fitness is

84: given as a function of phenotype, we need to map base sequences of a large number of

85: elements onto a continuous phenotypic variable $x$, where the fitness is represented as a

86: function of $x$, instead of the base sequence itself.  A theoretical technique and careful

87: analysis are needed to project a discrete symbol sequence onto a continuous

88: variable.

89:

90: Mutation in a nucleotide sequence is random, and is represented by a stochastic process.

91: Thus, a method of deriving a diffusion equation from a random walk is often

92: applied. However, the selection process depends on the phenotype. If a phenotype is given

93: as a function of a sequence, the fitness is represented by a continuous variable mapped

94: from a base sequence. Since the population changes through the selection of fitness, the

95: distribution of the phenotype changes accordingly. If the mapping to the phenotype

96: variable is represented properly, the evolutionary process will be described by the

97: dynamics of the distribution of the variable, akin to a Fokker-Planck equation.

98:

99: In fact, there have been several approaches to representing the gene with a continuous

100: variable \cite{footnote}

101: %[ A well established theory in population genetics, which adopts diffusion equation or

102: %Fokker-Planck type equation, is related to the frequency of genes in alleles in population

103: %as developed by Wright, Fisher, and Kimura\cite{Fisher,Wright,kimura1970}.  How new genes

104: %spread in population is analyzed by the diffusion equation. In contrast, we are concerned

105: %with the changes in the distribution of a base sequence consisting of a haploid gene.  ]

106: .  Kimura\cite{Kimura2} developed the population distribution of a continuous fitness.

107: Also, for certain conditions, a Fokker-Planck type equation has been analyzed by

108: Levine\cite{Levine}. Generalizing these studies provides a systematic derivation of an

109: equation describing the evolution of the distribution of the phenotypic variable. We adopt

110: selection-mutation models describing the molecular biological evolution discussed by

111: Eigen\cite{Eigen}, Kauffman\cite{kauffmann-book}, and others, and take a continuum limit

112: assuming that the number of bases $N$ in the genetic sequence is large, and derive the

113: evolution equation systematically in terms of the expansion of $1/N$.

114:

115: In particular, we refer to Eigen's equation\cite{Eigen}, originally introduced for the

116: evolution of RNA, where the fitness is given as a function of a sequence. Mutation into a

117: sequence is formulated by a master equation, which is transformed to a diffusion-like

118: equation.  With this representation, population dynamics over a large number of species is

119: reduced to one simple integro-differential equation with one variable. Although the

120: equation obtained is a non-linear equation for the distribution, we can adopt techniques

121: developed in the analysis of the (linear) Fokker-Planck equation, such as the

122: eigenfunction expansion and perturbation methods.

123:

124: So far, we have assumed a fixed, unique mapping from a genotype to a phenotype.  However,

125: there are phenotypic fluctuations in individuals sharing the same genotype, which has

126: recently been measured quantitatively as a stochastic gene expression

127: \cite{Koshland,Elowitz,Kaern-Collins,Collins,Furusawa,Ueda,noise-review}.  Relevance of

128: such fluctuations to evolution has also been

129: discussed\cite{SatoPNAS,kaneko-book,KKFurusawaJTB,Ancel}.  In this case, mapping from a gene

130: gives the average of the phenotype, but phenotype of each individual fluctuates around the

131: average.  In the second part of the present paper, we introduce this isogenic phenotypic

132: fluctuation into our evolution equation.  Indeed, our framework of Fokker-Planck type

133: equations is fitted to include such fluctuations, so that one can discuss the effect of

134: isogenic phenotypic fluctuations on the evolution.

135:

136: The outline of the present paper is as follows: We first establish a sequence model in

137: section (\ref{31:setup-and-derivation}). For deriving the evolution equation from the

138: sequence model, we postulate the assumption that the transition probability of phenotype

139: values is uniquely determined by the original phenotype value.  The assumption may appear

140: too demanding at a first sight, but we show that it is not unnatural from the viewpoint of

141: evolutionary biology. In fact, most models studied so far satisfy this postulate.  With

142: this assumption, we derive a Fokker-Planck type equation of phenotypic distribution using

143: the Kramers-Moyal expansion method from statistical

144: physics\cite{vanKampen,KuboMatsuoKitahara}.  We discuss the validity of this expansion

145: method to derive the equation, also from a biological point of view.

146:

147: As an example of the application of our formulation, we study the Eigen's model in section

148: (\ref{20}), and estimate the critical mutation rates at which error catastrophe occurs,

149: using a singular perturbation method.  In section (\ref{32:discussion}), we discuss the

150: range of the applicability of our method and discuss possible extensions to it.

151:

152: Following the formulation and application of the Fokker-Planck type equations for

153: evolution, we study the effect of isogenic phenotypic fluctuations.  While fluctuation in

154: the mapping from a genotype to phenotype modifies the fitness function in the equation,

155: our formulation itself is applicable.  We will also discuss how this fluctuation changes

156: the conditions for the error catastrophe, by adopting Eigen's model.

157:

158: For concluding the paper, we discuss generality of our formulation, and the relevance of

159: isogenic phenotypic fluctuation to evolution.

160:

161: \section{Derivation of evolution equation}

162: \label{31:setup-and-derivation}

163:

164: We consider a population of individuals having a haploid genotype, which is encoded on a

165: sequence consisting of $N$ sites (consider, for example, DNA or RNA). The gene is

166: represented by this symbol sequence, which is assigned from a set of numbers, such as

167: $\{-1,1\}$. This set of numbers is denoted by $S$. By denoting the state value of the

168: $i$th site by $s_i$ ($ \in S$), the configuration of the sequence is represented by the

169: ordered set $s=\{s_1,...,s_N\}$.

170:

171: We assume that a scalar phenotype variable $x$ is assigned for each sequence $s$.  This

172: mapping from sequence to phenotype is given as function $x(s)$. Examples of the phenotype

173: include the activity of some enzyme (protein), infection rate of bacteria virus, and

174: replication rate of RNA.  In general, the function $x(s)$ is a degenerate function, i.e.,

175: many different sequences are mapped onto the same phenotypic value $x$.

176:

177: Each sequence is reproduced with rate $A$, which is assumed to depend only on the

178: phenotypic value $x$, as $A(x)$; this assumption may be justified by choosing the

179: phenotypic value $x$ to relate to the replication. For example, if a protein concerns with

180: the metabolism of a replicating cell, its activity may affect the replication rate of the

181: cell and of the protein itself.

182:

183: In the replication of the sequence, mutation generally occurs; for simplicity, we consider

184: only the substitution of $s(i)$. With a given constant mutation rate $\mu$ over all sites

185: in the sequence, the state $s'_i$ of the daughter sequence is changed from $s_i$ of the

186: mother sequence, where the value $s'_i$ is assigned from the members of the set $S$ with

187: an equal probability. We call this type of mutation symmetric mutation~\cite{Baake2}. The

188: mutation is represented by the transition probability $Q(s \rightarrow s')$, from the

189: mother $s$ to the daughter sequence $s'$. The probability $Q$ is uniquely determined from

190: the sequence $s$, the mutation rate $\mu$, and the number of members of $S$.  The setup so

191: far is essentially the same as adopted by Eigen et al.\cite{Eigen}, where the fitness is

192: given as a function of the RNA sequence or DNA sequence of virus.

193:

194: Now, we assume that the transition probability depends only on the phenotypic value $x$,

195: i.e., the function $Q$ can be written in terms of a probability function $W$, which

196: depends only on $x$, $W(x \rightarrow x')$, as

197: \begin{equation} \sum_{s' \in \{ s'|x'=x(s') \}} Q(s \rightarrow

198: s')=W(x(s) \rightarrow x') .

199: \label{1} \end{equation}

200:

201: This assumption may appear too demanding. However, most models of sequence evolution

202: somehow adopt this assumption. For example, in Eigen's model, fitness is given as a

203: function of the Hamming distance from a given optimal sequence.  By assigning a phenotype

204: $x$ as the Hamming distance, the above condition is satisfied (this will be discussed

205: later). In Kauffman's NK model, if we set $N \gg 1$, $K \gg 1$, and $K/N \ll 1$, this

206: assumption is also satisfied (see Appendix \ref{29}). For the RNA secondary structure

207: model\cite{Waterman}, this assumption seems to hold approximately, from statistical

208: estimates through numerical simulations. Some simulations on a cell model with chemical

209: reaction networks\cite{Furusawa,Furusawa-KK} also support the assumption. In fact, a

210: similar assumption has been made in evolution theory with a gene substitution

211: process\cite{Gillespie,Orr}.

212:

213: The validity of this assumption in experiments has to be confirmed. Consider a selection

214: experiment to enhance some function through mutation, such as the evolution of a certain

215: protein to enhance its activity\cite{Ito}. In this case, the assumption means that the

216: activity distribution over the mutant proteins is statistically similar as long as they

217: have the same activity, even though their mother protein sequences are different.

218:

219: With the above setup, we consider the population of these sequences and their dynamics,

220: allowing for overlap between generations, by taking a continuous-time

221: model\cite{Baake2}. We do not consider the death rate of the sequence explicitly since its

222: consideration introduces only an additional term, as will be shown later. The

223: time-evolution equation of the probability distribution $\hat{P}(s,t)$ of the sequence $s$

224: is given by:

225: \begin{equation} \bubun{\hat{P}(s,t)}{t} = -\bar{A}(t) \hat{P}(s,t) +

226: \sum_{s'} A(x(s')) Q(s' \rightarrow s) \hat{P}(s',t), \label{3} \end{equation} as

227: specified by Eigen\cite{Eigen}. Here the quantity $\bar{A}(t)$ is the average fitness of

228: the population at time $t$, defined by $\bar{A}(t)=\sum_{s} A(x(s)) \hat{P}(s,t)$ and $Q$

229: is the transition probability satisfying $\sum_{s} Q(s' \rightarrow s)=1$ for any $s'$.

230:

231: %By following

232: According to the assumption (\ref{1}), eq. (\ref{3}) is transformed into the equation for

233: $P(x,t)$, which is the probability distribution of the sequences having the phenotypic

234: value $x$, defined by $P(x,t)=\sum_{ s \in \{s | x = x(s)\}} \hat{P}(s,t)$. The equation

235: is given by \begin{equation} \bubun{P(x,t)}{t} = - \bar{A}(t) P(x,t) + \sum_{x'} A(x')

236: W(x' \rightarrow x) P(x',t), \label{4}

237: \end{equation}

238: where the function $W$ satisfies

239: \begin{equation} \sum_{x} W(x' \rightarrow x)=1 \qquad \mbox{for any $x'$,}

240: \label{2} \end{equation} \noindent as shown.

241:

242: Since $N$ is sufficiently large, the variable $x$ is regarded as a continuous variable. By

243: using the Kramers-Moyal expansion\cite{vanKampen,KuboMatsuoKitahara,Haken}, with the help

244: of property (\ref{2}), we obtain:

245: \begin{equation}

246: \bubun{P(x,t)}{t} = (A(x)-\bar{A}(t)) P(x,t) + \sum_{n=1}^{\infinity} \frac{(-1)^n }{n!}

247: \bubun{{}^n}{x^n} m_n(x) A(x) P(x,t),

248: \label{5} \end{equation}

249: where $m_n(x)$ is the $n$th moment about the value $x$, defined by

250: $m_n(x)= \int (x'-x)^n W(x \rightarrow x') dx' $.

251:

252: Let us discuss the conditions for the convergence of expansion (\ref{5}), without

253: mathematical rigor.  For convergence, it is natural to assume that the function $W(x'

254: \rightarrow x)$ decays sufficiently fast as $x$ gets far from $x'$, by the definition of

255: the moment.

256:

257: Here, the transition $W(x' \rightarrow x)$ is a result of $n$ point mutants of the

258: original sequence $s'$ for $n=0,1,2,...,N$. Accordingly, we introduce a set of quantities,

259: $w_n(x(s') \rightarrow x)$, as the fitness distribution of $n$ point mutants of the

260: original sequence $s'$ (Naturally, $w_0(x(s') \rightarrow x)=\delta(x(s')-x)$, which does

261: not contribute to the $n$th moment $m_n$ ($n \geq 1$)). Next, we introduce the probability

262: $p_n$ that a daughter sequence is an $n$ point mutant $(n=0,1,2,...,N)$ from her mother

263: sequence, which are determined only by the mutation rate $\mu$ and the sequence length

264: $N$. Indeed, ${p_n}'s$ form a binomial distribution, characterized by $\mu$ and $N$.

265:

266: In terms of the quantities $w_n$ and $p_n$, we are able to write down the transition

267: probability $W$ as

268: \begin{equation}

269: W(x(s') \rightarrow x)=\sum_{n=0}^{N} p_n w_n(x(s') \rightarrow x).\label{6}

270: \end{equation}

271: Now, we discuss if $W(x(s') \rightarrow x)$ decays sufficiently fast with $|x(s')-x|$.

272: First, we note that the width of the domain, in which $w_n(x(s') \rightarrow x)$ is not

273: close to zero, increases with $n$ since $n$-point mutants involve increasing number of

274: changes in the phenotype with larger values of $n$.  Then, to satisfy the condition for

275: $W(x(s') \rightarrow x)$, at least the single-point-mutant transition $w_1(x(s')

276: \rightarrow x)$ has to decay sufficiently fast with $|x(s')-x|$. In other words, the

277: phenotypic value of a single-point mutant $s$ of the mother sequence $s'$ must not vary

278: much from that of the original sequence, i.e., $|x(s')-x(s)|$ should not be large

279: (``continuity condition").

280:

281: In general, the domain $|x-x(s')|$, in which $w_n(x(s') \rightarrow x) \neq 0$, increases

282: with $n$. On the other hand, the term $p_n$ decreases with $n$ and with the power of

283: $\mu^n$. Hence, as long as the mutation rate is not large, the contribution of $w_n$ to

284: $W$ is expected to decay with $n$. Thus, if the continuity condition with regards to a

285: single-point mutant and a sufficiently low mutation rate are satisfied, the requirement on

286: $W(x(s') \rightarrow x)$ should be fulfilled. Hence, the convergence of the expansion is

287: expected.

288:

289: Following the argument, we further restrict our study to the case with a small mutation

290: rate $\mu$ such that $\mu N \ll 1$ holds. The transition probability $W$ in eq.  (\ref{6})

291: is written as

292: \begin{equation} W(x(s') \rightarrow x) \simeq (1-\mu N)

293: \delta(x(s')-x) + \mu N w_1(x(s') \rightarrow x),\label{7}

294: \end{equation}

295: where we have used the property that ${p_n}'s$ form the binomial distribution

296: characterized by $\mu$ and $N$. Introducing a new parameter, $\gamma$ ($\gamma=\mu N$),

297: that gives the average of the number of changed sites at a single-point mutant, and using

298: the transition probability (\ref{7}), we obtain

299: \begin{equation} \bubun{P(x,t)}{t} =

300: (A(x)-\bar{A}(t)) P(x,t) + \gamma \sum_{n=1}^{\infinity} \frac{(-1)^n

301: } {n!} \bubun{{}^n}{x^n} m_n^{(1)}(x) A(x) P(x,t), \label{8}

302: \end{equation}

303: where $m_n^{(1)}(x)$ is the $n$th moment of $w_{1}(x \rightarrow x')$, i.e.,

304: $m_n^{(1)}(x)=\int (x'-x)^n w_1(x \rightarrow x') dx'$.

305:

306: When we stop the expansion at the second order, as is often adopted in statistical

307: physics, we obtain

308: \begin{equation}

309: \bubun{P(x,t)}{t} = (A(x)-\bar{A}(t)) P(x,t) + \gamma \bubun{{}}{x}

310: \left[ - m_1^{(1)}(x) + \frac{1}{2} \bubun{{}}{x} m_2^{(1)}(x) \right]

311: A(x) P(x,t). \label{9}

312: \end{equation}

313: Eqs. (\ref{8}) and (\ref{9}) are basic equations for the evolution of distribution

314: function.  Eq. (\ref{9}) is an approximation. However, it is often more tractable, with

315: the help of techniques developed for solving the Fokker-Planck equation ( see Appendix

316: \ref{10} and \cite{PhysicalBiology}), while there is no established standard method for

317: solving eq.  (\ref{8}).

318:

319: At the boundary condition we naturally impose that there are no probability flux, which is

320: given by

321: \begin{equation} \left.

322: \sum_{n=1}^{\infinity} \frac{(-1)^n } {n!} \bubun{{}^{(n-1)}}{x^{(n-1)}} m_n^{(1)}(x) A(x)

323: P(x,t) \right|_{x=x_1, x_2} =0, \label{26} \end{equation} in the case of (\ref{8}) and

324: \begin{equation}

325: \left. \left[ - m_1^{(1)}(x) + \frac{1}{2} \bubun{{}}{x} m_2^{(1)}(x) \right] A(x) P(x,t)

326: \right|_{x=x_1, x_2} =0

327: \label{27}

328: \end{equation}

329: in the case of (\ref{9}), where $x_1$ and $x_2$ are the values of the left and right

330: boundaries, respectively.

331:

332: Next, as an example of the application of our formula, we derive the evolution equation

333: for Eigen's model, and estimate the error threshold, with the help of a singular

334: perturbation theory. Through this application, we can see the validity of eq. (\ref{9}) as

335: an approximation of eq. (\ref{8}).

336:

337: Two additional remarks: First, introduction of the death of individuals is rather

338: straightforward. By including the death rate $D(x)$ into the evolution equation, the first

339: term in eq. (\ref{8}) (or eq. (\ref{9})) is replaced by

340: $\left[(A(x)-D(x))-(\bar{A}(t)-\bar{D}(t))\right] P(x,t)$, where $\bar{D}(t) \equiv \int

341: D(x) P(x,t) dx$. Second, instead of deriving each term in eq. (\ref{9}) from microscopic

342: models, it may be possible to adopt it as a phenomenological equation, with parameters (or

343: functions) to be determined heuristically from experiments.

344:

345: %%---

346:

347: \section{Application of error threshold in Eigen model}

348: \label{20}

349:

350: In the Eigen model\cite{Eigen}, the set $S$ of the site state values is given by

351: $\{-1,1\}$, and the fitness (replication rate) of the sequence is given as a function of

352: its Hamming distance from the target sequence $\{1,...,1\}$, i.e., the fitness of an

353: individual sequence is given as a function of the number $n$ of the sites of the sequence

354: having value $1$. Hence it is appropriate to define a phenotypic value $x$ in the Eigen

355: model as a monotonic function of the number $n$; we determine it as $x=\frac{2n-N}{N}$, in

356: the range $[-1,1]$. Accordingly, the replication rate $A$ of the sequence can be written

357: as a function of $x$, i.e., $A(x)$; it is natural to postulate that $A$ is a non-negative

358: and bounded function over the whole domain. If the sequence length $N$ is sufficiently

359: large, the phenotypic variable $x$ can be regarded as a continuous variable, since the

360: step size of $x$ ($\Delta x=\frac{2}{N}$) approaches 0 as $N$ goes to infinity.

361:

362: In order to derive the evolution equation of form (\ref{8}) corresponding to the Eigen

363: model, we only need to know the function $w_1$ in that model. (Recall that in our

364: formulation the mutation rate $\mu$ is assumed to be so small that only a single-point

365: mutation is considered.) Due to the assumption of the symmetric mutation, this

366: distribution function is obtained as $w_1(x \rightarrow x - \Delta x)=\frac{1+x}{2}$,

367: $w_1(x \rightarrow x + \Delta x)=\frac{1-x}{2}$, and $w_1(x \rightarrow x')=0$ for any

368: other $x'$. Accordingly, the $n$th moment is given by $m_n^{(1)}(x)= \frac{1+x}{2}

369: (-\Delta x)^n + \frac{1-x}{2} (\Delta x)^n$. Now, we obtain

370: \begin{equation}

371: \bubun{P(x,t)}{t} = (A(x)-\bar{A}(t)) P(x,t) + \gamma \sum_{n=1}^{\infinity} \frac{1} {n!}

372: \bubun{{}^n}{x^n} \left[ \frac{1+x}{2} \left( \frac{2}{N} \right)^n + \frac{1-x}{2}

373: \left(- \frac{2}{N} \right)^n \right] A(x) P(x,t) \label{12}

374: \end{equation} where

375: $\gamma=N \mu$, the mutation rate per sequence. When we ignore the moment terms higher

376: than the second order, we have

377: \begin{equation} \bubun{P(x,t)}{t} = (A(x)-\bar{A}(t)) P(x,t) + \frac{2

378: \gamma}{N} \bubun{}{x} \left[ x + \frac{1}{N}

379: \bubun{}{x} \right]A(x) P(x,t).

380: \label{11}

381: \end{equation}

382:

383: In fact, if we focus on a change near $x\sim 0$ ( to be specific $x \sim O(1/\sqrt{N})$),

384: the truncation of the expansion up to the second order is validated (Or equivalently, if

385: we define $x'=(2n-N)/\sqrt{N}$ instead of $(2n-N)/N$, and expand eq.(3) by $1/\sqrt{N}$

386: instead of $1/N$, terms higher than the second order are negligible, as is also discussed

387: in \cite{Levine}. However, in this case, the validity is restricted to $x' \sim O(1)$

388: (i.e., $(n-N/2) \sim O(1)$), which means $x\sim O(1/\sqrt{N})$ in the original variable).

389:

390: Now, we solve the eq. (\ref{11}) with a standard singular perturbation method (see

391: Appendix \ref{10}), and then return to eq. (\ref{12}).  According to the analysis in

392: Appendix \ref{10}, the stationary solution of the equation of form (\ref{11}) is given by

393: the eigenfunction corresponding to the largest eigenvalue of the linear operator $L$

394: defined by $L=A(x)+2 \gamma \varepsilon \bubun{}{x} \left[ x + \varepsilon \bubun{}{x}

395: \right]A(x)$ with $\varepsilon=\frac{1}{N}$. Now we consider the eigenvalue problem

396: \begin{equation} A(x) P(x) + 2 \gamma \varepsilon \bubun{}{x} \left[ x + \varepsilon

397: \bubun{}{x} \right]A(x) P(x) = \lambda P(x) \label{23},

398: \end{equation} where

399: $P(x) \geq 0$, with $\lambda$ to be determined.

400:

401: Since $\varepsilon$ is very small (because $N$ is sufficiently large), a singular

402: perturbation method, the WKB approximation\cite{Morse-book}, is applied. Let us put

403: \begin{equation} P(x)=e^{\frac{1}{\varepsilon}\int_{x0}^{x} R(\varepsilon,x') dx'},

404: \label{28} \end{equation} where $x_0$ is some constant and $R$ is a

405: function of $\varepsilon$ and $x$, which is expanded with respect to $\varepsilon$ as

406: \begin{equation} R(\varepsilon,x)=R_0(x)+\varepsilon R_1(x)+\varepsilon^2 R_2(x)+...

407: \label{22}

408: \end{equation} Retaining only the zeroth order terms in $\varepsilon$ in

409: eq. (\ref{23}), we get \begin{equation} A(x) + 2 \gamma \left[ x R_0(x) + R_0^2(x) \right]

410: A(x) =\lambda,

411: \label{24} \end{equation} which is formally solved for $R_0$ as

412: $R_0^{(\pm)}(x)= \frac{-x \pm \sqrt{g(x)}}{2}$ where $g(x)= x^2+\frac{2}{\gamma}

413: (\frac{\lambda}{A(x)}-1)$. Hence the general solution of eq. (\ref{23}) up to the zeroth

414: order in $\varepsilon$ is given by $P(x)=\alpha e^{\frac{1}{\varepsilon} \int_{x_0}^{x}

415: R_0^{(+)}(x')dx'} +\beta e^{\frac{1}{\varepsilon} \int_{x_0}^{x} R_0^{(-)}(x')dx'} $ with

416: $\alpha$ and $\beta$ constants to be determined.

417:

418: Now, recall the boundary conditions (\ref{27}); $P$ has to take the two branches in $R_0$

419: as $ P(x)=\alpha e^{\frac{1}{\varepsilon} \int_{x_b}^{x} R_0^{(+)}(x')dx'} $ for $x < x_b$

420: and $ P(x)= \beta e^{\frac{1}{\varepsilon} \int_{x_b}^{x} R_0^{(-)}(x')dx'} $ for $x >

421: x_b$, where $x_b$ is defined as the value at which $g(x)$ has the minimum value. Next,

422: from the continuity of $P$

423: %$\bubun{P}{x}$

424: at $x_b$, $\alpha=\beta$ follows, while from the

425: continuity of $\bubun{P}{x}$ at $x_b$, the function $g$ has to vanish at $x=x_b$. This

426: requirement $g(x_b)=0$ determines the value of the unknown parameter $\lambda$ as

427: \begin{equation}

428: \lambda=A(x_b) (1-\frac{\gamma}{2} {x_b}^2).

429: \label{18:approximated-eigenvalue}

430: \end{equation}

431: From function $P$, we find that $P$ has its peak at the point $x=x_p$, where $R_0(x)$

432: vanishes, i.e., at $ A(x_p)=\lambda $. Then, $P(x)$ approaches $\delta(x-x_p)$ in the

433: limit $\varepsilon \rightarrow +0$. These results are consistent with the requirement that

434: the mean replication rate in the steady state be equal to the largest eigenvalue of the

435: system (see Appendix \ref{10}).

436:

437: The stationary solution of eq.(\ref{12}) is obtained by following the same procedure of

438: singular perturbation. Consider the eigenvalue problem

439: \begin{equation}

440: A(x)P(x)+\gamma \sum_{n=1}^{\infinity} \frac{1} {n!}

441: \bubun{{}^n}{x^n} \left[ \frac{1+x}{2} \left( 2

442: \varepsilon \right)^n + \frac{1-x}{2} \left(- 2

443: \varepsilon \right)^n \right] A(x) P(x)=\lambda

444: P(x). \label{25} \end{equation} By putting

445: $P(x)=e^{\frac{1}{\varepsilon}\int_{x0}^{x} R_0(x')

446: dx'}$ and taking only the zeroth order terms in

447: $\varepsilon$, we obtain $$A(x)+\gamma \left[

448: \frac{1+x}{2} \left( e^{2 R_0(x)} - 1 \right) +

449: \frac{1-x}{2} \left( e^{-2 R_0(x)} -1 \right) \right]

450: A(x) =\lambda ,$$ which gives $$

451: R_0^{(\pm)}(x)=\frac{1}{2} \log

452: \frac{1+\frac{1}{\gamma} (\frac{\lambda}{A(x)}-1) \pm

453: \sqrt{ \hat{g}(x)}}{1+x}$$ with $\hat{g}(x)=

454: (1+\frac{1}{\gamma} (\frac{\lambda}{A(x)}-1))^2-(1-x^2)

455: $.

456:

457: By defining again the value $x=x_b$ at which $\hat{g}(x)$ takes the minimum, $P$ is

458: represented as $ P(x)=\alpha e^{\frac{1}{\varepsilon} \int_{x_b}^{x} R_0^{(+)}(x')dx'} $

459: for $x < x_b$ and $ P(x)= \beta e^{\frac{1}{\varepsilon} \int_{x_b}^{x} R_0^{(-)}(x')dx'}

460: $ for $x > x_b$. The continuity of $\bubun{P}{x}$ at $x=x_b$ requires $\hat{g}(x_b)=0$,

461: which determines the value of $\lambda$ as \begin{equation} \lambda=A(x_b) \left[1-\gamma

462: \left(1-\sqrt{1-{x_b}^2}\right) \right].

463: \label{15:more-exact-eigenvalue} \end{equation}

464: Again, $P(x)=\delta(x-x_p)$, in the limit $\varepsilon \rightarrow +0$, with $x_p$ given

465: by the condition $A(x_p)=\lambda$. When $|x_b| \ll 1$, the form

466: (\ref{15:more-exact-eigenvalue}) approaches eq.  (\ref{18:approximated-eigenvalue})

467: asymptotically.  This implies that the time evolution equation (\ref{8}), if restricted to

468: $|x| \ll 1$, is accurately approximated by eq.(\ref{9}) that keeps the terms only up to

469: the second moment.

470:

471: Let us estimate the threshold mutation rate for error catastrophe. This error threshold is

472: defined as the critical mutation rate $\gamma^{*}$ at which the peak position $x_p$ of the

473: stationary distribution drops from $x_p\neq 0$ to $x_p =0$, with an increase of $\gamma$.

474: We use the following procedure to obtain the critical value $\gamma^{*}$.

475:

476: First consider an evaluation function whose form

477: corresponds to that of eigenvalue

478: (\ref{15:more-exact-eigenvalue}) as \begin{equation}

479: f(x)=A(x) \left[1-\gamma \left(1-\sqrt{1-{x}^2}\right)

480: \right], \label{30:more-exact-evaluation-function}

481: \end{equation} and find

482: the position at which the function $f(x)$ takes the maximum value. This procedure is

483: equivalent to obtaining $x_b$ in the above analysis, since the relation

484: $f(x)=\lambda-\frac{\gamma^2 A^2(x)}{\lambda-A(x) \left( 1-\gamma

485: \left(1+\sqrt{1-x^2}\right) \right)} \hat{g}(x)$ and the requirement on $x_b$ that

486: $\hat{g}(x_b)=0$ and $\left. \frac{d \hat{g}(x)}{dx} \right|_{x=x_b}=0$ lead to

487: $\left. \frac{df(x)}{dx} \right|_{x=x_b}=0$. Obviously, $x_b$ is given as a function of

488: $\gamma$, thus, we denote it by $x_b(\gamma)$. The position $x_b$ determines the position

489: $x_p$ of the stationary distribution through the relation $A(x_p)=\lambda=f(x_b)$ as in

490: the above analysis. If $A$ has flat parts around $x=0$ and higher parts in the region ($x

491: > 0$), $x_p(\gamma)$ discontinuously changes from $x_p \neq 0$ to $x_p = 0$ at some

492: critical mutation rate $\gamma^{*}$, when $\gamma$ increases from zero.  A schematic

493: illustration of this transition is given in

494: Fig.(\ref{33:fig:schematical-explaination-of-estimation}).

495:

496: As a simple example of this estimate of error threshold, let us consider the case

497: \begin{equation} A(x)=1+A_0 \Theta(x-x_0),

498: \label{14:step} \end{equation} with $A_0>0$ and

499: $0<x_0<1$, and $\Theta$ as the Heaviside step function, defined as $\Theta(x)=0$ for $x <

500: 0$ and $\Theta(x)=1$ for $x \geq 0$. According to the procedure given above, the critical

501: mutation rate is straightforwardly obtained as

502: $\gamma^{*}=\frac{A_0}{(1+A_0)\left(1-\sqrt{1-{x_0}^2}\right)}$, for $\gamma<\gamma^{*}$,

503: $x_p=x_0$ and for $\gamma > \gamma^{*}$, $x_p=0$.

504:

505: {\sl Remark}

506:

507: An exact transformation from the sequence model (Eigen model\cite{Eigen}) into a class of

508: Ising models\cite{Leuthausser, Baake} has recently been reported, such that the sequence

509: model is treated analytically with methods developed in statistical physics.  Rigorous

510: estimation of the error threshold for various fitness landscapes\cite{Baake2,Taiwan} and

511: relaxation times of species distribution have been obtained\cite{Taiwan2}. In fact, our

512: estimate (above) agrees with that given by their analysis.

513:

514: Their method is indeed powerful when a microscopic model is prescribed in correspondence

515: with a spin model.  However, even if such microscopic model is not given, our formulation

516: with a Fokker-Planck type equation will be applicable because it only requires estimation

517: of moments in the fitness landscape. Alternatively, by giving a phenomenological model

518: describing the fitness without microscopic process, it is possible to derive the evolution

519: equation of population distribution. Hence, our formulation has a broad range of potential

520: applications.

521:

522: \section{Consideration of phenotypic fluctuation}

523:

524: In this section, we include the fluctuation in the mapping from genetic sequence to the

525: phenotype into our formula, and examine how it influences the error catastrophe. We first

526: explain the term ``phenotypic fluctuation'' briefly, and show that in its presence our

527: formulation (\ref{8}) remains valid by redefining the function $A(x)$. By applying the

528: formulation, we study how the introduction of the phenotypic fluctuation changes the

529: critical mutation rate $\gamma^{*}$ for the error catastrophe.

530:

531: In general, even for individuals with identical gene sequences in a fixed environment, the

532: phenotypic values are distributed. Some examples are the activities of proteins

533: synthesized from the identical DNA \cite{Yang-et-al}, the shapes of RNA molecules of

534: identical sequences \cite{ancel-fontana}, and the numbers of specific proteins for

535: isogenic bacteria \cite{ Elowitz,Kaern-Collins,Collins,Furusawa}. Next, the phenotype $x$

536: from each individual with the sequence $s$ is distributed, which is denoted by

537: $P_{phe}(s,x)$.

538:

539: We assume that the form of distribution $P_{phe}$ is characterized only in terms of its

540: mean value, i.e., the distributions ${P_{phe}}'s$ having the same mean value $X$ take the

541: same form. By representing the mean value of the phenotype $x$ by $\bar{x}(s)$, the

542: distribution $P_{phe}$ is written as $P_{phe}(s,x)=\hat{P}_{phe}(\bar{x}(s),x)$, where

543: $\hat{P}_{phe}$ is a function of $\bar{x}$ and $x$, which is normalized with respect to

544: $x$, i.e., satisfying $\int \hat{P}_{phe}(\bar{x},x) dx =1$.

545:

546: In our formulation, the replication rate $A$ of the sequence with the phenotypic value $x$

547: is given by a function of phenotypic value $x$, denoted by $A(x)$.  The mean replication

548: rate $\hat{A}$ of the species $s$ is calculated by

549: \begin{equation} \hat{A}(\bar{x}(s))=\int \hat{P}_{phe}(\bar{x}(s),x) A(x)

550: dx. \label{phe:mean} \end{equation}

551:

552: As in the case of (\ref{1}), we assume that the transition probability from $s$ to $s'$

553: during the replication is represented only by its mean values $\bar{x}(s)$ and

554: $\bar{x}(s')$, i.e., the transition probability function is written as $W(\bar{x}(s)

555: \rightarrow \bar{x}(s'))$. With this setup, the population dynamics of the whole sequences

556: is represented in terms of the distribution of the mean value $\bar{x}$ only, so that we

557: can use our formulation (\ref{8}) even when the phenotypic fluctuation is taken into

558: account; we need only replace the replication rate $A$ in (\ref{8}) by the mean

559: replication rate $\hat{A}$ obtained from eq. (\ref{phe:mean}).

560:

561: Now, we can study the influence of phenotypic fluctuation on the error threshold by taking

562: the step fitness function $A(x)$ of eq. (\ref{14:step}) and including the phenotypic

563: fluctuation as given in eq.(\ref{phe:mean} ).  We consider a simple case where the form of

564: $\hat{P}_{phe}$ is given by a constant function within a given range (we call this the

565: piecewise flat case). Our aim is to illustrate the effect of the phenotypic fluctuation on

566: the error threshold, so we evaluate the critical mutation rate $\gamma^{*}$ using the

567: simpler form $f(x)=A(x)(1-\frac{\gamma}{2} x^2)$ from

568: eq.(\ref{18:approximated-eigenvalue}), while the use of the form

569: (\ref{30:more-exact-evaluation-function}) gives the same qualitative result. With this

570: simpler evaluation function, the critical mutation rate $\gamma^{*}$ is given by

571: \begin{equation} \gamma_0^{*}=\frac{2 A_0}{(1+A_0) {x_0}^2},

572: \label{34:gamma-zero}

573: \end{equation} in the case without phenotypic fluctuation.

574: Here we examine if this critical value $\gamma^{*}_0$ increases under isogenic phenotypic

575: fluctuation.

576:

577: We make two further technical assumptions in the following analysis: first we assume that

578: $A_0$ in the form (\ref{14:step}) is sufficiently small, so that the value of critical

579: $\gamma^{*}$ is not large. Second, we extend the range of $x$ to $[-\infinity,\infinity]$

580: for simplicity.  This does not cause problems because we have set the range of $x_0$ to

581: $(0,1)$. Hence, the stationary distribution has its peak around the range $0 \leq x < 1$;

582: everywhere outside this range, the distribution vanishes.

583:

584: We consider the case in which distribution $\hat{P}_{ phe }$ of the phenotype of the

585: species $s$ is given by

586: \begin{equation}

587: \hat{P}_{ phe }^{(F)}(\bar{x}(s), x) = \left\{

588: \begin{array}{ll} 0 & \quad \mbox{for $ x

589: <\bar{x}-\ell$}\\ \frac{1}{2 \ell} & \quad \mbox{for $ \bar{x}-\ell \leq x \leq \bar{x} +

590: \ell$}\\ 0 & \quad \mbox{for $ \bar{x} + \ell < x $, }

591: \end{array} \right.

592: \label{36:flat-case}

593: \end{equation}

594: where $\ell$ gives the half-width of the distribution. ($(F)$ represents the

595: piecewise-flat distribution case). Then, $\hat{A}$ is calculated by

596: $$ \hat{A}^{(F)}(x) = \left\{ \begin{array}{ll} 1 & \quad \mbox{for $x<x_0-\ell$}\\ 1+

597: \frac{A_0}{2 \ell} (x-(x_0-\ell)) & \quad \mbox{for $x_0-\ell \leq x \leq x_0 + \ell$}\\

598: 1+ A_0 & \quad \mbox{for $x_0 + \ell < x$. } \end{array} \right.$$ An example of

599: $\hat{A}^{(F)}(x)$ is shown in Fig. (\ref{35:fig:profile-of-A}). The evaluation function

600: $f$ in section (\ref{20}) is given by $ f^{(F)}(x)=\hat{A}^{(F)}(x) (1-\frac{\gamma}{2}

601: x^2) $.

602:

603: We study the case where the position ${x_b^{*}}^{(F)} (\equiv {x_b^{(F)}}(\gamma^{*}))$ is

604: within the range $[x_0-\ell,x_0]$ because the profile of $\hat{A}^{(F)}$ shows that

605: ${\gamma^{*}}^{(F)}$ is smaller than $\gamma^{*}_0$ if ${x_b^{*}}^{(F)}>x_0$.  If

606: $\frac{x_0}{2+A_0} \leq \ell < x_0$, the position ${x_b^{*}}^{(F)}$ is within the range

607: $[x_0-\ell,x_0]$.  In that case, ${\gamma^{*}}^{(F)}$ is given by ${\gamma^{*}}^{(F)}

608: \simeq \frac{A_0}{4 \ell (x_0-\ell)} $ to the first order of $A_0$. Comparing

609: ${\gamma^{*}}^{(F)}$ with $\gamma^{*}_0$ in (\ref{34:gamma-zero}), we conclude that

610: ${\gamma^{*}}^{(F)} < \gamma^{*}_0$ for $ 0 <\ell<\frac{2+\sqrt{2}}{4} x_0 $, and

611: ${\gamma^{*}}^{(F)} > \gamma^{*}_0$ for $ \frac{2+\sqrt{2}}{4} x_0 <\ell< x_0$.  Hence,

612: when the half width $\ell$ of the distribution $P_{phe}$ is within the range

613: $(\frac{2+\sqrt{2}}{4} x_0,x_0)$, the critical mutation rate for the error catastrophe

614: threshold is increased.  In other words, the isogenic phenotypic fluctuation increases the

615: robustness of high fitness state against mutation.

616:

617: We also studied the case in which $ \hat{P}_{ phe } (\bar{x}, x)$ decreases linearly

618: around its peak, i.e., with a triangular form.  In this case, the phenotypic fluctuation

619: decreases the critical mutation rate as long as $A_0$ is small, while it can increase for

620: sufficiently large values of $A_0$, for a certain range of the values of width of

621: phenotypic fluctuation.

622:

623: \section{Discussion}

624: \label{32:discussion}

625:

626: In the present paper, we have presented a general formulation to describe the evolution of

627: phenotype distribution.  A partial differential equation describing the temporal evolution

628: of phenotype distribution is presented with a self-consistently determined growth term.

629: Once a microscopic model is provided, each term in this evolution equation is explicitly

630: determined so that one can derive the evolution of phenotype distribution

631: straightforwardly.  This eq. (\ref{8}) is obtained as a result of Kramers-Moyal expansion,

632: which includes infinite order of derivatives.  However, this expansion is often summed to

633: a single term in the large number limit of base sequences, with the aid of singular

634: perturbation.

635:

636: If the value of a phenotype variable $|x|$ is much smaller than unity (which is the

637: maximal possible value giving rise to the fittest state), the terms higher than the second

638: order can be neglected, so that a Fokker-Planck type equation with a self-consistent

639: growth term is derived.  The validity of this truncation is confirmed by putting

640: $x'=(2n-N)/\sqrt{N}$ and verifying that the third or higher order moment is negligible

641: compared with the second-order moment. Thus the equation up to its second order,

642: (\ref{9}), is relevant to analyzing the initial stage of evolution starting from a

643: low-fitness value.

644:

645: As a starting point for our formalism, we adopted eq. (\ref{3}), which is called the

646: ``coupled'' mutation-selection equation\cite{Hofbauer}.  Although it is a natural and

647: general choice for studying the evolution, a simpler and approximate form may be used if

648: the mutation rate and the selection pressure are sufficiently small.  This form given by

649: $\bubun{\hat{P}(s,t)}{t} = -\bar{A}(t) \hat{P}(s,t) + \sum_{s'} Q(s' \rightarrow s)

650: \hat{P}(s',t)$, is called the ``parallel'' mutation-selection

651: equation\cite{kimura1970,Akin}.  It approaches the coupled mutation-selection equation

652: (\ref{3}), in the limits of small mutation rate and selection pressure, as shown in

653: \cite{Hofbauer}. If we start from this approximate, parallel mutation-selection equation,

654: and follow the procedure presented in this paper, we obtain $\bubun{P(x,t)}{t} =

655: (A(x)-\bar{A}(t)) P(x,t) + \gamma \bubun{{}}{x} \left[ - m_1^{(1)}(x) + \frac{1}{2}

656: \bubun{{}}{x} m_2^{(1)}(x) \right] P(x,t)$.

657:

658: In general, this equation is more tractable than eq. (\ref{9}), as the techniques

659: developed in Fokker-Planck equations are straightforwardly applied as discussed in

660: \cite{PhysicalBiology}, and it is also useful in describing of evolution.  Setting

661: $A(x)=x^2$ and replacing $m_1^{(1)}$ and $m_2^{(1)}$ with some constants, the equation is

662: reduced to that introduced by Kimura\cite{Kimura2}; while setting $A(x)=x$, $m_1^{(1)}(x)

663: \propto x$, and replacing $m_2^{(1)}$ with some constant derives the equation by

664: Levine\cite{Levine}.  Because our formalism is general, these earlier studies are derived

665: by approximating our evolution equation suitably.

666:

667: Besides the generality, another merit of our formulation lies in its use of the phenotype

668: as a variable describing the distribution, rather than the fitness (as adopted by Kimura).

669: Whereas the phenotype is an inherent variable directly mapped from the genetic sequence,

670: the fitness is a function of the phenotype and environment, and strongly influenced by

671: environmental conditions.  The evaluation of the transition matrix by mutation in

672: eq.(\ref{8}) would be more complicated if we used the fitness as a variable, due to

673: crucial dependence of fitness values on the environmental conditions.  In the formalism by

674: phenotype distribution, environmental change is feasible by changing the growth term

675: $A(x)$ accordingly. Our formalism does include the fitness-based equation as a special

676: case, by setting $A(x)=x$.

677:

678: Another merit in our formulation is that it easily takes isogenic phenotypic fluctuation

679: into account without changing the form of the equation, but only by modifying $A(x)$.  By

680: applying this equation, we obtained the influence of isogenic phenotype fluctuations on

681: error catastrophe.  The critical mutation rate for the error catastrophe increases because

682: of the fluctuation, in a certain case.  This implies that the fluctuation can enhance the

683: robustness of a high-fitness state against mutation.

684:

685: In fact, the relevance of isogenic phenotypic fluctuations on evolution has been recently

686: proposed\cite{SatoPNAS,kaneko-book,KKFurusawaJTB}, and change in phenotypic fluctuation

687: through evolution has been experimentally verified\cite{Ito,SatoPNAS}.  In general,

688: phenotypic fluctuations and a mutation-selection process for artificial evolution have

689: been extensively studied recently.  The present formulation will be useful in analysing

690: such experimental data, as well as in elucidating the relevance of phenotypic fluctuations

691: to evolution.

692:

693: \newpage

694:

695: {\bf Figures}

696:

697: \begin{figure}[hbtp] \begin{minipage}[t]{15cm} \begin{center}

698: \scalebox{ 0.45 }{\includegraphics{Fig1.eps}}

699: \caption{ Examples of profiles of the evaluation function $f$ for three values of

700: $\gamma$. The red, purple, and blue curves give the profiles of $f$ for $\gamma=0.31$,

701: $\gamma=0.386$, and $\gamma=0.49$, respectively, where $f$ is defined by $f(x)=A(x)

702: (1-\gamma (1-\sqrt{1-x^2}))$ and $A$ is given by $A(x)=1+0.2 (x-0.25) \Theta(x-0.25)

703: \Theta(0.75-x)+ 0.1 \Theta(x-0.75)$; the profile of $A$ is indicated by the black

704: curve. This illustrates determination of $x_b$ and $x_p$; $x_b$ is given by the position

705: where $f$ takes a maximum, while $x_p$ is given as the position where the line $y=f(x_b)$

706: crosses the curve of $A$. For $\gamma < 0.386$, $f(x)$ has a maximum value at $x=x_b$, and

707: thus the critical mutation rate for the error threshold is estimated to be

708: $\gamma^{*}=0.386$.  }

709: \label{33:fig:schematical-explaination-of-estimation}

710: \end{center} \end{minipage} \end{figure}

711:

712: \begin{figure}[hbtp] \begin{minipage}[t]{15cm} \begin{center}

713: \scalebox{ 0.4 }{\includegraphics{Fig2.eps}}

714: \caption{Example of profiles of the mean fitness functions without phenotypic fluctuation

715: case (black); with a constant phenotypic fluctuation over a given range given by

716: eq.(\ref{36:flat-case}) (red), where we set $A(x)=1+0.1 \Theta(x-0.5)$ and $\ell=0.25$.  }

717: \label{35:fig:profile-of-A}

718: \end{center} \end{minipage} \end{figure}

719:

720: %% shaji 2006/11/06

721: \acknowledgements The authors would like to thank P. Marcq, S. Sasa, and T. Yomo for

722: useful discussion.

723:

724: %%---

725:

726: \appendix

727:

728: %�������̂ق�����Ɍ����̂� 2006/11/06

729: \section{Estimation of the transition probability in the NK model}

730: \label{29}

731:

732: In the NK model\cite{well-written-NK,kauffmann-book}, the fitness $f$ of a sequence $s$ is

733: given by $$ f(s)=\frac{1}{N} \sum_{i=1}^{N} \omega_i(s) ,$$ where $\omega_i$ is the

734: contribution of the $i$th site to the fitness, which is a function of $s_i$ and the state

735: values of other $K$ sites. The function $\omega_i$ takes a value chosen uniformly from

736: $[0,1]$ at random. We assume that the phenotype $x$ of the sequence $s$ is given by

737: $x=f(s)$.

738:

739: When $N \gg 1$, $K \gg 1$, and $K/N \ll 1$, the phenotype distribution of mutants of a

740: given sequence $s$ (whose phenotype is $x$) is characterized only by the phenotype $x$

741: (without the need to specify the sequence $s$). For showing this, we first examine the

742: one-point mutant case.

743:

744: We consider the ``number of changed sites'' of sites at which $\omega's$ are changed due

745: to a single-point mutation. By assuming that the average number of changed sites is $K$,

746: the distribution of the number of changed sites $n$, denoted by $P_{site}(n)$, is

747: approximately given by

748: \begin{equation} P_{site}(n) \simeq e^{-\frac{(n-K)^2}{2 K }},

749: \label{1:appen:site} \end{equation} with the help of

750: the limiting form of binomial distribution.  Here, we have omitted the normalization

751: constant.

752:

753: Next, we study the distribution of the difference between the phenotype $x$ of the

754: original sequence and the phenotype $x'$ of its one-point mutant, given the number $n$ of

755: changed sites of the single-point mutant.  We denote the distribution as $P_{dif \!

756: f}(n;X)$, where $X=x'-x$. Here the average of $x'$ is $x(N-n)/N$, since $(N-n)$ sites are

757: unchanged. Thus, according to the central limit theorem, the distribution is estimated as

758: \begin{equation}

759: P_{dif \! f}(n;X) \simeq \exp \left[ {-\frac{(X+\frac{n}{N}

760: x)^2}{2 n \frac{\sigma^2}{N^2} }} \right] ,

761: \label{2:appen:diff} \end{equation} where $\sigma^2$ is the

762: variance of the distribution of the value of $\omega$.  This variance is estimated from

763: the probability distribution $P_{(s,\{\omega_i\})}(\omega)$ that the sequence $\omega$ is

764: generated.. Although the explicit form of $P_{(s,\{\omega_i\})}$ is hard to obtain unless

765: $\{\omega_i\}$ and $s$ are given, it is estimated by means of the ``most probable

766: distribution,'' obtained as

767: %in the following way

768: follows: Find the distribution that maximizes the evaluation function $S$ (called

769: ``entropy'') defined by $S=-\int_{0}^{1} P(\omega) \log P(\omega) d\omega$ under the

770: conditions $\int_{0}^{1} P(\omega) d\omega=1$ and $\int_{0}^{1} \omega P(\omega)

771: d\omega=x$. Accordingly the variance $\sigma^2$ may depend on $x$.

772:

773: Combining these distributions (\ref{1:appen:site}) and (\ref{2:appen:diff}) gives the

774: distribution of $X$ without constraint on the number of changed sites:

775: $$ P(X) = \sum_{n=1}^{N} P_{site}(n) P_{dif \! f}(n;X) \simeq \exp \left[

776: {-\frac{(X+\frac{K}{N} x)^2}{2 K \frac{(\sigma^2+x^2)}{N^2}}} \right] .$$ This result

777: indicates that the phenotype distribution of single-point mutants from the original

778: sequence $s$ having the phenotype $x$ is characterized by its phenotype $x$ only; $s$ is

779: not necessary. Similarly, one can show that phenotype distribution of $n$-point mutants is

780: also characterized only by $x$. Hence, the transition probability in the NK model is

781: described only in terms of the phenotypes of the sequences, when $N \gg 1$, $K \gg 1$, and

782: $K/N \ll 1$.

783:

784: \section{ Mathematical structure of the equation of form (\ref{9})} \label{10}

785:

786: We first rewrite eq. (\ref{9}) as

787: \begin{equation} \bubun{P(x,t)}{t} = -\bar{A}(t) P(x,t) + L(x) P(x,t),

788: \label{17} \end{equation}

789: where $L$ is a linear operator, defined by $L(x)=A(x) +\bubun{{}}{x} f(x) +

790: \bubun{{}^2}{x^2} g(x) $ with $f(x)= - \gamma m_1^{(1)}(x) A(x)$ and $g(x)=

791: \frac{\gamma}{2} m_2^{(1)}(x) A(x)$.

792:

793: The linear operator $L$ is transformed to an Hermite operator using variable

794: transformations (see below) so that $L$ is represented by a complete set of eigenfunctions

795: and corresponding eigenvalues, which are denoted by $\{\phi_i(x)\}$ and $\{\lambda_i\}$

796: ($i=0,1,2,...$), respectively. Eigenvalues are real and not degenerated, so that they are

797: arranged as $\lambda_0 > \lambda_1 > \lambda_2 >...$.

798:

799: According to \cite{PhysicalBiology}, $P(x,t)$ is expanded as

800: \begin{equation} P(x,t)=\sum_{i=0}^{\infinity} a_i(t) \phi_i(x), \label{13}

801: \end{equation}

802: where $a_i$ satisfies

803: \begin{equation} \frac{d a_i(t)}{dt}= a_i(t)

804: (\lambda_i-{\sum_{j=0}^{\infinity}}' a_j(t) \lambda_j). \label{19}

805: \end{equation}

806: The prime over the sum symbol indicates that the summation is taken except for those of

807: noncontributing eigenfunctions as defined in \cite{PhysicalBiology}.

808:

809: Stationary solutions of eq. (\ref{19}) are given by $\{ a_{k}=1$ and $a_i=0$ for $i \ne k

810: \}$. Among these stationary solutions, only the solution $\{a_0=1$ and $a_i=0$ for $i \ne

811: 0\}$ is stable. Hence, the eigenfunction for the largest eigenvalue (the largest

812: replication rate) gives the stationary distribution function. Now it is important to

813: obtain eigenfunctions and eigenvalues of $L$, in particular the largest eigenvalue

814: $\lambda_0$ and its corresponding eigenfunction $\phi_0$. Hence, we focus our attention on

815: the eigenvalue problem

816: \begin{equation} \left[ A(x) + \bubun{{}}{x} f(x) + \bubun{{}^2}{x^2} g(x)

817: \right] P(x) =\lambda P(x), \label{16} \end{equation} where $\lambda$ is a constant and P

818: is a function of $x$.

819:

820: We can transform eigenvalue problem (\ref{16}) to a Schroedinger equation-type eigenvalue

821: problem as follows: First we introduce a new variable $y$ related to $x$ as

822: $y(x)=\int_{x_0}^{x} \sqrt{\frac{h}{g(x')}} dx'$ where $x_0$ and $h$ are constants. Next,

823: we consider a new function $\Psi(y)$ related to $P(x)$ as

824: $$\Psi(y)= \left. \sqrt{\frac{g(x)}{h}} e^{{\int_{y_0}^{y} \frac{\hat{f}(y')}{2 h} dy'}}

825: P(x) \right|_{x=x(y)} $$ where $y_0$ is some constant, $x(y)$ the inverse function of

826: $y(x)$, and $\hat{f}$ a function of $y$ defined by

827: $$\hat{f}(y)= \left. \sqrt{\frac{h}{g(x)}} (f(x)+\frac{1}{2} \frac{d

828: g(x)}{dx}) \right|_{x=x(y)} .$$

829:

830: Using these new quantities $y$ and $\Psi$ and rewriting eigenvalue problem (\ref{16})

831: suitably, we get

832: \begin{equation} \left[ V(y) + h \bubun{{}^2}{y^2} \right] \Psi(y) =\lambda

833: \Psi(y), \label{21} \end{equation} where $V(y)=\hat{A}(y) + \frac{\frac{d \hat{f} (y)}{d

834: y}}{2} - \frac{\hat{f}^2(y)}{4 h}$ with $\hat{A}(y)=A(x(y))$.

835:

836: %%---

837:

838:

839: %% bunken

840: \begin{thebibliography}{99}

841:

842: %(1)

843: \bibitem{Lenski}

844: S. F. Elena and R. E. Lenski, Nat. Rev. Genet. {\bf 4}, 457 (2003).

845: %Nature Reviews

846: %authors: Santiago F. Elena and Richard E. Lenski

847: %title:Evolution experiments with microorganisms: the dynamics and genetic bases of

848: %adaptation

849:

850: %(2)

851: \bibitem{Lenski-Rose}

852: R. E. Lenski, M. R. Rose, S. C. Simpson, and S. C. Tadler, Am. Nat. {\bf 183}, 1315-1341

853: (1991).

854: %S. F. Elena and R. E. Lenski, Nat. Rev. Genet. {\bf 4}, 457 (2003).

855:

856: %(3)

857: \bibitem{Kishony}

858: M. Hegreness, N. Shoresh, D. Hartl, and R. Kishony, Science {\bf 311}, 1615 (2006).

859: %authors: Matthew Hegreness, Noam Shoresh, Daniel Hartl, and Roy Kishony

860: %title:An Equivalence Principle for the Incorporation of Favorable Mutations in Asexual

861: %Populations

862:

863: %(4)

864: \bibitem{Bactreia-many-generations}

865: A. E. Mayo, Y. Setty, S. Shavit, A. Zaslaver, and U. Alon, PLoS Biol. {\bf 4}, 556-561 (2006).

866: %???

867:

868: %(5)

869: \bibitem{Dekel-Alon}

870: E. Dekel and U. Alon, Nature {\bf 436}, 588-592 (2005).

871: %title:Optimality and evolutionary tuning of the expression level of a protein.

872:

873: \bibitem{Kashiwagi-Noumachi} A. Kashiwagi, W. Noumachi, M. Katsuno, M. T. Alam, I. Urabe,

874: and T. Yomo, J. Mol. Evol. {\bf 52}, 502-509 (2001).

875: %* Plasticity of Fitness and Diversification Process During an Experimental Molecular

876: %     Evolution

877: %*Journal of Molecular Evolution

878: %*Authors

879: %     *Akiko Kashiwagi, Wataru Noumachi, Masato Katsuno, Mohammad T. Alam, Itaru Urabe,

880: %          Tetsuya Yomo

881:

882: \bibitem{Ito}

883: Y. Ito, T. Kawama, I. Urabe, and T. Yomo, Mol. Evol. {\bf 58(2)}, 196-202 (2004).

884:

885: \bibitem{footnote}

886: {A well established theory in population genetics,

887: which adopts diffusion equation or Fokker-Planck type equation, is related to the

888: frequency of genes in alleles in population as developed by Wright, Fisher, and

889: Kimura\cite{Fisher,Wright,kimura1970}.  How new genes spread in population is analyzed by

890: the diffusion equation. In contrast, we are concerned with the changes in the distribution

891: of a base sequence consisting of a haploid gene.}

892:

893: %\bibitem{Kashiwagi}

894: %A. Kashiwagi, I. Urabe, K. Kaneko, and T. Yomo, PLOS ONE.

895: %details are still unknown for now (2006/10/25).

896: %title:Adaptive response of a gene network to environmental changes by attractor selection

897:

898: \bibitem{Fisher}

899:

900: R. A. Fisher, Proc. Roy. Soc. Edinb. {\bf 50}, 205 (1930); {\sl The genetical

901: theory of natural selection} (Oxford University Press, Oxford, 1999).

902: %*article:Proceedings of the Royal Society of Edinburgh

903: %*���ꂪ�ł��悭�ł��Ă���炵��.2006/10/13

904:

905: \bibitem{Wright} S. Wright, Proc. Natl. Acad. Sci. USA {\bf 31}, 382 (1945); {\sl The

906: theory of gene frequencies} (University of Chicago Press, Chicago, 1969).

907: %*full name:Sewall Wright

908: %*���ꂪ�������Ă���{�炵��.

909:

910: \bibitem{kimura1970}

911: J. F. Crow and M. Kimura, {\sl An introduction to population genetics theory}

912: (Harper \verb|&| Row, New York, 1970).

913:

914: \bibitem{Kimura2}

915: M. Kimura, Proc.  Natl. Acad.  Sci. USA {\bf 54}, 731-736 (1965).

916: %author: Motoo Kimura

917:

918: \bibitem{Levine}

919: L. S. Tsimring, H. Levine, and D.A. Kessler, Phys. Rev. Lett. {\bf 76}, 4440 (1996).

920: %authors:Lev S. Tsimring and Herbert Levine

921: %title:RNA Virus Evolution via a Fitness-Space Model

922:

923: \bibitem{Eigen}

924: M. Eigen, J.  McCaskill, P. Schuster, J. Phys. Chem. {\bf 92}, 6881-6891 (1988).

925: %title:Molecular Quasi-Species

926: %authors: Manfred Eigen, John McCaskill, and Peter Schuster

927:

928: \bibitem{kauffmann-book}

929: S. Kauffman, { \sl The Origins of Order} (Oxford University Press, New York, 1993).

930:

931: \bibitem{Koshland}

932: J. Spudich and D. Koshland, Nature {\bf 262}, 467-471 (1976).

933: %title:Non-genetic individuality: chance in the single cell.

934:

935: \bibitem{Elowitz} M. B. Elowitz, A. J. Levine, E. D. Siggia, and P. S. Swain, Science {\bf

936: 297}, 1183 (2002).

937: %Elowitz M B, Levine A J, Siggia E D and Swain P S

938: %title:Stochastic Gene Expression in a Single Cell

939:

940: \bibitem{Kaern-Collins} M. Kaern, T. C. Elston, W. J. Blake, and J.J. Collins,

941: Nat. Rev. Genet. {\bf 6}, 451-464 (2005).

942: %*title:Stochasticity in gene expression: from theories to phenotypes.

943:

944: \bibitem{Collins}

945: J. Hasty, J. Pradines, M. Dolnik, and J. J. Collins, Proc.  Natl. Acad.

946:          Sci. USA {\bf 97}, 2075-2080 (2000).

947: %title:Noise-based switches and amplifiers for gene expression

948:

949: \bibitem{Furusawa} C. Furusawa, T. Suzuki, A. Kashiwagi, T. Yomo, and K. Kaneko,

950: BIOPHYSICS {\bf 1}, 25 (2005).

951: %title:Ubiquity of log-normal distributions in intra-cellular reaction dynamics

952:

953: %\bibitem{Eigen-Schuster}

954: %M. Eigen and P. Schuster, Naturwissenschaften {\bf 64}, 541-565 (1977).

955:

956: \bibitem{Ueda}

957: M. Ueda, Y. Sako, T. Tanaka, P. Devreotes, and T. Yanagida, Science {\bf 294},

958:          864 (2001).

959: %title:Single-Molecule Analysis of Chemotactic Signaling in Dictyostelium Cells

960:

961: \bibitem{noise-review}

962: A. Bar-Even, J. Paulsson, N. Maheshri, M. Carmi, E. O'Shea, Y. Pilpel, and

963: N. Barkai, Nat. Genet. {\bf 38}, 636-643 (2006).

964: %*title:Noise in protein expression scales with natural protein abundance.

965:

966: \bibitem{SatoPNAS}

967: K. Sato, Y. Ito, T. Yomo, and K. Kaneko, Proc. Nat. Acad. Sci. USA {\bf

968: 100}, 14086 (2003).

969: %title:On the relation between fluctuation and response in biological systems

970:

971: \bibitem{kaneko-book} K. Kaneko, {\sl Life: An Introduction to Complex Systems Bioilogy}

972: (Springer, Berlin, 2006).

973:

974: \bibitem{KKFurusawaJTB}

975: K. Kaneko and C. Furusawa, J. Theo. Biol. {\bf 240}, 78-86 (2006).

976: %An evolutionary relationship between genetic variation and phenotypic fluctuation

977:

978: \bibitem{Ancel}

979: L. W. Ancel, Theor. Popul. Biol. {\bf 58}, 307-319 (2000).

980: %*title:Undermining the Baldwin expediting effect: does phenotypic plasticity

981: %accelerate evolution?

982:

983: \bibitem{vanKampen}

984: N. G. van Kampen, {\sl Stochastic processes in physics and chemistry} (North-Holland,

985:          Amsterdam, 1992).

986:

987: \bibitem{KuboMatsuoKitahara}

988: R. Kubo, K. Matsuo, and K. Kitahara, J. Stat. Phys. {\bf 9}, 51 (1973).

989: %Journal of Statistical Physics

990: %authors: Ryogo Kubo, Kazuhiro Matsuo, and Kazuo Kitahara

991: %title:Fluctuation and Relaxation of Macrovariables

992:

993: \bibitem{Baake2}

994: E. Baake and H. Wagner, Genet. Res. {\bf 78}, 93-117 (2001).

995: %title:Mutation-selection models solved exactly with methods of statistical mechanics

996: %authors: Ellen Baake and Holger Wagner

997: %���r���[�̕�

998:

999: \bibitem{Waterman}

1000: M. S. Waterman, {\sl Introduction to computational biology : maps, sequences and

1001:          genomes} (Chapman and Hall, London, 1995).

1002: %author: Michael S. Waterman

1003:

1004: \bibitem{Furusawa-KK}

1005: C. Furusawa, K. Kaneko, Phys. Rev. E {\bf 73}, 011912 (2006).

1006:

1007: \bibitem{Gillespie}

1008: J. H. Gillespie, Theor. Popul. Biol. {\bf 23}, 202-215 (1983).

1009: %Title:A simple stochastic gene substitution process.

1010: %author:Gillespie, J. H.

1011:

1012: \bibitem{Orr}

1013: H. A. Orr, Evolution {\bf 56}, 1317-1330 (2002).

1014: %author:H. ALLEN Orr

1015: %title:THE POPULATION GENETICS OF ADAPTATION: THE ADAPTATION OF DNA SEQUENCES

1016:

1017: \bibitem{Haken}

1018: H. Haken, {\sl Synergetics: an introduction nonequilibrium phase transitions and

1019:          self-organization in physics, chemistry and biology} (Springer-Verlag, Berlin, 1978).

1020: %edition: 2nd edn

1021:

1022: \bibitem{PhysicalBiology}

1023: K. Sato and K. Kaneko, Phys. Biol. {\bf 3}, 74-82 (2006).

1024: %title:On the distribution of state values of reproducing cells

1025: %authors:Katsuhiko Sato and Kunihiko Kaneko

1026:

1027: \bibitem{Morse-book}

1028: P. M. Morse and H. Feshbach, {\sl Methods of theoretical physics} (McGraw-Hill, New

1029:          York, 1953), pp. 1092-1106.

1030: %International student edition

1031: %autors:Philip M. Morse and Herman Feshbach

1032:

1033: \bibitem{Leuthausser}

1034: I. Leuthausser, J. Stat. Phys. {\bf 48}, 343-360 (1987).

1035: %title:Statistical Mechanics of Eigen's Evolution Model

1036: %author:Ira Leuthausser

1037: %�n�߂�Ising�ƑΉ��������l�ł����. 2006/07/13

1038: %���U���Ԃŋc�_���Ă�����.

1039:

1040: \bibitem{Baake}

1041: E. Baake, M. Baake and H. Wagner, Phys. Rev. Lett. {\bf 78}, 559 (1997).

1042: %title:Ising Quantum Chain is Equivalent to a Model of Biological Evolution

1043: %�X�s���Ƃ̃^�C�����������_��

1044:

1045: \bibitem{Taiwan}

1046: D. B. Saakian and C. K. Hu, Proc.  Natl. Acad.  Sci. USA {\bf 103}, 4935-4939

1047:          (2006).

1048: %authors:David B. Saakian and Chin-Kun Hun

1049: %title:Exact solution of the Eigen model with general fitness functions and

1050: %         degradation rates

1051:

1052: \bibitem{Taiwan2}

1053: D. Saakian and C. K. Hu, Phys. Rev. E {\bf 69}, 021913 (2004).

1054: %authors: David Saakian and Chin-Kun Hu

1055: %title:Eigen model as a quantum spin chain: Exact dynamics

1056:

1057: \bibitem{Yang-et-al}

1058: H. Yang, G. Luo, P. Karnchanaphanurach, T. M. Louie, I. Rech, S. Cova, L. Xun,

1059: and X. S. Xie, Science {\bf 302}, 262-266 (2003).

1060: %%*title: Protein conformational dynamics probed by single-molecule electron

1061: %               transfer.

1062:

1063: \bibitem{ancel-fontana}

1064: L. W. Ancel and W. Fontana, J. Exp. Zool. {\bf 288}, 242-283 (2000).

1065: %*title:Plasticity, evolvability, and modularity in RNA.

1066:

1067: \bibitem{Hofbauer}

1068: J. Hofbauer, J. Math. Biol. {\bf 23}, 41-53 (1985).

1069:

1070: \bibitem{Akin}

1071: E. Akin, {\sl The geometry of population genetics} (Springer-Verlag, Berlin, 1979).

1072: %Ethan Akin

1073:

1074: \bibitem{well-written-NK}

1075: B. Levitan and S. Kauffman, Mol. Divers.  {\bf 1}, 53-68 (1995).

1076: %article: Molecular Diversity, 1 (1995) 53-68

1077: %authors : Bennett Levitan and Stuart Kauffman

1078: %title:Adaptive walks with noisy fitness measurements

1079:

1080: \end{thebibliography}

1081:

1082: \end{document}

1083: