q-bio0611031/main.tex
1: \documentclass[preprint,aps]{revtex4}
2: %\documentclass[a4paper,12pt]{article}
3: %\documentclass[preprint,aps,draft]{revtex4}
4: %\documentclass[preprint,showpacs,preprintnumbers,amsmath,amssymb]{revtex4}
5: 
6: \usepackage{graphicx}
7: 
8: %\usepackage[dviout]{color}
9: 
10: %\input{c:/sato/unix/Latex/Style/format_small.tex}
11: %\input{c:/sato/unix/Latex/Style/format_normal.tex}
12: 
13: %%ラベルの名前を見せるようにする
14: %\input{c:/sato/unix/Latex/Style/dummy.tex}
15: %\input{c:/sato/unix/Latex/Style/emerge_label.tex}
16: 
17: %%ページがでないようにしている
18: %\pagestyle{empty}
19: %\thispagestyle{empty}
20: 
21: %% standard commands 2006/08/15
22: \newcommand{\integrate}{\int}
23: %\renewcommand{\<}{\left\langle}
24: %\renewcommand{\>}{\right\rangle}
25: \newcommand{\integral}{\int}
26: \newcommand{\bra}{\langle}
27: \newcommand{\ket}{\rangle}
28: \newcommand{\braket}[1]{\langle #1 \rangle}
29: \newcommand{\bubun}[2]{\frac{\partial #1}{\partial #2}}
30: \newcommand{\bibun}[2]{\frac{d #1}{d #2}}
31: \newcommand{\infinity}{\infty}
32: 
33: \begin{document}
34: 
35: \preprint{ }
36: 
37: \title{Evolution Equation of Phenotype Distribution: General Formulation and Application
38: to Error Catastrophe} \author{Katsuhiko Sato$^1$ and Kunihiko Kaneko$^{1,2}$} \address{
39: $^1$ Complex Systems Biology Project, ERATO JST} \address{ $^2$ Department of Pure and
40: Applied Sciences, University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo 153-8902, Japan }
41: \email{sato@complex.c.u-tokyo.ac.jp; kaneko@complex.c.u-tokyo.ac.jp} \date{\today}
42: 
43: \begin{abstract} 
44: An equation describing the evolution of phenotypic distribution is derived using methods
45: developed in statistical physics.  The equation is solved by using the singular
46: perturbation method, and assuming that the number of bases in the genetic sequence is
47: large.  Applying the equation to the mutation-selection model by Eigen provides the
48: critical mutation rate for the error catastrophe.  Phenotypic fluctuation of clones
49: (individuals sharing the same gene) is introduced into this evolution equation.  With this
50: formalism, it is found that the critical mutation rate is sometimes increased by the
51: phenotypic fluctuations, i.e., noise can enhance robustness of a fitted state to mutation. 
52: Our formalism is systematic and general, while approximations to derive more tractable
53: evolution equations are also discussed.
54: \end{abstract}
55: 
56: \pacs{ }
57: 
58: \keywords{ }
59: 
60: \maketitle
61: 
62: \section{Introduction}
63: 
64: For decades, quantitative studies of evolution in laboratories have used bacteria and
65: other microorganisms\cite{Lenski,Lenski-Rose,Kishony}.  Changes in phenotypes, such as
66: enzyme activity and gene expressions introduced by mutations in genes, are measured along
67: with the changes in their population distribution in phenotypes
68: \cite{Bactreia-many-generations, Dekel-Alon,Kashiwagi-Noumachi,Ito}.  Following
69: such experimental advances, it is important to analyze the evolution equation of
70: population distribution of concerned genotypes and phenotypes.
71: 
72: In general, fitness for reproduction is given by a phenotype, not directly by a genetic
73: sequence. Here, we consider evolution in a fixed environment, so that the fitness is given
74: as a fixed function of the phenotype.  A phenotype is determined by mapping a genetic
75: sequence. This phenotype is typically represented by a continuous (scalar) variable, such
76: as enzyme activity, protein abundances, and body size. For studying the evolution of a
77: phenotype, it is essential to establish a description of the distribution function for a
78: continuous phenotypic variable, where the fitness for survival, given as a function of
79: such a continuous variable, determines population distribution changes over generations.
80: 
81: However, since a gene is originally encoded on a base sequence (such as AGCTGCTT in DNA),
82: it is represented by a symbol sequence of a large number of discrete elements. Mutation in
83: a sequence is not originally represented by a continuous change.  Since the fitness is
84: given as a function of phenotype, we need to map base sequences of a large number of
85: elements onto a continuous phenotypic variable $x$, where the fitness is represented as a
86: function of $x$, instead of the base sequence itself.  A theoretical technique and careful
87: analysis are needed to project a discrete symbol sequence onto a continuous
88: variable. 
89: 
90: Mutation in a nucleotide sequence is random, and is represented by a stochastic process.
91: Thus, a method of deriving a diffusion equation from a random walk is often
92: applied. However, the selection process depends on the phenotype. If a phenotype is given
93: as a function of a sequence, the fitness is represented by a continuous variable mapped
94: from a base sequence. Since the population changes through the selection of fitness, the
95: distribution of the phenotype changes accordingly. If the mapping to the phenotype
96: variable is represented properly, the evolutionary process will be described by the
97: dynamics of the distribution of the variable, akin to a Fokker-Planck equation.
98: 
99: In fact, there have been several approaches to representing the gene with a continuous
100: variable \cite{footnote}
101: %[ A well established theory in population genetics, which adopts diffusion equation or
102: %Fokker-Planck type equation, is related to the frequency of genes in alleles in population
103: %as developed by Wright, Fisher, and Kimura\cite{Fisher,Wright,kimura1970}.  How new genes
104: %spread in population is analyzed by the diffusion equation. In contrast, we are concerned
105: %with the changes in the distribution of a base sequence consisting of a haploid gene.  ] 
106: .  Kimura\cite{Kimura2} developed the population distribution of a continuous fitness.
107: Also, for certain conditions, a Fokker-Planck type equation has been analyzed by
108: Levine\cite{Levine}. Generalizing these studies provides a systematic derivation of an
109: equation describing the evolution of the distribution of the phenotypic variable. We adopt
110: selection-mutation models describing the molecular biological evolution discussed by
111: Eigen\cite{Eigen}, Kauffman\cite{kauffmann-book}, and others, and take a continuum limit
112: assuming that the number of bases $N$ in the genetic sequence is large, and derive the
113: evolution equation systematically in terms of the expansion of $1/N$.
114: 
115: In particular, we refer to Eigen's equation\cite{Eigen}, originally introduced for the
116: evolution of RNA, where the fitness is given as a function of a sequence. Mutation into a
117: sequence is formulated by a master equation, which is transformed to a diffusion-like
118: equation.  With this representation, population dynamics over a large number of species is
119: reduced to one simple integro-differential equation with one variable. Although the
120: equation obtained is a non-linear equation for the distribution, we can adopt techniques
121: developed in the analysis of the (linear) Fokker-Planck equation, such as the
122: eigenfunction expansion and perturbation methods.
123: 
124: So far, we have assumed a fixed, unique mapping from a genotype to a phenotype.  However,
125: there are phenotypic fluctuations in individuals sharing the same genotype, which has
126: recently been measured quantitatively as a stochastic gene expression
127: \cite{Koshland,Elowitz,Kaern-Collins,Collins,Furusawa,Ueda,noise-review}.  Relevance of
128: such fluctuations to evolution has also been
129: discussed\cite{SatoPNAS,kaneko-book,KKFurusawaJTB,Ancel}.  In this case, mapping from a gene
130: gives the average of the phenotype, but phenotype of each individual fluctuates around the
131: average.  In the second part of the present paper, we introduce this isogenic phenotypic
132: fluctuation into our evolution equation.  Indeed, our framework of Fokker-Planck type
133: equations is fitted to include such fluctuations, so that one can discuss the effect of
134: isogenic phenotypic fluctuations on the evolution.
135: 
136: The outline of the present paper is as follows: We first establish a sequence model in
137: section (\ref{31:setup-and-derivation}). For deriving the evolution equation from the
138: sequence model, we postulate the assumption that the transition probability of phenotype
139: values is uniquely determined by the original phenotype value.  The assumption may appear
140: too demanding at a first sight, but we show that it is not unnatural from the viewpoint of
141: evolutionary biology. In fact, most models studied so far satisfy this postulate.  With
142: this assumption, we derive a Fokker-Planck type equation of phenotypic distribution using
143: the Kramers-Moyal expansion method from statistical
144: physics\cite{vanKampen,KuboMatsuoKitahara}.  We discuss the validity of this expansion
145: method to derive the equation, also from a biological point of view.
146: 
147: As an example of the application of our formulation, we study the Eigen's model in section
148: (\ref{20}), and estimate the critical mutation rates at which error catastrophe occurs,
149: using a singular perturbation method.  In section (\ref{32:discussion}), we discuss the
150: range of the applicability of our method and discuss possible extensions to it.
151: 
152: Following the formulation and application of the Fokker-Planck type equations for
153: evolution, we study the effect of isogenic phenotypic fluctuations.  While fluctuation in
154: the mapping from a genotype to phenotype modifies the fitness function in the equation,
155: our formulation itself is applicable.  We will also discuss how this fluctuation changes
156: the conditions for the error catastrophe, by adopting Eigen's model.
157: 
158: For concluding the paper, we discuss generality of our formulation, and the relevance of
159: isogenic phenotypic fluctuation to evolution.
160: 
161: \section{Derivation of evolution equation}
162: \label{31:setup-and-derivation}
163: 
164: We consider a population of individuals having a haploid genotype, which is encoded on a
165: sequence consisting of $N$ sites (consider, for example, DNA or RNA). The gene is
166: represented by this symbol sequence, which is assigned from a set of numbers, such as
167: $\{-1,1\}$. This set of numbers is denoted by $S$. By denoting the state value of the
168: $i$th site by $s_i$ ($ \in S$), the configuration of the sequence is represented by the
169: ordered set $s=\{s_1,...,s_N\}$.
170: 
171: We assume that a scalar phenotype variable $x$ is assigned for each sequence $s$.  This
172: mapping from sequence to phenotype is given as function $x(s)$. Examples of the phenotype
173: include the activity of some enzyme (protein), infection rate of bacteria virus, and
174: replication rate of RNA.  In general, the function $x(s)$ is a degenerate function, i.e.,
175: many different sequences are mapped onto the same phenotypic value $x$.
176: 
177: Each sequence is reproduced with rate $A$, which is assumed to depend only on the
178: phenotypic value $x$, as $A(x)$; this assumption may be justified by choosing the
179: phenotypic value $x$ to relate to the replication. For example, if a protein concerns with
180: the metabolism of a replicating cell, its activity may affect the replication rate of the
181: cell and of the protein itself.
182: 
183: In the replication of the sequence, mutation generally occurs; for simplicity, we consider
184: only the substitution of $s(i)$. With a given constant mutation rate $\mu$ over all sites
185: in the sequence, the state $s'_i$ of the daughter sequence is changed from $s_i$ of the
186: mother sequence, where the value $s'_i$ is assigned from the members of the set $S$ with
187: an equal probability. We call this type of mutation symmetric mutation~\cite{Baake2}. The
188: mutation is represented by the transition probability $Q(s \rightarrow s')$, from the
189: mother $s$ to the daughter sequence $s'$. The probability $Q$ is uniquely determined from
190: the sequence $s$, the mutation rate $\mu$, and the number of members of $S$.  The setup so
191: far is essentially the same as adopted by Eigen et al.\cite{Eigen}, where the fitness is
192: given as a function of the RNA sequence or DNA sequence of virus.
193: 
194: Now, we assume that the transition probability depends only on the phenotypic value $x$,
195: i.e., the function $Q$ can be written in terms of a probability function $W$, which
196: depends only on $x$, $W(x \rightarrow x')$, as
197: \begin{equation} \sum_{s' \in \{ s'|x'=x(s') \}} Q(s \rightarrow
198: s')=W(x(s) \rightarrow x') .
199: \label{1} \end{equation}
200: 
201: This assumption may appear too demanding. However, most models of sequence evolution
202: somehow adopt this assumption. For example, in Eigen's model, fitness is given as a
203: function of the Hamming distance from a given optimal sequence.  By assigning a phenotype
204: $x$ as the Hamming distance, the above condition is satisfied (this will be discussed
205: later). In Kauffman's NK model, if we set $N \gg 1$, $K \gg 1$, and $K/N \ll 1$, this
206: assumption is also satisfied (see Appendix \ref{29}). For the RNA secondary structure
207: model\cite{Waterman}, this assumption seems to hold approximately, from statistical
208: estimates through numerical simulations. Some simulations on a cell model with chemical
209: reaction networks\cite{Furusawa,Furusawa-KK} also support the assumption. In fact, a
210: similar assumption has been made in evolution theory with a gene substitution
211: process\cite{Gillespie,Orr}.
212: 
213: The validity of this assumption in experiments has to be confirmed. Consider a selection
214: experiment to enhance some function through mutation, such as the evolution of a certain
215: protein to enhance its activity\cite{Ito}. In this case, the assumption means that the
216: activity distribution over the mutant proteins is statistically similar as long as they
217: have the same activity, even though their mother protein sequences are different.
218: 
219: With the above setup, we consider the population of these sequences and their dynamics,
220: allowing for overlap between generations, by taking a continuous-time
221: model\cite{Baake2}. We do not consider the death rate of the sequence explicitly since its
222: consideration introduces only an additional term, as will be shown later. The
223: time-evolution equation of the probability distribution $\hat{P}(s,t)$ of the sequence $s$
224: is given by:
225: \begin{equation} \bubun{\hat{P}(s,t)}{t} = -\bar{A}(t) \hat{P}(s,t) +
226: \sum_{s'} A(x(s')) Q(s' \rightarrow s) \hat{P}(s',t), \label{3} \end{equation} as
227: specified by Eigen\cite{Eigen}. Here the quantity $\bar{A}(t)$ is the average fitness of
228: the population at time $t$, defined by $\bar{A}(t)=\sum_{s} A(x(s)) \hat{P}(s,t)$ and $Q$
229: is the transition probability satisfying $\sum_{s} Q(s' \rightarrow s)=1$ for any $s'$.
230: 
231: %By following 
232: According to the assumption (\ref{1}), eq. (\ref{3}) is transformed into the equation for
233: $P(x,t)$, which is the probability distribution of the sequences having the phenotypic
234: value $x$, defined by $P(x,t)=\sum_{ s \in \{s | x = x(s)\}} \hat{P}(s,t)$. The equation
235: is given by \begin{equation} \bubun{P(x,t)}{t} = - \bar{A}(t) P(x,t) + \sum_{x'} A(x')
236: W(x' \rightarrow x) P(x',t), \label{4}
237: \end{equation}
238: where the function $W$ satisfies
239: \begin{equation} \sum_{x} W(x' \rightarrow x)=1 \qquad \mbox{for any $x'$,}
240: \label{2} \end{equation} \noindent as shown.
241: 
242: Since $N$ is sufficiently large, the variable $x$ is regarded as a continuous variable. By
243: using the Kramers-Moyal expansion\cite{vanKampen,KuboMatsuoKitahara,Haken}, with the help
244: of property (\ref{2}), we obtain:
245: \begin{equation}
246: \bubun{P(x,t)}{t} = (A(x)-\bar{A}(t)) P(x,t) + \sum_{n=1}^{\infinity} \frac{(-1)^n }{n!} 
247: \bubun{{}^n}{x^n} m_n(x) A(x) P(x,t),
248: \label{5} \end{equation}
249: where $m_n(x)$ is the $n$th moment about the value $x$, defined by
250: $m_n(x)= \int (x'-x)^n W(x \rightarrow x') dx' $.
251: 
252: Let us discuss the conditions for the convergence of expansion (\ref{5}), without
253: mathematical rigor.  For convergence, it is natural to assume that the function $W(x'
254: \rightarrow x)$ decays sufficiently fast as $x$ gets far from $x'$, by the definition of
255: the moment.
256: 
257: Here, the transition $W(x' \rightarrow x)$ is a result of $n$ point mutants of the
258: original sequence $s'$ for $n=0,1,2,...,N$. Accordingly, we introduce a set of quantities,
259: $w_n(x(s') \rightarrow x)$, as the fitness distribution of $n$ point mutants of the
260: original sequence $s'$ (Naturally, $w_0(x(s') \rightarrow x)=\delta(x(s')-x)$, which does
261: not contribute to the $n$th moment $m_n$ ($n \geq 1$)). Next, we introduce the probability
262: $p_n$ that a daughter sequence is an $n$ point mutant $(n=0,1,2,...,N)$ from her mother
263: sequence, which are determined only by the mutation rate $\mu$ and the sequence length
264: $N$. Indeed, ${p_n}'s$ form a binomial distribution, characterized by $\mu$ and $N$.
265: 
266: In terms of the quantities $w_n$ and $p_n$, we are able to write down the transition
267: probability $W$ as
268: \begin{equation}
269: W(x(s') \rightarrow x)=\sum_{n=0}^{N} p_n w_n(x(s') \rightarrow x).\label{6}
270: \end{equation}
271: Now, we discuss if $W(x(s') \rightarrow x)$ decays sufficiently fast with $|x(s')-x|$.
272: First, we note that the width of the domain, in which $w_n(x(s') \rightarrow x)$ is not
273: close to zero, increases with $n$ since $n$-point mutants involve increasing number of
274: changes in the phenotype with larger values of $n$.  Then, to satisfy the condition for
275: $W(x(s') \rightarrow x)$, at least the single-point-mutant transition $w_1(x(s')
276: \rightarrow x)$ has to decay sufficiently fast with $|x(s')-x|$. In other words, the
277: phenotypic value of a single-point mutant $s$ of the mother sequence $s'$ must not vary
278: much from that of the original sequence, i.e., $|x(s')-x(s)|$ should not be large
279: (``continuity condition").
280: 
281: In general, the domain $|x-x(s')|$, in which $w_n(x(s') \rightarrow x) \neq 0$, increases
282: with $n$. On the other hand, the term $p_n$ decreases with $n$ and with the power of
283: $\mu^n$. Hence, as long as the mutation rate is not large, the contribution of $w_n$ to
284: $W$ is expected to decay with $n$. Thus, if the continuity condition with regards to a
285: single-point mutant and a sufficiently low mutation rate are satisfied, the requirement on
286: $W(x(s') \rightarrow x)$ should be fulfilled. Hence, the convergence of the expansion is
287: expected.
288: 
289: Following the argument, we further restrict our study to the case with a small mutation
290: rate $\mu$ such that $\mu N \ll 1$ holds. The transition probability $W$ in eq.  (\ref{6})
291: is written as
292: \begin{equation} W(x(s') \rightarrow x) \simeq (1-\mu N)
293: \delta(x(s')-x) + \mu N w_1(x(s') \rightarrow x),\label{7}
294: \end{equation}
295: where we have used the property that ${p_n}'s$ form the binomial distribution
296: characterized by $\mu$ and $N$. Introducing a new parameter, $\gamma$ ($\gamma=\mu N$),
297: that gives the average of the number of changed sites at a single-point mutant, and using
298: the transition probability (\ref{7}), we obtain
299: \begin{equation} \bubun{P(x,t)}{t} =
300: (A(x)-\bar{A}(t)) P(x,t) + \gamma \sum_{n=1}^{\infinity} \frac{(-1)^n
301: } {n!} \bubun{{}^n}{x^n} m_n^{(1)}(x) A(x) P(x,t), \label{8}
302: \end{equation}
303: where $m_n^{(1)}(x)$ is the $n$th moment of $w_{1}(x \rightarrow x')$, i.e.,
304: $m_n^{(1)}(x)=\int (x'-x)^n w_1(x \rightarrow x') dx'$.
305: 
306: When we stop the expansion at the second order, as is often adopted in statistical
307: physics, we obtain
308: \begin{equation}
309: \bubun{P(x,t)}{t} = (A(x)-\bar{A}(t)) P(x,t) + \gamma \bubun{{}}{x}
310: \left[ - m_1^{(1)}(x) + \frac{1}{2} \bubun{{}}{x} m_2^{(1)}(x) \right]
311: A(x) P(x,t). \label{9}
312: \end{equation}
313: Eqs. (\ref{8}) and (\ref{9}) are basic equations for the evolution of distribution
314: function.  Eq. (\ref{9}) is an approximation. However, it is often more tractable, with
315: the help of techniques developed for solving the Fokker-Planck equation ( see Appendix
316: \ref{10} and \cite{PhysicalBiology}), while there is no established standard method for
317: solving eq.  (\ref{8}).
318: 
319: At the boundary condition we naturally impose that there are no probability flux, which is
320: given by
321: \begin{equation} \left.
322: \sum_{n=1}^{\infinity} \frac{(-1)^n } {n!} \bubun{{}^{(n-1)}}{x^{(n-1)}} m_n^{(1)}(x) A(x)
323: P(x,t) \right|_{x=x_1, x_2} =0, \label{26} \end{equation} in the case of (\ref{8}) and
324: \begin{equation}
325: \left. \left[ - m_1^{(1)}(x) + \frac{1}{2} \bubun{{}}{x} m_2^{(1)}(x) \right] A(x) P(x,t)
326: \right|_{x=x_1, x_2} =0
327: \label{27}
328: \end{equation}
329: in the case of (\ref{9}), where $x_1$ and $x_2$ are the values of the left and right
330: boundaries, respectively.
331: 
332: Next, as an example of the application of our formula, we derive the evolution equation
333: for Eigen's model, and estimate the error threshold, with the help of a singular
334: perturbation theory. Through this application, we can see the validity of eq. (\ref{9}) as
335: an approximation of eq. (\ref{8}).
336: 
337: Two additional remarks: First, introduction of the death of individuals is rather
338: straightforward. By including the death rate $D(x)$ into the evolution equation, the first
339: term in eq. (\ref{8}) (or eq. (\ref{9})) is replaced by
340: $\left[(A(x)-D(x))-(\bar{A}(t)-\bar{D}(t))\right] P(x,t)$, where $\bar{D}(t) \equiv \int
341: D(x) P(x,t) dx$. Second, instead of deriving each term in eq. (\ref{9}) from microscopic
342: models, it may be possible to adopt it as a phenomenological equation, with parameters (or
343: functions) to be determined heuristically from experiments.
344: 
345: %%---
346: 
347: \section{Application of error threshold in Eigen model}
348: \label{20}
349: 
350: In the Eigen model\cite{Eigen}, the set $S$ of the site state values is given by
351: $\{-1,1\}$, and the fitness (replication rate) of the sequence is given as a function of
352: its Hamming distance from the target sequence $\{1,...,1\}$, i.e., the fitness of an
353: individual sequence is given as a function of the number $n$ of the sites of the sequence
354: having value $1$. Hence it is appropriate to define a phenotypic value $x$ in the Eigen
355: model as a monotonic function of the number $n$; we determine it as $x=\frac{2n-N}{N}$, in
356: the range $[-1,1]$. Accordingly, the replication rate $A$ of the sequence can be written
357: as a function of $x$, i.e., $A(x)$; it is natural to postulate that $A$ is a non-negative
358: and bounded function over the whole domain. If the sequence length $N$ is sufficiently
359: large, the phenotypic variable $x$ can be regarded as a continuous variable, since the
360: step size of $x$ ($\Delta x=\frac{2}{N}$) approaches 0 as $N$ goes to infinity.
361: 
362: In order to derive the evolution equation of form (\ref{8}) corresponding to the Eigen
363: model, we only need to know the function $w_1$ in that model. (Recall that in our
364: formulation the mutation rate $\mu$ is assumed to be so small that only a single-point
365: mutation is considered.) Due to the assumption of the symmetric mutation, this
366: distribution function is obtained as $w_1(x \rightarrow x - \Delta x)=\frac{1+x}{2}$,
367: $w_1(x \rightarrow x + \Delta x)=\frac{1-x}{2}$, and $w_1(x \rightarrow x')=0$ for any
368: other $x'$. Accordingly, the $n$th moment is given by $m_n^{(1)}(x)= \frac{1+x}{2}
369: (-\Delta x)^n + \frac{1-x}{2} (\Delta x)^n$. Now, we obtain
370: \begin{equation}
371: \bubun{P(x,t)}{t} = (A(x)-\bar{A}(t)) P(x,t) + \gamma \sum_{n=1}^{\infinity} \frac{1} {n!}
372: \bubun{{}^n}{x^n} \left[ \frac{1+x}{2} \left( \frac{2}{N} \right)^n + \frac{1-x}{2}
373: \left(- \frac{2}{N} \right)^n \right] A(x) P(x,t) \label{12}
374: \end{equation} where
375: $\gamma=N \mu$, the mutation rate per sequence. When we ignore the moment terms higher
376: than the second order, we have
377: \begin{equation} \bubun{P(x,t)}{t} = (A(x)-\bar{A}(t)) P(x,t) + \frac{2
378: \gamma}{N} \bubun{}{x} \left[ x + \frac{1}{N}
379: \bubun{}{x} \right]A(x) P(x,t).
380: \label{11}
381: \end{equation}
382: 
383: In fact, if we focus on a change near $x\sim 0$ ( to be specific $x \sim O(1/\sqrt{N})$),
384: the truncation of the expansion up to the second order is validated (Or equivalently, if
385: we define $x'=(2n-N)/\sqrt{N}$ instead of $(2n-N)/N$, and expand eq.(3) by $1/\sqrt{N}$
386: instead of $1/N$, terms higher than the second order are negligible, as is also discussed
387: in \cite{Levine}. However, in this case, the validity is restricted to $x' \sim O(1)$
388: (i.e., $(n-N/2) \sim O(1)$), which means $x\sim O(1/\sqrt{N})$ in the original variable).
389: 
390: Now, we solve the eq. (\ref{11}) with a standard singular perturbation method (see
391: Appendix \ref{10}), and then return to eq. (\ref{12}).  According to the analysis in
392: Appendix \ref{10}, the stationary solution of the equation of form (\ref{11}) is given by
393: the eigenfunction corresponding to the largest eigenvalue of the linear operator $L$
394: defined by $L=A(x)+2 \gamma \varepsilon \bubun{}{x} \left[ x + \varepsilon \bubun{}{x}
395: \right]A(x)$ with $\varepsilon=\frac{1}{N}$. Now we consider the eigenvalue problem
396: \begin{equation} A(x) P(x) + 2 \gamma \varepsilon \bubun{}{x} \left[ x + \varepsilon
397: \bubun{}{x} \right]A(x) P(x) = \lambda P(x) \label{23},
398: \end{equation} where
399: $P(x) \geq 0$, with $\lambda$ to be determined.
400: 
401: Since $\varepsilon$ is very small (because $N$ is sufficiently large), a singular
402: perturbation method, the WKB approximation\cite{Morse-book}, is applied. Let us put
403: \begin{equation} P(x)=e^{\frac{1}{\varepsilon}\int_{x0}^{x} R(\varepsilon,x') dx'},
404: \label{28} \end{equation} where $x_0$ is some constant and $R$ is a
405: function of $\varepsilon$ and $x$, which is expanded with respect to $\varepsilon$ as
406: \begin{equation} R(\varepsilon,x)=R_0(x)+\varepsilon R_1(x)+\varepsilon^2 R_2(x)+...
407: \label{22}
408: \end{equation} Retaining only the zeroth order terms in $\varepsilon$ in
409: eq. (\ref{23}), we get \begin{equation} A(x) + 2 \gamma \left[ x R_0(x) + R_0^2(x) \right]
410: A(x) =\lambda,
411: \label{24} \end{equation} which is formally solved for $R_0$ as
412: $R_0^{(\pm)}(x)= \frac{-x \pm \sqrt{g(x)}}{2}$ where $g(x)= x^2+\frac{2}{\gamma}
413: (\frac{\lambda}{A(x)}-1)$. Hence the general solution of eq. (\ref{23}) up to the zeroth
414: order in $\varepsilon$ is given by $P(x)=\alpha e^{\frac{1}{\varepsilon} \int_{x_0}^{x}
415: R_0^{(+)}(x')dx'} +\beta e^{\frac{1}{\varepsilon} \int_{x_0}^{x} R_0^{(-)}(x')dx'} $ with
416: $\alpha$ and $\beta$ constants to be determined.
417: 
418: Now, recall the boundary conditions (\ref{27}); $P$ has to take the two branches in $R_0$
419: as $ P(x)=\alpha e^{\frac{1}{\varepsilon} \int_{x_b}^{x} R_0^{(+)}(x')dx'} $ for $x < x_b$
420: and $ P(x)= \beta e^{\frac{1}{\varepsilon} \int_{x_b}^{x} R_0^{(-)}(x')dx'} $ for $x >
421: x_b$, where $x_b$ is defined as the value at which $g(x)$ has the minimum value. Next,
422: from the continuity of $P$
423: %$\bubun{P}{x}$ 
424: at $x_b$, $\alpha=\beta$ follows, while from the
425: continuity of $\bubun{P}{x}$ at $x_b$, the function $g$ has to vanish at $x=x_b$. This
426: requirement $g(x_b)=0$ determines the value of the unknown parameter $\lambda$ as
427: \begin{equation}
428: \lambda=A(x_b) (1-\frac{\gamma}{2} {x_b}^2).
429: \label{18:approximated-eigenvalue}
430: \end{equation}
431: From function $P$, we find that $P$ has its peak at the point $x=x_p$, where $R_0(x)$
432: vanishes, i.e., at $ A(x_p)=\lambda $. Then, $P(x)$ approaches $\delta(x-x_p)$ in the
433: limit $\varepsilon \rightarrow +0$. These results are consistent with the requirement that
434: the mean replication rate in the steady state be equal to the largest eigenvalue of the
435: system (see Appendix \ref{10}).
436: 
437: The stationary solution of eq.(\ref{12}) is obtained by following the same procedure of
438: singular perturbation. Consider the eigenvalue problem
439: \begin{equation}
440: A(x)P(x)+\gamma \sum_{n=1}^{\infinity} \frac{1} {n!}
441: \bubun{{}^n}{x^n} \left[ \frac{1+x}{2} \left( 2
442: \varepsilon \right)^n + \frac{1-x}{2} \left(- 2
443: \varepsilon \right)^n \right] A(x) P(x)=\lambda
444: P(x). \label{25} \end{equation} By putting
445: $P(x)=e^{\frac{1}{\varepsilon}\int_{x0}^{x} R_0(x')
446: dx'}$ and taking only the zeroth order terms in
447: $\varepsilon$, we obtain $$A(x)+\gamma \left[
448: \frac{1+x}{2} \left( e^{2 R_0(x)} - 1 \right) +
449: \frac{1-x}{2} \left( e^{-2 R_0(x)} -1 \right) \right]
450: A(x) =\lambda ,$$ which gives $$
451: R_0^{(\pm)}(x)=\frac{1}{2} \log
452: \frac{1+\frac{1}{\gamma} (\frac{\lambda}{A(x)}-1) \pm
453: \sqrt{ \hat{g}(x)}}{1+x}$$ with $\hat{g}(x)=
454: (1+\frac{1}{\gamma} (\frac{\lambda}{A(x)}-1))^2-(1-x^2)
455: $.
456: 
457: By defining again the value $x=x_b$ at which $\hat{g}(x)$ takes the minimum, $P$ is
458: represented as $ P(x)=\alpha e^{\frac{1}{\varepsilon} \int_{x_b}^{x} R_0^{(+)}(x')dx'} $
459: for $x < x_b$ and $ P(x)= \beta e^{\frac{1}{\varepsilon} \int_{x_b}^{x} R_0^{(-)}(x')dx'}
460: $ for $x > x_b$. The continuity of $\bubun{P}{x}$ at $x=x_b$ requires $\hat{g}(x_b)=0$,
461: which determines the value of $\lambda$ as \begin{equation} \lambda=A(x_b) \left[1-\gamma
462: \left(1-\sqrt{1-{x_b}^2}\right) \right].
463: \label{15:more-exact-eigenvalue} \end{equation}
464: Again, $P(x)=\delta(x-x_p)$, in the limit $\varepsilon \rightarrow +0$, with $x_p$ given
465: by the condition $A(x_p)=\lambda$. When $|x_b| \ll 1$, the form
466: (\ref{15:more-exact-eigenvalue}) approaches eq.  (\ref{18:approximated-eigenvalue})
467: asymptotically.  This implies that the time evolution equation (\ref{8}), if restricted to
468: $|x| \ll 1$, is accurately approximated by eq.(\ref{9}) that keeps the terms only up to
469: the second moment.
470: 
471: Let us estimate the threshold mutation rate for error catastrophe. This error threshold is
472: defined as the critical mutation rate $\gamma^{*}$ at which the peak position $x_p$ of the
473: stationary distribution drops from $x_p\neq 0$ to $x_p =0$, with an increase of $\gamma$.
474: We use the following procedure to obtain the critical value $\gamma^{*}$.
475: 
476: First consider an evaluation function whose form
477: corresponds to that of eigenvalue
478: (\ref{15:more-exact-eigenvalue}) as \begin{equation}
479: f(x)=A(x) \left[1-\gamma \left(1-\sqrt{1-{x}^2}\right)
480: \right], \label{30:more-exact-evaluation-function}
481: \end{equation} and find
482: the position at which the function $f(x)$ takes the maximum value. This procedure is
483: equivalent to obtaining $x_b$ in the above analysis, since the relation
484: $f(x)=\lambda-\frac{\gamma^2 A^2(x)}{\lambda-A(x) \left( 1-\gamma
485: \left(1+\sqrt{1-x^2}\right) \right)} \hat{g}(x)$ and the requirement on $x_b$ that
486: $\hat{g}(x_b)=0$ and $\left. \frac{d \hat{g}(x)}{dx} \right|_{x=x_b}=0$ lead to
487: $\left. \frac{df(x)}{dx} \right|_{x=x_b}=0$. Obviously, $x_b$ is given as a function of
488: $\gamma$, thus, we denote it by $x_b(\gamma)$. The position $x_b$ determines the position
489: $x_p$ of the stationary distribution through the relation $A(x_p)=\lambda=f(x_b)$ as in
490: the above analysis. If $A$ has flat parts around $x=0$ and higher parts in the region ($x
491: > 0$), $x_p(\gamma)$ discontinuously changes from $x_p \neq 0$ to $x_p = 0$ at some
492: critical mutation rate $\gamma^{*}$, when $\gamma$ increases from zero.  A schematic
493: illustration of this transition is given in
494: Fig.(\ref{33:fig:schematical-explaination-of-estimation}).
495: 
496: As a simple example of this estimate of error threshold, let us consider the case
497: \begin{equation} A(x)=1+A_0 \Theta(x-x_0),
498: \label{14:step} \end{equation} with $A_0>0$ and
499: $0<x_0<1$, and $\Theta$ as the Heaviside step function, defined as $\Theta(x)=0$ for $x <
500: 0$ and $\Theta(x)=1$ for $x \geq 0$. According to the procedure given above, the critical
501: mutation rate is straightforwardly obtained as
502: $\gamma^{*}=\frac{A_0}{(1+A_0)\left(1-\sqrt{1-{x_0}^2}\right)}$, for $\gamma<\gamma^{*}$,
503: $x_p=x_0$ and for $\gamma > \gamma^{*}$, $x_p=0$.
504: 
505: {\sl Remark}
506: 
507: An exact transformation from the sequence model (Eigen model\cite{Eigen}) into a class of
508: Ising models\cite{Leuthausser, Baake} has recently been reported, such that the sequence
509: model is treated analytically with methods developed in statistical physics.  Rigorous
510: estimation of the error threshold for various fitness landscapes\cite{Baake2,Taiwan} and
511: relaxation times of species distribution have been obtained\cite{Taiwan2}. In fact, our
512: estimate (above) agrees with that given by their analysis.
513: 
514: Their method is indeed powerful when a microscopic model is prescribed in correspondence
515: with a spin model.  However, even if such microscopic model is not given, our formulation
516: with a Fokker-Planck type equation will be applicable because it only requires estimation
517: of moments in the fitness landscape. Alternatively, by giving a phenomenological model
518: describing the fitness without microscopic process, it is possible to derive the evolution
519: equation of population distribution. Hence, our formulation has a broad range of potential
520: applications.
521: 
522: \section{Consideration of phenotypic fluctuation}
523: 
524: In this section, we include the fluctuation in the mapping from genetic sequence to the
525: phenotype into our formula, and examine how it influences the error catastrophe. We first
526: explain the term ``phenotypic fluctuation'' briefly, and show that in its presence our
527: formulation (\ref{8}) remains valid by redefining the function $A(x)$. By applying the
528: formulation, we study how the introduction of the phenotypic fluctuation changes the
529: critical mutation rate $\gamma^{*}$ for the error catastrophe.
530: 
531: In general, even for individuals with identical gene sequences in a fixed environment, the
532: phenotypic values are distributed. Some examples are the activities of proteins
533: synthesized from the identical DNA \cite{Yang-et-al}, the shapes of RNA molecules of
534: identical sequences \cite{ancel-fontana}, and the numbers of specific proteins for
535: isogenic bacteria \cite{ Elowitz,Kaern-Collins,Collins,Furusawa}. Next, the phenotype $x$
536: from each individual with the sequence $s$ is distributed, which is denoted by
537: $P_{phe}(s,x)$.
538: 
539: We assume that the form of distribution $P_{phe}$ is characterized only in terms of its
540: mean value, i.e., the distributions ${P_{phe}}'s$ having the same mean value $X$ take the
541: same form. By representing the mean value of the phenotype $x$ by $\bar{x}(s)$, the
542: distribution $P_{phe}$ is written as $P_{phe}(s,x)=\hat{P}_{phe}(\bar{x}(s),x)$, where
543: $\hat{P}_{phe}$ is a function of $\bar{x}$ and $x$, which is normalized with respect to
544: $x$, i.e., satisfying $\int \hat{P}_{phe}(\bar{x},x) dx =1$.
545: 
546: In our formulation, the replication rate $A$ of the sequence with the phenotypic value $x$
547: is given by a function of phenotypic value $x$, denoted by $A(x)$.  The mean replication
548: rate $\hat{A}$ of the species $s$ is calculated by
549: \begin{equation} \hat{A}(\bar{x}(s))=\int \hat{P}_{phe}(\bar{x}(s),x) A(x)
550: dx. \label{phe:mean} \end{equation} 
551: 
552: As in the case of (\ref{1}), we assume that the transition probability from $s$ to $s'$
553: during the replication is represented only by its mean values $\bar{x}(s)$ and
554: $\bar{x}(s')$, i.e., the transition probability function is written as $W(\bar{x}(s)
555: \rightarrow \bar{x}(s'))$. With this setup, the population dynamics of the whole sequences
556: is represented in terms of the distribution of the mean value $\bar{x}$ only, so that we
557: can use our formulation (\ref{8}) even when the phenotypic fluctuation is taken into
558: account; we need only replace the replication rate $A$ in (\ref{8}) by the mean
559: replication rate $\hat{A}$ obtained from eq. (\ref{phe:mean}).
560: 
561: Now, we can study the influence of phenotypic fluctuation on the error threshold by taking
562: the step fitness function $A(x)$ of eq. (\ref{14:step}) and including the phenotypic
563: fluctuation as given in eq.(\ref{phe:mean} ).  We consider a simple case where the form of
564: $\hat{P}_{phe}$ is given by a constant function within a given range (we call this the
565: piecewise flat case). Our aim is to illustrate the effect of the phenotypic fluctuation on
566: the error threshold, so we evaluate the critical mutation rate $\gamma^{*}$ using the
567: simpler form $f(x)=A(x)(1-\frac{\gamma}{2} x^2)$ from
568: eq.(\ref{18:approximated-eigenvalue}), while the use of the form
569: (\ref{30:more-exact-evaluation-function}) gives the same qualitative result. With this
570: simpler evaluation function, the critical mutation rate $\gamma^{*}$ is given by
571: \begin{equation} \gamma_0^{*}=\frac{2 A_0}{(1+A_0) {x_0}^2},
572: \label{34:gamma-zero}
573: \end{equation} in the case without phenotypic fluctuation.
574: Here we examine if this critical value $\gamma^{*}_0$ increases under isogenic phenotypic
575: fluctuation.
576: 
577: We make two further technical assumptions in the following analysis: first we assume that
578: $A_0$ in the form (\ref{14:step}) is sufficiently small, so that the value of critical
579: $\gamma^{*}$ is not large. Second, we extend the range of $x$ to $[-\infinity,\infinity]$
580: for simplicity.  This does not cause problems because we have set the range of $x_0$ to
581: $(0,1)$. Hence, the stationary distribution has its peak around the range $0 \leq x < 1$;
582: everywhere outside this range, the distribution vanishes.
583: 
584: We consider the case in which distribution $\hat{P}_{ phe }$ of the phenotype of the
585: species $s$ is given by
586: \begin{equation}
587: \hat{P}_{ phe }^{(F)}(\bar{x}(s), x) = \left\{
588: \begin{array}{ll} 0 & \quad \mbox{for $ x
589: <\bar{x}-\ell$}\\ \frac{1}{2 \ell} & \quad \mbox{for $ \bar{x}-\ell \leq x \leq \bar{x} +
590: \ell$}\\ 0 & \quad \mbox{for $ \bar{x} + \ell < x $, }
591: \end{array} \right.
592: \label{36:flat-case}
593: \end{equation}
594: where $\ell$ gives the half-width of the distribution. ($(F)$ represents the
595: piecewise-flat distribution case). Then, $\hat{A}$ is calculated by
596: $$ \hat{A}^{(F)}(x) = \left\{ \begin{array}{ll} 1 & \quad \mbox{for $x<x_0-\ell$}\\ 1+
597: \frac{A_0}{2 \ell} (x-(x_0-\ell)) & \quad \mbox{for $x_0-\ell \leq x \leq x_0 + \ell$}\\
598: 1+ A_0 & \quad \mbox{for $x_0 + \ell < x$. } \end{array} \right.$$ An example of
599: $\hat{A}^{(F)}(x)$ is shown in Fig. (\ref{35:fig:profile-of-A}). The evaluation function
600: $f$ in section (\ref{20}) is given by $ f^{(F)}(x)=\hat{A}^{(F)}(x) (1-\frac{\gamma}{2}
601: x^2) $.
602: 
603: We study the case where the position ${x_b^{*}}^{(F)} (\equiv {x_b^{(F)}}(\gamma^{*}))$ is
604: within the range $[x_0-\ell,x_0]$ because the profile of $\hat{A}^{(F)}$ shows that
605: ${\gamma^{*}}^{(F)}$ is smaller than $\gamma^{*}_0$ if ${x_b^{*}}^{(F)}>x_0$.  If
606: $\frac{x_0}{2+A_0} \leq \ell < x_0$, the position ${x_b^{*}}^{(F)}$ is within the range
607: $[x_0-\ell,x_0]$.  In that case, ${\gamma^{*}}^{(F)}$ is given by ${\gamma^{*}}^{(F)}
608: \simeq \frac{A_0}{4 \ell (x_0-\ell)} $ to the first order of $A_0$. Comparing
609: ${\gamma^{*}}^{(F)}$ with $\gamma^{*}_0$ in (\ref{34:gamma-zero}), we conclude that
610: ${\gamma^{*}}^{(F)} < \gamma^{*}_0$ for $ 0 <\ell<\frac{2+\sqrt{2}}{4} x_0 $, and
611: ${\gamma^{*}}^{(F)} > \gamma^{*}_0$ for $ \frac{2+\sqrt{2}}{4} x_0 <\ell< x_0$.  Hence,
612: when the half width $\ell$ of the distribution $P_{phe}$ is within the range
613: $(\frac{2+\sqrt{2}}{4} x_0,x_0)$, the critical mutation rate for the error catastrophe
614: threshold is increased.  In other words, the isogenic phenotypic fluctuation increases the
615: robustness of high fitness state against mutation.
616: 
617: We also studied the case in which $ \hat{P}_{ phe } (\bar{x}, x)$ decreases linearly
618: around its peak, i.e., with a triangular form.  In this case, the phenotypic fluctuation
619: decreases the critical mutation rate as long as $A_0$ is small, while it can increase for
620: sufficiently large values of $A_0$, for a certain range of the values of width of
621: phenotypic fluctuation.
622: 
623: \section{Discussion}
624: \label{32:discussion}
625: 
626: In the present paper, we have presented a general formulation to describe the evolution of
627: phenotype distribution.  A partial differential equation describing the temporal evolution
628: of phenotype distribution is presented with a self-consistently determined growth term.
629: Once a microscopic model is provided, each term in this evolution equation is explicitly
630: determined so that one can derive the evolution of phenotype distribution
631: straightforwardly.  This eq. (\ref{8}) is obtained as a result of Kramers-Moyal expansion,
632: which includes infinite order of derivatives.  However, this expansion is often summed to
633: a single term in the large number limit of base sequences, with the aid of singular
634: perturbation.
635: 
636: If the value of a phenotype variable $|x|$ is much smaller than unity (which is the
637: maximal possible value giving rise to the fittest state), the terms higher than the second
638: order can be neglected, so that a Fokker-Planck type equation with a self-consistent
639: growth term is derived.  The validity of this truncation is confirmed by putting
640: $x'=(2n-N)/\sqrt{N}$ and verifying that the third or higher order moment is negligible
641: compared with the second-order moment. Thus the equation up to its second order,
642: (\ref{9}), is relevant to analyzing the initial stage of evolution starting from a
643: low-fitness value.
644: 
645: As a starting point for our formalism, we adopted eq. (\ref{3}), which is called the
646: ``coupled'' mutation-selection equation\cite{Hofbauer}.  Although it is a natural and
647: general choice for studying the evolution, a simpler and approximate form may be used if
648: the mutation rate and the selection pressure are sufficiently small.  This form given by
649: $\bubun{\hat{P}(s,t)}{t} = -\bar{A}(t) \hat{P}(s,t) + \sum_{s'} Q(s' \rightarrow s)
650: \hat{P}(s',t)$, is called the ``parallel'' mutation-selection
651: equation\cite{kimura1970,Akin}.  It approaches the coupled mutation-selection equation
652: (\ref{3}), in the limits of small mutation rate and selection pressure, as shown in
653: \cite{Hofbauer}. If we start from this approximate, parallel mutation-selection equation,
654: and follow the procedure presented in this paper, we obtain $\bubun{P(x,t)}{t} =
655: (A(x)-\bar{A}(t)) P(x,t) + \gamma \bubun{{}}{x} \left[ - m_1^{(1)}(x) + \frac{1}{2}
656: \bubun{{}}{x} m_2^{(1)}(x) \right] P(x,t)$.
657: 
658: In general, this equation is more tractable than eq. (\ref{9}), as the techniques
659: developed in Fokker-Planck equations are straightforwardly applied as discussed in
660: \cite{PhysicalBiology}, and it is also useful in describing of evolution.  Setting
661: $A(x)=x^2$ and replacing $m_1^{(1)}$ and $m_2^{(1)}$ with some constants, the equation is
662: reduced to that introduced by Kimura\cite{Kimura2}; while setting $A(x)=x$, $m_1^{(1)}(x)
663: \propto x$, and replacing $m_2^{(1)}$ with some constant derives the equation by
664: Levine\cite{Levine}.  Because our formalism is general, these earlier studies are derived
665: by approximating our evolution equation suitably.
666: 
667: Besides the generality, another merit of our formulation lies in its use of the phenotype
668: as a variable describing the distribution, rather than the fitness (as adopted by Kimura).
669: Whereas the phenotype is an inherent variable directly mapped from the genetic sequence,
670: the fitness is a function of the phenotype and environment, and strongly influenced by
671: environmental conditions.  The evaluation of the transition matrix by mutation in
672: eq.(\ref{8}) would be more complicated if we used the fitness as a variable, due to
673: crucial dependence of fitness values on the environmental conditions.  In the formalism by
674: phenotype distribution, environmental change is feasible by changing the growth term
675: $A(x)$ accordingly. Our formalism does include the fitness-based equation as a special
676: case, by setting $A(x)=x$.
677: 
678: Another merit in our formulation is that it easily takes isogenic phenotypic fluctuation
679: into account without changing the form of the equation, but only by modifying $A(x)$.  By
680: applying this equation, we obtained the influence of isogenic phenotype fluctuations on
681: error catastrophe.  The critical mutation rate for the error catastrophe increases because
682: of the fluctuation, in a certain case.  This implies that the fluctuation can enhance the
683: robustness of a high-fitness state against mutation.
684: 
685: In fact, the relevance of isogenic phenotypic fluctuations on evolution has been recently
686: proposed\cite{SatoPNAS,kaneko-book,KKFurusawaJTB}, and change in phenotypic fluctuation
687: through evolution has been experimentally verified\cite{Ito,SatoPNAS}.  In general,
688: phenotypic fluctuations and a mutation-selection process for artificial evolution have
689: been extensively studied recently.  The present formulation will be useful in analysing
690: such experimental data, as well as in elucidating the relevance of phenotypic fluctuations
691: to evolution.
692: 
693: \newpage
694: 
695: {\bf Figures}
696: 
697: \begin{figure}[hbtp] \begin{minipage}[t]{15cm} \begin{center}
698: \scalebox{ 0.45 }{\includegraphics{Fig1.eps}}
699: \caption{ Examples of profiles of the evaluation function $f$ for three values of
700: $\gamma$. The red, purple, and blue curves give the profiles of $f$ for $\gamma=0.31$,
701: $\gamma=0.386$, and $\gamma=0.49$, respectively, where $f$ is defined by $f(x)=A(x)
702: (1-\gamma (1-\sqrt{1-x^2}))$ and $A$ is given by $A(x)=1+0.2 (x-0.25) \Theta(x-0.25)
703: \Theta(0.75-x)+ 0.1 \Theta(x-0.75)$; the profile of $A$ is indicated by the black
704: curve. This illustrates determination of $x_b$ and $x_p$; $x_b$ is given by the position
705: where $f$ takes a maximum, while $x_p$ is given as the position where the line $y=f(x_b)$
706: crosses the curve of $A$. For $\gamma < 0.386$, $f(x)$ has a maximum value at $x=x_b$, and
707: thus the critical mutation rate for the error threshold is estimated to be
708: $\gamma^{*}=0.386$.  }
709: \label{33:fig:schematical-explaination-of-estimation}
710: \end{center} \end{minipage} \end{figure}
711: 
712: \begin{figure}[hbtp] \begin{minipage}[t]{15cm} \begin{center}
713: \scalebox{ 0.4 }{\includegraphics{Fig2.eps}}
714: \caption{Example of profiles of the mean fitness functions without phenotypic fluctuation
715: case (black); with a constant phenotypic fluctuation over a given range given by
716: eq.(\ref{36:flat-case}) (red), where we set $A(x)=1+0.1 \Theta(x-0.5)$ and $\ell=0.25$.  }
717: \label{35:fig:profile-of-A}
718: \end{center} \end{minipage} \end{figure}
719: 
720: %% shaji 2006/11/06
721: \acknowledgements The authors would like to thank P. Marcq, S. Sasa, and T. Yomo for
722: useful discussion.
723: 
724: %%---
725: 
726: \appendix 
727: 
728: %こっちのほうが先に現れるので 2006/11/06
729: \section{Estimation of the transition probability in the NK model}
730: \label{29}
731: 
732: In the NK model\cite{well-written-NK,kauffmann-book}, the fitness $f$ of a sequence $s$ is
733: given by $$ f(s)=\frac{1}{N} \sum_{i=1}^{N} \omega_i(s) ,$$ where $\omega_i$ is the
734: contribution of the $i$th site to the fitness, which is a function of $s_i$ and the state
735: values of other $K$ sites. The function $\omega_i$ takes a value chosen uniformly from
736: $[0,1]$ at random. We assume that the phenotype $x$ of the sequence $s$ is given by
737: $x=f(s)$.
738: 
739: When $N \gg 1$, $K \gg 1$, and $K/N \ll 1$, the phenotype distribution of mutants of a
740: given sequence $s$ (whose phenotype is $x$) is characterized only by the phenotype $x$
741: (without the need to specify the sequence $s$). For showing this, we first examine the
742: one-point mutant case.
743: 
744: We consider the ``number of changed sites'' of sites at which $\omega's$ are changed due
745: to a single-point mutation. By assuming that the average number of changed sites is $K$,
746: the distribution of the number of changed sites $n$, denoted by $P_{site}(n)$, is
747: approximately given by
748: \begin{equation} P_{site}(n) \simeq e^{-\frac{(n-K)^2}{2 K }},
749: \label{1:appen:site} \end{equation} with the help of
750: the limiting form of binomial distribution.  Here, we have omitted the normalization
751: constant.
752: 
753: Next, we study the distribution of the difference between the phenotype $x$ of the
754: original sequence and the phenotype $x'$ of its one-point mutant, given the number $n$ of
755: changed sites of the single-point mutant.  We denote the distribution as $P_{dif \! 
756: f}(n;X)$, where $X=x'-x$. Here the average of $x'$ is $x(N-n)/N$, since $(N-n)$ sites are
757: unchanged. Thus, according to the central limit theorem, the distribution is estimated as
758: \begin{equation}
759: P_{dif \! f}(n;X) \simeq \exp \left[ {-\frac{(X+\frac{n}{N}
760: x)^2}{2 n \frac{\sigma^2}{N^2} }} \right] ,
761: \label{2:appen:diff} \end{equation} where $\sigma^2$ is the
762: variance of the distribution of the value of $\omega$.  This variance is estimated from
763: the probability distribution $P_{(s,\{\omega_i\})}(\omega)$ that the sequence $\omega$ is
764: generated.. Although the explicit form of $P_{(s,\{\omega_i\})}$ is hard to obtain unless
765: $\{\omega_i\}$ and $s$ are given, it is estimated by means of the ``most probable
766: distribution,'' obtained as 
767: %in the following way
768: follows: Find the distribution that maximizes the evaluation function $S$ (called
769: ``entropy'') defined by $S=-\int_{0}^{1} P(\omega) \log P(\omega) d\omega$ under the
770: conditions $\int_{0}^{1} P(\omega) d\omega=1$ and $\int_{0}^{1} \omega P(\omega)
771: d\omega=x$. Accordingly the variance $\sigma^2$ may depend on $x$.
772: 
773: Combining these distributions (\ref{1:appen:site}) and (\ref{2:appen:diff}) gives the
774: distribution of $X$ without constraint on the number of changed sites:
775: $$ P(X) = \sum_{n=1}^{N} P_{site}(n) P_{dif \! f}(n;X) \simeq \exp \left[
776: {-\frac{(X+\frac{K}{N} x)^2}{2 K \frac{(\sigma^2+x^2)}{N^2}}} \right] .$$ This result
777: indicates that the phenotype distribution of single-point mutants from the original
778: sequence $s$ having the phenotype $x$ is characterized by its phenotype $x$ only; $s$ is
779: not necessary. Similarly, one can show that phenotype distribution of $n$-point mutants is
780: also characterized only by $x$. Hence, the transition probability in the NK model is
781: described only in terms of the phenotypes of the sequences, when $N \gg 1$, $K \gg 1$, and
782: $K/N \ll 1$.
783: 
784: \section{ Mathematical structure of the equation of form (\ref{9})} \label{10}
785: 
786: We first rewrite eq. (\ref{9}) as
787: \begin{equation} \bubun{P(x,t)}{t} = -\bar{A}(t) P(x,t) + L(x) P(x,t),
788: \label{17} \end{equation}
789: where $L$ is a linear operator, defined by $L(x)=A(x) +\bubun{{}}{x} f(x) +
790: \bubun{{}^2}{x^2} g(x) $ with $f(x)= - \gamma m_1^{(1)}(x) A(x)$ and $g(x)=
791: \frac{\gamma}{2} m_2^{(1)}(x) A(x)$.
792: 
793: The linear operator $L$ is transformed to an Hermite operator using variable
794: transformations (see below) so that $L$ is represented by a complete set of eigenfunctions
795: and corresponding eigenvalues, which are denoted by $\{\phi_i(x)\}$ and $\{\lambda_i\}$
796: ($i=0,1,2,...$), respectively. Eigenvalues are real and not degenerated, so that they are
797: arranged as $\lambda_0 > \lambda_1 > \lambda_2 >...$.
798: 
799: According to \cite{PhysicalBiology}, $P(x,t)$ is expanded as
800: \begin{equation} P(x,t)=\sum_{i=0}^{\infinity} a_i(t) \phi_i(x), \label{13}
801: \end{equation}
802: where $a_i$ satisfies
803: \begin{equation} \frac{d a_i(t)}{dt}= a_i(t)
804: (\lambda_i-{\sum_{j=0}^{\infinity}}' a_j(t) \lambda_j). \label{19}
805: \end{equation}
806: The prime over the sum symbol indicates that the summation is taken except for those of
807: noncontributing eigenfunctions as defined in \cite{PhysicalBiology}.
808: 
809: Stationary solutions of eq. (\ref{19}) are given by $\{ a_{k}=1$ and $a_i=0$ for $i \ne k
810: \}$. Among these stationary solutions, only the solution $\{a_0=1$ and $a_i=0$ for $i \ne
811: 0\}$ is stable. Hence, the eigenfunction for the largest eigenvalue (the largest
812: replication rate) gives the stationary distribution function. Now it is important to
813: obtain eigenfunctions and eigenvalues of $L$, in particular the largest eigenvalue
814: $\lambda_0$ and its corresponding eigenfunction $\phi_0$. Hence, we focus our attention on
815: the eigenvalue problem
816: \begin{equation} \left[ A(x) + \bubun{{}}{x} f(x) + \bubun{{}^2}{x^2} g(x)
817: \right] P(x) =\lambda P(x), \label{16} \end{equation} where $\lambda$ is a constant and P
818: is a function of $x$.
819: 
820: We can transform eigenvalue problem (\ref{16}) to a Schroedinger equation-type eigenvalue
821: problem as follows: First we introduce a new variable $y$ related to $x$ as
822: $y(x)=\int_{x_0}^{x} \sqrt{\frac{h}{g(x')}} dx'$ where $x_0$ and $h$ are constants. Next,
823: we consider a new function $\Psi(y)$ related to $P(x)$ as
824: $$\Psi(y)= \left. \sqrt{\frac{g(x)}{h}} e^{{\int_{y_0}^{y} \frac{\hat{f}(y')}{2 h} dy'}}
825: P(x) \right|_{x=x(y)} $$ where $y_0$ is some constant, $x(y)$ the inverse function of
826: $y(x)$, and $\hat{f}$ a function of $y$ defined by
827: $$\hat{f}(y)= \left. \sqrt{\frac{h}{g(x)}} (f(x)+\frac{1}{2} \frac{d
828: g(x)}{dx}) \right|_{x=x(y)} .$$
829: 
830: Using these new quantities $y$ and $\Psi$ and rewriting eigenvalue problem (\ref{16})
831: suitably, we get
832: \begin{equation} \left[ V(y) + h \bubun{{}^2}{y^2} \right] \Psi(y) =\lambda
833: \Psi(y), \label{21} \end{equation} where $V(y)=\hat{A}(y) + \frac{\frac{d \hat{f} (y)}{d
834: y}}{2} - \frac{\hat{f}^2(y)}{4 h}$ with $\hat{A}(y)=A(x(y))$.
835: 
836: %%---
837: 
838: 
839: %% bunken
840: \begin{thebibliography}{99} 
841: 
842: %(1)
843: \bibitem{Lenski}
844: S. F. Elena and R. E. Lenski, Nat. Rev. Genet. {\bf 4}, 457 (2003).
845: %Nature Reviews
846: %authors: Santiago F. Elena and Richard E. Lenski
847: %title:Evolution experiments with microorganisms: the dynamics and genetic bases of
848: %adaptation
849: 
850: %(2)
851: \bibitem{Lenski-Rose}
852: R. E. Lenski, M. R. Rose, S. C. Simpson, and S. C. Tadler, Am. Nat. {\bf 183}, 1315-1341
853: (1991). 
854: %S. F. Elena and R. E. Lenski, Nat. Rev. Genet. {\bf 4}, 457 (2003).
855: 
856: %(3)
857: \bibitem{Kishony}
858: M. Hegreness, N. Shoresh, D. Hartl, and R. Kishony, Science {\bf 311}, 1615 (2006).
859: %authors: Matthew Hegreness, Noam Shoresh, Daniel Hartl, and Roy Kishony
860: %title:An Equivalence Principle for the Incorporation of Favorable Mutations in Asexual
861: %Populations
862: 
863: %(4)
864: \bibitem{Bactreia-many-generations}
865: A. E. Mayo, Y. Setty, S. Shavit, A. Zaslaver, and U. Alon, PLoS Biol. {\bf 4}, 556-561 (2006).
866: %???
867: 
868: %(5)
869: \bibitem{Dekel-Alon}
870: E. Dekel and U. Alon, Nature {\bf 436}, 588-592 (2005).
871: %title:Optimality and evolutionary tuning of the expression level of a protein.
872: 
873: \bibitem{Kashiwagi-Noumachi} A. Kashiwagi, W. Noumachi, M. Katsuno, M. T. Alam, I. Urabe,
874: and T. Yomo, J. Mol. Evol. {\bf 52}, 502-509 (2001).
875: %* Plasticity of Fitness and Diversification Process During an Experimental Molecular
876: %     Evolution 
877: %*Journal of Molecular Evolution
878: %*Authors
879: %     *Akiko Kashiwagi, Wataru Noumachi, Masato Katsuno, Mohammad T. Alam, Itaru Urabe,
880: %          Tetsuya Yomo
881: 
882: \bibitem{Ito}
883: Y. Ito, T. Kawama, I. Urabe, and T. Yomo, Mol. Evol. {\bf 58(2)}, 196-202 (2004).
884: 
885: \bibitem{footnote}
886: {A well established theory in population genetics,
887: which adopts diffusion equation or Fokker-Planck type equation, is related to the
888: frequency of genes in alleles in population as developed by Wright, Fisher, and
889: Kimura\cite{Fisher,Wright,kimura1970}.  How new genes spread in population is analyzed by
890: the diffusion equation. In contrast, we are concerned with the changes in the distribution
891: of a base sequence consisting of a haploid gene.}
892: 
893: %\bibitem{Kashiwagi}
894: %A. Kashiwagi, I. Urabe, K. Kaneko, and T. Yomo, PLOS ONE.
895: %details are still unknown for now (2006/10/25).
896: %title:Adaptive response of a gene network to environmental changes by attractor selection
897: 
898: \bibitem{Fisher} 
899: 
900: R. A. Fisher, Proc. Roy. Soc. Edinb. {\bf 50}, 205 (1930); {\sl The genetical
901: theory of natural selection} (Oxford University Press, Oxford, 1999).
902: %*article:Proceedings of the Royal Society of Edinburgh
903: %*これが最もよくできているらしい.2006/10/13
904: 
905: \bibitem{Wright} S. Wright, Proc. Natl. Acad. Sci. USA {\bf 31}, 382 (1945); {\sl The
906: theory of gene frequencies} (University of Chicago Press, Chicago, 1969).
907: %*full name:Sewall Wright
908: %*これが総括している本らしい.
909: 
910: \bibitem{kimura1970}
911: J. F. Crow and M. Kimura, {\sl An introduction to population genetics theory}
912: (Harper \verb|&| Row, New York, 1970).
913: 
914: \bibitem{Kimura2}
915: M. Kimura, Proc.  Natl. Acad.  Sci. USA {\bf 54}, 731-736 (1965).
916: %author: Motoo Kimura
917: 
918: \bibitem{Levine} 
919: L. S. Tsimring, H. Levine, and D.A. Kessler, Phys. Rev. Lett. {\bf 76}, 4440 (1996).
920: %authors:Lev S. Tsimring and Herbert Levine
921: %title:RNA Virus Evolution via a Fitness-Space Model
922: 
923: \bibitem{Eigen} 
924: M. Eigen, J.  McCaskill, P. Schuster, J. Phys. Chem. {\bf 92}, 6881-6891 (1988).
925: %title:Molecular Quasi-Species
926: %authors: Manfred Eigen, John McCaskill, and Peter Schuster
927: 
928: \bibitem{kauffmann-book} 
929: S. Kauffman, { \sl The Origins of Order} (Oxford University Press, New York, 1993).
930: 
931: \bibitem{Koshland}
932: J. Spudich and D. Koshland, Nature {\bf 262}, 467-471 (1976).
933: %title:Non-genetic individuality: chance in the single cell.
934: 
935: \bibitem{Elowitz} M. B. Elowitz, A. J. Levine, E. D. Siggia, and P. S. Swain, Science {\bf
936: 297}, 1183 (2002).
937: %Elowitz M B, Levine A J, Siggia E D and Swain P S
938: %title:Stochastic Gene Expression in a Single Cell
939: 
940: \bibitem{Kaern-Collins} M. Kaern, T. C. Elston, W. J. Blake, and J.J. Collins,
941: Nat. Rev. Genet. {\bf 6}, 451-464 (2005).
942: %*title:Stochasticity in gene expression: from theories to phenotypes.
943: 
944: \bibitem{Collins}
945: J. Hasty, J. Pradines, M. Dolnik, and J. J. Collins, Proc.  Natl. Acad.
946:          Sci. USA {\bf 97}, 2075-2080 (2000).
947: %title:Noise-based switches and amplifiers for gene expression
948: 
949: \bibitem{Furusawa} C. Furusawa, T. Suzuki, A. Kashiwagi, T. Yomo, and K. Kaneko,
950: BIOPHYSICS {\bf 1}, 25 (2005).
951: %title:Ubiquity of log-normal distributions in intra-cellular reaction dynamics
952: 
953: %\bibitem{Eigen-Schuster}
954: %M. Eigen and P. Schuster, Naturwissenschaften {\bf 64}, 541-565 (1977).
955: 
956: \bibitem{Ueda}
957: M. Ueda, Y. Sako, T. Tanaka, P. Devreotes, and T. Yanagida, Science {\bf 294},
958:          864 (2001).
959: %title:Single-Molecule Analysis of Chemotactic Signaling in Dictyostelium Cells
960: 
961: \bibitem{noise-review}
962: A. Bar-Even, J. Paulsson, N. Maheshri, M. Carmi, E. O'Shea, Y. Pilpel, and
963: N. Barkai, Nat. Genet. {\bf 38}, 636-643 (2006).
964: %*title:Noise in protein expression scales with natural protein abundance.
965: 
966: \bibitem{SatoPNAS} 
967: K. Sato, Y. Ito, T. Yomo, and K. Kaneko, Proc. Nat. Acad. Sci. USA {\bf
968: 100}, 14086 (2003).
969: %title:On the relation between fluctuation and response in biological systems
970: 
971: \bibitem{kaneko-book} K. Kaneko, {\sl Life: An Introduction to Complex Systems Bioilogy}
972: (Springer, Berlin, 2006).
973: 
974: \bibitem{KKFurusawaJTB}
975: K. Kaneko and C. Furusawa, J. Theo. Biol. {\bf 240}, 78-86 (2006).
976: %An evolutionary relationship between genetic variation and phenotypic fluctuation
977: 
978: \bibitem{Ancel}
979: L. W. Ancel, Theor. Popul. Biol. {\bf 58}, 307-319 (2000).
980: %*title:Undermining the Baldwin expediting effect: does phenotypic plasticity
981: %accelerate evolution?
982: 
983: \bibitem{vanKampen}
984: N. G. van Kampen, {\sl Stochastic processes in physics and chemistry} (North-Holland,
985:          Amsterdam, 1992).
986: 
987: \bibitem{KuboMatsuoKitahara}
988: R. Kubo, K. Matsuo, and K. Kitahara, J. Stat. Phys. {\bf 9}, 51 (1973).
989: %Journal of Statistical Physics
990: %authors: Ryogo Kubo, Kazuhiro Matsuo, and Kazuo Kitahara
991: %title:Fluctuation and Relaxation of Macrovariables
992: 
993: \bibitem{Baake2} 
994: E. Baake and H. Wagner, Genet. Res. {\bf 78}, 93-117 (2001).
995: %title:Mutation-selection models solved exactly with methods of statistical mechanics
996: %authors: Ellen Baake and Holger Wagner
997: %レビューの方
998: 
999: \bibitem{Waterman}
1000: M. S. Waterman, {\sl Introduction to computational biology : maps, sequences and
1001:          genomes} (Chapman and Hall, London, 1995).
1002: %author: Michael S. Waterman
1003: 
1004: \bibitem{Furusawa-KK}
1005: C. Furusawa, K. Kaneko, Phys. Rev. E {\bf 73}, 011912 (2006).
1006: 
1007: \bibitem{Gillespie}
1008: J. H. Gillespie, Theor. Popul. Biol. {\bf 23}, 202-215 (1983).
1009: %Title:A simple stochastic gene substitution process.
1010: %author:Gillespie, J. H.
1011: 
1012: \bibitem{Orr}
1013: H. A. Orr, Evolution {\bf 56}, 1317-1330 (2002).
1014: %author:H. ALLEN Orr
1015: %title:THE POPULATION GENETICS OF ADAPTATION: THE ADAPTATION OF DNA SEQUENCES
1016: 
1017: \bibitem{Haken}
1018: H. Haken, {\sl Synergetics: an introduction nonequilibrium phase transitions and
1019:          self-organization in physics, chemistry and biology} (Springer-Verlag, Berlin, 1978).
1020: %edition: 2nd edn
1021: 
1022: \bibitem{PhysicalBiology}
1023: K. Sato and K. Kaneko, Phys. Biol. {\bf 3}, 74-82 (2006).
1024: %title:On the distribution of state values of reproducing cells
1025: %authors:Katsuhiko Sato and Kunihiko Kaneko
1026: 
1027: \bibitem{Morse-book}
1028: P. M. Morse and H. Feshbach, {\sl Methods of theoretical physics} (McGraw-Hill, New
1029:          York, 1953), pp. 1092-1106.
1030: %International student edition
1031: %autors:Philip M. Morse and Herman Feshbach
1032: 
1033: \bibitem{Leuthausser} 
1034: I. Leuthausser, J. Stat. Phys. {\bf 48}, 343-360 (1987).
1035: %title:Statistical Mechanics of Eigen's Evolution Model
1036: %author:Ira Leuthausser
1037: %始めにIsingと対応をつけた人ですよね. 2006/07/13
1038: %離散時間で議論しているやつ.
1039: 
1040: \bibitem{Baake}
1041: E. Baake, M. Baake and H. Wagner, Phys. Rev. Lett. {\bf 78}, 559 (1997).
1042: %title:Ising Quantum Chain is Equivalent to a Model of Biological Evolution
1043: %スピンとのタイをうをつけた論文
1044: 
1045: \bibitem{Taiwan}
1046: D. B. Saakian and C. K. Hu, Proc.  Natl. Acad.  Sci. USA {\bf 103}, 4935-4939
1047:          (2006).
1048: %authors:David B. Saakian and Chin-Kun Hun
1049: %title:Exact solution of the Eigen model with general fitness functions and
1050: %         degradation rates
1051: 
1052: \bibitem{Taiwan2} 
1053: D. Saakian and C. K. Hu, Phys. Rev. E {\bf 69}, 021913 (2004).
1054: %authors: David Saakian and Chin-Kun Hu
1055: %title:Eigen model as a quantum spin chain: Exact dynamics
1056: 
1057: \bibitem{Yang-et-al} 
1058: H. Yang, G. Luo, P. Karnchanaphanurach, T. M. Louie, I. Rech, S. Cova, L. Xun,
1059: and X. S. Xie, Science {\bf 302}, 262-266 (2003).
1060: %%*title: Protein conformational dynamics probed by single-molecule electron
1061: %               transfer.
1062: 
1063: \bibitem{ancel-fontana} 
1064: L. W. Ancel and W. Fontana, J. Exp. Zool. {\bf 288}, 242-283 (2000).
1065: %*title:Plasticity, evolvability, and modularity in RNA.
1066: 
1067: \bibitem{Hofbauer} 
1068: J. Hofbauer, J. Math. Biol. {\bf 23}, 41-53 (1985).
1069: 
1070: \bibitem{Akin} 
1071: E. Akin, {\sl The geometry of population genetics} (Springer-Verlag, Berlin, 1979).
1072: %Ethan Akin
1073: 
1074: \bibitem{well-written-NK}
1075: B. Levitan and S. Kauffman, Mol. Divers.  {\bf 1}, 53-68 (1995).
1076: %article: Molecular Diversity, 1 (1995) 53-68
1077: %authors : Bennett Levitan and Stuart Kauffman
1078: %title:Adaptive walks with noisy fitness measurements
1079: 
1080: \end{thebibliography}
1081: 
1082: \end{document}
1083: