q-bio0511039/evol.tex
1: \documentclass[11pt]{amsart}
2: 
3: \usepackage{graphicx}
4: \usepackage{hyperref}
5: \usepackage{url}
6: 
7: \input{pstricks}
8: \input{pst-node}
9: \usepackage{pst-tree}
10: 
11: % trees and tree nodes:
12: \newcommand{\tree}[2]{\pstree[treemode=U,arrows=->,treefit=tight,treesep=0.5cm,levelsep=1cm]{#1}{#2}}
13: \newcommand{\node}[1]{\Tr{\psframebox[linecolor=white,framearc=.5]{#1}}}
14: %\newcommand{\node}[1]{\Toval{#1}}
15: %%\renewcommand{\root}[1]{\Tr{\psframebox[linecolor=white,framearc=.5]{#1}}}
16: \renewcommand{\root}[1]{\Toval{#1}}
17: 
18: % xy pic:
19: %\input xy
20: %\xyoption{all}
21: %\CompileMatrices
22: 
23: \DeclareMathOperator{\rank}{rank}
24: \DeclareMathOperator{\Prob}{Prob}
25: \DeclareMathOperator{\diag}{diag}
26: 
27: % "A independent of B given C"
28: \newcommand{\ind}{\mbox{$\perp \kern-5.5pt \perp$}}
29: \newcommand{\nind}{\mbox{$\not\hspace{-4pt}\ind$}}
30: 
31: \newcommand{\one}{\mathbf 1}
32: \newcommand{\pa}{\mathrm{pa}}  % parent
33: \newcommand{\ch}{\mathrm{ch}}  % child
34: \newcommand{\cT}{\mathcal{T}}  % mutagenetic tree
35: \newcommand{\cM}{\mathcal{M}}  % mixture model
36: \newcommand{\cI}{\mathcal{I}}  % states
37: \newcommand{\cC}{\mathcal{C}}  % compatible states
38: \newcommand{\cS}{\mathcal{S}}  % star
39: \newcommand{\cE}{\mathcal{E}}
40: \newcommand{\cG}{\mathcal{G}}
41: \newcommand{\cB}{\mathcal{B}}
42: \newcommand{\R}{\mathbb{R}}
43: 
44: \newcommand{\RP}{\mathcal R}  % risk polynomial
45: \newcommand{\ba}{\mathbf a}
46: \newcommand{\bU}{\mathbf U}
47: \newcommand{\bI}{\mathbf I}
48: \newcommand{\muta}[3]{\rho_{#1,#2}^{#3}}
49: \newcommand{\thet}[1]{\theta^e_{#1_{\pa(e)}, #1_e}}
50: 
51: \newtheorem{thm}{Theorem}
52: \newtheorem{lemma}[thm]{Lemma}
53: \newtheorem{prop}[thm]{Proposition}
54: \newtheorem{cor}[thm]{Corollary}
55: \newtheorem{prob}[thm]{Problem}
56: \newtheorem{conj}[thm]{Conjecture}
57: \newtheorem{alg}[thm]{Algorithm}
58: 
59: \newtheorem{ex}[thm]{Example}
60: \newtheorem{df}[thm]{Definition}
61: 
62: \title{Evolution on distributive lattices}
63: 
64: \author[Beerenwinkel, Eriksson, and Sturmfels]{
65: Niko Beerenwinkel$^*$ \and Nicholas Eriksson  \and Bernd Sturmfels\\
66: Department of Mathematics\\
67: University of California\\
68: Berkeley, CA 94720, USA\\
69: $\{$niko,eriksson,bernd$\}$@math.berkeley.edu\\
70: $^*$Corresponding Author:\\
71: phone: +1 (510) 642-3529, fax: +1 (510) 642-8204
72: }
73: 
74: 
75: %\date{\today}
76: 
77: \begin{document}
78: 
79: \begin{abstract}
80: We consider the directed evolution of a population after an 
81: intervention that has significantly altered the underlying 
82: fitness landscape. 
83: We model the space of genotypes as a distributive lattice;
84: the fitness landscape is a real-valued function on 
85: that lattice. The risk of escape from intervention, i.e., the 
86: probability that the population
87: develops an escape mutant before extinction, is
88: encoded in the  risk polynomial.
89: Tools from algebraic combinatorics are applied
90: to compute the risk polynomial in terms of
91: the fitness landscape. In an application to 
92:  the development of drug
93: resistance in HIV, we study the
94:  risk of viral escape from
95: treatment with the protease inhibitors ritonavir
96: and indinavir.     
97: \end{abstract}
98: 
99: \maketitle
100: 
101: \begin{quote}
102: \noindent {\bf Keywords:}
103: fitness landscape, distributive lattice, directed evolution, 
104: risk polynomial, chain polynomial,
105: HIV drug resistance, Bayesian network, mutagenetic tree 
106: \end{quote}
107: 
108: 
109: 
110: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
111: \section{Introduction}
112: 
113: The evolutionary fate of a population is determined by the replication
114: dynamics of the ensemble and by the reproductive success of its individuals.
115: We are interested in scenarios where most individuals have a low fitness,
116: eventually leading to extinction, and only a few types of individuals
117: (``escape mutants'')
118: can survive permanently. These situations often arise due to  
119: a significant change of the underlying fitness landscape. 
120: For example, a virus
121: that has been transmitted to a new host is confronted with a new immune
122: response. Likewise, medical interventions such as radiation therapy,
123: vaccination, or chemotherapy result in altered fitness landscapes for the 
124: targeted agents, which may be bacteria, viruses, or cancer cells.
125: 
126: Given a population and such a hostile fitness landscape, the central question
127: is whether the population will survive.
128: In the case of medical interventions we wish to know the probability
129: of successful treatment. Answering this question involves computing
130: the risk of evolutionary escape, i.e., the probability that the
131: population develops an escape mutant before extinction.     
132: We present a mathematical framework for computing such probabilities.
133: 
134: Our primary application is the evolution of drug resistance
135: during treatment of HIV infected patients \cite{Clavel2004}.
136: We consider therapy with two different protease inhibitors (PIs).
137: These compounds interfere with HIV particle maturation
138: by inhibiting the viral protease enzyme.
139: The effectiveness of PI therapy is limited
140: by the development of drug resistance.
141: Rapid and highly error prone replication of a large virus
142: population generates mutants that resist the selective pressure of
143: drug therapy. PI resistance is caused by mutations in the protease gene
144: that reduce the binding affinity of the drug to the enzyme.
145: These mutations have been shown to accumulate in a stepwise manner
146: \cite{Berkhout1999}. For most PIs, no single mutation confers
147: a significant level of resistance, but multiple mutations are
148: required for escape from drug pressure.
149: Quantitative predictions of the probability of successful PI treatment
150: would help in finding effective antiretroviral
151: combination therapies. Selecting a drug combination
152: amounts to controlling the viral fitness landscape.
153: 
154: We regard the directed evolution of a population towards an escape state
155: as a fluctuation on a fitness landscape. The space of
156: genotypes is modeled as follows. We start with a 
157: finite partially ordered set (poset) $\cE$ whose elements are called 
158: \emph{events}. The events are non-reversible
159: mutations with some constraints on their order of occurrence.
160: Such constraints are primarily due to
161: epistatic effects between different loci in a genome
162: \cite{Bonhoeffer2000}.
163: The event constraints define the poset structure: 
164: $\,e_1 < e_2 \,$ in $\cE$ means that
165: event $e_1$ must occur before event $e_2$ can occur.
166: Each genotype $g$ is represented by a subset of $\cE$, namely,
167: the set of all events that occurred to create $g$. 
168: Thus a genotype $g$ is an \emph{order ideal} in the  poset $\cE$.
169: The space of genotypes $\cG$ is the set of
170: all order ideals in $\cE$, which is a {\em distributive lattice}
171: \cite[Sec.~3.4]{Stanley1999}.
172: The order relation on $\cG$ is set inclusion and
173: corresponds to the accumulation of mutations. 
174: This mathematical formulation is reasonable in the above situations,
175: where a population is exposed to strong selective pressure.  
176: 
177: \begin{figure}
178: \includegraphics[width=\textwidth]{landscape}
179: \caption{An event poset, its genotype lattice, and a fitness landscape.}
180: \label{fig:ex1}
181: \end{figure}
182: 
183: The risk of escape is governed by the structure of $\cG$,
184: the fitness function on $\cG$, and the population dynamics
185: (such as the mutation rates and population size). Our focus
186: is on the dependency of the risk of escape
187: on the assigned fitness values for each genotype $g \in \cG$.
188: This leads us to the \emph{risk polynomial},
189: which is shown to be equivalent to a well-known object in
190: algebraic combinatorics. Indeed, one of the objectives of this
191: work is to provide a bridge between algebraic combinatorics 
192: and evolutionary biology.
193: 
194: 
195: This paper  is organized as follows. In Section~\ref{sec:fitness}
196: we formalize our
197: model of a static fitness landscape on the genotype lattice $\cG$
198: derived from an event poset $\cE$,
199: and we discuss evolution on the lattice $\cG$. 
200: In Section~\ref{sec:branching} we review the multistate
201: branching process studied by Iwasa, Michor and Nowak 
202: \cite{Iwasa2003,Iwasa2004}.
203: 
204: In Section~\ref{sec:bayes} we study the Bayesian networks 
205: which arise from identifying the events in $\cE$
206: with binary random variables. These
207: statistical models can be used
208: to infer the genotype space from
209: given data. For conjunctive Bayesian networks
210: we recover the distributive lattice of order ideals in $\cE$.
211: Of particular interest is
212: the case where $\cE$ is a directed forest: here the Bayesian network
213: is a mutagenetic tree model \cite{Beerenwinkel2005c,Beerenwinkel2005f}.
214: The application of our methods
215: to the development of PI resistance in HIV 
216: is presented in  Section~\ref{sec:apply}.
217: 
218: The Appendix summarizes various representations
219: of the risk polynomial in terms of structures from
220: algebraic combinatorics. Efficient methods for computing
221: the risk polynomial and their implementation are presented.
222: 
223: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
224: 
225: \section{Fitness landscapes on distributive lattices}   \label{sec:fitness}
226: 
227: A partially ordered set (or poset) is a set $\cE$
228: together with a binary relation, denoted ``$\leq$'', which is
229: reflexive, antisymmetric, and transitive. Here
230: we fix a finite poset $\cE$  whose elements are called \emph{events}.
231: If the number of events is $n$ then we
232: often identify the set underlying $\cE$ with
233: the set $\,[n] = \{1,2,\ldots,n\}$. In this way,
234: the subsets of $\cE$ are encoded by the $2^n$ binary strings of length $n$.
235: The empty subset of $\cE$ is
236: encoded by the all-zero string $\hat{0} = 0 0 \cdots 0$ 
237: which represents the \emph{wild type}, and
238: the full set $\cE$ is
239: encoded by the  all-one string $\hat{1} = 11  \cdots 1$
240: which represents the \emph{escape state}.
241: 
242: An order ideal $g$ in a poset $\cE$ is a subset of $\cE$ 
243: that is closed downward;
244: that is, if $e_2 \in g$ and $e_1 \le e_2$, then $e_1 \in g$. 
245: The set of all order ideals of $\cE$ forms a distributive lattice 
246: $J(\cE)$ under inclusion. Birkhoff's Representation Theorem
247: \cite[Thm.~3.4.1]{Stanley1999}
248: states that all distributive lattices have the form 
249: $J(\cE)$ for a poset $\cE$.
250: We write $\cG = J(\cE)$, and we
251: call $\cG$ the {\em genotype lattice}.
252: 
253: \begin{ex} \rm
254: Let $\cE$ be the trivial poset, 
255: where no two events are comparable,
256: with $|\cE| = n$.
257: Then $\cG = J(\cE)$ is the Boolean lattice consisting
258: of all subsets of $\cE$ ordered by inclusion.
259: This means that all possible combinations of mutations
260: are possible, and they can occur in any order. Each of
261: the $2^n$ binary strings $g \in \{0,1\}^n$ represents
262: a mutational pattern, or genotype.
263: \end{ex}
264: 
265: In general, the event poset $\cE$ does have non-trivial
266: relations $e_1 < e_2$. The relation $e_1 < e_2$ excludes all
267: genotypes $g$ with $g_{e_1} = 0$ and 
268: $g_{e_2} = 1$ from $\cG$. The remaining genotypes
269: $g$ form a sublattice of the Boolean lattice $\{0,1\}^n$,
270: and this is precisely our distributive lattice
271: $\cG = J(\cE)$. Note that the
272: lattice $\cG$ is ranked, with the rank function given by
273: $\rank(g) = |g|$.
274: 
275: \begin{ex} \rm \label{FourEvents}
276: Consider a scenario with $n=4$ mutation events, labeled $\cE = \{1,2,3,4\}$. 
277: Suppose that event $3$ can only  occur after
278: events $1$ and $2$,
279: and event $4$ can only occur after event $2$.
280: This allows for precisely eight genotypes
281: \[
282: \cG \,\, = \,\, \bigl\{
283: 0000, 1000, 0100, 1100, 0101, 1110, 1101, 1111  \bigr\}.
284: \]
285: The event poset $\cE$ and the genotype lattice $\cG$ are
286: shown in Figure~\ref{fig:ex1}.
287: \end{ex}
288: 
289: A fitness landscape associates to each possible genotype
290: a number which quantifies the reproductive capacity of
291: an individual with that genotype \cite{Reidys2002}. We define a
292: \emph{fitness landscape} on the distributive lattice $\cG$ 
293:  to be any function ${\mathbf f} \colon \cG \to \mathbb{R}$.
294: The value  ${\mathbf f}(g)$ at any $g \in \cG$ 
295: is the  \emph{fitness} of the genotype $g$.
296: Thus, the space of all fitness landscapes is the finite-dimensional
297: vector space $\mathbb{R}^\cG$.
298: 
299: We shall consider certain special models of fitness landscapes,
300: which are represented by linear subspaces of $\mathbb{R}^\cG$.
301: In the following definitions, a genotype $g$ is regarded
302: as a subset of the event poset $\cE$, where $|\cE| = n$.
303: A \emph{constant fitness landscape} has the
304: form ${\mathbf f}(g) \equiv a$ for some constant $a$.
305: Thus the constant landscapes form a
306: line through the origin in $\mathbb{R}^\cG$.
307: A \emph{graded fitness landscape} is a landscape on 
308: $\cG$ whose fitness values depend only on the rank. Equivalently, we have
309: ${\mathbf f}(g) = a_{|g|}$ for
310: constants $a_0,a_1,\ldots,a_n$. Thus, graded fitness landscapes
311: form an $(n+1)$-dimensional linear subspace of $\mathbb{R}^\cG$.
312: 
313: Our biological application in Section~\ref{sec:apply} uses
314: the graded fitness landscape model, which means that the
315: fitness of a virus type depends only on the number of mutations it
316: harbors. We shall  
317: model situations where a virus escapes from a wild
318: type $\hat{0}$ to a drug-resistant type $\hat{1}$.  In this case, we
319: assume a graded fitness landscape that is
320: monotonically increasing with rank, i.e.,
321: \[ 
322:    a_0 \,<\, a_1 \,< \,a_2 \,<\, \cdots \,<\, a_n. 
323: \]
324:   This implies that the fitness landscape ${\mathbf f}$ has a unique
325: local (and global) maximum at the drug resistant type $\hat{1}$,
326: which is the top element in $\cG$.
327: 
328: We next introduce the mathematical framework
329: for evolution on a fitness landscape. The general
330: setup is as in the work of Reidys and Stadler
331: \cite{Reidys2002}, but this is adapted here to our specific
332: situation, where the genotypes form a 
333:  distributive lattice $\cG$. The order relation on $\cG$,
334:  which comes from inclusion of subsets of $\cE$,  induces a
335: neighborhood structure on $\cG$ where the neighbors
336: of $g \in \cG$ are the genotypes that strictly contain $g$,
337: \begin{equation}   \label{eq:neighborhood}
338:    N(g) \, := \, \bigl\{ h \in \cG \,\mid \, g \subset h \bigr\}.
339: \end{equation}
340: Unlike the typical situation considered in \cite{Reidys2002},
341: this notion of neighborhood is not symmetric. To be precise,
342: we have that $h \in N(g)$ implies $g \not\in N(h)$.
343: 
344: This neighborhood structure implies that mutational 
345: changes are possible only upward in the genotype lattice. 
346: This structure models a directed evolutionary 
347: process from the wild type $\hat{0}$ towards the escape state
348: $\hat{1}$. Typically, our configuration space $\cG$ is a small subset
349: of the Boolean lattice $\{0,1\}^n$ of all binary strings.
350: Indeed, in the course of viral evolution,
351:  a population will visit only a small fraction of $\{0,1\}^n$,
352:  as most mutants are not viable.
353: 
354: 
355: Suppose that the number of genotypes in $\cG$ is $m$.
356:  We wish to define dynamics between the states of $\cG$.
357:  To this end, we fix a linear extension of $\cG$, and we
358:   introduce an
359:  $m \times m$ matrix of transition rates, written
360:  ${\bf U} = (u_{gh})$, whose rows and columns
361:  are indexed by genotypes $g,h \in \cG$.
362: Each entry $u_{gh}$ of the matrix ${\bf U}$ is a non-negative 
363: real number which is zero unless $h \in N(g)$.
364: In the framework of algebraic combinatorics, it
365: is convenient to think of the matrix ${\bf U}$ as an element in the
366: incidence algebra of $\cG$;
367: see \cite[Sec.~3.6]{Stanley1999}.
368: 
369: We further assume that the non-zero mutation rates
370: $u_{gh}$ depend only on the events in $h \backslash g$.
371: Equivalently, the rate at which a collection of mutation events
372: occurs is independent of which other mutations have
373: already occurred. With this assumption, there are only $n$ free
374: parameters $\mu_1,\ldots,\mu_n$ in the matrix ${\bf U}$,
375: where $\mu_e$ is the mutation rate of event $e$.
376: Then 
377: \begin{equation} \label{eq:muta}
378: u_{gh} \,\,=\,\, \begin{cases}
379:    \,\,\,  \prod_{e \in h\backslash g} \mu_e & \text{if $g \subset h$}\\
380:    \,\,\,  0                               & \text{otherwise}.
381:    \end{cases} 
382: \end{equation} 
383: In particular, if all rates are the same, say $\mu = \mu_1 = \dots = \mu_n$, then
384: the entries of $\bU$ are $\, u_{gh} \,=\,  \mu^{|h \backslash g|}\,$
385: if $g \subset h$ and $\,u_{gh} = 0\,$ otherwise.
386: 
387: \begin{ex} \rm \label{FourEvents2}
388: For the genotype lattice $\cG$ in
389: Figure~\ref{fig:ex1}, the matrix $\bU$ equals
390: \[
391: \bordermatrix{ & \! 0000 \! &
392: \! 1000 \! & \! 0100 \! & \! 1100 \! & \! 0101 \! & 1110 & 1101 & 1111 \cr
393: 0000 &           0  & \mu_1 & \mu_2 & \mu_1 \mu_2 & \mu_2 \mu_4 & 
394: \! \mu_1 \mu_2 \mu_3 \! & \! \mu_1 \mu_2 \mu_4 \!&
395: \! \mu_1 \mu_2 \mu_3 \mu_4 \! \cr
396: 1000 & 0 & 0 & 0 &\mu_2 & 0 & \mu_2 \mu_3 & \mu_2 \mu_4 & \mu_2 \mu_3 \mu_4 \cr
397: 0100 & 0 & 0 & 0 &\mu_1 & \mu_4 & \mu_1 \mu_3 & \mu_1 \mu_4 & \mu_1 \mu_3 \mu_4
398:  \cr
399: 1100 &        0    &  0   &  0   &  0   &  0   & \mu_3  & \mu_4 & \mu_3 \mu_4  
400: \cr
401: 0101 &        0    &  0   &  0   &  0   &  0   &  0  &  \mu_1 & \mu_1 \mu_3 \cr
402: 1110 &        0    &  0   &  0   &  0   &  0   &  0  &  0   &  \mu_4   \cr
403: 1101 &        0    &  0   &  0   &  0   &  0   &  0  &  0   &  \mu_3   \cr
404: 1111 &        0    &  0   &  0   &  0   &  0   &  0  &  0   &  0   \cr}
405: \]
406: Note that the entry in row $g$ and column $h$ of
407: any power $\bU^k$ equals $u_{gh}$ times the number
408: of paths of length $k$ from $g$ to $h$ in $\cG$. In particular,
409: $\,\bU^5 = 0 $.
410: \end{ex}
411: 
412: Let ${\mathbf f}$ be a fitness landscape on $\cG$ and  $\,{\mathbf F} \,=\, 
413: \diag\bigl({\bf f}(g) \mid g \in \cG \bigr)\,$  the $m \times m$ diagonal 
414: matrix whose entries are the fitness values.
415: The entry of the matrix product ${\bU} {\mathbf F}$ in row $g$ and column $h$
416: represents the  probability of genotype $g$ transitioning
417: into genotype $h$ in one step.
418: A precise probabilistic derivation and interpretation 
419: will be given in the next section.
420: 
421: We are interested
422: in \emph{all} mutational pathways that lead from the wild type
423: $\hat{0}$ to the escape state $\hat{1}$.
424: Towards this end, note that the entry $(g,h)$ of the matrix
425: $({\bU}{\mathbf F})^k$ represents the probability of
426: genotype $g$ evolving to genotype $h$
427: along any mutational pathway (chain) of length $k$ in 
428: the genotype lattice $\cG$.
429: The chains from $\hat{0}$ to $\hat{1}$ in $\cG$ 
430: are accounted for by the 
431: upper right hand entry of $({\bU}{\mathbf F})^k$.
432: Note that the matrix $\,({\bU}{\mathbf F})^k\,$ is zero for $k > n$.
433: 
434: To account for chains of arbitrary length, we consider the matrix
435: \begin{equation}
436: \label{GeometricSeries}
437: (\bI - {\bU} {\mathbf F})^{-1} - \bI \,\,\, = \,\,\,
438:   {\bU}{\mathbf F}
439:  +  ({\bU}{\mathbf F})^2
440:  +  ({\bU}{\mathbf F})^3
441:  + \cdots  +   ({\bU}{\mathbf F})^n,
442: \end{equation}
443: where $\bI$ is the $m \times m$ identity matrix.
444: We summarize our discussion in the following proposition,
445: which is proved by elementary matrix algebra.
446: 
447: \begin{prop}
448: \label{ZeroUnless}
449: The entry of the matrix (\ref{GeometricSeries})
450: in row $g$ and column $h$ is zero unless
451: $g \subset h$, in which case it is
452: $\,u_{gh} \cdot {\mathbf f}(h) \cdot P_{gh}({\mathbf f}) \,$
453: where $P_{gh}$ is a polynomial function of degree
454: $|h \backslash g|-1$ on the space
455: of all fitness landscapes $\,\mathbb{R}^\cG $.
456: \end{prop}
457: 
458: The polynomial  $\,P_{gh}({\mathbf f}) \,$  is the
459: generating function for all chains from $g$ to $h$ in $\cG$.
460: This will be made precise in the following corollary.
461: We shall restrict ourselves to the most important case
462: when  $ g = \hat{0}$ is the wild type
463: and $h = \hat{1}$ is the escape state.
464: Studying $\,P_{\hat{0} \hat{1}}({\mathbf f})\,$ only
465: is no loss of generality because any
466: interval of a distributive lattice
467: is again a distributive lattice.
468: 
469: Proposition~\ref{ZeroUnless} tells us
470: that $\,P_{\hat{0} \hat{1}}({\mathbf f})  \,$
471: is a polynomial of   degree $n-1$
472: in the unknown fitness values ${\bf f}(g)$,
473: which are also written as $f_g$, where $g \in \cG$.
474: 
475: 
476: \begin{cor} \label{AllThoseChains}
477: The polynomial $\,P_{\hat{0} \hat{1}}({\mathbf f})  \,$
478: in the upper-right entry of
479: (\ref{GeometricSeries}) equals
480: \begin{equation}
481: \label{RISK}
482: P_{\hat{0} \hat{1}}({\mathbf f}) 
483: \quad  = \sum_{\hat{0}=g_0 \subset g_1 \subset \dots \subset g_k = \hat{1}} 
484: \!\!\!\!\!\!  f_{g_1} f_{g_2} \cdots f_{g_{k-1}},
485: \end{equation}
486: where the sum runs over all chains
487: from $\hat{0}$ to $\hat{1}$ in 
488: the genotype lattice $\cG$.
489: \end{cor}
490: 
491: 
492: 
493: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
494: 
495: \section{The risk of escape}   \label{sec:branching}
496: 
497: For a poset of events $\cE$ and 
498: the corresponding distributive lattice $\cG = J(\cE)$, 
499: the \emph{risk polynomial} of $\cG$ is defined as the
500: polynomial (\ref{RISK}), which we denote by $\,\RP(\cG;{\mathbf f})$.
501: The risk polynomial was introduced
502:   in \cite{Iwasa2003,Iwasa2004}.
503: In this section we review the evolutionary dynamics model
504: proposed in these papers, and we
505: discuss the probabilistic meaning
506: of the risk polynomial. 
507: 
508: \begin{ex} \label{ex:rp} \rm
509: Let $\cG$ be the genotype lattice in Figure~\ref{fig:ex1}.
510: Then the risk polynomial
511: $\,\RP(\cG;{\mathbf f})\,$ is the following 
512: polynomial of degree three in six unknowns:
513: \begin{eqnarray*}
514: & 1 + f_{1000} + f_{0100} + f_{1100} + f_{0101} + f_{1110} + f_{1101} \\
515: &  + f_{1000} f_{1100} + f_{0100} f_{1100} + 
516:     f_{0100} f_{0101}  + f_{1000} f_{1110} + f_{0100} f_{1110}
517:  \\ &   + f_{1000} f_{1101} + f_{0100} f_{1101} +  f_{1100} f_{1110}
518:   + f_{1100} f_{1101} + f_{0101} f_{1101} \\
519: &   + f_{1000} f_{1100}  f_{1110} + f_{0100} f_{1100} f_{1110} 
520:  + f_{1000} f_{1100} f_{1101}  \\ &
521:      +  f_{0100} f_{1100} f_{1101} + f_{0100} f_{0101} f_{1101}.
522: \end{eqnarray*}
523: \end{ex}
524: 
525: If we restrict the fitness landscape
526:  ${\mathbf f}$ to lie in a linear subspace of $\mathbb{R}^\cG$,
527: then $\,\RP(\cG;{\mathbf f})$ specializes to a polynomial in fewer unknowns.
528: For example, the risk polynomial for graded fitness landscapes 
529: is obtained from the specialization
530: ${\mathbf f}(g) = a_{|g|}$. That risk polynomial has
531: degree $n-1$ and is denoted by $\RP(\cG; a_1,\ldots,a_{n-1})$.
532: For instance, $\,\RP(\cG;{\mathbf f})\,$ in
533: Example \ref{ex:rp} specializes to
534: \[
535:    \RP(\cG; a_1,a_2,a_3) =
536:        1 + 2 a_1 + 2 a_2 + 2 a_3
537:      + 3 a_1 a_2 + 4 a_1 a_3 
538:      + 3 a_2 a_3 + 5 a_1 a_2 a_3.
539: \]
540: For constant fitness landscapes 
541: $\, {\mathbf f} \equiv a \,$, the risk polynomial is a polynomial in one unknown $\,a$.
542: It is denoted $\RP(\cG; a)$. In our running example,
543: \[ 
544:    \RP(\cG; a) \, = \, 1 + 6 a + 10 a^2 + 5 a^3. 
545: \]
546: 
547: We now make precise the notion of {\em risk of escape}, which will
548: justify our definition of the risk polynomial. 
549: Our derivation is based on the model 
550: for the dynamics of a replicating population
551: on a fitness landscape studied by
552: Iwasa, Michor and Nowak  \cite{Iwasa2003,Iwasa2004}.
553: See also the work of Wilke \cite{Wilke2003}
554: and the references given therein for approaches
555: to computing fixation probabilities.
556: 
557: 
558: A {\em multistate branching process} \cite{Athreya1972} consists
559: of a set of genotypes along with a fitness landscape and mutation
560: rates between genotypes.  We assume a discrete time process, where
561: in one generation an individual with genotype $g$
562: has a random number of offspring following a Poisson distribution
563: with mean $R_g$.  Some of these offspring may be mutants according to
564: the mutation rates $u_{gh}$.
565: The parameter $R_g$ is the {\em basic
566: reproductive ratio} \cite[Chap.~3]{Nowak2000}.
567: 
568: We assume there is no interaction between individuals; each reproduces
569: at a rate independent of the distribution of the population.
570: Let $\muta{g}{h}{k}$ be the probability
571: that one individual of genotype $g$ has $k$ children of type $h$.  Then, \begin{equation}\label{eq:1}
572: \muta{g}{h}{k}  \, = \, 
573: \frac {(u_{gh}R_g)^k \cdot e^{-u_{gh}R_g}} {k!}.
574: \end{equation}
575: The {\em reproductive fitness} $f_g$ is related to 
576: the reproductive ratio $R_g$ by
577: \begin{equation}
578: \label{Randf}
579:  f_g \, = \,  \frac {R_g} {1-R_g}
580: \qquad \hbox{and} \qquad
581: R_g \, = \, \frac{f_g}{1+f_g}. 
582: \end{equation}
583: 
584: Let $\xi_g$ be the probability of escape  starting with one individual of
585: genotype $g$, so $1 - \xi_g$ is the probability of extinction. 
586: In particular, $\xi_{\hat{1}}$ is the probability that one resistant
587: virus will not become extinct.
588: Each of these probabilities is a function
589: of the mutation rates $u_{gh}$ and the reproductive ratios $R_g$.
590: We assume that the $u_{gh}$ are as in 
591: (\ref{eq:muta}), but with $u_{gg} = 1$.
592: Thus, each escape probability $\xi_g$  can be expressed
593: as a function of the $\mu_e$
594: for $e \in \cE$ and  (using the relation (\ref{Randf}))
595:  the fitness values $f_g$ for $g \in \cG$.
596: 
597: \begin{thm} \label{thm:1}
598: If $\xi_g \ll 1$ for $g \neq \hat{1}$, then
599: the probability of escape on
600: the fitness landscape $\mathbf{f} \in \mathbb{R}^{\cG}$ starting with one
601: individual of wild type $\hat{0}$, satisfies
602: \begin{equation} \label{eq:2}
603:   \xi_{\hat{0}} \quad  \approx  \quad \xi_{\hat{1}} \cdot f_{\hat{0}} \cdot
604:     \prod_{e \in \cE} {\mu_e} \cdot \RP(\cG;\mathbf{f}).
605: \end{equation}
606: \end{thm}
607: 
608: \begin{proof}
609: The probability of extinction
610: satisfies the recursive formula
611: \begin{equation}
612: \label{michorproof}
613:    1-\xi_g \quad = \quad  \prod_{h \supseteq g} \sum_{k=0}^{\infty}
614: (1-\xi_h)^k \cdot 
615:        \muta{g}{h}{k} .
616:   \end{equation}
617: Using (\ref{eq:1}), the right hand side
618: of (\ref{michorproof}) can be rewritten as follows:
619: \begin{equation*}
620: \label{michorproof2}
621:       \prod_{h \supseteq g} 
622:        {\rm exp}({(1-\xi_h)u_{gh}R_g} ) \cdot {\rm exp} ({-u_{gh}R_g}) 
623:      \quad = \quad \exp\left(\sum_{h \supseteq g} -\xi_h u_{gh} R_g\right). 
624: \end{equation*}
625: We conclude that
626: \[
627:    \log(1-\xi_g) \quad = \quad - \sum_{h\supseteq g} \xi_h u_{gh} R_g \quad
628: \qquad \hbox{for all} \,\, g \in \cG.
629: \]
630: Under the assumption that $\xi_g \ll 1$ for $g \neq \hat{1}$, we can 
631: linearize the logarithms using
632: the relation $\,\log(1-\xi_g) \approx -\xi_g$. This implies,
633: for $\, g \in \cG \backslash \{\hat{1}\}$,
634: \begin{eqnarray*}
635:    \xi_g \quad  \approx  & R_g \cdot \sum_{h \supseteq g} \xi_h u_{gh} \\
636:   \quad    = & \frac {R_g} {1-R_g u_{gg}} \cdot \sum_{h \supset g} \xi_h u_{gh} \\
637:      = & f_g \cdot \sum_{h \supset g} \xi_h u_{gh}.
638: \end{eqnarray*}
639: 
640: The theorem now
641: follows by setting $g = \hat{0}$ and expanding the last equation recursively.
642: Here we are using the fact  from (\ref{eq:muta}) that the
643: product of the $u_{gh}$ over any
644: chain from $\hat{0}$ to $\hat{1}$ in $\cG$ 
645: equals $\,\prod_{e \in \cE} \mu_e$.
646: \end{proof}
647: 
648: The typical situation of interest is a fitness landscape for which
649: only the escape state has a basic reproductive ratio greater than one,
650: i.e.,
651: \[
652:    R_{\hat{1}} > 1 \qquad \mbox{and} \qquad 
653:    R_g < 1 \quad \mbox{for all} \quad g \not= \hat{1}.
654: \]
655: When the positive numbers $R_g$ are very small for 
656:  $g  \in \cG \backslash \{\hat{1}\}$ then the approximation 
657: (\ref{eq:2}) is valid, and
658: it shows the crucial role that the risk polynomial
659: $\RP(\cG;\mathbf{f})$ plays in 
660: assessing the risk of escape from the wild type $\hat{0}$
661: to the escape state $\hat{1}$.
662: The theorem implies that the risk of escape 
663: of a population of $N$ wild type viruses   
664: is $(1-\xi_{\hat{0}})^N$. In Section~\ref{sec:discussion} we
665: discuss the situation in which the population is not homogeneous
666: at the time of intervention.
667: 
668: \smallskip
669: 
670: The risk of escape is an important quantity in analyzing the
671: invasiveness of pathogens and in assessing
672: the success probability of medical interventions such as
673: chemotherapy. However, putting this concept into practice
674: depends on our ability to actually compute the risk polynomial. 
675: It turns out that methods from algebraic combinatorics lead
676: to efficient algorithms for this task.
677: In the Appendix, several methods are presented in detail.
678: 
679: \begin{figure}
680: \centering
681: \includegraphics[width=.6\textwidth]{fence}
682: %\begin{verbatim}
683: %                      7 8 9 101112
684: %                      |/|/|/|/|/|
685: %                      1 2 3 4 5 6
686: %\end{verbatim}
687: \caption{Example of an event poset whose general risk polynomial 
688: is of degree 11 in 375 unknowns.}
689: \label{fig:poset}
690: \end{figure}
691: 
692: Our method of choice from a practical perspective
693: relies on computing linear extensions 
694: of the event poset $\cE$ (Theorem~\ref{thm:linearExtensions}, Appendix). 
695: Our software implementation is available at
696: \url{http://bio.math.berkeley.edu/riskpoly/} .  
697: For an example of the efficiency of the software, 
698: let $\cE$ be the poset in Figure~\ref{fig:poset} 
699: on $n=12$ events with cover relations
700: $i < 6 + i$ for $1 \leq i \leq 6$ and $i < 7 + i$ for $1 \leq i \leq 5$.
701: Here the genotype lattice $\cG$ consists of $375$ genotypes.
702: The risk polynomial $\RP(\cG; {\mathbf f})$ is a polynomial
703: of degree 11 in 375 unknowns $f_g$. 
704: This polynomial has 224,750,298 monomials in the 375
705: unknowns, but we represent it as a sum of
706: 2,702,765 products, one for each 
707: linear extension of the event poset $\cE$.
708: Our software takes about ten seconds to compute
709: this representation of $\RP(\cG; {\mathbf f})$.
710: The result takes up 200MB of disk space.
711: 
712:  The univariate risk polynomial for this example is
713: \begin{multline*}
714: 1 + 375a + 19088a^2 + 324498 a^3 + 2610169 a^4 + 11729394 a^5 +
715: 32080336 a^6 +\\ 55597909 a^7 + 61448965 a^8 + 42020208 a^9 + 16216590
716: a^{10} + 2702765a^{11}.
717: \end{multline*}
718: Thus, exact symbolic computations, as opposed to numerical approximations,
719: may be necessary and feasible when one is interested in
720: assessing the risk of escape in applications like
721: the one described in Section \ref{sec:apply} below.
722: 
723: 
724: 
725: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
726: 
727: \section{Distributive lattices from Bayesian networks}   \label{sec:bayes}
728: 
729: In this section, we present a family of statistical models that naturally
730: gives rise to distributive lattices.  This statistical interpretation
731: provides a method for deriving the genotype lattice $\cG$ directly from data.
732: The basic idea is to estimate the poset structure on $\cE$ from
733: observed genotypes, by applying model selection techniques
734: to a range of Bayesian networks, and to
735:  define $\cG$ as the set of all genotypes with
736: non-zero probability in the model. 
737: 
738: 
739: We first make precise the derivation of a genotype space
740: from a statistical model.
741: Let $\cE$ be an unordered set of $n$ genetic events. 
742: The events are labeled by $1,2,\ldots,n$. Subsets
743: of $\cE$ are identified with binary strings $g \in \{0,1\}^n$.
744: They are the possible genotypes.
745: We consider binary random variables $X_{\cE} = (X_1, \dots, X_n)$,
746: where $X_e = 1$ indicates the occurrence of event $e$. 
747: Let $\Delta$ denote the $(2^n-1)$-dimensional simplex
748: of probability distributions on $ \{0,1\}^n$. A {\em statistical
749: model} for $X_{\cE}$ is a map $\,p  \colon \Theta \to \Delta$,
750: where $\Theta$ is some parameter space.
751: The $g$-th coordinate of $p$, denoted
752: $p_g$,  is the
753: probability of genotype $g \in \{0,1\}^n$ under the model $p$.
754: The \emph{induced genotype space} of the model
755: $\,p \colon \Theta \to \Delta\,$ is the set
756: $\cG_p $ of all strings $\, g \in \{0,1\}^n \,$ such that
757: $p_g$ is not the zero function on $\Theta$.
758: We regard $\cG_p$ as a poset ordered by inclusion.
759: 
760: Now consider a directed acyclic graph on the set of events $\cE$. 
761: We will also call this graph $\cE$.
762: The {\em Bayesian network model}, or directed acyclic graphical model,
763: defined by $\cE$ is the family of joint distributions
764: that factor as
765: \begin{equation*}   \label{eqn:bayesnet}
766:    \Pr(X_1, \dots, X_n) 
767:      \quad = \quad \prod_{e \in \cE} \Pr(X_e \mid X_{\pa(e)}),
768: \end{equation*}
769: where $\pa(e)$ denotes the set of parents of $e$ in $\cE$.
770: Equivalently, a Bayesian network is specified by a  set of conditional independence
771: statements. Each node is independent of its ancestors given its parents.
772: See \cite{Lauritzen1996} for an introduction  to the relevant
773: statistical theory and \cite{Garcia2005} for an algebraic perspective.
774: 
775: The parameters for a Bayesian network are specified by
776: providing, for each event $e \in \cE$,
777: a $2^{|\pa(e)|} \times 2$
778: matrix $\theta^e$. The matrix entries are
779: \[   \theta^e_{g_{\pa(e)},g_e} 
780:      \quad = \quad \Pr\left( X_e = g_e \mid X_{\pa(e)} 
781:      = g_{\pa(e)} \right),  
782: \]
783: for  $ \, g_{\pa(e)} \in \{0,1\}^{\pa(e)},
784: \, g_e \in \{0,1\}$.
785: These conditional probabilities satisfy
786: \begin{equation}
787: \label{SumToOne}
788: \theta^e_{g_{\pa(e)},0} \geq 0\,,\,\,\,
789: \theta^e_{g_{\pa(e)},1} \geq 0 \,\, \quad \hbox{and} \,\,\quad
790:   \theta^e_{g_{\pa(e)},0}\, +\, \theta^e_{g_{\pa(e)},1} \,\,= \,\,1 . 
791: \end{equation}
792:     
793: Set $d = \sum_{e \in \cE} 2^{|\pa(e)|}$ and $\Theta = [0,1]^d$.
794: The points in the cube $\Theta$ are identified with $n$-tuples
795: of matrices $\,\theta = (\theta^e \,|\, e \in \cE)\,$ as above.
796: The {\em general Bayesian network} is the polynomial map
797: $\, p  \colon  \Theta \, \rightarrow \,\Delta \,$
798:  whose coordinates are
799:   \begin{equation}   \label{eqn:bayesfactor}
800:    p_g(\theta) \,\,\,= \,\,\, \prod_{e \in \cE} \theta^e_{g_{\pa(e)}, g_e}.
801: \end{equation}
802: The general Bayesian network on $\cE$ induces the
803: genotype space $\cG_p = \{0,1\}^n$, the Boolean lattice on $\cE$.
804: Indeed, the factorization~(\ref{eqn:bayesfactor}) implies
805: that no genotype $g \in \{0,1\}^n$ has probability zero for all
806: parameter values. 
807: 
808: To obtain other genotype spaces, we replace the
809: cube $\Theta = [0,1]^d$ by one of its faces, as follows.
810: For each event $e \in \cE$ consider a Boolean function
811: $\,\beta_e \colon \{0,1\}^{\pa(e)} \rightarrow \{0,1\}$.
812: If $\beta_e(g_e) = 0$ then
813: the row of the $2^{|\pa(e)|} \times 2$-matrix $\theta^e$ 
814: indexed by the genotype $g$ is fixed
815: to be the vector $(1,0)$;
816: otherwise that row remains indeterminate
817: subject to the constraints (\ref{SumToOne}).
818: Let $\Theta^\beta$ denote the face of $\Theta$
819: determined by these requirements
820: and  $\,p^\beta \colon \Theta^\beta \,\rightarrow \,\Delta\,$
821:  the restriction of the polynomial map $p$ to $\Theta^\beta$.
822: The resulting model is the Bayesian network on $\cE$ constrained by the
823: Boolean functions $\beta^e$.
824: 
825: If all Boolean functions $\beta^e$ are disjunctions
826: then we get the {\em disjunctive Bayesian network} on $ \cE$.
827: In this model, an event $e$ can only occur if at least one
828: of its parent events has already occurred.
829: If all Boolean functions $\beta^e$ are conjunctions
830: then we get the {\em conjunctive Bayesian network} on $\cE$.
831: In this model, an event $e$ can only occur if all
832: of its parent events have already occurred.
833: These restricted Bayesian network models induce 
834:  interesting genotype spaces. 
835: Our main result in this section concerns the conjunctive case.
836: 
837: 
838: We regard the given directed acyclic graph $\cE$ as a poset by setting $e_1
839: \leq e_2$ if there exists a path from $e_1$ to $e_2$.
840: We write $\,p^{\rm conj} \colon [0,1]^n \rightarrow \Delta\,$
841: for the conjunctive Bayesian network on $\cE$,
842: since it has precisely $n$ free parameters.
843: 
844: \begin{thm} \label{fromBNtoDL}
845: The genotype space induced by the conjunctive
846: Bayesian network on $\cE$ is the distributive lattice of order ideals
847: in $\cE$, i.e., $\cG_{p^{\rm conj}} = J(\cE)$.
848: \end{thm}
849: 
850: \begin{proof}
851: The possible genotypes $g $ are binary strings whose coordinates $g_e$
852: indicate whether or not the event $e$ has occurred. If $p$ is
853: any of the Bayesian network models discussed above, then
854:  (\ref{eqn:bayesfactor}) implies that $g \in \cG_p$ if and only if
855: each $\thet{g}$ is non-zero. Consider now the
856:  conjunctive model $\,p = p^{\rm conj}$.
857: Here, the conditional probability
858:  $\thet{g}$ is non-zero if and
859: only if $g_e = 1$ implies $g_{\pa(e)} = (1, \dots, 1)$.  This is
860: precisely the condition for $g$ to be an order ideal in $\cE$.
861: Thus $\cG_p$ is the distributive lattice of order ideals of $\cE$.
862: \end{proof}
863: 
864: The following example illustrates Theorem \ref{fromBNtoDL},
865: and it compares the genotype spaces induced by 
866: the disjunctive and the conjunctive  Bayesian network.
867: The former is not a distributive lattice, 
868: but the latter always is.
869: 
870: \begin{ex} \label{ex:conjunctive} \rm
871: Let $\cE$ be the event poset in Figure~\ref{fig:ex1}.
872: The general Bayesian network model defined by $\cE$
873: is parametrized by the following four matrices:
874: 
875: \vspace{1ex}
876: \parbox{3.5cm}{ \centering
877: $
878:   \begin{array}{l}
879:     \theta^1 = 
880:     \left( \begin{array}{cc}
881:       a & 1-a 
882:     \end{array} \right), \\[2ex] 
883:     \theta^2 =
884:      \left( \begin{array}{cc}
885:       b & 1-b 
886:     \end{array} \right),
887:   \end{array}
888: $
889: }
890: \parbox{4.5cm}{ \centering
891: $
892:   \theta^3 = \left(
893:   \begin{array}{cc} 
894:    c_{00} & 1 - c_{00} \\
895:    c_{01} & 1 - c_{01} \\
896:    c_{10} & 1 - c_{10} \\
897:    c_{11} & 1 - c_{11}
898:   \end{array} \right),
899: $
900: }
901: \parbox{4cm}{ \centering
902: $
903:     \theta^4 =
904:      \left( \begin{array}{cc}
905:       d_0 & 1 - d_0 \\
906:       d_1 & 1 - d_1 
907:     \end{array} \right).   
908: $
909: }\\
910: \vspace{1ex}
911: 
912: \noindent The map $p \colon [0,1]^8 \to \Delta$ has coordinates
913: \begin{eqnarray*} &
914:  p_{0000} \, = \, a b c_{00} d_0, &
915:  p_{0001} \, = \, a b c_{00} (1-d_0) , \\ &
916:  p_{0010} \, = \, a b (1-c_{00}) d_0, &
917:  p_{0011} \, = \, a b (1-c_{00}) (1-d_0), \\ &
918:  p_{0100} \, = \, a (1-b) c_{01} d_1, &
919:  p_{0101} \, = \, a (1-b) c_{01} (1-d_1), \\ &
920:  p_{0110} \, = \, a (1-b) (1-c_{01}) d_1, &
921:  p_{0111} \, = \, a (1-b) (1-c_{01}) (1-d_1), \\ &
922:  p_{1000} \, = \, (1-a) b c_{10} d_0, &
923:  p_{1001} \, = \, (1-a) b c_{10} (1-d_0), \\ &
924:  p_{1010} \, = \, (1-a) b (1-c_{10}) d_0, &
925:  p_{1011} \, = \, (1-a) b (1-c_{10}) (1-d_0), \\ &
926:  p_{1100} \, = \, (1-a) (1-b) c_{11} d_1, &
927:  p_{1101} \, = \, (1-a) (1-b) c_{11} (1-d_1), \\ &
928:  p_{1110} \, = \, (1-a) (1-b) (1-c_{11}) d_1, &
929:  p_{1111} \, = \, (1\!-\!a) (1\!-\!b) (1 \! - \! c_{11}) (1 \! - \! d_1).
930: \end{eqnarray*}
931: This model induces the Boolean lattice $\{0,1\}^4$ as genotype space.
932: 
933: The disjunctive Bayesian network is the 
934: six-dimensional  submodel  obtained by setting
935: $\,c_{00}=1 \,$ and $ \,d_0=1 $. This substitution implies
936: \[
937: p_{0001} \,=\, p_{0010} \, = \, p_{0011} \,=\,
938: p_{1001}  \,=\, p_{1011} \,\, = \,\, 0.
939: \]
940: The genotype space $\,\cG_{p^{\rm disj}}$
941: consists of the remaining eleven strings in
942: $\{0,1\}^4$. Note that 
943: $\,\cG_{p^{\rm disj}} \,$ is not
944: a  lattice because it is not
945: closed under intersections. For instance,
946: $\,1010$ and $ 0110 $ are in $ \cG_{p^{\rm disj}} \,$ 
947: but $\,0010 =  1010\,\cap \, 0110
948:  \not\in \cG_{p^{\rm disj}} $.
949: 
950: The conjunctive Bayesian network is the 
951: four-dimensional  submodel  obtained by setting
952: $\, c_{00}= c_{01}= c_{10}= d_0 = 1$. The
953: remaining eight non-zero probabilities are
954: indexed by the eight genotypes in Figure~\ref{fig:ex1}:
955: \begin{eqnarray*}
956: &  p_{0000} \, = \, a b   \, ,\,\,&
957:  p_{0100} \, = \, a (1-b)  d_1 \, ,\,\,\\
958: & p_{0101} \, = \, a (1-b) (1-d_1) \, ,\,\, &
959:  p_{1000} \, = \, (1-a) b \,,\,\,\, \\
960: & p_{1100} \, = \, (1-a) (1-b) c_{11} d_1 \, ,\,\, &
961:  p_{1101} \, = \, (1-a) (1-b) c_{11} (1-d_1) \, ,\,\, \\
962: & p_{1110} \, = \, (1-a) (1-b) (1-c_{11}) d_1 \, ,\,\, &
963:  p_{1111} \, = \, (1\! - \! a) (1\! - \! b)
964:  (1 \! - \! c_{11}) (1 \! - \! d_1).
965: \end{eqnarray*}
966: \end{ex}
967: 
968: 
969: If $\cE$ is a directed forest, i.e.,
970: if every $e \in \cE$ has at most one parent,
971: then we can augment $\cE$ to a tree $\cE^T$
972: by adding an auxiliary root node $0$
973: which points to the roots (edges with no parents) of the forest.
974: On the resulting tree $\cE^T$ we consider the
975: {\em mutagenetic tree model} of \cite{Beerenwinkel2005f, Desper1999}.
976: 
977: \begin{prop}   \label{prop:forest}
978: If $\cE$ is a directed forest then the following three statistical
979: models coincide: the disjunctive Bayesian network on $\cE$,
980: the conjunctive Bayesian network on $\cE$, and the
981:  mutagenetic tree model on $\cE^T$.
982: \end{prop}
983: 
984: \begin{proof}
985: The disjunctive and the conjunctive networks 
986: coincide because they are defined by the same
987: specializations of the parameters $\,\theta^e$.
988: The identification with the mutagenetic tree model follows from
989: \cite[Thm.~14.6]{Beerenwinkel2005c}.
990: \end{proof}
991: 
992: Mutagenetic tree models can be learned from observed data by an efficient
993: combinatorial algorithm.
994: With appropriate edge weights that depend on the pairwise
995: probabilities of events, a mutagenetic tree can be obtained as the maximum
996: weight branching rooted at 0 in the complete graph on $\{0,\dots,n\}$; see
997: \cite{Desper1999}. This gives an efficient method for learning
998: the poset $\cE$, and hence the genotype lattice $\cG = J(\cE)$, from
999: data. It would be interesting to extend this model selection
1000: technique to arbitrary  conjunctive Bayesian networks.
1001: 
1002: %If $\cE$ is a directed forest, the algebraic geometry of the
1003: %Bayesian network model is well-understood and the risk polynomial
1004: %can be derived directly from the algebraic invariants of the model.
1005: %Details of this algebraic statistical perspective are given
1006: %in the Appendix.
1007: 
1008: 
1009: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1010: 
1011: \section{Applications to HIV drug resistance}   \label{sec:apply}
1012: 
1013: We investigate the development of resistance during treatment of HIV
1014: infected patients with two different PIs. Consider the seven genetic events
1015: \[
1016:    \cE \, = \, \left\{ \mbox{K20R,~M36I,~M46I,~I54V,~A71V,~V82A,~I84V} \right\},
1017: \]
1018: where K20R stands for the amino acid change from lysine (K) to arginine (R) 
1019: at position 20 of the protease chain, etc. 
1020: The occurrence of these mutations confers broad cross-resistance to the 
1021: entire class of PIs. Appearance of the virus with
1022: all 7 mutations renders most of the PIs ineffective for subsequent 
1023: treatment.  We analyze the risk of reaching this escape state under 
1024: therapy with the PIs ritonavir (RTV) and indinavir (IDV)
1025: \cite{Condra1996, Molla1996}.
1026: 
1027: 
1028: We use mutagenetic trees for estimating preferred mutational pathways
1029: and for defining genotype lattices.
1030: For both drugs, a tree $\cE^T$ is learned from genotypes derived 
1031: from patients under the respective therapy. We used 112 and 691 samples
1032: from the Stanford HIV Drug Resistance Database \cite{Rhee2003} 
1033: for ritonavir and indinavir, respectively. 
1034: Figure~\ref{fig:trees} shows the inferred mutagenetic trees.
1035: The models indicate that the evolution of ritonavir
1036: resistance is partly a linear process, whereas indinavir resistance
1037: develops in a less ordered fashion. This is consistent with
1038: previous studies \cite{Condra1996, Molla1996}.
1039: The genotype lattices $\cG$  have size
1040: $16 $  for ritonavir and $45$ for indinavir.
1041: We study the risk polynomials on these 
1042: lattices under different fitness landscape models.  
1043: 
1044: \begin{figure}[!tpb]
1045: \centering
1046: \begin{tabular}{ccc}
1047: \tree{\root{0}}{
1048: 	\tree{\node{V82A}}{
1049: 		\tree{\node{M46I}}{
1050: 			\node{I84V}
1051: 			}
1052: 		\tree{\node{I54V}}{
1053: 			\tree{\node{A71V}}{
1054: 				\tree{\node{K20R}}{
1055: 					\node{M36I}
1056: 					}
1057: 				}
1058: 			}
1059: 		}
1060: 	}
1061: & ~~~~~~~~~~~~~ &
1062: \tree{\root{0}}{
1063: 	\tree{\node{M36I}}{
1064: 		\node{K20R}
1065: 		}
1066: 	\tree{\node{V82A}}{
1067: 		\node{I54V}
1068: 		\node{A71V}
1069: 		}
1070: 	\tree{\node{M46I}}{
1071: 		\node{I84V}
1072: 		}
1073: 	}\\[3ex]
1074: (a) & & (b)
1075: \end{tabular}
1076: \caption{Mutagenetic tree $\cE^T$ for the development of resistance 
1077: to (a) ritonavir and (b) indinavir in the HIV-1 protease.
1078: The event poset $\cE$ is obtained by removing the 
1079: root node ``0''.}
1080: \label{fig:trees} 
1081: \end{figure}
1082: 
1083: 
1084: 
1085: For the constant fitness landscape on $\,\cG \backslash 
1086: \{\hat{0}, \hat{1}\}$, we obtain
1087: \begin{eqnarray*}
1088:   \RP_{\rm RTV}(a) &=& 15a^6+70a^5+131a^4+124a^3+61a^2+14a+1, \\
1089:   \RP_{\rm IDV}(a) &=& 420a^6+1470a^5+1970a^4+1250a^3+372a^2+43a+1.
1090: \end{eqnarray*}
1091: Thus, the risk of developing all seven PI resistance mutations 
1092: is higher under indinavir therapy than under ritonavir:
1093: $  \RP_{\rm IDV}(a) >   \RP_{\rm RTV}(a)$ for $a > 0$.
1094: Intuitively, the risk under ritonavir is lower because
1095: the mutations must occur in a certain order. Likewise,
1096: the high risk under indinavir results from many mutations occurring
1097: independently, which gives rise to a large genotype lattice and to many
1098: mutational pathways from the wild type to the escape state. 
1099: 
1100: More realistic fitness landscapes may be derived by modeling viral fitness
1101: as a function of drug concentration. We follow the approach pursued
1102: in \cite{Stilianakis1997a} and use a simple saturation function for
1103: this dependency. Specifically, we assume viral fitness to be the following
1104: function of drug concentration $D$,  
1105: \begin{equation}   \label{eqn:drugfitness}
1106:    f_g(D) \quad = \quad \frac{\phi_g}{1 + D/r_g},
1107: \end{equation}
1108: where $\phi_g$ denotes the fitness of genotype $g$ in the absence of drug
1109: and $r_g$ the IC$_{50}$ value of $g$, i.e., the drug concentration necessary
1110: to inhibit viral replication \emph{in vitro} by 50\%. The IC$_{50}$ value 
1111: is a measure of resistance. We will assume
1112: throughout that all $\phi_g \equiv \phi$ are equal.
1113: If we assume, in addition,
1114: that the resistance landscape is constant on $\cG \backslash \{\hat{0},\hat{1}\}$,
1115: with $r_g \equiv r$,
1116: then the substitution (\ref{eqn:drugfitness}) turns
1117: the risk polynomial into a rational function in $\phi$, $D$, and $r$.
1118: For example, for ritonavir, this rational function is
1119: \[
1120:    \frac{(15\phi^2r^2+10\phi Dr+10\phi r^2+D^2+2Dr+r^2)(\phi r+D+r)^4}{(D+r)^6}.
1121: \]
1122: 
1123: \begin{figure}
1124: \centering
1125: \includegraphics[width=.8\textwidth,angle=270]{gfl}
1126: \caption{Graded resistance landscapes for ritonavir (RTV, bullets)
1127: and indinavir (IDV, squares). Resistance is quantified as the
1128: drug concentration necessary to inhibit viral replication \emph{in vitro}
1129: by 50\% (IC$_{50}$).}
1130: \label{fig:gfl}
1131: \end{figure}
1132: 
1133: In general, the IC$_{50}$ values $r_g$ are distinct and can be determined
1134: experimentally for some genotypes
1135: by phenotypic resistance testing \cite{Walter1999},
1136: and may be predicted for all genotypes using regression techniques
1137: \cite{Beerenwinkel2003d}.
1138: PI phenotypic resistance data suggests a graded resistance landscape;
1139: see \cite{Berkhout1999} and \cite[Tab.~3]{Condra1996}.
1140: Hence, we estimate the resistance $r \in \mathbb{R}^8$
1141: for ritonavir and indinavir by defining $r_k$ 
1142: as the mean predicted IC$_{50}$ of all 
1143: genotypes of rank~$k$. The resulting resistance landscapes
1144: are shown in Figure~\ref{fig:gfl}.
1145:   
1146: \begin{figure}
1147: \centering
1148: \includegraphics[height=\textwidth,angle=270]{drugfit}
1149: \caption{Drug dependent risk. The log of the risk polynomial
1150: for ritonavir (a) and indinavir (b) 
1151: is displayed as a function of plasma drug concentration $D$. Marked
1152: values denote mean trough ($C_{\min}$) and peak ($C_{\max}$)
1153: levels observed in clinical studies. The parameter $\phi$ is
1154: the relative fitness of mutants as compared to the wild type
1155: in the absence of drug.}
1156: \label{fig:drugfit}
1157: \end{figure}
1158: 
1159: The graded risk polynomials $\RP(a_1,a_2,a_3,a_4,a_5,a_6)$ have 64 terms. After
1160: substituting $a_k = \phi/(1 + D/r_k)$, we obtain rational risk functions in $D$
1161: with parameter $\phi$. Figure~\ref{fig:drugfit} illustrates the dependency of
1162: the risk on drug concentration for three different values of $\phi$. For both
1163: drugs we indicate published mean plasma trough ($C_{\min}$) and peak ($C_{\max}$) levels 
1164: observed in clinical settings.
1165: 
1166: This example illustrates how the risk
1167: polynomial can be used to study viral escape as a function of
1168: different parameters. For instance, given a pharmacokinetics model
1169: of antiretroviral drug therapy, we can compute
1170: the risk of developing resistance after a patient has missed a dose. 
1171: Thus, our mathematical framework may help in designing robust drug combinations.
1172: 
1173: 
1174: 
1175: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1176: 
1177: \section{Discussion}   \label{sec:discussion}
1178: 
1179: We have presented a computational framework for assessing
1180: the risk of escape of an evolving population of pathogens.
1181: The risk of escape is the probability that the population
1182: reaches an escape state before extinction.
1183: In virus transmissions, for example, this probability is 
1184: the chance of survival in the new host. In the situation
1185: of antiretroviral therapy, the risk of escape is the
1186: probability of therapy failure due to the development
1187: of drug resistance.
1188: 
1189: The general setup we consider for computing the risk of escape
1190: includes an event poset, a fitness landscape on its induced
1191: genotype lattice, and a branching process on this lattice.
1192: The event poset $\cE$ consists of all mutational events that can
1193: occur and encodes the constraints which apply to their order of 
1194: occurrence. From this structure the genotype space $\cG$ is obtained
1195: by considering all mutational pathways that respect the order
1196: constraints. This natural construction endows $\cG$ with the
1197: mathematical structure of a distributive lattice.
1198: The risk polynomial, the crucial factor in
1199: computing the risk of escape, turns out to coincide with the chain
1200: polynomial of the genotype lattice. We have presented
1201: methods from algebraic combinatorics that exploit
1202: this connection and that result in efficient algorithms.     
1203: 
1204: The space of genotypes may also be inferred from 
1205: observed genotype data using statistical model selection tools.
1206: We have identified a class of Bayesian network models,
1207: the conjunctive Bayesian networks, whose support induces
1208: a genotype lattice. 
1209: Mutagenetic tree models arise as important special cases.
1210: Here, both statistical model selection
1211: and risk computation are particularly efficient, and readily available
1212: with existing software \cite{Beerenwinkel2005b} 
1213: coupled with our implementation of the linear extensions
1214: method (Theorem~\ref{thm:linearExtensions}, Appendix). 
1215: 
1216: \smallskip
1217: 
1218: %The risk polynomial is a crucial factor in assessing the risk
1219: %of escape from strong selective pressure experienced by
1220: %a population evolving according to a multitype branching process. 
1221: We have focused on the dependency of the risk polynomial
1222: on the fitness landscape and considered throughout a homogeneous
1223: wild type population prior to intervention. However, the risk of
1224: escape is calculated  similarly for a quasispecies 
1225: distribution at the time of intervention. In fact,
1226: this involves computing the risk polynomial of
1227: the prior fitness landscape \cite{Iwasa2003}.
1228: In contrast, the branching process
1229: model can not account  
1230: for recombination, horizontal gene transfer, or frequency 
1231: dependent selection, since evolution is assumed to take place
1232: in multiple lineages independently. 
1233: 
1234: The main challenge in using our method to compute the risk
1235: of escape from antiretroviral therapy lies in accurately
1236: modeling the fitness landscape. 
1237: The dependency (\ref{eqn:drugfitness}) of the fitness on drug 
1238: concentration may be improved by experimentally determined
1239: viral replicative capacities in the
1240: absence of drugs. An alternative approach to derive a 
1241: fitness landscape for HIV-1 proteases is based on estimating
1242: the binding affinity of the drug to the mutant protease, and
1243: the mutant's ability to cleave its natural substrates
1244: \cite{Rosin1999a}. 
1245: These calculations are based on simplified molecular
1246: modeling techniques. 
1247: The resulting fitness landscape does not account for different
1248: drug levels, but it is independent of experimental 
1249: resistance and fitness data.
1250:  
1251: Escape from indinavir and ritonavir therapy may in some cases
1252: involve mutations other than the seven we considered, although those
1253: are the most frequent mutations observed after therapy failure
1254: \cite{Condra1996,Molla1996}.
1255: On the other hand, viral escape might be accomplished with
1256: genotypes that harbor fewer than all of the mutations.
1257: Thus it would be desirable to compute the risk of reaching
1258: any of several escape states, rather than only the $11\cdots 1$ type.
1259: This computation will involve similar techniques to those presented
1260: in Section~\ref{sec:branching} and the Appendix.
1261: 
1262: Finally, the PIs form only one out of four distinct
1263: classes of antiretroviral drugs
1264: that are in current clinical use. The standard of care is combination
1265: therapy with at least three different drugs from two different drug 
1266: classes. Modeling the fitness landscape of combination therapy in
1267: terms of viral drug resistance and drug exposure is even more
1268: challenging, but can eventually help in designing optimal 
1269: antiretroviral therapies.  Algebraic combinatorics offers
1270: tools for the mathematical analysis of these 
1271: biomedical problems.
1272: 
1273: 
1274: 
1275: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1276: 
1277: \section*{Acknowledgements} 
1278: 
1279: Niko Beerenwinkel is supported by Deutsche Forschungsgemeinschaft under
1280: grant No.\ BE~3217/1-1.
1281: Nicholas Eriksson and Bernd Sturmfels are supported by
1282: the U.S.~National Science Foundation,
1283: under the grants  EF-0331494 and DMS-0456960
1284: respectively, and by the DARPA program
1285: {\em Fundamental Laws in Biology} (HR0011-05-1-0057).
1286: 
1287: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1288: 
1289: \bigskip
1290: 
1291: \bibliographystyle{plain}
1292: 
1293: \begin{thebibliography}{10}
1294: 
1295: \bibitem{Athreya1972}
1296: K.B. Athreya and P.E. Ney.
1297: \newblock {\em Branching processes}.
1298: \newblock Dover, Mineola, New York, 1972.
1299: 
1300: \bibitem{Beerenwinkel2003d}
1301: N. Beerenwinkel, M. D{\"a}umer, M. Oette, K. Korn, D. Hoffmann,
1302:   R. Kaiser, T. Lengauer, J. Selbig, and H. Walter.
1303: \newblock Geno2pheno: {E}stimating phenotypic drug resistance from {HIV}-1
1304:   genotypes.
1305: \newblock {\em Nucl. Acids Res.}, 31(13):3850--3855, Jul 2003.
1306: 
1307: \bibitem{Beerenwinkel2005c}
1308: N. Beerenwinkel and M. Drton.
1309: \newblock Mutagenetic tree models.
1310: \newblock In L.~Pachter and B.~Sturmfels, editors, {\em Algebraic Statistics
1311:   for Computational Biology}, chapter~14, pages 278--290. Cambridge University
1312:   Press, Cambridge, UK, 2005.
1313: 
1314: \bibitem{Beerenwinkel2005f}
1315: N. Beerenwinkel, J. Rahnenf{\"u}hrer, M. D{\"a}umer, D.
1316:   Hoffmann, R. Kaiser, J. Selbig, and T. Lengauer.
1317: \newblock Learning multiple evolutionary pathways from cross-sectional data.
1318: \newblock {\em J. Comput. Biol.}, 12(6):584--598, 2005.
1319: 
1320: \bibitem{Beerenwinkel2005b}
1321: N. Beerenwinkel, J. Rahnenf{\"u}hrer, R. Kaiser, D. Hoffmann,
1322:   J. Selbig, and T. Lengauer.
1323: \newblock Mtreemix: a software package for learning and using mixture models of
1324:   mutagenetic trees.
1325: \newblock {\em Bioinformatics}, 21(9):2106--2107, May 2005.
1326: 
1327: \bibitem{Berkhout1999}
1328: B.~Berkhout.
1329: \newblock {HIV}-1 evolution under pressure of protease inhibitors: Climbing the
1330:   stairs of viral fitness.
1331: \newblock {\em J. Biomed. Sci.}, 6:298--305, 1999.
1332: 
1333: \bibitem{Bonhoeffer2000}
1334: S. Bonhoeffer, C. Chappey, N.T. Parkin, J.M. Whitcomb, and C.J. Petropoulos.  
1335: \newblock Evidence for Positive Epistasis in HIV-1. 
1336: \newblock {\em Science}, 306:1547--1550, 2004.  
1337: 
1338: \bibitem{brightwell}
1339: G. Brightwell and P. Winkler.
1340: \newblock Counting linear extensions.
1341: \newblock {\em Order}, 8(3):225--242, 1991.
1342: 
1343: \bibitem{Clavel2004}
1344: F. Clavel and A.J. Hance.
1345: \newblock H{IV} drug resistance.
1346: \newblock {\em N. Engl. J. Med.}, 350(10):1023--1035, Mar 2004.
1347: 
1348: \bibitem{Condra1996}
1349: J.H. Condra, D.J. Holder, W.A. Schleif, O.M. Blahy, R.M. Danovich, L.J.
1350:   Gabryelski, D.J. Graham, D.~Laird, J.C. Quintero, A.~Rhodes, H.L. Robbins,
1351:   E.~Roth, M.~Shivaprakash, T.~Yang, J.A. Chodakewitz, P.J. Deutsch, R.Y.
1352:   Leavitt, F.E. Massari, J.W. Mellors, K.E. Squires, R.T. Steigbigel,
1353:   H.~Teppler, and E.A. Emini.
1354: \newblock Genetic correlates of in vivo viral resistance to indinavir, a human
1355:   immunodeficiency virus type 1 protease inhibitor.
1356: \newblock {\em J. Virol.}, 70(12):8270--8276, 1996.
1357: 
1358: \bibitem{Desper1999}
1359: R.~Desper, F.~Jiang, O.P. Kallioniemi, H.~Moch, C.H. Papadimitriou, and A.A.
1360:   Sch{\"a}ffer.
1361: \newblock Inferring tree models for oncogenesis from comparative genome
1362:   hybridization data.
1363: \newblock {\em J. Comput. Biol.}, 6(1):37--51, 1999.
1364: 
1365: \bibitem{ehrenborg1996}
1366: R. Ehrenborg.
1367: \newblock On posets and {H}opf algebras.
1368: \newblock {\em Adv. Math.}, 119(1):1--25, 1996.
1369: 
1370: \bibitem{Garcia2005}
1371:  L.~Garcia, M.~Stillman, and B.~Sturmfels.
1372: \newblock Algebraic geometry of {B}ayesian networks.
1373: \newblock {\em J. Symbol. Comput.}, 39:331--355, 2005.
1374: 
1375: \bibitem{Iwasa2003}
1376: Y. Iwasa, F. Michor, and M.A. Nowak.
1377: \newblock Evolutionary dynamics of escape from biomedical intervention.
1378: \newblock {\em Proc. Biol. Sci.}, 270(1533):2573--2578, Dec 2003.
1379: 
1380: \bibitem{Iwasa2004}
1381: Y. Iwasa, F. Michor, and M.A. Nowak.
1382: \newblock Evolutionary dynamics of invasion and escape.
1383: \newblock {\em J. Theor. Biol.}, 226(2):205--214, Jan 2004.
1384: 
1385: %\bibitem{Kimmel2002}
1386: %M. Kimmel and D.E. Axelrod.
1387: %\newblock {\em Branching Processes in Biology}.
1388: %\newblock Springer, 2002.
1389: 
1390: \bibitem{Lauritzen1996}
1391: S.L. Lauritzen.
1392: \newblock {\em Graphical Models}.
1393: \newblock Clarendon Press, 1996.
1394: 
1395: \bibitem{Miller2004}
1396: E. Miller and B. Sturmfels.
1397: \newblock {\em Combinatorial commutative algebra}, volume 227 of {\em Graduate
1398:   Texts in Mathematics}.
1399: \newblock Springer, New York, 2005.
1400: 
1401: \bibitem{Molla1996}
1402: A.~Molla, M.~Korneyeva, Q.~Gao, S.~Vasavanonda, P.J. Schipper, H.M. Mo,
1403:   M.~Markowitz, T.~Chernyavskiy, P.~Niu, N.~Lyons, A.~Hsu, G.R. Granneman,
1404:   D.D. Ho, C.A. Boucher, J.M. Leonard, D.W. Norbeck, and D.J. Kempf.
1405: \newblock Ordered accumulation of mutations in {HIV} protease confers
1406:   resistance to ritonavir.
1407: \newblock {\em Nat. Med.}, 2(7):760--766, Jul 1996.
1408: 
1409: \bibitem{Nowak2000}
1410: M.A. Nowak and R.M. May.
1411: \newblock {\em Virus dynamics}.
1412: \newblock Oxford University Press, 2000.
1413: 
1414: %\bibitem{Pachter2005}
1415: %L. Pachter and B. Sturmfels, editors.
1416: %\newblock {\em Algebraic Statistics for Computational Biology}.
1417: %\newblock Oxford University Press, 2005.
1418: 
1419: \bibitem{pruesse1994}
1420: G. Pruesse and F. Ruskey.
1421: \newblock Generating linear extensions fast.
1422: \newblock {\em SIAM J. Comput.}, 23(2):373--386, 1994.
1423: 
1424: \bibitem{Reidys2002}
1425: C.M. Reidys and P.F. Stadler.
1426: \newblock Combinatorial landscapes.
1427: \newblock {\em SIAM Review}, 44:3--54, 2002.
1428: 
1429: \bibitem{Rhee2003}
1430: S.-Y. Rhee, M.J. Gonzales, R. Kantor, B.J. Betts, J. Ravela,
1431:   and R.W. Shafer.
1432: \newblock Human immunodeficiency virus reverse transcriptase and protease
1433:   sequence database.
1434: \newblock {\em Nucl. Acids Res.}, 31(1):298--303, Jan 2003.
1435: 
1436: 
1437: \bibitem{Rosin1999a}
1438: C.D. Rosin, R.K. Belew, G.M. Morris, A.J. Olson, and D.S. Goodsell.
1439: \newblock Coevolutionary analysis of resistance-evading peptidomimetic
1440:   inhibitors of {HIV-1} protease.
1441: \newblock {\em Proc. Natl. Acad. Sci. U. S. A.}, 96:1369--1374, 1999.
1442: 
1443: \bibitem{Stanley1996}
1444: R.P. Stanley.
1445: \newblock A matrix for counting paths in acyclic digraphs.
1446: \newblock {\em J. Combin. Theory Ser. A}, 74(1):169--172, 1996.
1447: 
1448: \bibitem{Stanley1999}
1449: R.P. Stanley.
1450: \newblock {\em Enumerative combinatorics. {V}ol. 1}, volume~49 of {\em
1451:   Cambridge Studies in Advanced Mathematics}.
1452: \newblock Cambridge University Press, Cambridge, 1997.
1453: %\newblock With a foreword by Gian-Carlo Rota, Corrected reprint of the 1986
1454: %  original.
1455: 
1456: \bibitem{Stilianakis1997a}
1457: N.I. Stilianakis, C.A. Boucher, M.D.~De Jong, R.~Van Leeuwen, R.~Schuurman, and
1458:   R.J.~De Boer.
1459: \newblock Clinical data sets of human immunodeficiency virus type 1 reverse
1460:   transcriptase resistant mutants explained by a mathematical model.
1461: \newblock {\em J. Virol.}, 71(1):161--168, 1997.
1462: 
1463: \bibitem{Varol1981}
1464: Y.L.~Varol and D.~Rotem.
1465: \newblock An algorithm to generate all topological sorting arrangements.
1466: \newblock {\em Comput. J.}, 24(1):83--84, 1981.
1467: 
1468: \bibitem{Walter1999}
1469: H.~Walter, B.~Schmidt, K.~Korn, A.~M. Vandamme, T.~Harrer, and K.~{\"U}berla.
1470: \newblock Rapid, phenotypic {HIV-1} drug sensitivity assay for protease and
1471:   reverse transcriptase inhibitors.
1472: \newblock {\em J. Clin. Virol.}, 13:71--80, 1999.
1473: 
1474: \bibitem{Wilke2003}
1475: C.O.~Wilke.
1476: \newblock Probability of fixation of an advantageous mutant 
1477: in a viral quasispecies.
1478: \newblock {\em Genetics}, 163:467--474, 2003.
1479: 
1480: \end{thebibliography}
1481: 
1482: \section*{Appendix: Mathematics and computation of the risk polynomial}
1483: 
1484: Here we discuss in more detail mathematical properties 
1485: of the risk polynomial and we present several methods for computing it.
1486: The given data consists of an $n$ element poset $\cE$ 
1487: and its induced genotype lattice $\cG$, which is the distributive 
1488: lattice of order ideals in $\cE$. We assume that $\cG$ has 
1489: $m$ elements, which are encoded either
1490: as subsets of $\cE$ or as binary strings in $\{0,1\}^n$.
1491: The risk polynomial is the polynomial $\,\RP(\cG;{\bf f})\,$
1492: in the $m$ unknowns $f_g = {\bf f}(g)$, 
1493: one for each genotype $g$.
1494: We are also interested in  specializations of
1495: $\RP(\cG;{\bf f})$ obtained by setting some (or all) of the unknowns
1496: equal to each other, such as
1497: the graded risk polynomial and the univariate risk polynomial.
1498: 
1499: 
1500: \subsection*{Stanley's linear algebra method}
1501: 
1502: A direct method for computing the risk polynomial is given 
1503: in Section~\ref{sec:branching}.
1504:  Namely, we can set all $\mu_e$  equal to one
1505: in the matrix ${\bf U}$ and then compute the upper right
1506: entry of the matrix $\,({\bf I} - {\bf UF})^{-1} - {\bf I} \,$ of
1507: equation (\ref{GeometricSeries}).
1508: In practice, one would compute this entry
1509: by a dynamic program which runs in time $O(m^2)$.
1510: That dynamic program is easily   derived by resolving the recursion
1511: in  the last equation of the proof  of Theorem~\ref{thm:1}.
1512: 
1513: 
1514: The following alternative linear algebra technique for 
1515: computing polynomials similar to our risk polynomials 
1516: was given by Stanley  in \cite{Stanley1996}.
1517: Let $\,\cG' = \cG  \backslash \{\hat{0}, \hat{1}\} \,$ denote
1518: the genotype lattice with the top element
1519: $\hat{1}$ and the bottom element $\hat{0}$ removed.
1520: We define ${\bf A} $ to be the {\em anti-adjacency matrix} of the truncated 
1521: genotype lattice $\cG'$. Thus ${\bf A}$ is the $(m-2) \times (m-2)$-matrix
1522: with rows and columns indexed by $\cG'$, and whose entry
1523: in row $g$ and column $h$ is $0$ if $ g \subset h$
1524: and is $1$ otherwise. We write ${\bf I}$ for the 
1525: $(m-2) \times (m-2)$ identity matrix and 
1526: $\, {\bf F}' = {\rm diag} \bigl( \,{\bf f}(g) \,|\, g \in \cG' \bigr)\,$ for the
1527: $ (m-2) \times (m-2)$-diagonal matrix whose entries are the
1528: fitness values. Stanley's result reads as follows.
1529: 
1530: \begin{thm}[Stanley \cite{Stanley1996}]
1531: \label{stanley}
1532: The risk polynomial $\,\RP(\cG; {\bf f})\,$ equals
1533: the determinant of the $(m-2) \times (m-2)$-matrix $\, {\bf I} \, +\, {\bf F}' \cdot {\bf A}$.
1534: \end{thm}
1535: 
1536: \begin{ex} \rm
1537: Let $\cG$ be the genotype lattice in Figure~\ref{fig:ex1}. Then $m =8$ and
1538:  $\, {\bf I} \, +\, {\bf F}' \cdot {\bf A}\,$ is the $6 \times 6$-matrix
1539: \[
1540: \bordermatrix{ &  1000 & 0100 & 1100 & 0101 & 1110 & 1101 \cr
1541: 1000 & 1 + f_{1000} &   f_{1000} &    0  &   f_{1000} &   0 &   0 \cr
1542: 0100 &       f_{0100} &    1 + f_{0100} &    0  &    0  &    0 &    0 \cr
1543: 1100 & f_{1100} &   f_{1100} &   1 + f_{1100 } &   f_{1100 } & 0 &  0 \cr
1544: 0101 &  f_{0101} & f_{0101} &  f_{0101 } & 1 + f_{0101} &  f_{0101} & 0 \cr
1545: 1110 &  f_{1110 } &  f_{1110} &  f_{1110} & f_{1110} &  1 + f_{1110} &  f_{1110} \cr
1546: 1101 & f_{1101} &  f_{1101} & f_{1101} &  f_{1101} &  f_{1101} &  1 + f_{1101} \cr}.
1547: \]
1548: The determinant of this matrix is
1549: the risk polynomial of Example~\ref{ex:rp}.
1550: \end{ex}
1551: 
1552: 
1553: \subsection*{The Hilbert series method}
1554: 
1555: A more conceptual way of thinking about the risk polynomial
1556: is based on the following algebraic construction.
1557: The {\em Stanley-Reisner ideal} $\,I_{\cG'}\,$ of $\cG'$
1558: is the ideal generated by all quadratic monomials
1559: $\,f_g \cdot f_h \,$ where $g$ and $h$
1560: are genotypes that are incomparable,
1561: i.e., neither $g \subseteq h$ nor $h \subseteq g$ holds.
1562: The ambient polynomial ring $\,S = \mathbb{R}[{\bf f}] $ 
1563: is generated by the unknowns $f_g$ where $g \in \cG'$. 
1564: The {\em Hilbert series} of $\,I_{\cG'}\,$
1565: is the formal sum over all monomials
1566: $\,{\bf f}^u \, = \,\prod_{g \in \cG'} f_g^{u_g}\,$
1567: which  are not in the ideal $\,I_{\cG'}$.
1568: This is a formal generating function which can be
1569: written as a rational function of the following form
1570: \[
1571: H(S/I_{\cG'}; {\bf f}) \quad = \quad
1572: \frac{K_\cG({\bf f})}{\prod_{g \in \cG'} (1-f_g)}.
1573: \]
1574: Here $K_\cG({\bf f})$ is a polynomial
1575: in the unknowns $f_g$ with integer coefficients.
1576: The polynomial $K_\cG({\bf f})$
1577: is known as the {\em K-polynomial} of the ideal $I_{\cG'}$.
1578: We refer to \cite{Miller2004} for an introduction
1579: to Stanley-Reisner ideals and their K-polynomials.
1580: 
1581: If $\cE$ is a directed forest (and we identify $f_g = p_g$) 
1582: then Proposition \ref{prop:forest} and
1583: \cite[Thm.~14.11]{Beerenwinkel2005c} imply that
1584: the ideal $I_{\cG'}$ is an initial monomial ideal
1585: of the conjunctive Bayesian network on $\cE$.
1586: In a forthcoming paper we shall prove
1587: that this initial ideal property holds
1588: for all event posets (not just trees).
1589: 
1590: \begin{ex} \rm
1591: Let $\cG$ be the genotype lattice in Figure~\ref{fig:ex1}.
1592: Then 
1593: \[
1594: I _{\cG'} \quad = \quad \langle\,
1595: f_{0101} f_{1110},\,
1596: f_{1101} f_{1110},\,
1597: f_{0101} f_{1100},\,
1598: f_{0101} f_{1000},\,
1599: f_{0100} f_{1000}
1600: \rangle
1601: \]
1602: %Comparing these monomials to the underlined initial monomials in 
1603: %Example~\ref{ex:conjunctive}, we see that 
1604: %$I_{\cG'}$ 
1605: is indeed the initial monomial ideal
1606: of the conjunctive Bayesian network
1607: % in that example.
1608: in Example~\ref{ex:conjunctive}.
1609: The K-polynomial $K_{\cG}({\bf f})$ equals
1610: \begin{eqnarray*}
1611: & 1
1612: - f_{0101} f_{1110}
1613: - f_{1101} f_{1110}  
1614: - f_{0101} f_{1100}
1615: - f_{0101} f_{1000}
1616: - f_{0100} f_{1000} \\ &
1617: + f_{0100} f_{1000} f_{0101}
1618: + f_{1000} f_{0101} f_{1100}
1619: + f_{1000} f_{0101} f_{1110}
1620: + f_{0101} f_{1100} f_{1110} \\ &
1621: + f_{0101} f_{1110} f_{1101}
1622: + f_{0100} f_{1000} f_{1110} f_{1101} \\ &
1623: - f_{1000} f_{0101} f_{1100} f_{1110}
1624: - f_{0100} f_{1000} f_{0101} f_{1110} f_{1101}.
1625: \end{eqnarray*}
1626: \end{ex}
1627: 
1628: \smallskip
1629: 
1630: %Just as in the proof of Corollary~\ref{cor:gb}, we see
1631: Again using Proposition~\ref{prop:forest} and 
1632: Theorem~14.11 in \cite{Beerenwinkel2005c}
1633: we see that 
1634: the risk polynomial  $\,\RP(\cG; {\bf f})\,$
1635: is the sum of all squarefree monomials
1636: in the expansion of the Hilbert series $H(S/I_{\cG'}; {\bf f})$.
1637: Equivalently, $\,\RP(\cG; {\bf f})\,$ is the reduction of
1638: $H(S/I_{\cG'}; {\bf f})$ modulo the ideal generated
1639: by the squares $\,f_g^2 \,$ of the unknowns.
1640: Since $\,1/(1-f_g)\,$ equals $\,1+f_g\,$ modulo 
1641: $\,\langle \, f_g^2 \, \rangle $, we have the following result.
1642: 
1643: \begin{prop} \label{reisner}
1644: The risk polynomial  $\,\RP(\cG; {\bf f})\,$ 
1645: of the genotype lattice $\cG$ is the sum of
1646: all squarefree terms in the expansion of
1647: \[
1648: K_\cG({\bf f}) \cdot \prod_{g \in \cG'} (1+f_g),
1649: \]
1650: where $K_\cG({\bf f})$ is the $K$-polynomial
1651: of the Stanley-Reisner ideal $I_{\cG'}$.
1652: \end{prop}
1653: 
1654: The univariate risk polynomial $\,\RP(\cG; a) \,$
1655:  is derived from $\,\RP(\cG;{\bf f})\,$
1656: by replacing each $f_g$ by the scalar unknown $a$.
1657: We have
1658: \[
1659: \RP(\cG;a) \quad = \quad
1660: c_0 + c_1 a + c_2 a^2 + \cdots + c_{n-1} a^{n-1},
1661: \]
1662: where $c_i$ is the number of chains of length $i$ in $\cG'$. Thus, 
1663: $(c_0,\ldots,c_{n-1})$ is the $f$-vector
1664: of the simplicial complex of chains in $\cG'$.
1665: Likewise, we get the graded risk polynomial from
1666: $\RP(\cG;{\bf f})$ by replacing each $f_g$ by
1667: $a_{|g|}$. We note that the graded risk polynomial is  related to
1668: Ehrenborg's quasi-symmetric function encoding \cite{ehrenborg1996}
1669: of the flag $f$-vector of the chain complex of $\cG'$.
1670: 
1671: 
1672: \subsection*{The linear extensions method}
1673: 
1674: One advantage of both Theorem~\ref{stanley}
1675: and Proposition~\ref{reisner} is that these
1676: formulas do not actually depend on the
1677: fact that $\cG$ is a distributive lattice.
1678: They also apply if the set
1679: $\cG$ of genotypes is an arbitrary
1680: poset. This is relevant for our
1681: discussion of the statistical models in Section~\ref{sec:bayes},
1682: where we introduced a more general
1683: class of posets $\cG_p \subseteq \{0,1\}^n$.
1684: 
1685: This advantage is also a disadvantage: 
1686: Theorem~\ref{stanley} and Proposition~\ref{reisner}
1687: do not give the most efficient methods for
1688: computing  $\RP(\cG;{\bf f})$ when $\cG$ is  the distributive lattice
1689: induced by an event poset $\cE$. In what follows
1690: we present a specialized and more efficient
1691: algorithm for the risk polynomial.
1692:  The input to this algorithm consists of
1693: the event poset $\cE$. It is not necessary
1694: to compute the genotype lattice $\cG$ 
1695: as this will be done as a byproduct of our approach,
1696: which is to compute  the risk polynomial $\RP(\cG;{\bf f}) $ directly from $\cE$.
1697:   
1698: As before, we assume that $\cE$ has $n$ elements, and 
1699: we write $[n]$ for the linearly ordered set $\{1,2,\ldots,n\}$.
1700: A {\em linear extension} of $\cE$ is an order-preserving
1701: bijection $\,\pi \colon \cE \rightarrow [n]$. This means that
1702: $e < e'$ in $\cE$ implies $\pi(e) < \pi(e')$.
1703: Every linear extension  $\,\pi \colon \cE \rightarrow [n]$
1704: gives rise to an ordered list of $n-1$ genotypes 
1705: $\,g^{(1)},g^{(2)}, \ldots,g^{(n-1)}\,$ in
1706: $\,\cG' = \cG \backslash \{\hat{0},\hat{1}\}$ as follows.
1707: The genotype $g^{(i)}$ is
1708: the subset of $\cE$ consisting of all
1709: events whose image under $\pi$ 
1710: is among the first $i$ positive integers. In symbols,
1711: $\, g^{(i)} \,= \, \pi^{-1}(\{1,2,\ldots,i\}) $.
1712: The sequence $g^{(1)}, g^{(2)}, \ldots, g^{(n-1)}$, derived from $\pi$,
1713: represents a mutational pathway in $\cG$.
1714: 
1715: We now fix one distinguished linear extension of $\cE$,
1716: that is, we identify the set underlying $\cE$ with $[n]$ itself.
1717: Then a linear extension is simply
1718: any permutation $\pi$ of $[n]$ which preserves the
1719: order relations in $\cE$.  We define
1720: \begin{equation}
1721: \label{FPi}
1722: {\bf f}(\pi) \quad = \quad 
1723: \prod_{i: \pi(i) < \pi(i+1)} ( f_{g^{(i)}} + 1) 
1724: \cdot
1725: \prod_{i: \pi(i) > \pi(i+1)}  f_{g^{(i)}}  ,
1726: \end{equation}
1727: where $i$ runs over $\{1,2,\ldots,n-1\}$.
1728: Our algorithm amounts to evaluating
1729: the  risk polynomial by means of the
1730: following explicit summation formula.
1731: 
1732: \begin{thm}   \label{thm:linearExtensions}
1733: The risk polynomial  $\RP(\cG;{\bf f}) $ 
1734: equals the sum of the products ${\bf f}(\pi)$
1735: where $\pi$ runs over all linear extensions of 
1736: the event poset $\cE$.
1737: \end{thm}
1738: 
1739: \begin{proof}
1740: The relationship between chains in $\cG$ and
1741: linear extensions of $\cE$ is the content of
1742: \cite[Prop.~3.5.2]{Stanley1999}.
1743: The distributive lattice $\cG$ has a canonical
1744: {\em R-labeling} \cite[Sec.~3.13]{Stanley1999}
1745: which assigns to each edge of the Hasse diagram of
1746: $\cG$ the corresponding element of $\cE$.
1747: In view of this R-labeling, Exercise~59d in \cite[Chap.~3]{Stanley1999} 
1748: tells us that the poset $\,\cG'  = \cG \backslash \{\hat{0},\hat{1}\}\,$ is
1749: {\em chain-partitionable}.
1750: Each product  ${\bf f}(\pi)$ as in (\ref{FPi})
1751:  is the generating function for
1752:  all the chains in precisely one part of that chain
1753: partition of $\cG'$. Adding up all products
1754: gives the generating function for all chains,
1755: which is the risk polynomial.
1756: \end{proof}
1757: 
1758: \begin{ex} \rm
1759: The event poset $\cE$ in Figure~\ref{fig:ex1} has five linear extensions $\pi$:
1760: \begin{eqnarray*}
1761: \pi \quad \,\,\,& {\bf f}(\pi) \\
1762: (1, 2, 3, 4) &  (1+f_{1000})(1+f_{1100})(1+f_{1110})  \\
1763: (1, 2, 4, 3) & (1 + f_{1000}) (1+f_{1100}) f_{1101}         \\
1764: (2, 1, 3, 4) & f_{0100}(1+f_{1100})(1+f_{1110})          \\
1765: (2, 1, 4, 3) & f_{0100}(1+f_{1100}) f_{1101}                 \\
1766: (2, 4, 1, 3) & (1+f_{0100}) f_{0101} (1+f_{1101})
1767: \end{eqnarray*}
1768: The sum of these five products equals the risk polynomial  $\RP(\cG;{\bf f}) $.
1769: \end{ex}
1770: 
1771: 
1772: \subsection*{Implementation}
1773: 
1774: Pruesse and Ruskey \cite{pruesse1994} showed that
1775: the linear extensions of a poset $\cE$ can be computed in time linear in 
1776: the number of linear extensions.
1777: Thus, their algorithm computes $\RP(\cG;{\bf f}) $ in
1778: time linear in the size of the output of
1779: Theorem~\ref{thm:linearExtensions}.  That output is in
1780: factored form (\ref{FPi}) and is always more compact than the
1781: expanded risk polynomial.  In this manner, we compute the risk
1782: polynomial in time sublinear in the size of the expanded risk
1783: polynomial.  
1784: 
1785: To obtain the univariate risk polynomial, we take the sum of the terms
1786: $\,(1+a)^{n-1-\delta} a^\delta$, where $\delta = \delta(\pi)$
1787: is the number of descents of the linear extension $\pi$.
1788: Similarly, the graded risk polynomial $\RP(\cG; a_1,\ldots,a_{n-1})$ is found by
1789: keeping track of the descent set of each linear extension $\pi$.
1790: We believe that this method is best possible for general posets
1791: $\cE$. Notice that the leading term of
1792: the univariate risk polynomial is the number of linear extensions of
1793: $\cE$, and it is \#P-complete to count linear extensions \cite{brightwell}.  
1794: 
1795: When $\cE$ is a directed forest, the
1796: recursive structure can be used to help compute the risk polynomial.
1797: In this case, $\cE$ is built up by the operations of disjoint union
1798: and ordinal sum from the one element poset.  For example, in the univariate case,
1799: the zeta polynomial \cite[Sec.~3.11]{Stanley1999} of $\cG$ behaves nicely under these operations and
1800: can be used to write down the risk polynomial. Based on these
1801: considerations, we can design an efficient algorithm for
1802: computing the univariate risk polynomial of a directed forest.
1803: 
1804: Using the method of Theorem~\ref{thm:linearExtensions}, we have developed software
1805: for computing risk polynomials.
1806: The input to our program is an arbitrary event poset $\cE$,
1807: and the output is 
1808: the risk polynomial, the graded risk polynomial
1809: or the univariate risk polynomial.  Optionally, the user can also
1810: input either exact fitness values or upper and lower bounds for each fitness
1811: value.  The output in this case is either the exact risk of escape 
1812: or upper and lower bounds for the risk.
1813: It is designed to integrate with the package 
1814: \texttt{Mtreemix} \cite{Beerenwinkel2005b},
1815: allowing the user to start with data, infer a mutagenetic 
1816: tree, and then easily compute the risk 
1817: polynomial.
1818: Our software is available at
1819: \[ 
1820:    \url{http://bio.math.berkeley.edu/riskpoly/}
1821: \] 
1822: We use the algorithm of \cite{Varol1981} for computing linear
1823: extensions.  Although this algorithm isn't asymptotically optimal, as
1824: shown in \cite{pruesse1994}, it
1825: is simple to implement and efficient in practice.
1826: 
1827: 
1828: 
1829: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1830: 
1831: 
1832: 
1833: 
1834: \end{document}
1835: