0402:q-bio0402007/4019.tex

1: \documentclass[matbio]{svjour}

2: \usepackage{t1enc}

3: \usepackage{amssymb,amsmath,graphicx}

4: \usepackage[latin1]{inputenc}

5: \usepackage[english]{babel}

6: \markright{}

7: \setlength{\parskip}{12pt}

8: \setlength{\parindent}{.2in}

9: \newcommand{\com}[2]{\ensuremath{\left [ #1 , #2 \right]}}

10: \newcommand{\anticom}[2]{\ensuremath{\left \{ #1 , #2 \right\}}}

11: \newcommand{\bra}[1]{\ensuremath{\left\langle {#1} \right|}}

12: \newcommand{\ket}[1]{\ensuremath{\left| {#1} \right\rangle}}

13: \newcommand{\inner}[2]{\ensuremath{\left\langle {#1}| {#2}\right\rangle}}

14: \newcommand{\fra}[2]{\textstyle{\frac{#1}{#2}}}

15: \newcommand{\beqn}{\begin{eqnarray}\begin{aligned}}

16: \newcommand{\eqn}{\end{aligned}\end{eqnarray}}

17:

18:

19:

20:

21: \begin{document}

22:

23:

24: \title{Entanglement Invariants and Phylogenetic Branching}

25: \author{J G Sumner$^*$\thanks{$^*$ Australian Postgraduate Award} \and P D Jarvis$^\dagger$\thanks{$^\dagger$ Alexander von Humboldt Fellow}}

26: \institute{School of Mathematics and Physics, University of Tasmania GPO Box 252-21, Hobart Tas 7001, Australia. \email{jsumner@utas.edu.au}}\keywords{phylogenetics--entanglement--invariants--Markov}

27: \maketitle

28: \abstract{It is possible to consider stochastic models of sequence evolution in phylogenetics in the context of a dynamical tensor description inspired from physics. Approaching the problem in this framework allows for the well developed methods of mathematical physics to be exploited in the biological arena. We present the tensor description of the homogeneous continuous time Markov chain model of phylogenetics with branching events generated by dynamical operations. Standard results from phylogenetics are shown to be derivable from the tensor framework. We summarize a powerful approach to entanglement measures in quantum physics and present its relevance to phylogenetic analysis. Entanglement measures are found to give distance measures that are equivalent to, and expand upon, those already known in phylogenetics. In particular we make the connection between the group invariant functions of phylogenetic data and phylogenetic distance functions. We introduce a new distance measure valid for three taxa based on the group invariant function known in physics as the "tangle". All work is presented for the homogeneous continuous time Markov chain model with arbitrary rate matrices.}

29:

30:

31:

32:

33:

34:

35:

36:

37: \section{Introduction}

38:

39: Stochastic methods which model character distributions in aligned gene sequences are part of the standard armoury of phylogenetic analysis \cite{stee,fels,fels2,rodr,nei}. The evolutionary relationships are usually represented as a bifurcating tree directed in time. It is remarkable that there is a strong conceptual and mathematical analogy between the construction of phylogenetic trees using stochastic methods, and the process of scattering in particle physics \cite{jarv}. It is the purpose of the present work to show that there is much potential in taking an algebraic, group theoretical approach to the problem where the inherent symmetries of the system can be fully appreciated and utilized.\\\indent

40: Entanglement is of considerable interest in physics and there has been much effort to elucidate the nature of this most curious of physical phenomena \cite{wern,lind,bern,dur,guhn}. Entanglement has its origin in the manner in which the state probabilities of a quantum mechanical system must be constructed from the individual state probabilities of its various subsystems. Whenever there are global conserved quantities, such as spin, it is the case that there exist entangled states where the choice of measurement of one subsystem can affect the measurement outcome of another subsystem no matter how spatially separated the two subsystems are. This curious physical property is represented mathematically by nonseparable tensor states. Remarkably, if the pattern frequences of phylogenetic analysis are interpreted in a tensor framework it is possible to show that the branching process itself introduces entanglement into the state. This is a mathematical curiosity that can be studied using methods from quantum physics. This is a novel way of approaching phylogenetic analysis which has not been explored before. \\\indent

41: In section \ref{tensor} we begin by considering the stochastic model of sequence evolution in phylogenetics using continuous time Markov chains (CTMCs)\footnote{The reader should note that the model considered in this paper is the general Markov model on a tree together with the additional assumption that the transition matrices are non-singular and arise as an analytical continuation from the identity matrix. For a recent review of the hierarchy of phylogenetic models see \cite{erik}.}. We go on to present this model in a dynamical tensor description where the probability distribution is given by the components of a tensor in a preferred basis and the time evolution is generated by linear operators acting on the space. The phylogenetic branching process is then developed formally in section \ref{branching} by introducing a linear operator which introduces an extra product in the tensor space. This operator is shown to be unique given that the probability distribution must be conditionally independent from branch to branch. We also show that the branching process introduces entanglement into the state space. The stationary states of the system and the pulley principle, which describes the unrootedness of phylogenetic trees, are presented in the tensor framework in sections \ref{stationarysection} and \ref{pulley} respectively. Section \ref{orbitssection} is a short review of current methods of analysing entanglement in terms of group orbits and invariant functions.  In sections \ref{2characters} and \ref{canonical} the work specializes to the cases of two phylogenetic characters and small numbers of taxa making up the analysis. In section \ref{concurrencesection} the group orbits and invariant functions for the case of two taxa are presented and explicitly solved to show that the invariant function is the well known $\log\det$ distance. In section \ref{tangle} we go on to study the case of three taxa where the invariant function known in physics as the tangle is shown to give a new distance measure for phylogenetics. This previously unstudied distance measure is found to be  useful analytical tool in the reconstruction of phylogenetic trees, (section \ref{distances}).

42:

43:

44:

45: \section{Tensor methods in phylogenetic branching}\label{tensor}

46: We consider a system consisting of $N$ sites each of which takes on one of $K$ distinct characters. Associated with such a system we have the set of frequencies

47: \beqn

48: \widehat{p}_i:&=\frac{\text{total number of occurrences of character }i}{N},\\i&=0,1,...,K-1.\nonumber

49: \eqn

50: We model these frequencies by defining a set of probabilities which are the theoretical limit

51: \beqn

52: p_i=\lim_{N\rightarrow\infty}\widehat{p}_i.\nonumber

53: \eqn

54:  Introducing the $K$ dimensional vector space $V$ with preferred basis \{$e_i$\}, we can associate the set of probabilities with a unique vector

55: \beqn

56: p=p_0e_0+p_1e_1+...+p_{K-1}e_{K-1}.\nonumber

57: \eqn

58: The probabilities are assumed to evolve in time as a homogeneous CTMC \cite{hagg,rind}. This amounts to assuming that the character

59: state at any time $t$, conditional on the character

60:  state any time $t' < t$, is independent of the

61:  character state at any earlier time $t'' < t'$

62: \cite{fels,fels2,rodr}. The defining relation for the time evolution is

63: \beqn

64: \frac{d}{dt}p(t)=R\cdot p(t),\label{rate}

65: \eqn

66: where $R$ is a linear operator. To preserve reality of the probabilities and the property $\sum_i p_i(t)=1,\forall t$ it follows that in the preferred basis $R$ is a real valued zero column sum matrix. A formal solution of (\ref{rate}) for time independent $R$ is found by exponentiating

67: \beqn\label{exp}

68: p(t)=&e^{Rt}p(0)\\

69: :=&M(t)p(0).

70: \eqn

71: We refer to $M(t)$ as a Markov operator. Taking its derivative

72: \beqn

73: \frac{d}{dt}M(t)=RM(t),\nonumber

74: \eqn

75: we observe that

76: \beqn

77: \frac{d}{dt}M(t)|_{t=0}=R.\nonumber

78: \eqn

79: As is well known, in order to conserve positivity of the probabilities it must also be the case that

80: \beqn

81: R_{ij}&\geq0,\quad\forall i\neq j;\nonumber\\

82: R_{ii}&\leq0.

83: \eqn

84: \\\indent

85: In phylogenetics we consider the case where we have multiple, aligned, $N$ site, $K$ character systems labelled by $\{1,2,...,L\}$. We refer to the individual systems as taxa. What is now of interest is the set of frequencies

86: \beqn

87: \widehat{p}_{i_1i_2...i_L}:=&\frac{\text{total number of occurrences of pattern }i_1i_2...i_L}{N},\nonumber\\

88: i_1,i_2,...,i_L=&0,...,K-1.\nonumber

89: \eqn

90: We model these frequencies by again defining a set of probabilities which are the theoretical limit

91: \beqn

92: p_{i_1i_2...i_L}:=\lim_{N\rightarrow\infty}\widehat{p}_{i_1i_2...i_L}.\nonumber

93: \eqn

94: The system is assumed to have evolved in time as a homogeneous CTMC.\\\indent

95: Introducing the random variables $x_1,x_2,...,x_L$ each of which take on values in the  individual character spaces $\{i_1,i_2,...,i_L=0,...,K-1\}$ and $x=(x_1,x_2,...,x_L)$ which takes on values in the $K^L$ dimensional character space $\{i_1i_2...i_L\}$ we can write the transition probabilities of the Markov chain as

96: \beqn

97: \mathbb{P}&(x\!=i_1i_2...i_L,t|x\!=\!j_1j_2...j_L,0)\nonumber\\&=\mathbb{P}(x_1\!=\!i_1,t|x_1\!=\!j_1,0)\mathbb{P}(x_2\!=\!i_2,t|x_2\!=\!j_2,0)...\mathbb{P}(x_L\!=\!i_L,t|x_L\!=\!j_L,0)\nonumber\\

98: :\!&=M^1_{i_1j_1}(t)M^2_{i_2j_2}(t)...M^L_{i_Lj_L}(t),\nonumber\\

99: &=\sum_{k_1,k_2,...,k_L}M^1_{i_1k_1}(t)M^2_{i_2k_2}(t)...M^L_{i_Lk_L}(t)\delta^{k_1}_{j_1}\delta^{k_2}_{j_2}...\delta^{k_L}_{j_L}.

100: \eqn

101: From this we notice that it is possible to construct the state space setting of the tensor product $V\otimes V\otimes ...\otimes V=V^{\otimes L}$ where the probabilities are associated with the tensor

102: \beqn

103: P(t)=\sum_{i_1,...,i_L}p_{i_1i_2...i_L}(t)e_{i_1}\otimes e_{i_2}\otimes ...\otimes e_{i_L}.\nonumber

104: \eqn

105: Time evolution of this system is generated by the transition probabilities of the Markov chain which in tensor notation can be represented as linear operators acting on the initial pattern distribution

106: \beqn

107: P(t)=M^1(t)\otimes M^2(t)\otimes ...\otimes M^L(t)P(0),\nonumber

108: \eqn

109: where distinct rate parameters have been allowed on each component of the tensor product space. (The reader should note that $M^l$ refers to the $lth$ component of the tensor product space and is \textit{not} meant to indicate the $lth$ power of the operator $M$.)\\\indent

110: If we have a phylogenetic tensor $P(t)$ which describes the pattern distribution of characters for $L$ taxa, it is possible to find the reduced tensor $\overline{P}(t)$ which gives the pattern distribution for a subset of $l$ taken from the original set of $L$ taxa. The correct operation is given by

111: \beqn\label{reduced}

112: p_{i_1i_2...i_L}(t)&\rightarrow \overline{p}_{i_1i_2...i_l}(t)=\sum_{i_s,i_t,...}p_{i_1i_2...i_L}(t),\\

113: \overline{P}(t)&=\sum_{\text{all i's}}\overline{p}_{i_1i_2...i_l}(t)e_{i_1}\otimes e_{i_2}\otimes ...\otimes e_{i_l},

114: \eqn

115: where $s,t,...,$ label the taxa which are not in the subset. Such a reduced tensor will be referred to as a \textit{marginal} distribution. (In this work we will follow the convention from here on that if a summation sign has no suffixes it is assumed that \textit{all} indices inside the expression are to be summed over.)

116:

117: \section{Phylogenetic Branching}

118: \label{branching}

119:

120: Having developed the general tensor description of the homogeneous CTMC model of sequence evolution in phylogenetics, in this section we will now introduce a formalism for describing the branching events. We do this by defining a formal operation on the tensor space. \\\indent

121: Consider the case where we have a single taxon branching into $L=2$ taxa. The corresponding mathematical operation is $V\rightarrow V\otimes V$. If the branching event occurs at $t=\tau$ we are required to determine the appropriate pattern probabilities $p_{i_1i_2}(\tau)$ given the probabilities $p_{i}(\tau)$. (In this paper $\tau$ is considered to be an additional model parameter alongside the parameters in the rate matrices). Intuitively a reasonable choice is the initial set $p_{ii}(\tau)=p_i(\tau)$ and $p_{ij}(\tau)=0,\forall i\neq j$. \\\indent

122: In order to formalize this we introduce the \textit{splitting} operator $\delta:V\rightarrow V\otimes V$. The most general action of $\delta$ on the basis elements of $V$ can be expressed as

123: \beqn\label{splitting}

124: \delta\cdot e_i=\sum_{j,k}\Gamma _i^{jk}e_j\otimes e_k,

125: \eqn

126: where $\Gamma _i^{jk}$ are an arbitrary set of coefficients. Standard models of phylogenetics assume conditional independence upon the distinct branches of the tree \cite{steel,fels,fels2,nei}. This assumption will presently be used to determine the exact form of the splitting operator. It is only necessary to consider initial probabilities of the form \beqn p^{(\gamma)}_{i}(\tau)&=\delta_i^{\gamma},\nonumber\\\gamma &=0,1,...,K-1\eqn so that the initial single taxon state is

127: \beqn

128: p^{(\gamma)}(\tau)&=\sum_{i}p^{(\gamma)}_i(\tau)e_i,\nonumber\\&=\sum_{i}\delta_i^\gamma e_i.\nonumber\\

129: \eqn

130: Directly subsequent to the branching event the 2 taxa state is given by

131: \beqn

132: P^{(\gamma)}(\tau)&=\delta\cdot p^{(\gamma)}(\tau),\nonumber\\

133: &=\sum_{i,j,k}\delta^\gamma_i\Gamma_i^{jk}e_j\otimes e_k.

134: \eqn

135: We implement the conditional independence upon the branches by setting

136: \beqn\label{condindep}

137: \mathbb{P}(x\!=\!i_1i_2,&t\!=\!t'|x_1\!=\!x_2=\gamma,t\!=\!\tau)\\

138: =\mathbb{P}&(x_1\!=\!i_1,t\!=\!t'|x_1\!=\!\gamma,t\!=\!\tau)\mathbb{P}(x_2\!=\!i_2,t\!=\!t'|x_2\!=\!\gamma,t\!=\!\tau).

139: \eqn

140: Using the tensor formalism the transitions probabilites can be expressed as

141: \beqn

142: &\mathbb{P}(x_1=i_1,t=t'|x_1=\gamma,t=\tau)=\sum_{k}M^1_{i_1k}(t'-\tau)\delta_k^{\gamma},\nonumber\\

143: &\mathbb{P}(x_2=i_2,t=t'|x_2=\gamma,t=\tau)=\sum_{l}M^2_{i_2l}(t'-\tau)\delta_l^{\gamma},\nonumber\\

144: &\mathbb{P}(x=i_1i_2,t=t'|x_1=x_2=\gamma,t=\tau)\nonumber\\

145: &\hspace{100pt}

146: =\sum_{k,l,m}M^1_{i_1k}(t'-\tau)M^2_{i_2l}(t'-\tau)\delta_m^\gamma\Gamma_{m}^{kl}.\nonumber

147: \eqn

148: Implementing (\ref{condindep}) leads to the requirement that

149: \beqn\label{splitcond}

150: \Gamma_\gamma^{kl}=\delta_k^\gamma\delta_l^\gamma

151: \eqn

152: and the basis dependent definition of the splitting operator

153: \beqn\label{delta}

154: \delta\cdot e_i=e_i\otimes e_i.

155: \eqn

156: The action on the components of a vector is such that

157: \beqn\label{comp}

158: p_{ij}(\tau)&=p_i(\tau),\quad i=j;\\

159: &=0,\quad i\neq j;\nonumber

160: \eqn

161: which is consistent with our intuitive guess. As will become apparent, the operation defined in (\ref{delta}) takes disentangled states into entangled states.\\\indent

162: The splitting operator is an important structural element of the tensor description and its symmetry properties \cite{bashjarv} are intimately related to the existence of discrete transform methods for particular classes of phylogenetic model. More general forms of the conditions (\ref{splitcond}) can be envisaged under weaker assumptions than considered here. Finally, in the particle scattering picture for phylogenetic branching \cite{jarv} the splitting operator is implemented as an interaction term. For present purposes the utility of $\delta$ is that it allows us to write down a formal expression for a system which undergoes a branching event.  Suppose a system described by the tensor $P\in V^{\otimes L}$ undergoes a branching event on its $r^{th}$ branch, the new system is described by

163: \beqn\label{coords}

164: P\rightarrow(1_1\otimes 1_2\otimes...\otimes 1_{r-1}\otimes\delta\otimes 1_{r+1}\otimes...\otimes 1_L)P\in V^{\otimes L+1},

165: \eqn

166: where $1_s$ is the identity operator on the $s$th component of the tensor product space.

167: We introduce the convention that the tensor space is labelled so that under the action (\ref{coords}) the probabilities are given by

168: \beqn

169: p_{i_1i_2...i_L}\rightarrow p_{i_1i_2...i_ri_{r+1}i_{r+2}...i_{L}i_{L+1}}=p_{i_1i_2...i_ri_{r+2}...i_Li_{L+1}}\delta_{i_ri_{r+1}}.\nonumber

170: \eqn

171: We introduce parameter sets labelled on the edges of the tree by \{$\epsilon_a,a=1,2,...\}$ and defined as $\epsilon_a=\{\alpha_a,\beta_a,...,.;t_a\}$ to distinguish entries in the rate matrices and branch lengths on different edges. Given the solution (\ref{exp}) it should be noted that there is an edge scaling symmetry \beqn

172: \{\alpha_a,\beta_a,...\}&\rightarrow\{\lambda\alpha_a,\lambda\beta_a,...\},\\t_a&\rightarrow \lambda^{-1}t_a\nonumber

173: \eqn

174: which leaves the model invariant. This symmetry is well known in the literature \cite{fels,fels2} and indicates that it is not possible to distinguish between a fast rate of evolution and a long time period of evolution. When the rate parameters $\{\alpha_a,\beta_a,...\}$ are identical on all edges of the tree, a "molecular clock" is said to be in operation. Under the circumstances of a molecular clock it is possible in principle to determine the time period. \\\indent

175: The edges of a tree are labelled using an away-from-the-root and left-to-right ordering convention. As an example the expression which defines the most general homogeneous CTMC on the tree $(1((23)4))$ is given by

176: \beqn

177: P&_{\tau_2\tau_3}(t)\nonumber\\

178: &=(M_{\epsilon_1}\!\otimes\!M_{\epsilon_5}\!\otimes\!M_{\epsilon_6}\!\otimes\!M_{\epsilon_4})1\!\otimes\!\delta\!\otimes\!1(1\!\otimes\!M_{\epsilon_3}\!\otimes\!1)1\!\otimes\!\delta(1\!\otimes\!M_{\epsilon_2})\delta\cdot p\nonumber

179: \eqn

180: where $p$ is the initial single taxon distribution, $\tau_2$,$\tau_3$ define the branching times and $t_1=t$, $t_4=t-\tau_2$, $t_5=t_6=t-\tau_2-\tau_3$, (see figure \ref{fourtaxa}).\\\indent

181:  In this work we will derive results which are independent of the particular rate parameters which occur in the Markov operators of the model.

182:

183: \begin{figure}[h]

184: \centering

185: \resizebox{0.3\hsize}{!}{\rotatebox{270}{\includegraphics{fig2.eps}}}

186: \caption{The CTMC model of four taxa on the tree (1((23)4)).}

187: \label{fourtaxa}

188: \end{figure}

189:

190: \section{Stationary states}

191: \label{stationarysection}

192:

193: A stationary state of a homogeneous CTMC is defined as the vector, $\pi$, which satisfies

194: \beqn\label{stationary}

195: R\cdot\pi=0.

196: \eqn

197: The stationary state can be generalized to the case of a tensor, $\Pi\in V^{\otimes L}$, associated with a set of pattern probabilities which satisfies

198: \beqn

199: R_1\otimes R_2\otimes...\otimes R_L\cdot\Pi=0\nonumber

200: \eqn

201: and has solution

202: \beqn

203: \Pi=\pi_1\otimes\pi_2\otimes...\otimes\pi_L.\nonumber

204: \eqn

205: It should be noted that $\Pi$ is a completely separable state and that \textit{any} state tends to the stationary state as $t\rightarrow \infty$ \cite{hagg}.\\

206:

207: \section{The pulley principle}\label{pulley}

208:

209: We consider the case of a single taxon which is in state $p\!\in\! V$ at time $t=0$. We implement a branching event at $t=0$ and let the system evolve under arbitrary Markov operators to produce the state at a later time $t$

210: \beqn\label{see}

211: P(t)=M_{\epsilon_1}\otimes M_{\epsilon_2}\delta\cdot p\quad\in\! V\otimes V,

212: \eqn

213: (see figure \ref{twotaxa}).

214:

215: \begin{figure}[h]

216: \centering

217: \resizebox{0.3\hsize}{!}{\rotatebox{0}{\includegraphics{fig3.eps}}}

218: \caption{The CTMC model of two taxa.}

219: \label{twotaxa}

220: \end{figure}

221:

222:  The dual vector space $V^*$ has a basis which consists of the linear maps \{$f^i:V\rightarrow F,i=1,...,K$\} which satisfy $f^i(e_j)=\delta^i_j$. Using this dual basis we can define an isomorphism $\phi: V\otimes V\rightarrow L(V)$ as $\phi(e_i\otimes e_j)\rightarrow e_i f^j$ and rewrite (\ref{see}) as

223: \beqn

224: \phi(P(t))=M_{\epsilon_1}\phi(\delta\cdot p)M^{\intercal}_{\epsilon_2},\nonumber

225: \eqn

226: where ${\intercal}$ indicates matrix transpostion.\\\indent

227: The pulley principle is then a direct consequence of the existence of states $p$ such that $

228: M\phi(\delta\cdot p)=\phi(\delta\cdot p)M^\intercal$

229: for a given Markov operator $M$. A solution can be found to be

230: \beqn

231: p&=\pi,\nonumber\\

232: \eqn

233: where $M\cdot\pi=\pi$ so that $\pi$ is a stationary state under $M$. Putting this together we can conclude that

234: \beqn

235: P_1(t):=\left[M_{\epsilon_1}\otimes M_{\epsilon_2}\right]\delta\cdot\pi_1=&\left[1\otimes M_{\epsilon_2}M_{\epsilon_1}\right]\delta\cdot\pi_1,\nonumber

236: \eqn

237: or

238: \beqn

239: P_2(t):=\left[M_{\epsilon_1}\otimes M_{\epsilon_2}\right]\delta\cdot\pi_2=&\left[M_{\epsilon_1}M_{\epsilon_2}\otimes 1\right]\delta\cdot\pi_2,\nonumber

240: \eqn

241: for the special case where $M_{\epsilon_1}\cdot\pi_1=\pi_1$ or $M_{\epsilon_2}\cdot\pi_2=\pi_2$, respectively. This tells us that in the case where the initial distribution is a stationary state of one of the rate matrices on edges 1 and/or 2 the placement of the root of the tree is not strictly determined. This property has been observed previously to hold when the Markov chain is reversible, \cite{fels,fels2}. However in the tensor framework presented here we have refined the pulley principle by showing that one requires only that the initial distribution $\pi$ be a stationary distribution upon either branch of the tree for one to be able to "pull" that same branch through the initial distribution. This is a less stringent requirement than that of reversibility \cite{hagg}. \\\indent

242:

243: \section{Group invariants and orbit classes}

244: \label{orbitssection}

245:

246: In quantum physics there has been much interest in quantifying and/or classifying the phenomenon of entanglement between multiple non-local systems \cite{wern}. The correct description involves expressing the total state vector as belonging to the multi-linear space built from the tensor product of the individual state spaces. Entangled states exhibit non-local behaviour and correspond mathematically to the non-separable property of such state vectors.\\\indent

247: A systematic approach to the classification problem is to study the orbit classes of the tensor product space under a group action which is designed to preserve the essential non-local properties of entanglement. The orbit of an element $h$ belonging to the (multi)-linear space $H$ under the group action $G$ is defined as the set of elements $\{h'\in H:h'=gh\text{ for some }g\in G\}$.\\\indent

248: In quantum physics the appropriate group action is known to be the set of SLOCC operators, (Stochastic Local Operations with Classical Communication) \cite{lind,dur,guhn,nei2,miya}. Mathematically SLOCC operators correspond to the ability to transform the individual parts of the tensor product space $H\cong H_1\otimes H_2\otimes ...\otimes H_n$ with arbitrary invertible, linear operations. These operators are expressed by group elements of the form

249: \beqn

250: g=g_1\otimes g_2\otimes...\otimes g_n,\nonumber

251: \eqn

252: where $n$ is the number of individual spaces making up the tensor product, and $g_i\in GL(H_i)$.\\\indent

253:  The task is to identify the orbit classes of a given tensor product space under the general set of SLOCC operators. Powerful tools in this analysis are the methods of classical invariant theory. If $H$ is defined on a field $\mathbb{F}$, the set of invariant functions $I(G)$ is defined as

254: \beqn

255: I(G)=\{f:H\rightarrow\mathbb{F},\text{ s.t.

256: }f(gh)=[\det{g}]^kf(h),\text{ }k\in{0,1,2,...}, \text{ }\forall g\in

257: G,h\in H\}\nonumber.

258: \eqn

259: Clearly such invariants are relatively constant up to the determinant upon each orbit class of $H$. The set of invariants can, after some trivial definitions, be given the structure of a (graded) ring and it can be shown that there exists (under the action of the general linear group at least) a \textit{finite} set of elements which generate the full set on a given linear space. It can also be shown that the set of orbit classes of a given linear space can be completely classified given a full set of invariants on that space \cite{olver}. \\\indent

260: The motivation of the present work is the possibility that the study of orbit classes can be used to elucidate interesting results in phylogenetics.

261:

262:

263: \section{$K=2$ characters and qubits.}

264: \label{2characters}

265:

266: From here on we specialize to the case where the set of characters consists of $K=2$ elements $\{0,1\}$. When we  are dealing with a single taxon the phylogenetic state $p$ mathematically corresponds to a vector belonging to $\mathbb{R}^2$. In quantum physics the corresponding two dimensional object is the "qubit" which in turn belongs to the vector space $\mathbb{C}^2$ and if we take multiple qubits the correct state space is $\mathcal{H}=\mathbb{C}^2\otimes\mathbb{C}^2\otimes ...\otimes\mathbb{C}^2$. As we showed previously the case of the phylogenetics of multiple taxa the corresponding state space is $H={\mathbb{R}^{2}\otimes\mathbb{R}^{2}\otimes ...\otimes\mathbb{R}^{2} }$. In the forgoing work we will be implicitly taking advantage of the fact that $\mathbb{R}\subset\mathbb{C}$.

267:

268: \section{Canonical Forms}\label{canonical}

269:

270: We wish to construct the orbit classes of $\mathcal{H}=\mathbb{C}^2\otimes\mathbb{C}^2$ under the group action $GL(\mathbb{C}^2)\times GL(\mathbb{C}^2)$. We have seen that for any state $h\in \mathcal{H}$ we can find an isomorphic state $\phi(h)\in L(\mathbb{C}^2)$ which transforms under the group action as $\phi\rightarrow\phi'=g_1\phi g_2^{\intercal}$. Hence we can answer the orbit class problem by taking a canonical $2\times2$ matrix $X$ and considering the set of matrices $M:M=AXB;A,B\in GL(\mathbb{C}^2)$.

271:

272: \begin{theorem}

273: The vector space $V\otimes V$ where $V\equiv\mathbb{C}^2$ has three orbits under the group action $GL(V)\times GL(V)$. Under the isomorphism $V\otimes V\cong L(V)$ the orbits are characterized by the following canonical forms:

274: (i) Null-orbit $X=\left(\begin{array}{cc} 0 & 0 \\ 0 & 0\nonumber

275: \end{array}\right)

276: $;

277: (ii) Separable-orbit $Y=\left(\begin{array}{cc} 1 & 0 \\ 0 & 0\nonumber

278: \end{array}\right)

279: $;

280: (iii) Entangled-orbit $Z=\left(\begin{array}{cc} 1 & 0 \\ 0 & 1\nonumber

281: \end{array}\right)

282: $. The separable and entangled-orbits can be distinguished by the determinant function.

283: \end{theorem}

284:

285: \begin{proof}

286: (i) The null-orbit has only one member, the null vector; which is of course unchanged by the group action.

287: (ii) We are required to show that the set of $2\times 2$ matrices $\mathcal{M}=\{S:S=AYB;A,B\in GL(V)\}$ is all matrices such that $\det(S)=0$. We begin by taking a general member of $\mathcal{M}$, $S=\left(\begin{array}{cc} a & b \\ c & d\nonumber

288: \end{array}\right)$ with $ad-bc=0$. Clearly the matrices

289: \beqn

290: S':&=\left(\begin{array}{cc} 0 & 1 \\ 1 & 0

291: \end{array}\right)S=\left(\begin{array}{cc} c & d \\ a & b

292: \end{array}\right),\nonumber

293: \quad

294: S'':&=S\left(\begin{array}{cc} 0 & 1 \\ 1 & 0

295: \end{array}\right)=\left(\begin{array}{cc} b & a \\ d & c

296: \end{array}\right)\nonumber,\quad\mbox{and}

297: \eqn

298: \beqn

299: S''':&=\left(\begin{array}{cc} 0 & 1 \\ 1 & 0

300: \end{array}\right)S\left(\begin{array}{cc} 0 & 1 \\ 1 & 0

301: \end{array}\right)=\left(\begin{array}{cc} d & c \\ b & a

302: \end{array}\right)\nonumber

303: \eqn

304: also belong to $\mathcal{M}$. So without loss of generality we can take $a\neq 0$ and it is an easy computation to show that

305: \beqn

306: S=\left(\begin{array}{cc} 1 & 0 \\ c/a & 1\nonumber

307: \end{array}\right)Y\left(\begin{array}{cc} a & b \\ 0 & 1\nonumber

308: \end{array}\right),

309: \eqn

310: so that $\mathcal{M}$ is the set of $2\times 2$ matrices with vanishing determinant.

311: (iii) Clearly any $2\times 2$ matrix $N$ with non-zero determinant can be written as $N=AZB$ where $A,B\in GL(\mathbb{C}^2)$.\qed

312: \end{proof}

313: \begin{corollary}

314: The orbits of $\mathcal{H}=\mathbb{C}^2\otimes \mathbb{C}^2$ under $SL(\mathbb{C}^2)\times SL(\mathbb{C}^2)$ are labelled by the determinant function $\det[\phi(h)].$\end{corollary}

315: For further discussion see \cite{bern,lind,dur}.

316:

317: \section{The concurrence}

318: \label{concurrencesection}

319:

320: We consider the case of $L=2$ taxa derived from the branching of a single taxon at $t=0$ followed by arbitrary Markov evolution. The state is represented by a tensor in $H=\mathbb{R}^2\otimes \mathbb{R}^2$ and is expressed as

321: \beqn\label{L=2}

322: P(t)=M_{\epsilon_1}\otimes M_{\epsilon_2}\delta\cdot p,

323: \eqn

324: where $t_1=t_2=t$, (see figure \ref{twotaxa}).\\\indent

325: The most general rate matrix depends on 2 parameters and can be expressed as

326: \beqn

327: R=\left(\begin{array}{cc} -\alpha & \beta \\ \alpha & -\beta\nonumber

328: \end{array}\right)

329: \eqn

330: where $\alpha$ and $\beta$ are real. A simple free parameter count in expression (\ref{L=2}) yields, taking into account the scaling symmetry on edges, 1 free parameter due to each transition matrix and 1 free parameter due to the initial state $p$. Hence there are a total of 3 free parameters and given that the components of the $K^2=4$ dimensional $P(t)$ are probabilites we conclude that all free parameters are accounted for.\\\indent

331: In quantum physics the tensors representing 2 qubits correspond in phylogenetics to the case of $L=2$ taxa with $K=2$ characters. As we have shown in the previous section, there exist 2 nontrivial orbit classes which are completely distinguished by the relative invariant known as the concurrence, $\mathcal{C}:\mathcal{H}=\mathbb{C}^2\otimes\mathbb{C}^2\rightarrow \mathbb{C}$. Using the formalism we have developed we can express the concurrence of the state $h\in \mathcal{H}$ as

332: \beqn\label{concurrence}

333: \mathcal{C}(h)=\det[\phi(h)],

334: \eqn

335: which satisfies

336: \beqn

337: \mathcal{C}(h'):&=\mathcal{C}(g_1\otimes g_2h)\nonumber\\

338: &=\det[g_1\phi(h)g_2^\intercal]\nonumber\\

339: &=\det[g_1]\det[g_2]\mathcal{C}(h),\nonumber

340: \eqn

341: so the concurrence is truly a relative invariant.

342: This can also be expressed explicitly as

343: \beqn\label{tensorconc}

344: \mathcal{C}(h)=\sum h_{ij}h_{kl}\epsilon_{ik}\epsilon_{jl},

345: \eqn

346: where $\epsilon$ is the completely anti-symmetric tensor with $\epsilon_{01}=1$.

347: The two orbit classes correspond to the completely entangled Bell state and the completely dis-entangled, and hence separable, state.  The entangled orbit is the set of states equivalent to the Bell state

348: \beqn

349: h_{bell}=\fra{1}{\sqrt{2}}(e_0\otimes e_0+e_1\otimes e_1),\nonumber

350: \eqn

351: whereas the dis-entangled orbit is the set of states which take on the separable form

352: \beqn

353: h=u\otimes v,\nonumber

354: \eqn

355: where $u,v\in \mathbb{C}^2$. The concurrence vanishes if and only if the state belongs to the separable orbit class. This property can be used to distinguish the orbit classes.\\\indent

356: In phylogenetic analysis the concurrence can be used to establish the magnitude of divergence between a pair of taxa derived from a single branching event. The case where there is no phylogenetic relation cannot be distinguished from the case of infinite divergence.  When there has been infinite divergence we have

357: \beqn

358: \lim_{t\rightarrow\infty}P(t)=\Pi=\pi_{\epsilon_1}\otimes\pi_{\epsilon_2},\nonumber

359: \eqn

360: which is a separable state and hence has concurrence

361: \beqn

362: \mathcal{C}(\pi_{\epsilon_1}\otimes\pi_{\epsilon_2})=(\beta_1\beta_2)(\alpha_1\alpha_2)-(\beta_1\alpha_2)(\alpha_1\beta_2)=0.\nonumber

363: \eqn

364: The concurrence of the phylogenetic state (\ref{L=2}) is given by

365: \beqn

366: \mathcal{C}(P(t))&=\det[M_{\epsilon_1}]\det[M_{\epsilon_2}]\det[\phi(\delta\cdot p)]\nonumber\\

367: \eqn

368: and using the operator identity $\det[e^X]=e^{trX}$ can easily be computed

369: \beqn\label{concurrence1}

370: \mathcal{C}(P(t))&=\displaystyle{e^{tr[R_1t]}e^{tr[R_2t]}p_0p_1}\\

371: &=e^{-(\alpha_1+\beta_1+\alpha_2+\beta_2)t}p_0p_1.

372: \eqn

373: From this explicit form it can be seen that the concurrence is some kind of measure of phylogenetic divergence. From the concurrence we would like to construct a formal distance function. In the case of general $L$ taxa it is of course possible to construct a reduced tensor which represents the pattern distribution upon any pair of the taxa. This is achieved using the prescription defined in equation (\ref{reduced}). One can then go on to calculate the concurrence between any given pair of taxa taken from the set of $L$. We define a distance function, $d_{ij}$, between any pair of taxa taken from a set of $L$ as

374: \beqn\label{distance}

375: d_{ij}:&=-\log{\mathcal{C}\left(\sum P_{a_1a_2...a_L}e_{a_i}\otimes e_{a_j}\right)},i\neq j\\

376: d_{ii}:&=0.

377: \eqn

378: From the definition of the concurrence (\ref{tensorconc}) it is trivial to show that $d_{ij}=d_{ji}$. At the time, $\tau$, of the branching event at which the pair of taxa under consideration were created the concurrence took on the value $p_0p_1$ where

379: \beqn\label{marginal}

380: p_\gamma=\sum_{\text{all $a$'s}} P_{a_1a_2...a_{i-1}\gamma a_{i+1}...a_{j-1}\gamma a_{j+1}...a_L}(\tau),

381: \eqn

382: Of course we have $0\leq p_0p_1\leq1$ and after this time the concurrence scales with the determinant $\det[M_i]\det[M_j]$ which is also strictly positive and less than unity. We can conclude that the concurrence between a pair of taxa is always strictly positive and less than unity and that the distance function, $d_{ij}$, is also strictly positive. The triangle inequality

383: \beqn

384: d_{ij}+d_{jk}\geq d_{ik}\nonumber

385: \eqn

386:  is equivalent to the statement that

387: \beqn

388: \mathcal{C}\left(\sum P_{a_1a_2...a_L}e_{a_i}\otimes e_{a_j}\right)\mathcal{C}&\left(\sum P_{a_1a_2...a_L}e_{a_j}\otimes e_{a_k}\right)&\nonumber\\&\leq \mathcal{C}\left(\sum P_{a_1a_2...a_L}e_{a_i}\otimes e_{a_k}\right),\nonumber

389: \eqn

390: which invoking (\ref{concurrence1}) can be expressed as

391: \beqn

392: e^{-(\alpha_i+\beta_i+\alpha_j+\beta_j)t}p^{(ij)}_0p^{(ij)}_1e^{-(\alpha_j+\beta_j+\alpha_k+\beta_k)t}&p^{(jk)}_0p^{(jk)}_1\nonumber\\&\leq e^{-(\alpha_i+\beta_i+\alpha_k+\beta_k)t}p^{(ik)}_0p^{(ik)}_1.\nonumber

393: \eqn

394: Here $p^{(ij)}$ is the single taxon marginal distribution existing at the node closest to the root which joins taxon $i$ to taxon $j$. These marginal distributions are calculated as in (\ref{marginal}).

395: Now depending on the branching structure of the tree we have $p^{(ij)}=p^{(jk)}$ or $p^{(jk)}=p^{(ik)}$ so that the distance function satisfies the triangle inequality.\\\indent

396: The distance function (\ref{distance}) is well known in phylogenetics as the $\log\det$ distance \cite{steel,fels}.

397:

398: \section{The tangle}\label{tangle}

399:

400: We consider the case of $L=3$ taxa derived from the branching of a single taxon at $t=0$ followed by arbitrary Markov evolution, an additional branching event on edge 1 or 2 at $t=\tau$ and then additional arbitrary Markov evolution. For the case when the second branching event occurs on edge 2 the tree is represented by (1(23)) and the state is represented by a tensor in $H=\mathbb{R}^2\otimes \mathbb{R}^2\otimes \mathbb{R}^2$ as

401: \beqn\label{L=3}

402: P_\tau(t)=[M_{\epsilon_1}\otimes M_{\epsilon_3}\otimes M_{\epsilon_4}]1\otimes\delta [1\otimes M_{\epsilon_2}]\delta\cdot p,

403: \eqn

404: where $t_1\!=t$, $t_2=\tau$, $t_3=t_4=t-\tau$, (see figure [\ref{threetaxa}])\footnote{The use of pattern frequencies for the case of three taxa has been studied in relation to the problem of tree reconstruction by Pearl and Tarsi \cite{pearl} and Chang \cite{chang}.}.

405:

406:

407: \begin{figure}[h]

408: \centering

409: \resizebox{0.3\hsize}{!}{\rotatebox{0}{\includegraphics{fig1.eps}}}

410: \caption{The CTMC model of three taxa on the tree (1(23)).}

411: \label{threetaxa}

412: \end{figure}

413:

414:

415: It is known that there are 6 orbit classes of $\mathbb{C}^2\otimes \mathbb{C}^2\otimes \mathbb{C}^2$ under $GL(\mathbb{C}^2)\times GL(\mathbb{C}^2)\times GL(\mathbb{C}^2)$ that can be distinguished by functions of the concurrence and another relative invariant known as the tangle \cite{dur,guhn}. We begin by defining a partial concurrence operation $\{\mathcal{C}_a:\mathbb{C}^2\otimes \mathbb{C}^2\otimes \mathbb{C}^2\rightarrow \mathbb{C}^2\otimes \mathbb{C}^2,a=1,2,3.\}$ as

416: \beqn\label{pconcurrence}

417: \mathcal{C}_1(h)=\sum h_{ijk}h_{lmn}\epsilon_{jm}\epsilon_{kn}e_i\otimes e_l,\\

418: \mathcal{C}_2(h)=\sum h_{ijk}h_{lmn}\epsilon_{il}\epsilon_{kn}e_j\otimes e_m,\\

419: \mathcal{C}_3(h)=\sum h_{ijk}h_{lmn}\epsilon_{il}\epsilon_{jm}e_k\otimes e_n.\\

420: \eqn

421: From these definitions it is easy to see that

422: \beqn

423: \mathcal{C}_1(h'):&=\mathcal{C}_1(g_1\otimes g_2 \otimes g_3 h)\nonumber\\

424: &=[\det(g_2)\det(g_3)]g_1\otimes g_1\mathcal{C}_1(h),

425: \eqn

426: with similar expressions for $\mathcal{C}_2$ and $\mathcal{C}_3$. The tangle, $\{\mathcal{T}:\mathbb{C}^2\otimes \mathbb{C}^2\otimes \mathbb{C}^2\rightarrow \mathbb{C}\}$, can be defined as

427: \beqn

428: \mathcal{T}=\mathcal{C}\raisebox{.3ex}{\scriptsize o} \mathcal{C}_a,\nonumber

429: \eqn

430: where we will confirm shortly that $\mathcal{T}$ is independent of the choice of $a$. The tangle is a relative invariant satisfying

431: \beqn

432: \mathcal{T}(h')=[\det(g_1)\det(g_2)\det(g_3)]^2\mathcal{T}(h),

433: \eqn

434: and, in analogy to \ref{tensorconc},  can also be written in the form

435: \beqn

436: \mathcal{T}(h)=\sum h_{a_1a_2a_3}h_{b_1b_2b_3}h_{c_1c_2c_3}h_{d_1d_2d_3}\epsilon_{a_1b_1}\epsilon_{b_2c_1}\epsilon_{c_2d_1}\epsilon_{d_2a_2}\epsilon_{b_2d_3}\epsilon_{a_3c_3}.

437: \eqn

438: \\\indent

439: The 6 orbit classes are described by the completely dis-entangled states

440: \beqn

441: h=&u\otimes v\otimes w,\\\nonumber u&,v,w\in \mathbb{C}^2;

442: \eqn

443: the partially entangled states

444: \beqn

445: h_a:\quad a=1,2,3,\nonumber

446: \eqn

447:  which form 3 orbit classes characterized by the separability of the canonical tensors

448: \beqn

449: \text{(1) }h_{ijk}=u_iv_{jk},\\\nonumber

450: \text{(2) }h_{ijk}=u_{ij}v_k,\\\nonumber

451: \text{(3) }h_{ijk}=u_{ik}v_j;\nonumber

452: \eqn

453: the completely entangled states equivalent to the $GHZ$ state

454: \beqn

455: h_{ghz}=\fra{1}{\sqrt{2}}(e_0\otimes e_0\otimes e_0+e_1\otimes e_1\otimes e_1);\nonumber

456: \eqn

457: and the completely entangled states equivalent to the $W$ state

458: \beqn

459: h_w=\fra{1}{\sqrt{3}}(e_0\otimes e_0\otimes e_1+e_0\otimes e_1\otimes e_0+e_1\otimes e_0\otimes e_0).\nonumber

460: \eqn

461: The tangle and the concurrence and its partial counterparts can be used to fully distinguish these orbit classes.  For the completely dis-entangled tensors we have

462: \beqn

463: \mathcal{C}_a(h)=0,\quad\forall a;\nonumber

464: \eqn

465: whereas for the partially entangled states we have

466: \beqn

467: \mathcal{C}_a(h_{a'})=0,\quad\text{ iff }\delta_a^{a'}=0.\nonumber

468: \eqn

469: The $GHZ$ and $W$ orbits are distinguished by calculating the tangle

470: \beqn

471: \mathcal{T}(h_{ghz})&\neq 0,\quad\mathcal{T}(h_W)&=0.\nonumber

472: \eqn

473: From these properties we will now show that the tangle is indeed independent of the choice of the partial concurrence. Begin by introducing an action of the symmetric group, $S_3$, on the tensor product space defined by

474: \beqn

475: \sigma\in S_3:V_1\otimes V_2\otimes V_3\rightarrow V_{\sigma 1}\otimes V_{\sigma 2}\otimes V_{\sigma 3}.\nonumber

476: \eqn

477: Now since the tangle vanishes everywhere except on the $GHZ$ orbit and $\sigma h_{ghz}=h_{ghz}$ we need only consider the value of the tangle on elements which lie on the $GHZ$ orbit. We take as our element $x=g_1\otimes g_2\otimes g_3h_{ghz}$ and proceed. From its definition the tangle satisfies

478: \beqn

479: \mathcal{T}(\sigma h)=\mathcal{C}\raisebox{.3ex}{\scriptsize o} \mathcal{C}_a(\sigma h)&=\mathcal{C}\raisebox{.3ex}{\scriptsize o} \mathcal{C}_{\sigma a}(h),\nonumber\\

480: \eqn

481: $\forall h\in \mathcal{H}$. For our element $x$ we have

482: \beqn

483: \mathcal{T}(\sigma x)&=\mathcal{T}(g_{\sigma 1}\otimes g_{\sigma 2}\otimes g_{\sigma 3}h_{ghz}),\nonumber\\

484: &=[\det{g}]^2\mathcal{T}(h_{ghz}),\nonumber\\

485: &=\mathcal{T}(x).

486: \eqn

487: This shows that

488: \beqn

489: \mathcal{C}\raisebox{.3ex}{\scriptsize o} \mathcal{C}_a&=\mathcal{C}\raisebox{.3ex}{\scriptsize o} \mathcal{C}_{\sigma a},\nonumber\\

490: \forall \sigma&\in S_3.

491: \eqn

492: \\\indent

493: We now determine which orbit the $L=3$ phylogenetic state (\ref{L=3}) lies in. The easiest way to do this is to calculate the various invariants at the time of the branching event. To this end we use (\ref{comp}) so that components of the state are

494: \beqn\label{bulldust}

495: p_{i_1i_2i_3}(\tau)=p_{i_1i_2}(\tau)\delta_{i_2i_3}.

496: \eqn

497: From this expression we might expect that the entanglement of the state given by the tangle just after branching can be expressed as a function of the entanglement in the state given by the concurrence just before branching. This is indeed the case. By direct computation using (\ref{bulldust}) it can be shown that at the time of branching the tangle is given by

498: \beqn

499: \mathcal{T}(P_{\tau}(\tau))&=\mathcal{C}\raisebox{.3ex}{\scriptsize o}\mathcal{C}_3(P_{\tau}(\tau))\nonumber\\

500: &=-2[\mathcal{C}(M_{\epsilon_1'}\otimes M_{\epsilon_2}\delta\cdot p)]^2,\nonumber\\

501: \eqn

502: where $\epsilon_1'=\{\alpha_1,\beta_1;t_2\}$.

503: The tangle has the value

504: \beqn

505: \mathcal{T}(P_{\tau}(\tau))=-2e^{-2(\alpha_1+\beta_1+\alpha_2+\beta_2)\tau}[p_0p_1]^2.

506: \eqn

507:  Subsequent to this the tangle takes on the value

508: \beqn

509: \mathcal{T}(P_{\tau}(t))&=(\det[M_{\epsilon_1''}]\det[M_{\epsilon_3}]\det[M_{\epsilon_4}])^2\mathcal{T}(P_{\tau}(\tau))\nonumber\\

510: &=-2e^{-2(\alpha_1+\beta_1+\alpha_3+\beta_3+\alpha_4+\beta_4)(t-\tau)}e^{-2(\alpha_1+\beta_1+\alpha_2+\beta_2)\tau}(p_0p_1)^2\nonumber,

511: \eqn

512: so that the phylogenetic state belongs to the $GHZ$ orbit for all finite $t$. This equivalence has in fact been observed in a different context in \cite{lake}.

513: It should be noted that as $t\rightarrow\infty$ the tangle tends to zero and the state becomes the disentangled, stationary state which corresponds phylogenetically to the case where the taxa are unrelated. This is of course what we would expect if the taxa have diverged so much that there is no longer a possibility of establishing any relation between the taxa.\\\indent

514:

515: \section{The tangle and distance functions}\label{distances}

516:

517: The tangle gives us a new tool for calculating the phylogenetic distance between a set of three taxa. As was the case with the concurrence it is possible to calculate the value of the tangle for any subset of three taxa taken from a set of $L$ taxa. We use the tangle to define a three taxa phylogenetic distance  given by

518: \beqn

519: d_{ijk}:=\fra{1}{2}\log{2}-\fra{1}{2}\log{\left[-\mathcal{T}\left(\sum P_{a_1a_2...a_L}e_{a_i}\otimes e_{a_j}\otimes e_{a_k}\right)\right]}.

520: \eqn

521: For the case under consideration this three taxa distance takes on the value

522: \beqn

523: d_{123}=(\alpha_1+\beta_1)&t+(\alpha_2+\beta_2)\tau\nonumber\\&+(\alpha_3+\beta_3+\alpha_4+\beta_4)(t-\tau)-\log{p_0p_1}.\nonumber

524: \eqn

525: \\\indent

526: At this point it is illuminating to compare the values of all the partial distance functions $d_{12},d_{23}$ and $d_{13}$ with the value of the tangle for the case of the branching structure $(1(23))$. The distance functions take on the values

527: \beqn

528: d_{12}&=(\alpha_1+\beta_1)t+(\alpha_2+\beta_2)\tau +(\alpha_3+\beta_3)(t-\tau)-\log{p_0p_1},\nonumber\\

529: d_{23}&=(\alpha_3+\beta_3+\alpha_4+\beta_4)(t-\tau)-\log{p_0'p_1'},\nonumber\\

530: d_{13}&=(\alpha_1+\beta_1)t+(\alpha_2+\beta_2)\tau +(\alpha_4+\beta_4)(t-\tau)-\log{p_0p_1},

531: \eqn

532: where $p'=M_{\epsilon_2}p$.

533: We define weights on the edges $1,2,3,4$ of the tree to be

534: \beqn

535: x&=(\alpha_1+\beta_1)t,\nonumber\\

536: y&=(\alpha_2+\beta_2)\tau,\nonumber\\

537: z&=(\alpha_3+\beta_3)(t-\tau),\nonumber\\

538: w&=(\alpha_4+\beta_4)(t-\tau),\nonumber\\

539: \eqn

540: respectively.

541: It is possible to solve the distance function equations for the weights

542: \beqn

543: x+y&=d_{12}+d_{13}-d_{123}+\log{p_0p_1},\nonumber\\

544: z&=d_{123}-d_{13},\nonumber\\

545: w&=d_{123}-d_{12},\nonumber\\

546: \log{p_0'p_1'}&=2d_{123}-d_{12}-d_{13}-d_{23}.

547: \eqn

548: In summary we find that, if we assume the branching structure of the tree, we now have a prescription that gives us the evolutionary distances between three taxa up to errors caused by the fact we cannot determine the marginal distribution, $p$, which lies the at top node of the tree. This marginal distribution must be estimated using some resonable prescription.  \\\indent

549: To elucidate the value of including the tangle in the analysis we present the corresponding set of branch lengths calculated without using the tangle

550: \beqn

551: x+y&=\fra{1}{2}(d_{12}+d_{13}-d_{23})-\log{p_0p_1}-\fra{1}{2}\log{p'_0p'_1},\nonumber\\

552: z&=\fra{1}{2}(d_{12}+d_{23}-d_{13})+\fra{1}{2}\log{p_0'p_1'},\nonumber\\

553: w&=\fra{1}{2}(d_{23}+d_{13}-d_{12})+\fra{1}{2}\log{p_0'p_1'}.

554: \eqn

555: \\\indent

556: This comparison makes clear the advantage of including the tangle in the analysis of branch lengths.

557:

558: \section{Conclusion}

559:

560: We have shown that it is possible to present the continuous time Markov chain model of phylogenetics using a dynamical, tensor state space description. We have shown that the branching process introduces entanglement into the description, and that the group invariant approach to entanglement in quantum physics can be used in phylogenetics to derive distance functions between taxa. The main original result presented was the use of the tangle as a new distance measure between three taxa.\\\indent

561: Entanglement measures can be extended to the cases of $K>2$ and $L>3$ and will be explored in future work. In particular the invariant theory for the case of $K=2,L=4$ is established in the physics literature \cite{vers,luqu} and progress in interpreting the theory in the phylogenetic context is underway.\\\indent The use of invariant functions to distinguish alternate tree branching structures was not achieved in this work, but future work will explore sharpening the group action from $GL(V)^{\times L}$ to the more stringent Markov operator action and it is hoped that this will allow branching structures to be distinguished using the corresponding invariant functions. This will allow results establishing that if and only if the values of invariant functions taken on an arbitrary character distribution satisfy certain relations, then it can be concluded that the distribution was generated from a Markov model on a given branching structure.\\\indent

562: There is the remaining issue that the distance functions are known only up to $\log$ functions of the marginal distributions at the nodes. An avenue of furthur research would be to determine under what conditions these $\log$ functions can be treated as statistically insignificant. This would involve studying how far the sequences can diverge before the information content of the distribution becomes so small that the calculation of edge weights is misleading. That is, under what circumstances are the edge weights large enough so that terms such as $\log{p_0p_1}$ can be treated as small error? The problem is that the edge weights are a measure of the magnitude of divergence, which if allowed to become large enough means that the information content of the distribution is small, and hence there is no potential to establish phylogenetic relation anyway. It would be fruitful to explore these issues analytically.\\\indent

563:

564: \begin{acknowledgement}

565: PDJ and JGS thank the Department of Physics and Astronomy, and also

566: the Biomathematics Research Centre, University of Canterbury, Christchurch, New

567: Zealand, for hosting a visit during which this work was presented, clarified and expanded upon.

568: We would also like to thank Mike Steel and Jim Bashford for helpful comments. Finally, we would like to the thank the anomynous referrees for helpful comments regarding the initial draft which have led to a much improved text in the final version and, also, John Rhodes for pointing out an error in the edge weights analysis. This research was supported by the Australian Research Council

569: grant DP0344996 and the Australian Postgraduate Award.

570: \end{acknowledgement}

571:

572: \begin{thebibliography}{99}

573:

574: \bibitem{bashjarv}{Bashford, J. D., P. D. Jarvis, J. G. Sumner and M. A. Steel (2004). "U(1)xU(1)xU(1) symmetry of the Kimura 3ST model and phylogenetic branching processes." Journal of Physics A: Mathematical and General 37: L1-L9.}

575: \bibitem{bern}{Bernevig, B. A. and H. D. Chen (2003). "Geometry of the three-qubit state, entanglement and division algebras." Journal of Physics A: Mathematical and General 36(30): 8325-8339.}

576: \bibitem{chang}{Chang, J. T. (1996). "Full reconstruction of markov models on evolutionary trees: identifiability and consistency." Math Biosci. 137(1): 51-73.}

577: \bibitem{dur}{Dur, W., G. Vidal and J. I. Cirac (2000). "Three qubits can be entangled in two inequivalent ways." Physics Review A 62(6): 062314.}

578: \bibitem{erik}{Eriksson, N., K. Ranestad, B. Sturmfels and S. Sullivant (2004). "Phylogenetic Algebraic Geometry."

579: To appear in the proceedings of the conference "Projective Varieties

580: with Unexpected Properties", Siena, Italy, 2004.

581: eprint:  math.AG/0407033}

582: \bibitem{fels}{Felsenstein, J. (2004). Inferring Phylogenies, Sinauer Associates: 196-206, 251.}

583: \bibitem{fels2}{Felsenstein, J. (1981). "Evolutionary trees from DNA sequences: a maximum likelihood approach." Journal of Molecular Evolution 17: 368-376.}

584: \bibitem{fels3}{Felsenstein, J. (1991). "Counting phylogenetic invariants in some simple cases." Journal of Theoretical Biology 152: 357-376.}

585: \bibitem{guhn}{Guhne, O. and P. Hyllus (2003). "Investigating three qubit entanglement with local measurements." International Journal of Theoretical Physics 42: 1001-1013.}

586: \bibitem{hagg}{Haggstrom, O. (2002). Finite Markov Chains and Algorithmic Applications. Cambridge, Cambridge University Press.}

587: \bibitem{jarv}{Jarvis, P. D. and J. D. Bashford (2001). "Quantum field theory and phylogenetic branching." Journal of Physics A: Mathematical and General 34: L703-L707.}

588: \bibitem{lake}{Lake, J. A. (1997). "Phylogenetic inference: how much evolutionary history is knowable?" Molecular Biology and Evolution 14(3): 213-219.}

589: \bibitem{lind}{Linden, N. and S. Popescu (1998). "On multi-particle entanglement." Fortschritte der Physik 46: 567-578.}

590: \bibitem{luqu}{Luque, J.-G. and J.-Y. Thibon (2003). "The polynomial invariants of four qubits." Physical Review A 67: 042303.}

591: \bibitem{miya}{Miyake, A. (2003). "Classification of multiparticle entangled states by multidimensional determinants." Physical Review A 67: 012108.}

592: \bibitem{nei}{Nei, M. and S. Kumar. (2000). Molecular Evolution and Phylogenetics. Oxford, Oxford University Press: 33-43.}

593: \bibitem{nei2}{Nielsen, M. A. and I. L. Chuang (2000). Quantum Computation and Quantum Information. Cambridge, Cambridge University Press.}

594: \bibitem{olver}{Olver, P. J. (2003). Classical Invariant Theory. Cambridge, Cambridge University Press.}

595: \bibitem{pearl}{Pearl, J. and M. Tarsi (1986). "Structuring causal trees." Journal of Complexity 2: 60-77.}

596: \bibitem{rind}{Rindos, A., S. Woolet, I. Viniotis and K. Trivedi (1995). Exact methods for the transient analysis of nonhomogeneous continuous time markov chains. 2nd Internation Workshop of Markov Chains. W. J. Stewar, Kluwer Academic.}

597: \bibitem{rodr}{Rodriguez, F., J. L. Oliver, A. Marin and J. R. Medina (1990). "The general stochastic model of nucleotide substitution." Journal of Theoretical Biology 142: 485-501.}

598: \bibitem{steel}{Steel, M., M. D. Hendy and D. Penny (1998). "Reconstructing phylogenies from nucleotide pattern probabilities: A survey and some new results." Discrete Applied Mathematics 88: 367-396.}

599: \bibitem{stee}{Semple, C. and M. Steel (2003). Phylogenetics. Oxford, Oxford Press: 183-215.}

600: \bibitem{vers}{Verstraete, F., J.Dehaene, B. D. Moor and H. Verschelde (2002). "Four qubits can be entangled in nine different ways." Physical Review A 65: 052112.}

601: \bibitem{wern}{Werner, R. F. and M. M. Wolf (2001). "Bell Inequalities and entanglement." Quantum Information and Computation 1(3): 1-25.}

602: \end{thebibliography}

603:

604:

605:

606:

607: \end{document}