1: %\documentclass[a4]{article}
2: %\usepackage{cb}
3:
4: \documentclass[12pt]{article}
5: \usepackage{graphicx}
6:
7: % autolabel: eq:4
8:
9: \author{Christoph Best\footnote{Electronic mail: {\tt
10: christoph.best@ifi.lmu.de}},\ \ Ralf Zimmer, and Joannis Apostolakis\\
11: Institute for Informatics, LMU, \\
12: Amalienstr. 17, 80333 M\"unchen, Germany}
13: \date{June 15, 2004}
14:
15: \title{Probabilistic methods for predicting protein functions
16: in protein-protein interaction networks}
17:
18: \newcommand{\nnm}{\nonumber}
19: \newcommand{\onehalf}{{\scriptstyle \frac{1}{2}}}
20: \newcommand{\const}{{\rm const}}
21:
22: \begin{document}
23: %\selectlanguage{english}
24:
25: \maketitle
26:
27: \begin{abstract}
28: We discuss probabilistic methods for predicting protein functions
29: from protein-protein interaction networks. Previous work based on
30: Markov Randon Fields is extended and compared to a general
31: machine-learning theoretic approach. Using actual protein
32: interaction networks for yeast from the MIPS database and GO-SLIM
33: function assignments, we compare the predictions of the different
34: probabilistic methods and of a standard support vector machine. It
35: turns out that, with the currently available networks, the simple
36: methods based on counting frequencies perform as well as the more
37: sophisticated approaches.
38: \end{abstract}
39:
40: \section{Introduction}
41:
42: Large-scale comprehensive protein-protein interaction data, which have
43: become available recently, open the possibility of deriving new
44: information about proteins from their associations in the interaction
45: graph. In the following, we discuss and compare several probabilistic
46: methods for predicting protein functions from the functions of
47: neighboring proteins in the interaction graph.
48:
49: In particular, we compare two recently published methods that are
50: based on Markov Random Fields \cite{Letovsky,Deng} with a prediction
51: based on a machine-learning appproach using maximum-likelihood
52: parameter estimation. It turns out that all three approaches can be
53: considered different versions of each other using different
54: approximations. The main difference between the Markov Random Field (MRF)
55: and the machine-learning methods is that the former apprach
56: takes a global look at the network, while the latter considers each
57: networks node as an independent training example. However, in the
58: mean-field approximation required to make the MRF approach numerically
59: tractable, it is reduced to considering each node independently. The
60: local enrichment-method considered in \cite{Letovsky} can then be
61: interpreted as another approximation which enables us to make
62: predictions directly from observer frequencies, bypassing the
63: numerical minimization step required in the more general
64: machine-learning approach.
65:
66: We also extend these methods by considering a non-linear
67: generalization for the probability distribution in the
68: machine-learning approach, and by taking larger neighborhoods in the
69: network into account. Finally, we compare the performance of these
70: methods to a standard Supper Vector Machine.
71:
72: \section{Methods}
73:
74: We consider a network specified by a graph whose nodes are proteins
75: and whose undirected vertices indicate interactions between the
76: proteins. Each node is assigned one of a set of protein functions. In
77: a machine-learning approach to prediction, this assignment follows a
78: simple probability function depending on the protein functions in the
79: network neighborhood of each node and parametrized by a small set of
80: parameters. The learning problem is to estimate these parameters from
81: a given sample of assignments. The prediction can then be
82: performed by evaluating the probability distribution using these
83: parameters.
84:
85: \subsection{Machine-learning approach}
86: \label{sec:ml}
87:
88: Assume we only consider a single protein function at a time. Node
89: assignments can then be chosen binary, $x\in\{0,1\}$, with $1$
90: indicating that a node has the function under consideration. In the
91: simplest case, the probability that a node $i$ has assignment $x$
92: depends only its immediate neighbors, and since all vertices of the
93: graph are equal, it can only depend on the number of neighbors $C$,
94: and the number of active neighbors $N$. Borrowing from statistical
95: mechanics, we write the probability using a potential $U(x;C,N)$
96: \begin{equation} \label{eq:2}
97: p(x|C,N) = \frac{e^{-U(x;C,N)}}{Z(C,N)}, \qquad
98: Z(C,N) = \sum_{y=0,1} e^{-U(y;C,N)}
99: \end{equation}
100: where the partition sum $Z(C,N)$ is a normalizing factor. This equation
101: basically expresses that the log-probabilities of $x$ are
102: proportional to the potential $U(x;C,N)$. In a lowest-order
103: approximation, we can choose a linear function for the potential:
104: \begin{equation} \label{eq:1}
105: U(x;C,N;\alpha) = (\alpha_0 + \alpha_1 C + \alpha_2 N)x \quad.
106: \end{equation}
107: Later, we will extend this approach to more general functions.
108:
109: The parameters $\alpha$ can be estimated from a set of training
110: samples $(x_i,C_i,N_i)$ by maximum-likelihood estimation. In this
111: approach, they are chosen to maximize the joint probability
112: \begin{equation}
113: P = \prod_i p(x_i|C_i,N_i)
114: \end{equation}
115: of the training data, or equivalently, to minimize its negative logarithm
116: \begin{equation}
117: -\log P = \sum_i \left[ \ln Z(C_i,N_i) + U(x_i;C_i,N_i) \right] \quad.
118: \end{equation}
119: Taking the partial derivative w.r.t.~to a parameter gives the equation
120: \begin{equation}
121: -\frac{\partial P}{\partial \alpha}
122: = \sum_i \left\{ - \frac{1}{Z(C_i,N_i)} \sum_{y=0,1}
123: \frac{\partial U(y,C_i,N_i)}{\partial\alpha}
124: e^{-U(y,C_i,N_i)}
125: + \frac{\partial U(x_i,C_i,N_i)}{\partial\alpha} \right\} \quad.
126: \end{equation}
127: The first term in the bracket is the expectation value of $\partial
128: U/\partial\alpha$ in the neighborhood $(C_i,N_i)$ under the
129: probability distributions parametrized by $(\alpha,\ldots)$:
130: \begin{equation}
131: \left\langle \frac{\partial U(y,C_i,N_i)}{\partial\alpha}
132: \right\rangle_{N_i,C_i;\alpha,\ldots} =
133: \frac{1}{Z(C_i,N_i)} \sum_{y=0,1}
134: \frac{\partial U(y,C_i,N_i)}{\partial\alpha} \,
135: e^{-U(y,C_i,N_i)}
136: \end{equation}
137: At the extremum, the derivative vanishes and we have the simple relation
138: \begin{equation}
139: \sum_i \left\langle \frac{\partial U(y,C_i,N_i)}{\partial\alpha}
140: \right\rangle
141: = \sum_i \frac{\partial U(x_i,C_i,N_i)}{\partial\alpha} \quad.
142: \end{equation}
143: Thus, in the maximum-likelihood model, the parameters are adjusted so
144: that the average expectation values of the derivatives of the
145: potential are equal to the averages observed in the training data.
146: Using eq.~\ref{eq:1}, this gives the set of three equations.
147: \begin{eqnarray}
148: \sum_i \left\{ \begin{array}{l} 1 \\ C_i \\ N_i \end{array} \right\} \,
149: \langle x \rangle
150: &=& \sum_i \left\{ \begin{array}{l} 1 \\ C_i \\ N_i \end{array} \right\} \, x_i
151: \end{eqnarray}
152: where the expectation value of $x$ in the environment $(C_i,N_i)$ and
153: in the model parametrized by $\alpha$ is given by
154: \begin{equation}
155: \langle x \rangle =
156: \langle x \rangle_{\alpha_0,\alpha_1,\alpha_2;C_i,N_i}
157: = \frac{e^{-(\alpha+\alpha_1 C_i+\alpha_2 N_i)}}{1+e^{-(\alpha+\alpha_1
158: C_i+\alpha_2 N_i)}} \quad.
159: \end{equation}
160: Only in the simplest case, $\alpha_1 = \alpha_2 = 0$, this equation
161: can be solved analytically, leading to the relation:
162: \begin{equation} \label{eq:3}
163: \alpha = \frac{\bar x}{1-\bar x}, \qquad\mbox{with}\qquad
164: \bar x = \frac{1}{n} \sum_{i=1}{n} x_i \quad.
165: \end{equation}
166: In the general case, we solve these equations numerically using a
167: conjugate-gradient method by explicitly minimizing the joint
168: probability $P$.
169:
170: \subsection{Network approach}
171:
172: An alternative approach to prediction starts out from considering a
173: given network and the protein function assignments as a whole and
174: assigning a score based on how well the network and the function
175: assignments agree. In the approach of \cite{Deng}, each link
176: contributes to this score with a gain $G_0$ or $G_1$, resp., if both nodes at the ends of the
177: link have the same function $0$ or $1$, and a penalty $P$ if they have different
178: function assignments. Assuming again that the log-probabilities are
179: proportional to the scores, this induces a probability
180: distribution over all joint function assignments ${\bf x}$ given by
181: \begin{equation}
182: p({\bf x}) = \frac{1}{Z} e^{-U({\bf x})} \quad, \qquad
183: Z = \sum_{\bf x} e^{-U({\bf x})}
184: \end{equation}
185: where now the normalization factor is calculated by summing over all
186: possible joint function assignments of the nodes.
187:
188: The scoring function $U({\bf x})$ can be expressed as
189: \begin{eqnarray}
190: U({\bf x}) &=& -\frac{G_1}{2} \sum_{i,j:(i,j)\in E} x_i x_j
191: - \frac{G_0}{2} \sum_{i,j:(i,j)\in E} (1-x_i)\, (1- x_j)
192: \\ \nnm
193: &&+ \frac{P}{2} \sum_{i,j:(i,j)\in E} \left( (1-x_i) \, x_j + x_i \, (1-x_j) \right)
194: + \eta_0 \sum_i x_i
195: \\ \nnm
196: &=& \eta_0 \sum_i x_i
197: + \eta_1 \sum_i C_i x_i
198: + \frac{\eta_2}{2} \sum_{i,j: (i,j)\in E} x_i x_j
199: \end{eqnarray}
200: with the parameters
201: \begin{equation}
202: \eta_2 = - G_1 - G_0 - 2P \qquad\mbox{and}\qquad
203: \eta_1 = G_0 + P \quad.
204: \end{equation}
205: In terms of statistical mechanics, this describes a ferromagnetic
206: system where the inverse temperature is determined by $\eta_2$ and an
207: external field by $\eta_0$ and $\eta_1$.
208:
209: Again, maximum-likelihood parameter estimation is performed by finding
210: a set of parameters $\eta = (\eta_0,\eta_1,\eta_2)$ such that the
211: probability of the $N$ sample configurations ${\bf x}^{(n)}$ is maximized:
212: \begin{equation}
213: \alpha = \mathop{\rm argmax}_\alpha \sum_n^N \ln p({\bf x}^{(n)};\alpha)
214: = \mathop{\rm argmin}_\alpha \left( \sum_n U({\bf x}^{(n)}) + N \ln Z(\alpha) \right)
215: \end{equation}
216: The logarithm of the partition sum appearing in the second term can
217: be related to the entropy by
218: \begin{eqnarray}
219: S &=& - \sum_x p(x) \, \ln p(x)
220: = \sum_{x} p(x) \, U(x) + \ln Z
221: \\ \Rightarrow\qquad
222: -\ln Z &=& \langle U \rangle- S = F
223: \end{eqnarray}
224: The quantity $\langle U \rangle - S$ is the thermodynamical free
225: energy. Maximum likelihood parameters estimation therefore corresponds
226: to choosing the parameters such that the energy of the given
227: configuration is minimized while the free energy of the system as a
228: whole is maximized:
229: \begin{equation} \label{eq:13}
230: \mathop{\rm argmin}_\alpha \left( U(X;\alpha) - F(\alpha) \right)
231: = \mathop{\rm argmin}_\alpha \left( U(X;\alpha) - \langle U \rangle(\alpha) +
232: S(\alpha) \right)
233: \quad.
234: \end{equation}
235: Unfortunately, this requires the calculation of both the internal
236: energy, $\langle U \rangle(\alpha)$, and the entropy, $S(\alpha)$, of
237: the system and thus more or less a complete solution of the system.
238:
239: This can be avoided by employing the \emph{mean field} approximation, in
240: which the probability distribution $p(x)$ is replaced by a trial
241: distribution $p_{\rm trial}(x)$ as a product of single-variable
242: distributions
243: \begin{equation}
244: p_{\rm trial}(x) = p_1(x_1) \ldots p_n(x_n)
245: \end{equation}
246: which can be completely parametrized by the expectation values $\bar
247: x_i$ using
248: \begin{equation}
249: p_i(x_i) = x_i \bar x_i + (1-x_i) (1-\bar x_i)
250: = \left\{ \begin{array}{ll} 1-\bar x_i & \mbox{if $x_i=0$} \\
251: \bar x_i & \mbox{if $x_i=1$} \end{array}\right.
252: \end{equation}
253: Optimum values for the parameters $\bar x_i$ can then be estimated by
254: minimizing the KL entropy of $p_{\rm trial}(x)$ vs.~the true distribution
255: $p(x)$.
256:
257: Interestingly, this approximation removes the distinguishing feature
258: of the network approach, namely that the neighborhood structure (in
259: the sense of neghbors of neighbors) is taken into account. The
260: resulting equations are very similar to the machine-learning equations
261: in which neighbors are treated as unrelated.
262:
263: \subsection{Binomial-neighborhood approach}
264: \label{sec:bin}
265:
266: The binomial-neighborhood approach \cite{Letovsky} is a simpler approach in which the
267: probability distribution $p(x|C,N)$ is chosen in such a way that it
268: can be directly derived from observed frequencies without the
269: minimization process typical for maximum-likelihood approaches. It is
270: based on the assumption that the distribution of active neighbors
271: $N_i$ of a
272: node $i$ follows a binomial distribution whose single probability $p$
273: depends on whether the node $i$ is active or not:
274: \begin{equation}
275: p(N_i|C_i,x_i=1) = \left(\begin{array}{c} C_i \\ N_i
276: \end{array} \right) \,
277: p_1^{N_i} (1-p_1)^{C_i - N_i} \quad,
278: \end{equation}
279: and correspondingly for $x_i=0$ using a single probability $p_0$. This
280: is the assumption of \emph{local enrichment}, i.e.~that the
281: probability $p_1$ to find an active node around another active node is
282: larger than the probability $p_0$ to find an active node around an
283: inactive node. Using Bayes' theorem, we can use this to calculate the
284: probability distribution of $x_i$:
285: \begin{equation}
286: p(x_i|C_i,N_i) = \frac{ p(N_i|C_i,x_i) \, p(x_i|C_i) }{
287: p(N_i|C_i)}
288: \end{equation}
289: where $p(x_i|C_i) = \bar x$ is the overall probability of observing an
290: active node, and
291: \begin{equation}
292: p(N_i|C_i) = \bar x p(N_i|C_i,x_i=1) + (1-\bar x) p(N_i|C_i,x_i=0) \quad.
293: \end{equation}
294: The resulting probability distribution can be written as
295: \begin{equation}
296: p(x_i=1|C_i,N_i) = \frac{\lambda}{1+\lambda} \qquad\mbox{and}\qquad
297: p(x_i=0|C_i,N_i) = \frac{1}{1+\lambda}
298: \end{equation}
299: with
300: \begin{equation}
301: \lambda = \frac{\bar x}{1-\bar x} \, \frac{p_1^{N_i} \, (1-p_1)^{C_i - N_i}}{
302: p_0^{N_i} \, (1-p_0)^{C_i - N_i}} \quad.
303: \end{equation}
304: This can be easily rewritten in the same form as (\ref{eq:2})
305: \begin{equation}
306: p(x_i|C_i,N_i) = \frac{1}{Z} \exp \left[-\left( -\ln \frac{\bar x}{1-\bar x}
307: - \ln \frac{p_1}{p_0} N_i
308: + \ln \frac{1-p_0}{1-p_1} \, (C_i-N_i) \right) \,x_i\right]
309: \end{equation}
310: The first term in the potential has the same form as (\ref{eq:3}) and
311: adjusts the overall number of positive sites; the two other terms
312: constitute a bones for having positive neighbors (proportional to $N_i$)
313: and a penalty for having negative neighbors (proportional to $C_i -
314: N_i$).
315:
316: This approach evidently gives a conditional probability distribution
317: $p(x_i|C_i,N_i)$ of the same for as the one in the machine-learning
318: approach. However, the coefficient in the potential can be directly
319: calculated from the observed frequencies $\bar x$, $p_0$, and
320: $p_1$. This is only possible because we made here the assumption that
321: the probability distribution $p(N_i|C_i,x_i)$ is binomial. The
322: machine-learning approach is more flexible in that in does not have to
323: make this assumption and yields a true maximum-likelihood estimate
324: even for distributions that deviate greatly from binomial form. In
325: particular, the binomial distribution implies that the neighbors of a
326: node behave statistically independent, which might be violated in a
327: densely connected network, where we would expect clusters to form.
328:
329: \section{Results}
330:
331: To compare the different prediction methods, we chose the MIPS
332: protein-protein interaction database for \emph{Saccharomyces
333: cerevisiae} \cite{MIPS,Uetz} and the GO-SLIM database of protein function
334: assignments from the Gene Ontology Consortium \cite{GO}. The latter is a
335: slimmed-down subset of the full gene ontology assignments comprising
336: 32 different processes, 21 functions, and 22 cell compartments. We
337: focused here on the process assignments as these were expected to
338: correspond most closely to the interaction network.
339:
340: We compared four methods:
341: \begin{enumerate}
342: \item the binomial neighborhood enrichment from sec.~\ref{sec:bin},
343: \item the machine-learning maximum-likelihood method from
344: sec.~\ref{sec:ml} using a linear potential (\ref{eq:1})
345: \item the same method with an extended non-linear potential, and
346: \item a standard support vector machine \cite{libsvm}.
347: \end{enumerate}
348:
349: For the probabilistic methods, we first looked at the single-function
350: prediction problem in which the system is presented with a binary
351: assignment expressing which proteins are known to have a given
352: function, and then makes a prediction for an unknown protein based on
353: the number of neighbors that have this function.
354:
355: \begin{figure}[htb]
356: \begin{center}
357: \includegraphics[angle=270,width=\hsize]{glyphs-5.eps}
358: \caption{Glyph plot summarizing the probability distribution
359: for a single-function prediction problem.
360: Each box represents a possible situation of a single node,
361: characterized by the total number of neighbors on the $x$-axis,
362: and the number of neighbors having the funtion of interested on
363: the $y$-axis. The numbers indicate the total incidence of the
364: situation, while the shading expresses how frequently the
365: central node had the function of interest in that situation.
366: The lines are the decision boundaries for the binomial method
367: and the linear and polynomal machine-learning methods. The
368: shading is the prediction region from the SVM.
369: }
370: \label{fig:1}
371: \end{center}
372: \end{figure}
373:
374: In this case, the local environment of a node can be described by two
375: numbers: $n$, the number of neighbors, and $j$, the number of
376: neighbors that have the function assignment under consideration. The
377: content of the training data set can be characterized by a glyph plot
378: such as in fig.~\ref{fig:1}.
379:
380: After learning the training data, the probabilistic method has
381: inferred a probability distribution that yields, for each pair
382: $(n,j)$, a probability $p(X_i=1|n,j)$ which is then utilized for
383: predictions. The 50\%-level of this probability, which determines the
384: prediction in a binary system, is indicated in fig.~\ref{fig:1} by
385: green lines.
386:
387: The three probabilistic predictors in fig.~\ref{fig:1} yield similar
388: results that differ rarely by more than one box. The main difference
389: is that the binomial predictor is restricted to a straight line, while
390: the linear and non-linear maximum-likelihood predictors can accomodate
391: a little turn. Linear and non-linear predictors differ only minimally.
392:
393: \begin{figure}[htb]
394: \begin{center}
395: \includegraphics[width=\hsize]{single-spec-2a.eps}
396: \caption{Sensitivity-specificity curve for the three probabilistic
397: prediction methods for a single-function prediction.}
398: \end{center}
399: \label{fig:2}
400: \end{figure}
401:
402: Finally the prediction from a support vector machine that was trained
403: on the same single-function data set is indicated by a shaded area
404: marking all those $(n,j)$ for which the SVM returned a positive
405: prediction. The border of this area very closely follows the linear
406: and non-linear M.L.~predictors.
407:
408: Fig.~\ref{fig:2} shows a sensitivity-specificity curve using five-fold
409: cross validation for single-function prediction using the
410: probabilistic predictors. Again, all three curves follow each other
411: quite closely, with a slight edge for the nonlinear M.L.~predictor.
412:
413: The preceding discussion applied to the problem of single function
414: prediction. To perform full prediction, we generated each of the three
415: predictors separately for each function and chose, for each protein
416: with an unknown function, the prediction with the largest probability.
417: For simplicity, this approach does not take into account possible
418: correlations between different protein functions. However, such
419: correlations were taken into account for the support vector machine,
420: which generated a full set of cross-predictors (predicting function
421: $i$ with neighbors of type $j$).
422:
423: \begin{figure}[htb]
424: \begin{center}
425: \includegraphics[width=\hsize]{full-spec-2.eps}
426: \caption{Accuracy of multiple-function prediction as a function of
427: the number of predictions made using the three probabilistic prediction methods.}
428: \end{center}
429: \label{fig:3}
430: \end{figure}
431:
432: In the probabilistic case, each predictor does not only provide us
433: with a yes-no decision, but also with a probability for the
434: prediction. We can use the information to restrict the predictions to
435: highly probable ones. Fig.~\ref{fig:3} shows the accuracy of the
436: prediction as a function of how many predictions are made with
437: different cut-offs in the predicted probability. Again, all three
438: curves closely follow each other, with maybe a small but unsignificant
439: edge of the linear M.L.~predictor. The predictions from all predictors
440: including the SVM were similar, and combining them would not have
441: improved predictive accuracy.
442:
443: \begin{table}[htb]
444: \centering
445: \begin{tabular}{|l|l|l|}
446: \hline
447: METHOD & \#SUCCESS & accuracy \\
448: \hline
449: binomial classifier & 623 & 31\% \\
450: linear M.L.\ classifier & 655 & 33\% \\
451: nonlinear M.L.\ classifier & 640 & 31.7\% \\
452: linear SVM classifier & 601 & 29.8\% \\
453: \hline
454: randomized network & 101 & 11.4\% \\
455: \hline
456: \hline
457: binomial classifier, process & & 32.5\%\\
458: randomized network & & 8.7\% \\
459: \hline
460: \end{tabular}
461: \caption{Prediction accuracy in five-fold cross validation for the
462: yeast data set.}
463: \label{tab:1}
464: \end{table}
465:
466: Finally, the success rates for all predictors are shown in table
467: \ref{tab:1} using five-fold cross-validation on a data set of 2014
468: unique function assignments for the yeast proteome. It turns out that
469: all four methods perform closely, with success rates between 30 and
470: 33\%. This compares to the null-hypothesis of prediction in a
471: randomized network, in which we would have a success rate of 11\% for
472: these data. The protein-protein interaction data therefore roughly
473: triples the prediction success over a random network. However, all
474: methods, from the simple, counting-based binomial classifier to the
475: full support vector machine, perform similarly.
476:
477: We also extended our methods to take larger neighborhoods (second and
478: higher-order neighbors) into account, but failed to substantially
479: improve predictive power.
480:
481: Finally, we also performed protein function prediction on a recently
482: published protein-interaction network for {\em Drosophila
483: melanogaster} \cite{Droso}, with similar results.
484:
485: \section{Discussion}
486: %\vspace*{-12pt}
487:
488: We compared different probabilistic approaches to predicting protein
489: functions in protein interaction networks. Under closer analysis, the
490: different Markov Random Field methods in the literature can be related
491: to a basic machine-learning approach with maximum-likelihood parameter
492: estimation. Using real data, they exhibit similar performance, with
493: simple methods performing as well as more complex ones. This might
494: indicate limits on the functional information contained in
495: protein-protein interaction networks.
496:
497: A standard support vector machine gave similar result, though it was
498: equipped with more information, namely the frequencies of all function
499: classes in the neighborhood. The additional information did neither improve nor
500: harm predictive performance.
501:
502: %\vspace*{-12pt}
503: \begin{thebibliography}{9999}
504: %\vspace*{-12pt}
505:
506: \bibitem{Letovsky}
507: S. Letovsky, S. Kasif, Bioinformatics {\bf 19}, Suppl. 1, i197 (2003).
508: \bibitem{Deng}
509: M. Deng, T. Chen, F. Sun, in: Proceedings, RECOMB '03,
510: 7th international conference on
511: Research in Computational Molecular Biology,
512: p.~95, ACM Press, New York, NY (2003).
513: \bibitem{Droso}
514: L. Giot \emph{et.~al.}, Science {\bf 302}, 1727 (2003).
515: \bibitem{Uetz}
516: P. Uetz \emph{et.~al.}, Nature {\bf 403}, 623 (2000).
517: \bibitem{MIPS}
518: H. W. Mewes \emph{et.~al.}, Nucleic Acids Research {\bf 32}, D41
519: (2004).
520: \bibitem{GO}
521: The Gene Ontology Consortium,
522: Nucleic Acids Res {\bf 32}, D258 (2004).
523: \bibitem{libsvm}
524: C.-C. Chang, C.-J. Lin, LIBSVM : a library for support vector
525: machines, 2001.
526: Software available at {\bf http://www.csie.ntu.edu.tw/~cjlin/libsvm}
527: \end{thebibliography}
528:
529: \if0
530: Nature. 2000 Feb 10;403(6770):623-7.
531: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae.
532:
533: Uetz P, Giot L, Cagney G, Mansfield TA, Judson RS, Knight JR, Lockshon
534: D, Narayan V, Srinivasan M, Pochart P, Qureshi-Emili A, Li Y, Godwin
535: B, Conover D, Kalbfleisch T, Vijayadamodar G, Yang M, Johnston M,
536: Fields S, Rothberg JM.
537:
538: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=10688190&dopt=Abstract
539: \fi
540:
541: \end{document}
542:
543: %%% Local Variables:
544: %%% mode: latex
545: %%% TeX-master: t
546: %%% End:
547: