1: \documentclass[prl,aps,twocolumn,showpacs,showkeys]{revtex4}
2: \usepackage{epsf,amssymb,amsmath}
3:
4: \begin{document}
5: \title{ \sffamily\bfseries\Large
6: Evolution of the Protein Interaction Network of Budding Yeast: \\
7: Role of the Protein Family Compatibility Constraint\\}
8:
9: \author{\sc K.-I. Goh, B. Kahng, and D. Kim}
10:
11: \affiliation{\mbox{School of Physics and Center for
12: Theoretical Physics, Seoul National University NS50,
13: Seoul 151-747, Korea}}
14: \date{\today}
15:
16: \begin{abstract}
17: Understanding of how protein interaction networks (PIN) of living
18: organisms have evolved or are organized can be the first stepping
19: stone in unveiling how life works on a fundamental basis.
20: Here, we introduce a new {\em in-silico} evolution model of
21: the PIN of budding yeast, {\em Saccharomyces cerevisiae};
22: the model is composed of the PIN and the protein family network.
23: The basic ingredient of the
24: model includes family compatibility which constrains
25: the potential binding ability of a protein,
26: as well as the previously proposed
27: gene duplication, divergence, and mutation.
28: We investigate various structural properties of our model network
29: with parameter values relevant to budding yeast and
30: find that the model successfully reproduces the
31: empirical data.
32: \end{abstract}
33: \pacs{89.75.Hc, 87.15.Aa }
34: \keywords{Protein interaction network, Family compatibility}
35: \maketitle
36:
37: Studying complex systems by means of their network representation
38: has attracted much attention recently \cite{rmp,advphys,siam,saemulli,dslee,han}.
39: The cell, one of the best examples of complex systems, can also
40: be viewed as a network:
41: The cellular components, such as genes, proteins, and other
42: biological molecules, connected by all physiologically
43: relevant interactions, form a full weblike molecular architecture
44: in a cell~\cite{pyramid,network-biology}.
45: Among the various levels, the protein interaction network (PIN)
46: plays a pivotal role as it acts as a basic physical protocol
47: of cooperative functioning in many physiological processes.
48: In the PIN, proteins are viewed
49: as nodes, and two proteins are linked if they physically
50: contact each other.
51: Thanks to recent progress in high-throughput experimental techniques,
52: the data set of protein interactions for budding yeast,
53: {\em Saccharomyces cerevisiae}, has been firmly
54: established in the last few years \cite{uetz,ito,gavin,ho,tong,mips,dip,bind}.
55: Thus, it offers a good testbed to understand how it has evolved
56: to form its status quo from basic evolutionary rules.
57: In this paper, our aim is to introduce a simple evolutionary model
58: to reproduce the structural properties of the PIN of budding yeast,
59: thereby deepening our understanding of the driving force for
60: cellular evolution.
61:
62: At a certain level of abstraction, one may view a protein as
63: an assembly of domains. It is domains that offer structural
64: and functional units. They act as basic units in
65: the interactions between proteins and in the evolution
66: of protein structures. Proteins are grouped into so-called protein
67: families or superfamilies
68: according to the domain structure within them \cite{alberts}.
69: The proteins within a family are monophyletic;
70: that is, they originate from a common ancestor
71: and are fairly well conserved during evolution.
72: The protein family network (PFN) is defined as the one
73: whose nodes are protein families, and two families are connected
74: if any of the domains within them simultaneously
75: occur in a single protein or any proteins within
76: them interact with each other \cite{jpark}.
77: The distributions of the degrees and the sizes of the families in the PFN
78: also follow power laws \cite{jpark,huynen}.
79: Given that the entities of proteins and protein families
80: are not separable but linked via domains as intermediates,
81: it is desirable to unify their evolutions into a single framework.
82:
83: So far, several {\it in-silico} evolution models have been proposed
84: for the yeast PIN \cite{sole,vazquez,berg,kim,chung}.
85: A distinguishing aspect in the evolution of the PIN compared
86: with that of other complex networks is the concept of ``evolution
87: by duplication''~\cite{ohno}:
88: A new protein is thought to be created mainly by gene duplication.
89: Subsequently, the duplicate protein may lose redundant interactions
90: endowed from its ancestor to reduce redundancy,
91: which is called divergence (or diversification).
92: A protein also gains new interactions with other
93: proteins via mutation. These three processes,
94: duplication--divergence--mutation, have been regarded as the basic
95: ingredients in the evolution of the PIN. While those {\it in-silico}
96: models~\cite{sole,vazquez,kim,chung,berg}
97: were successful in generating a fat-tail or power-law behavior in
98: the degree distribution,
99: they hardly reproduced other structural properties of the yeast
100: PIN network, such as the clustering coefficient, the assortativity,
101: {\it etc.}, which we will specify in more detail shortly.
102: The model we introduce here, however, can incorporate other
103: structural properties of the yeast PIN as well as the degree distribution.
104: To this end, we introduce the concept of
105: ``family compatibility'' (FC):
106: An interaction between two proteins is possible only when
107: the corresponding families they belong to are compatible,
108: and only those families linked via the PFN are compatible with one another.
109: With this, we realize the effective structural constraint
110: in physical binding between proteins, which is coupled with
111: the evolutionary lineage of proteins through the notion of protein family.
112:
113: \begin{figure}[t]
114: \centerline{\epsfxsize=9cm \epsfbox{fig1.eps}}
115: \caption{Schematic picture of the evolution rule of the model.
116: The elementary steps are composed of i) duplication
117: (light blue protein $\rightarrow$ red protein),
118: ii) divergence (dashed pink links), and
119: iii) mutation (violet link from the pink protein).
120: In addition, the mutation is constrained by family
121: compatibility; for example, the pink protein cannot
122: interact with the black protein because they are not compatible.
123: }
124: \end{figure}
125:
126: \begin{figure*}
127: \centerline{\epsfxsize=15cm \epsfbox{fig2.eps}}
128: \caption{
129: Simulation results ($\bigcirc$) of the model agree well with the
130: empirical data ($\diamond$).
131: Shown are
132: (a) the degree distribution $P(k)$,
133: (b) the hierarchical clustering $C(k)$, and
134: (c) the average neighbor-degree function
135: $\langle k_{\rm nn}\rangle$ for the protein interaction network.
136: The dotted line in (a) is a fit to Eq.~(\ref{pk}).
137: The results of the model without FC ($\Box$), which fail
138: to reproduce the empirical features, are also shown for
139: comparison.
140: }
141: \end{figure*}
142:
143: {\em Model}--- The model can be depicted schematically as in Fig.~1.
144: The whole system is composed of two types of networks,
145: the PIN and the PFN. A number of proteins are grouped, forming
146: a protein family. Protein families link to other protein families,
147: forming the PFN.
148: Two proteins belonging to different protein families can
149: interact only when the respective families are also linked.
150: Each family has a fitness-like parameter, the number of domains
151: within it, $D_f$, which is not fixed, but evolves with the PFN.
152: The evolution takes place in two stages. In the first stage,
153: the protein families are created along with the proteins;
154: thus, the PFN coevolves with the PIN.
155: In the second stage, the PFN is kept fixed, and the evolution of
156: the PIN continues on top of it.
157: A detailed description of the procedure is as follows:
158:
159: \begin{enumerate}
160: \item Initially, there are $n_0$ proteins, each of which constitutes
161: its own protein family. All $n_0$ proteins
162: are interconnected with one another, as are the $n_0$ protein families.
163: We choose $n_0=3$ to be minimal.
164: Each family has $D_f=2$ domains, the number of family-links it has.
165:
166: \item In the first stage, proteins and protein families coevolve:
167: At each step, with rate $\alpha$, a new protein, say $a$, is created
168: by duplicating an existing protein $b$ chosen randomly. The new protein $a$
169: creates its own protein family $F_a$.
170: Each of the inherited interactions of the protein $a$
171: is removed with probability $\delta$, a process called divergence.
172: Through divergence, the degree of the new protein $a$, $k_a$,
173: usually becomes less than that of the mother protein $k_b$.
174: The linkage of the new protein family is determined by that of
175: the protein created. By this process, the newly born family $F_a$
176: consists of a single protein, but has a number of linkages, say $K_{F_a}$,
177: to existing families.
178: The initial number of domains in the family is set to
179: $D_{F_a}=K_{F_a}$. In some cases, the newly created protein is left with no
180: interaction at all $(K_{F_a}=0)$.
181: In this case, we do not let it establish a new
182: family, but regard it as a remnant in the previous family.
183: When this case happens, the population of the family to which
184: the duplicated protein belongs is increased by 1. Note that the
185: remnant can later gain new interactions via mutation described below
186: and join the protein interaction network.
187:
188: With rate $1$, a randomly chosen existing protein $i$ gains a new
189: interaction to another previously unlinked protein $j$, which is
190: chosen among the proteins within compatible families,
191: according to the probability,
192: \begin{equation}
193: \Pi_j= \dfrac{D_{F_j}}{\underset{F_{l}\leftrightarrow F_i}{\displaystyle \sum\nolimits} D_{F_{l}}},
194: \end{equation}
195: where $F_i$ means the family
196: to which the protein $i$ belongs and $X\leftrightarrow Y$ means that
197: the families $X$ and $Y$ are compatible, i.e., linked in the PFN.
198: Eq.~(1), the preferential attachment in the domain abundance
199: constrained by FC, makes our model distinct
200: and successful.
201: In this process, the mutation as we will call it, the number of domains
202: in the family $F_i$ increases by 1, but the number of domains in $F_j$
203: does not.
204: This accounts for the acquisition of a new domain via mutation in
205: the family $F_i$. This stage lasts until there are 1,000 proteins
206: made, during which about $500$$\sim$$600$ families are created, a number
207: comparable with the number of superfamilies in yeast~\cite{superfamily}
208:
209: \item
210: In the second stage, the same protein evolution process as in
211: the first stage occurs, except that the PFN is
212: kept fixed and the daughter protein remains in the same family as
213: its mother in the duplication process.
214: This stage lasts until there are about 6,000 proteins in
215: the network, the approximate size of the yeast proteome.
216: \end{enumerate}
217:
218: A few remarks on the model are in order.
219: First, this model is designed to be as simple as possible while
220: implementing FC into the
221: trio of duplication, divergence, and mutation,
222: which we believe to be the most basic processes.
223: Many interesting processes, such as lateral gene transfer
224: and {\it de novo} creation of proteins and protein families,
225: are not covered in this model, however.
226: Second, we made an assumption that the time-scale of
227: the PFN evolution is strictly separated,
228: which might be an oversimplification.
229: Third, proteins and protein families may become extinct during evolution,
230: followed by the loss of the interactions between them.
231: However, we may view the parameters of the evolution rates,
232: such as $\alpha$ and $\delta$,
233: as {\it effective} ones incorporating all these details.
234: Also, for the sake of minimizing the number of free parameters,
235: we assume that the duplication and the divergence rates of proteins
236: and protein families are equal, i.e., $\alpha=\alpha_f$ and
237: $\delta=\delta_f$, although we can fix $\alpha$ and $\delta$ for any
238: given set of ($\alpha_f$, $\delta_f$) to incorporate the empirical
239: data.
240:
241: {\em Structure of the yeast PIN}---
242: Several analyses on the topological properties of the yeast
243: PIN have been performed during recent
244: years \cite{lethal,maslov,wagner}. Since then, however, new
245: protein--protein interactions in yeast have been discovered steadily,
246: so we repeat the analysis by integrating the most up-to-date data
247: from various public resources, such as
248: (i) the database at the Munich Information Center for Protein Sequences \cite{mips},
249: (ii) the database of the interacting proteins \cite{dip},
250: (iii) the biomolecular interaction network database \cite{bind},
251: (iv) the two-hybrid datasets obtained by Uetz {\it et al.}~\cite{uetz},
252: by Ito {\it et al.}~\cite{ito}, and by Tong {\it et al.}~\cite{tong},
253: and (v) the mass spectrometry data (filtered) by Ho {\it et al.}~\cite{ho}.
254: After trimming the synonyms and other redundant entries manually,
255: the resulting network consists of 15,\mbox{ }652 interactions
256: (excluding self-interactions) between 4,\mbox{ }926 nodes (in terms of
257: distinct open reading frames and other biomolecules).
258:
259: The topological properties of the integrated yeast PIN are shown
260: in Fig.~2:
261:
262: (a) The degree distribution of the PIN fits well to the generalized Pareto
263: distribution (or a generalized power law) \cite{ab,koonin},
264: \begin{equation}
265: p_d(k)\sim (k+k_0)^{-\gamma},
266: \label{pk}
267: \end{equation}
268: with $k_0=8.0$ and $\gamma\simeq3.45$.
269: Note that different functional types of the degree distribution from
270: Eq.~(\ref{pk}) were proposed~\cite{sole,vazquez,berg,wagner,lethal}
271: based on smaller-scale datasets than the current one.
272:
273: (b) The yeast PIN is highly clustered and modular.
274: To quantify this, we measured the local clustering of a protein $i$,
275: $c_i = {2e_i}/{k_i(k_i-1)}$, where $e_i$ is the number of links
276: present between the $k_i$ neighbors of node $i$ out of its maximum
277: possible number $k_i(k_i-1)/2$.
278: The clustering coefficient of a graph, $C$, is the average of
279: $c_i$ over all nodes with $k_i\ge 2$. We obtain $C\approx 0.128$.
280: $C(k)$ is the clustering function of vertices with degree
281: $k$~\cite{vespignani2,ravasz}.
282: $C(k)$ exhibits a plateau for small $k$ while it drops rapidly
283: for large $k$.
284: Such a plateau in the clustering function may reflect the
285: functional module structure within the PIN, inside which the
286: network is denser due to the high cooperativity to perform
287: a given cellular task. Such locally dense modules are interconnected
288: by a few global mediators, which are likely to be the hubs in the PIN \cite{han-vidal}.
289: This feature is what most existing PIN models fail to reproduce.
290: As we will show, the FC constraint that we introduce
291: successfully accounts for the emergence of the plateau in $C(k)$.
292:
293: (c) The yeast PIN shows a dissortative degree correlation.
294: The average neighbor-degree function
295: $\langle k_{\rm nn}\rangle(k)$ \cite{knn} is measured to be
296: $\langle k_{\rm nn} \rangle(k) \sim k^{-\nu}$
297: with $\nu\approx 0.3$, somewhat smaller than the value reported based
298: on a single two-hybrid dataset alone~\cite{maslov}.
299: The assortativity $r$, defined as the Pearson correlation coefficient
300: between the degrees of the two vertices on each side of
301: a link~\cite{assort}, is measured to be $r \approx -0.13$.
302: In Table \ref{tab1}, we summarize our measurements for the topological properties
303: of the integrated yeast PIN.
304: \begin{table}[b]
305: \caption{Topological quantities of the integrated
306: yeast PIN and the model network.
307: Error bars in the model results are the standard deviations of the
308: quantities from 1000 runs.}
309: \label{tab1}
310: \begin{ruledtabular}
311: \begin{tabular}{lll}
312: item & model & yeast PIN \\
313: \hline
314: total number of nodes $n$\phantom{aaa} & 6000\phantom{aaa} & $\approx$6000 \\
315: number of interacting nodes $N$\phantom{aaa} & 5079$\pm$54 & 4926 \\
316: average degree $\langle k\rangle$\phantom{aaa} & 6.5$\pm$0.3 & 6.35 \\
317: clustering coefficient $C$ & 0.13$\pm$0.02 & 0.128 \\
318: assortativity index $r$ & $-$0.09$\pm$0.04 & $-0.13$ \\
319: size of the largest component $N_1$ & 5051$\pm$53 & 4832 \\
320: \end{tabular}
321: \end{ruledtabular}
322: \end{table}
323:
324: \begin{figure*}
325: \begin{minipage}[!t]{0.5\linewidth}
326: \flushright{\epsfxsize=6.3cm \epsfbox{fig3a.eps}}
327: \end{minipage}\hfill
328: \begin{minipage}[!t]{0.5\linewidth}
329: \flushleft{\epsfxsize=6.3cm \epsfbox{fig3b.eps}}
330: \end{minipage}\hfill
331: \caption{(a) Comparison between the degree correlation profiles of the
332: yeast PIN and (b) the model network. The color code denotes the value
333: of $\log_{10}[P(k,k')/P_{\rm random}(k,k')]$. The randomized networks
334: are generated by the switching method \cite{maslov}
335: that conserves the degree sequence.\\
336: }
337: \label{corr}
338: \end{figure*}
339:
340: {\em Results}--- Now we compare the simulation results of our model.
341: In typical simulations,
342: we employed $\alpha=0.8$ and $\delta=0.7$. The value of $\delta$ was
343: chosen to accommodate the fact that superfamilies exhibit extensive
344: sequence diversity~\cite{todd}. The value of $\alpha$ was set to match
345: the empirical value of the average degree of the PIN,
346: $\langle k\rangle\simeq 6.4$. Also, we matched approximately the numbers
347: of protein families and proteins with those of budding yeast, as we
348: described before.
349: The results obtained from the model show
350: good agreements with the empirical data as shown in Fig.~2 and Table \ref{tab1}.
351: In Fig.~2, we also show the results with the model without implementing
352: FC, which is similar to the model of Sol\'e et al.~\cite{sole}.
353: One can clearly see that without FC, we cannot
354: account for the clustering and the degree correlation characteristics.
355: We also examine the full degree-correlation profile of
356: the joint probability $P(k,k')$ that two proteins with degrees $k$ and
357: $k'$ are connected to each other.
358: The degree-correlation intensity is quantified by $P(k,k')/P_{\rm random}(k,k')$,
359: the ratio with the joint probability in the randomized ensemble of
360: the original network \cite{maslov,sole03}.
361: As shown in Fig.~3, the profile obtained from the model
362: has a pattern that is quite similar to that of the empirical yeast PIN.
363:
364: \begin{figure}[t]
365: \centerline{\epsfxsize=\linewidth \epsfbox{fig4.eps}}
366: \caption{Network randomization test with and without FC.
367: (a) Clustering function $C(k)$ and
368: (b) the clustering coefficient $C$ as functions of
369: the number of edge shufflings are shown.
370: Symbols are for the unperturbed model network ($\bigcirc$),
371: the network shuffled with FC ($\diamond$),
372: and the network shuffled without FC ($\Box$).
373: The horizontal line in (b) corresponds to the value of the clustering
374: coefficient in the unperturbed model network.
375: }
376: \end{figure}
377:
378: To get further support for the relevance of the FC constraint,
379: we performed a network randomization test. We randomized the model network
380: by using the conventional edge switching method \cite{maslov}, but with the
381: FC constraint. That is, when we are to switch the interactions
382: between the protein pairs, only the switching attempts that preserve
383: FC are accepted. In this way, we can filter out the role of
384: FC. In Fig.~4, we show the results of randomization. We find that the
385: high clustering property of the network is preserved with randomization
386: with FC, but not without FC. Without FC, the clustering coefficient
387: drops as soon as we shuffle the network, as can be seen in Fig.~4(b).
388: Thus, we conclude FC, indeed, plays a crucial role in PIN evolution.
389:
390: \begin{figure}[t]
391: \centerline{\epsfxsize=9.5cm \epsfbox{fig5.eps}}
392: \caption{Simulation results for the protein family network:
393: (a) The family degree distribution $p_d(k_F)$ and
394: (b) the family size distribution $p_s(s_F)$.
395: The dotted lines in (a) and (b) are fit lines to Eq.~(\ref{pk}).
396: }
397: \end{figure}
398:
399: Finally, we check the properties of the PFN. In Fig.~5, we show the
400: degree distribution of the PFN and the family size distribution
401: generated {\it in silico}. The degree distribution of the PFN follows
402: a similar form to Eq.~(2), but with a different value of the exponent,
403: $\gamma_f\approx 3$. The family size distribution also follows a power
404: law with an exponent of 3$\sim$4.
405:
406: In summary, we have introduced an {\em in-silico} model for PIN
407: evolution. The model network is composed of the PIN and the PFN.
408: In the early stage of evolution, the PIN and the PFN coevolve,
409: and in the later stage, the PFN becomes fixed.
410: The evolution proceeds by the three major mechanisms
411: previously proposed, duplication, divergence, and mutation.
412: However, it is constrained by FC and
413: follows a modified preferential attachment rule in the domain abundance,
414: which is the new feature of our model.
415: We have checked various structural properties of the model network, finding
416: that they show good agreements with those of the integrated empirical data
417: of the yeast PIN.
418: Finally, it would be interesting to apply our model to higher eukaryotes,
419: as the data for the protein interactions are accumulating for the
420: multicellular species such as the nematode worm {\em Caenorhabditis elegans}
421: \cite{vidal} and the fruit fly {\em Drosophila melanogater} \cite{giot}.
422: \\
423:
424: \begin{acknowledgments}
425: The authors would like to thank J.~Park for helpful conversation.
426: This work is supported by Korea Science and Engineering Foundation
427: grant No. R14-2002-059-01000-0 in the Advanced Basic Research Laboratory
428: program and Ministry of Science and Technology grant No. M1 03B500000110.
429: \end{acknowledgments}
430:
431: \begin{thebibliography}{99}
432: \bibitem{rmp} R. Albert and A.-L. Barab\'asi, Rev. Mod. Phys. {\bf 74}, 47 (2002).
433: \bibitem{advphys} S. N. Dorogovtsev and J. F. F. Mendes, Adv. Phys. {\bf 51}, 1079 (2002).
434: \bibitem{siam} M. E. J. Newman, SIAM Rev. {\bf 45}, 167 (2003).
435: \bibitem{saemulli} B. Kahng, K.-I. Goh, D.-S. Lee, and D. Kim, Saemulli, New Physics (in Korean) {\bf 48}, 115 (2004).
436: \bibitem{dslee} D.-S. Lee, K.-I. Goh, B. Kahng, and D. Kim, J. Korean Phys. Soc. {\bf 44}, 633 (2004).
437: \bibitem{han} C. N. Yoon, S. K. Han, and H. Y. Kim, J. Korean Phys. Soc. {\bf 44}, 638 (2004).
438: \bibitem{pyramid} Z. N. Oltvai and A.-L. Barab\'asi, {Science} {\bf 298}, 763 (2002).
439: \bibitem{network-biology} A.-L. Barab\'asi and Z. N. Oltvai, Nat. Rev. Genet. {\bf 5}, 101 (2004).
440: \bibitem{uetz} P. Uetz, {\em et al.}, {Nature (London)} {\bf 403}, 623 (2000); B. Schwikowski, P. Uetz, and S. Fields, {Nat. Biotechnol.} {\bf 18}, 1257 (2000).
441: \bibitem{ito} T. Ito, T. Chiba, R. Ozawa, M. Yoshida, M. Hattori, and Y. Sakaki, {Proc. Natl. Acad. Sci.} USA {\bf 98}, 4569 (2001).
442: \bibitem{tong} A. H. Y. Tong, {\em et al.}, {Science} {\bf 295}, 321 (2002).
443: \bibitem{gavin} A.-C. Gavin, {\em et al.}, {Nature (London)} {\bf 415}, 141 (2002).
444: \bibitem{ho} Y. Ho, {\em et al.}, {Nature (London)} {\bf 415}, 180 (2002).
445: \bibitem{mips} H. W. Mewes, {\em et al.}, Nucl. Acids Res. {\bf 32}, D41 (2004).
446: \bibitem{dip} L. Salwinski, C. S. Miller, A. J. Smith, F. K. Pettit, J. U. Bowie, and D. Eisenberg, Nucl. Acids Res. {\bf 32}, D449 (2004).
447: \bibitem{bind} G. D. Bader, D. Betel, and C. W. V. Hogue, Nucl. Acids Res. {\bf 31}, 248 (2003).
448: \bibitem{alberts} B. Alberts, D. Bray, A. Johnson, J. Lewis, M. Raff, K. Robert, and P. Walter, {\it Essential Cell Biology} (Garland, New York, 1998).
449: \bibitem{jpark} J. Park, M. Lappe, and S. A. Teichmann, {J. Mol. Biol.} {\bf 307}, 929 (2001).
450: \bibitem{huynen} M. A. Huynen and E. van Nimwegen, {Mol. Biol. Evol.} {\bf 15,} 583 (1998).
451: \bibitem{sole} R. V. Sol\'e, R. Pastor-Satorras, E. Smith, and T. Kepler., {Adv. Compl. Syst.} {\bf 5}, 43 (2002); R. Pastor-Satorras, E. D. Smith, and R. V. Sol\'e, {J. Theor. Biol.} {\bf 222}, 199 (2003).
452: \bibitem{vazquez} A. V\'azquez, A. Flammini, A. Maritan, and A. Vespignani, {ComPlexUs} {\bf 1}, 38 (2003).
453: \bibitem{kim} J. Kim, P. L. Krapivsky, B. Kahng, and S. Redner, Phys. Rev. E {\bf 66}, 05510(R) (2002).
454: \bibitem{chung} F. Chung, L. Lu, T. G. Dewey, and D. J. Galas, {J. Comput. Biol.} {\bf 18}, 1486 (2003).
455: \bibitem{berg} J. Berg, M. L\"assig, and A. Wagner, BMC Evol. Biol. {\bf 4}, 51 (2004).
456: \bibitem{ohno} S. Ohno, {\it Evolution by Gene Duplication} (Springer-Verlag, Berlin, 1970).
457: \bibitem{superfamily} J. Gough, K. Karplus, R. Hughey, and C. Chothia, J. Mol. Biol. {\bf 313}, 903 (2001).
458: \bibitem{lethal} H. Jeong, S. P. Mason, A.-L. Barab\'asi, and Z. N. Oltvai, {Nature (London)} {\bf 411}, 41 (2001).
459: \bibitem{wagner} A. Wagner, {Mol. Biol. Evol.} {\bf 18}, 1283 (2001).
460: \bibitem{maslov} S. Maslov and K. Sneppen, {Science} {\bf 296}, 910 (2002).
461: \bibitem{ab} R. Albert and A.-L. Barab\'asi, Phys. Rev. Lett. {\bf 85}, 5234 (2000).
462: \bibitem{koonin} E. V. Koonin, Y. I. Wolf, and G. P. Karev, {Nature} {\bf 420}, 218 (2002).
463: \bibitem{vespignani2} A. V\'azquez, R. Pastor-Satorras, and A. Vespignani, Phys. Rev. E {\bf 65}, 066130 (2002).
464: \bibitem{ravasz} E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai, and A.-L. Barab\'asi, Science {\bf 297,} 1551 (2002); E. Ravasz and A.-L. Barab\'asi, Phys. Rev. E {\bf 67,} 026112 (2003).
465: \bibitem{han-vidal} J.-D. Han, {\em et al.}, Nature (London) {\bf 430}, 88 (2004).
466: \bibitem{knn} R. Pastor-Satorras, A. V\'azquez and A. Vespignani, Phys. Rev. Lett. {\bf 87,} 258701 (2001).
467: \bibitem{assort} M. E. J. Newman, {Phys. Rev. Lett.} {\bf 89}, 208701 (2002).
468: \bibitem{todd} A. E. Todd, C. A. Orengo, and J.~M. Thornton, {J. Mol. Biol.} {\bf 307}, 1113 (2001).
469: \bibitem{sole03} R.~V. Sol\'e and P. Fern\'andez, (arXiv:q-bio.GN/0312032).
470: \bibitem{vidal} S. Li, {\em et al.}, Science {\bf 303}, 540 (2004).
471: \bibitem{giot} L. Giot, {\em et al.}, Science {\bf 302}, 1727 (2003).
472: \end{thebibliography}
473: \end{document}
474:
475: