0312:q-bio0312009/pin.tex

1: \documentclass[prl,aps,twocolumn,showpacs,showkeys]{revtex4}

2: \usepackage{epsf,amssymb,amsmath}

3:

4: \begin{document}

5: \title{ \sffamily\bfseries\Large

6: Evolution of the Protein Interaction Network of Budding Yeast: \\

7: Role of the Protein Family Compatibility Constraint\\}

8:

9: \author{\sc K.-I. Goh, B. Kahng, and D. Kim}

10:

11: \affiliation{\mbox{School of Physics and Center for

12: Theoretical Physics, Seoul National University NS50,

13: Seoul 151-747, Korea}}

14: \date{\today}

15:

16: \begin{abstract}

17: Understanding of how protein interaction networks (PIN) of living

18: organisms have evolved or are organized can be the first stepping

19: stone in unveiling how life works on a fundamental basis.

20: Here, we introduce a new {\em in-silico} evolution model of

21: the PIN of budding yeast, {\em Saccharomyces cerevisiae};

22: the model is composed of the PIN and the protein family network.

23: The basic ingredient of the

24: model includes family compatibility which constrains

25: the potential binding ability of a protein,

26: as well as the previously proposed

27: gene duplication, divergence, and mutation.

28: We investigate various structural properties of our model network

29: with parameter values relevant to budding yeast and

30: find that the model successfully reproduces the

31: empirical data.

32: \end{abstract}

33: \pacs{89.75.Hc, 87.15.Aa }

34: \keywords{Protein interaction network, Family compatibility}

35: \maketitle

36:

37: Studying complex systems by means of their network representation

38: has attracted much attention recently \cite{rmp,advphys,siam,saemulli,dslee,han}.

39: The cell, one of the best examples of complex systems, can also

40: be viewed as a network:

41: The cellular components, such as genes, proteins, and other

42: biological molecules, connected by all physiologically

43: relevant interactions, form a full weblike molecular architecture

44: in a cell~\cite{pyramid,network-biology}.

45: Among the various levels, the protein interaction network (PIN)

46: plays a pivotal role as it acts as a basic physical protocol

47: of cooperative functioning in many physiological processes.

48: In the PIN, proteins are viewed

49: as nodes, and two proteins are linked if they physically

50: contact each other.

51: Thanks to recent progress in high-throughput experimental techniques,

52: the data set of protein interactions for budding yeast,

53: {\em Saccharomyces cerevisiae}, has been firmly

54: established in the last few years \cite{uetz,ito,gavin,ho,tong,mips,dip,bind}.

55: Thus, it offers a good testbed to understand how it has evolved

56: to form its status quo from basic evolutionary rules.

57: In this paper, our aim is to introduce a simple evolutionary model

58: to reproduce the structural properties of the PIN of budding yeast,

59: thereby deepening our understanding of the driving force for

60: cellular evolution.

61:

62: At a certain level of abstraction, one may view a protein as

63: an assembly of domains. It is domains that offer structural

64: and functional units. They act as basic units in

65: the interactions between proteins and in the evolution

66: of protein structures. Proteins are grouped into so-called protein

67: families or superfamilies

68: according to the domain structure within them \cite{alberts}.

69: The proteins within a family are monophyletic;

70: that is, they originate from a common ancestor

71: and are fairly well conserved during evolution.

72: The protein family network (PFN) is defined as the one

73: whose nodes are protein families, and two families are connected

74: if any of the domains within them simultaneously

75: occur in a single protein or any proteins within

76: them interact with each other \cite{jpark}.

77: The distributions of the degrees and the sizes of the families in the PFN

78: also follow power laws \cite{jpark,huynen}.

79: Given that the entities of proteins and protein families

80: are not separable but linked via domains as intermediates,

81: it is desirable to unify their evolutions into a single framework.

82:

83: So far, several {\it in-silico} evolution models have been proposed

84: for the yeast PIN \cite{sole,vazquez,berg,kim,chung}.

85: A distinguishing aspect in the evolution of the PIN compared

86: with that of other complex networks is the concept of ``evolution

87: by duplication''~\cite{ohno}:

88: A new protein is thought to be created mainly by gene duplication.

89: Subsequently, the duplicate protein may lose redundant interactions

90: endowed from its ancestor to reduce redundancy,

91: which is called divergence (or diversification).

92: A protein also gains new interactions with other

93: proteins via mutation. These three processes,

94: duplication--divergence--mutation, have been regarded as the basic

95: ingredients in the evolution of the PIN. While those {\it in-silico}

96: models~\cite{sole,vazquez,kim,chung,berg}

97: were successful in generating a fat-tail or power-law behavior in

98: the degree distribution,

99: they hardly reproduced other structural properties of the yeast

100: PIN network, such as the clustering coefficient, the assortativity,

101: {\it etc.}, which we will specify in more detail shortly.

102: The model we introduce here, however, can incorporate other

103: structural properties of the yeast PIN as well as the degree distribution.

104: To this end, we introduce the concept of

105: ``family compatibility'' (FC):

106: An interaction between two proteins is possible only when

107: the corresponding families they belong to are compatible,

108: and only those families linked via the PFN are compatible with one another.

109: With this, we realize the effective structural constraint

110: in physical binding between proteins, which is coupled with

111: the evolutionary lineage of proteins through the notion of protein family.

112:

113: \begin{figure}[t]

114: \centerline{\epsfxsize=9cm \epsfbox{fig1.eps}}

115: \caption{Schematic picture of the evolution rule of the model.

116: The elementary steps are composed of i) duplication

117: (light blue protein $\rightarrow$ red protein),

118: ii) divergence (dashed pink links), and

119: iii) mutation (violet link from the pink protein).

120: In addition, the mutation is constrained by family

121: compatibility; for example, the pink protein cannot

122: interact with the black protein because they are not compatible.

123: }

124: \end{figure}

125:

126: \begin{figure*}

127: \centerline{\epsfxsize=15cm \epsfbox{fig2.eps}}

128: \caption{

129: Simulation results ($\bigcirc$) of the model agree well with the

130: empirical data ($\diamond$).

131: Shown are

132: (a) the degree distribution $P(k)$,

133: (b) the hierarchical clustering $C(k)$, and

134: (c) the average neighbor-degree function

135: $\langle k_{\rm nn}\rangle$ for the protein interaction network.

136: The dotted line in (a) is a fit to Eq.~(\ref{pk}).

137: The results of the model without FC ($\Box$), which fail

138: to reproduce the empirical features, are also shown for

139: comparison.

140: }

141: \end{figure*}

142:

143: {\em Model}--- The model can be depicted schematically as in Fig.~1.

144: The whole system is composed of two types of networks,

145: the PIN and the PFN. A number of proteins are grouped, forming

146: a protein family. Protein families link to other protein families,

147: forming the PFN.

148: Two proteins belonging to different protein families can

149: interact only when the respective families are also linked.

150: Each family has a fitness-like parameter, the number of domains

151: within it, $D_f$, which is not fixed, but evolves with the PFN.

152: The evolution takes place in two stages. In the first stage,

153: the protein families are created along with the proteins;

154: thus, the PFN coevolves with the PIN.

155: In the second stage, the PFN is kept fixed, and the evolution of

156: the PIN continues on top of it.

157: A detailed description of the procedure is as follows:

158:

159: \begin{enumerate}

160: \item Initially, there are $n_0$ proteins, each of which constitutes

161: its own protein family. All $n_0$ proteins

162: are interconnected with one another, as are the $n_0$ protein families.

163: We choose $n_0=3$ to be minimal.

164: Each family has $D_f=2$ domains, the number of family-links it has.

165:

166: \item In the first stage, proteins and protein families coevolve:

167: At each step, with rate $\alpha$, a new protein, say $a$, is created

168: by duplicating an existing protein $b$ chosen randomly. The new protein $a$

169: creates its own protein family $F_a$.

170: Each of the inherited interactions of the protein $a$

171: is removed with probability $\delta$, a process called divergence.

172: Through divergence, the degree of the new protein $a$, $k_a$,

173: usually becomes less than that of the mother protein $k_b$.

174: The linkage of the new protein family is determined by that of

175: the protein created. By this process, the newly born family $F_a$

176: consists of a single protein, but has a number of linkages, say $K_{F_a}$,

177: to existing families.

178: The initial number of domains in the family is set to

179: $D_{F_a}=K_{F_a}$. In some cases, the newly created protein is left with no

180: interaction at all $(K_{F_a}=0)$.

181: In this case, we do not let it establish a new

182: family, but regard it as a remnant in the previous family.

183: When this case happens, the population of the family to which

184: the duplicated protein belongs is increased by 1. Note that the

185: remnant can later gain new interactions via mutation described below

186: and join the protein interaction network.

187:

188: With rate $1$, a randomly chosen existing protein $i$ gains a new

189: interaction to another previously unlinked protein $j$, which is

190: chosen among the proteins within compatible families,

191: according to the probability,

192: \begin{equation}

193: \Pi_j= \dfrac{D_{F_j}}{\underset{F_{l}\leftrightarrow F_i}{\displaystyle \sum\nolimits} D_{F_{l}}},

194: \end{equation}

195: where $F_i$ means the family

196: to which the protein $i$ belongs and $X\leftrightarrow Y$ means that

197: the families $X$ and $Y$ are compatible, i.e., linked in the PFN.

198: Eq.~(1), the preferential attachment in the domain abundance

199: constrained by FC, makes our model distinct

200: and successful.

201: In this process, the mutation as we will call it, the number of domains

202: in the family $F_i$ increases by 1, but the number of domains in $F_j$

203: does not.

204: This accounts for the acquisition of a new domain via mutation in

205: the family $F_i$. This stage lasts until there are 1,000 proteins

206: made, during which about $500$$\sim$$600$ families are created, a number

207: comparable with the number of superfamilies in yeast~\cite{superfamily}

208:

209: \item

210: In the second stage, the same protein evolution process as in

211: the first stage occurs, except that the PFN is

212: kept fixed and the daughter protein remains in the same family as

213: its mother in the duplication process.

214: This stage lasts until there are about 6,000 proteins in

215: the network, the approximate size of the yeast proteome.

216: \end{enumerate}

217:

218: A few remarks on the model are in order.

219: First, this model is designed to be as simple as possible while

220: implementing FC into the

221: trio of duplication, divergence, and mutation,

222: which we believe to be the most basic processes.

223: Many interesting processes, such as lateral gene transfer

224: and {\it de novo} creation of proteins and protein families,

225: are not covered in this model, however.

226: Second, we made an assumption that the time-scale of

227: the PFN evolution is strictly separated,

228: which might be an oversimplification.

229: Third, proteins and protein families may become extinct during evolution,

230: followed by the loss of the interactions between them.

231: However, we may view the parameters of the evolution rates,

232: such as $\alpha$ and $\delta$,

233: as {\it effective} ones incorporating all these details.

234: Also, for the sake of minimizing the number of free parameters,

235: we assume that the duplication and the divergence rates of proteins

236: and protein families are equal, i.e., $\alpha=\alpha_f$ and

237: $\delta=\delta_f$, although we can fix $\alpha$ and $\delta$ for any

238: given set of ($\alpha_f$, $\delta_f$) to incorporate the empirical

239: data.

240:

241: {\em Structure of the yeast PIN}---

242: Several analyses on the topological properties of the yeast

243: PIN have been performed during recent

244: years \cite{lethal,maslov,wagner}. Since then, however, new

245: protein--protein interactions in yeast have been discovered steadily,

246: so we repeat the analysis by integrating the most up-to-date data

247: from various public resources, such as

248: (i) the database at the Munich Information Center for Protein Sequences \cite{mips},

249: (ii) the database of the interacting proteins \cite{dip},

250: (iii) the biomolecular interaction network database \cite{bind},

251: (iv) the two-hybrid datasets obtained by Uetz {\it et al.}~\cite{uetz},

252: by Ito {\it et al.}~\cite{ito}, and by Tong {\it et al.}~\cite{tong},

253: and (v) the mass spectrometry data (filtered) by Ho {\it et al.}~\cite{ho}.

254: After trimming the synonyms and other redundant entries manually,

255: the resulting network consists of 15,\mbox{ }652 interactions

256: (excluding self-interactions) between 4,\mbox{ }926 nodes (in terms of

257: distinct open reading frames and other biomolecules).

258:

259: The topological properties of the integrated yeast PIN are shown

260: in Fig.~2:

261:

262: (a) The degree distribution of the PIN fits well to the generalized Pareto

263: distribution (or a generalized power law) \cite{ab,koonin},

264: \begin{equation}

265: p_d(k)\sim (k+k_0)^{-\gamma},

266: \label{pk}

267: \end{equation}

268: with $k_0=8.0$ and $\gamma\simeq3.45$.

269: Note that different functional types of the degree distribution from

270: Eq.~(\ref{pk}) were proposed~\cite{sole,vazquez,berg,wagner,lethal}

271: based on smaller-scale datasets than the current one.

272:

273: (b) The yeast PIN is highly clustered and modular.

274: To quantify this, we measured the local clustering of a protein $i$,

275: $c_i = {2e_i}/{k_i(k_i-1)}$, where $e_i$ is the number of links

276: present between the $k_i$ neighbors of node $i$ out of its maximum

277: possible number $k_i(k_i-1)/2$.

278: The clustering coefficient of a graph, $C$, is the average of

279: $c_i$ over all nodes with $k_i\ge 2$. We obtain $C\approx 0.128$.

280: $C(k)$ is the clustering function of vertices with degree

281: $k$~\cite{vespignani2,ravasz}.

282: $C(k)$ exhibits a plateau for small $k$ while it drops rapidly

283: for large $k$.

284: Such a plateau in the clustering function may reflect the

285: functional module structure within the PIN, inside which the

286: network is denser due to the high cooperativity to perform

287: a given cellular task. Such locally dense modules are interconnected

288: by a few global mediators, which are likely to be the hubs in the PIN \cite{han-vidal}.

289: This feature is what most existing PIN models fail to reproduce.

290: As we will show, the FC constraint that we introduce

291: successfully accounts for the emergence of the plateau in $C(k)$.

292:

293: (c) The yeast PIN shows a dissortative degree correlation.

294: The average neighbor-degree function

295: $\langle k_{\rm nn}\rangle(k)$ \cite{knn} is measured to be

296: $\langle k_{\rm nn} \rangle(k) \sim k^{-\nu}$

297: with $\nu\approx 0.3$, somewhat smaller than the value reported based

298: on a single two-hybrid dataset alone~\cite{maslov}.

299: The assortativity $r$, defined as the Pearson correlation coefficient

300: between the degrees of the two vertices on each side of

301: a link~\cite{assort}, is measured to be $r \approx -0.13$.

302: In Table \ref{tab1}, we summarize our measurements for the topological properties

303: of the integrated yeast PIN.

304: \begin{table}[b]

305: \caption{Topological quantities of the integrated

306: yeast PIN and the model network.

307: Error bars in the model results are the standard deviations of the

308: quantities from 1000 runs.}

309: \label{tab1}

310: \begin{ruledtabular}

311: \begin{tabular}{lll}

312: item & model & yeast PIN \\

313: \hline

314: total number of nodes $n$\phantom{aaa} & 6000\phantom{aaa} & $\approx$6000 \\

315: number of interacting nodes $N$\phantom{aaa} & 5079$\pm$54 & 4926 \\

316: average degree $\langle k\rangle$\phantom{aaa} & 6.5$\pm$0.3 & 6.35 \\

317: clustering coefficient $C$ & 0.13$\pm$0.02 & 0.128 \\

318: assortativity index $r$ & $-$0.09$\pm$0.04 & $-0.13$ \\

319: size of the largest component $N_1$ & 5051$\pm$53 & 4832 \\

320: \end{tabular}

321: \end{ruledtabular}

322: \end{table}

323:

324: \begin{figure*}

325: \begin{minipage}[!t]{0.5\linewidth}

326: \flushright{\epsfxsize=6.3cm \epsfbox{fig3a.eps}}

327: \end{minipage}\hfill

328: \begin{minipage}[!t]{0.5\linewidth}

329: \flushleft{\epsfxsize=6.3cm \epsfbox{fig3b.eps}}

330: \end{minipage}\hfill

331: \caption{(a) Comparison between the degree correlation profiles of the

332: yeast PIN and (b) the model network. The color code denotes the value

333: of $\log_{10}[P(k,k')/P_{\rm random}(k,k')]$. The randomized networks

334: are generated by the switching method \cite{maslov}

335: that conserves the degree sequence.\\

336: }

337: \label{corr}

338: \end{figure*}

339:

340: {\em Results}--- Now we compare the simulation results of our model.

341: In typical simulations,

342: we employed $\alpha=0.8$ and $\delta=0.7$. The value of $\delta$ was

343: chosen to accommodate the fact that superfamilies exhibit extensive

344: sequence diversity~\cite{todd}. The value of $\alpha$ was set to match

345: the empirical value of the average degree of the PIN,

346: $\langle k\rangle\simeq 6.4$. Also, we matched approximately the numbers

347: of protein families and proteins with those of budding yeast, as we

348: described before.

349: The results obtained from the model show

350: good agreements with the empirical data as shown in Fig.~2 and Table \ref{tab1}.

351: In Fig.~2, we also show the results with the model without implementing

352: FC, which is similar to the model of Sol\'e et al.~\cite{sole}.

353: One can clearly see that without FC, we cannot

354: account for the clustering and the degree correlation characteristics.

355: We also examine the full degree-correlation profile of

356: the joint probability $P(k,k')$ that two proteins with degrees $k$ and

357: $k'$ are connected to each other.

358: The degree-correlation intensity is quantified by $P(k,k')/P_{\rm random}(k,k')$,

359: the ratio with the joint probability in the randomized ensemble of

360: the original network \cite{maslov,sole03}.

361: As shown in Fig.~3, the profile obtained from the model

362: has a pattern that is quite similar to that of the empirical yeast PIN.

363:

364: \begin{figure}[t]

365: \centerline{\epsfxsize=\linewidth \epsfbox{fig4.eps}}

366: \caption{Network randomization test with and without FC.

367: (a) Clustering function $C(k)$ and

368: (b) the clustering coefficient $C$ as functions of

369: the number of edge shufflings are shown.

370: Symbols are for the unperturbed model network ($\bigcirc$),

371: the network shuffled with FC ($\diamond$),

372: and the network shuffled without FC ($\Box$).

373: The horizontal line in (b) corresponds to the value of the clustering

374: coefficient in the unperturbed model network.

375: }

376: \end{figure}

377:

378: To get further support for the relevance of the FC constraint,

379: we performed a network randomization test. We randomized the model network

380: by using the conventional edge switching method \cite{maslov}, but with the

381: FC constraint. That is, when we are to switch the interactions

382: between the protein pairs, only the switching attempts that preserve

383: FC are accepted. In this way, we can filter out the role of

384: FC. In Fig.~4, we show the results of randomization. We find that the

385: high clustering property of the network is preserved with randomization

386: with FC, but not without FC. Without FC, the clustering coefficient

387: drops as soon as we shuffle the network, as can be seen in Fig.~4(b).

388: Thus, we conclude FC, indeed, plays a crucial role in PIN evolution.

389:

390: \begin{figure}[t]

391: \centerline{\epsfxsize=9.5cm \epsfbox{fig5.eps}}

392: \caption{Simulation results for the protein family network:

393: (a) The family degree distribution $p_d(k_F)$ and

394: (b) the family size distribution $p_s(s_F)$.

395: The dotted lines in (a) and (b) are fit lines to Eq.~(\ref{pk}).

396: }

397: \end{figure}

398:

399: Finally, we check the properties of the PFN. In Fig.~5, we show the

400: degree distribution of the PFN and the family size distribution

401: generated {\it in silico}. The degree distribution of the PFN follows

402: a similar form to Eq.~(2), but with a different value of the exponent,

403: $\gamma_f\approx 3$. The family size distribution also follows a power

404: law with an exponent of 3$\sim$4.

405:

406: In summary, we have introduced an {\em in-silico} model for PIN

407: evolution. The model network is composed of the PIN and the PFN.

408: In the early stage of evolution, the PIN and the PFN coevolve,

409: and in the later stage, the PFN becomes fixed.

410: The evolution proceeds by the three major mechanisms

411: previously proposed, duplication, divergence, and mutation.

412: However, it is constrained by FC and

413: follows a modified preferential attachment rule in the domain abundance,

414: which is the new feature of our model.

415: We have checked various structural properties of the model network, finding

416: that they show good agreements with those of the integrated empirical data

417: of the yeast PIN.

418: Finally, it would be interesting to apply our model to higher eukaryotes,

419: as the data for the protein interactions are accumulating for the

420: multicellular species such as the nematode worm {\em Caenorhabditis elegans}

421: \cite{vidal} and the fruit fly {\em Drosophila melanogater} \cite{giot}.

422: \\

423:

424: \begin{acknowledgments}

425: The authors would like to thank J.~Park for helpful conversation.

426: This work is supported by Korea Science and Engineering Foundation

427: grant No. R14-2002-059-01000-0 in the Advanced Basic Research Laboratory

428: program and Ministry of Science and Technology grant No. M1 03B500000110.

429: \end{acknowledgments}

430:

431: \begin{thebibliography}{99}

432: \bibitem{rmp} R. Albert and A.-L. Barab\'asi, Rev. Mod. Phys. {\bf 74}, 47 (2002).

433: \bibitem{advphys} S. N. Dorogovtsev and J. F. F. Mendes, Adv. Phys. {\bf 51}, 1079 (2002).

434: \bibitem{siam} M. E. J. Newman, SIAM Rev. {\bf 45}, 167 (2003).

435: \bibitem{saemulli} B. Kahng, K.-I. Goh, D.-S. Lee, and D. Kim, Saemulli, New Physics (in Korean) {\bf 48}, 115 (2004).

436: \bibitem{dslee} D.-S. Lee, K.-I. Goh, B. Kahng, and D. Kim, J. Korean Phys. Soc. {\bf 44}, 633 (2004).

437: \bibitem{han}  C. N. Yoon, S. K. Han, and H. Y. Kim, J. Korean Phys. Soc. {\bf 44}, 638 (2004).

438: \bibitem{pyramid} Z. N. Oltvai and A.-L. Barab\'asi, {Science} {\bf 298}, 763 (2002).

439: \bibitem{network-biology} A.-L. Barab\'asi and Z. N. Oltvai, Nat. Rev. Genet. {\bf 5}, 101 (2004).

440: \bibitem{uetz} P. Uetz, {\em et al.}, {Nature (London)} {\bf 403}, 623 (2000); B. Schwikowski, P. Uetz, and S. Fields, {Nat. Biotechnol.} {\bf 18}, 1257 (2000).

441: \bibitem{ito} T. Ito, T. Chiba, R. Ozawa, M. Yoshida, M. Hattori, and Y. Sakaki, {Proc. Natl. Acad. Sci.} USA {\bf 98}, 4569 (2001).

442: \bibitem{tong} A. H. Y. Tong, {\em et al.}, {Science} {\bf 295}, 321 (2002).

443: \bibitem{gavin} A.-C. Gavin, {\em et al.}, {Nature (London)} {\bf 415}, 141 (2002).

444: \bibitem{ho} Y. Ho, {\em et al.}, {Nature (London)} {\bf 415}, 180 (2002).

445: \bibitem{mips} H. W. Mewes, {\em et al.}, Nucl. Acids Res. {\bf 32}, D41 (2004).

446: \bibitem{dip} L. Salwinski, C. S. Miller, A. J. Smith, F. K. Pettit, J. U. Bowie, and D. Eisenberg, Nucl. Acids Res. {\bf 32}, D449 (2004).

447: \bibitem{bind} G. D. Bader, D. Betel, and C. W. V. Hogue, Nucl. Acids Res. {\bf 31}, 248 (2003).

448: \bibitem{alberts} B. Alberts, D. Bray, A. Johnson, J. Lewis, M. Raff, K. Robert, and P. Walter, {\it Essential Cell Biology} (Garland, New York, 1998).

449: \bibitem{jpark} J. Park, M. Lappe, and S. A. Teichmann, {J. Mol. Biol.} {\bf 307}, 929 (2001).

450: \bibitem{huynen} M. A. Huynen and E. van Nimwegen, {Mol. Biol. Evol.} {\bf 15,} 583 (1998).

451: \bibitem{sole} R. V. Sol\'e, R. Pastor-Satorras, E. Smith, and T. Kepler., {Adv. Compl. Syst.} {\bf 5}, 43 (2002); R. Pastor-Satorras, E. D. Smith, and R. V. Sol\'e, {J. Theor. Biol.} {\bf 222}, 199 (2003).

452: \bibitem{vazquez} A. V\'azquez, A. Flammini, A. Maritan, and A. Vespignani, {ComPlexUs} {\bf 1}, 38 (2003).

453: \bibitem{kim} J. Kim, P. L. Krapivsky, B. Kahng, and S. Redner, Phys. Rev. E {\bf 66}, 05510(R) (2002).

454: \bibitem{chung} F. Chung, L. Lu, T. G. Dewey, and D. J. Galas, {J. Comput. Biol.} {\bf 18}, 1486 (2003).

455: \bibitem{berg} J. Berg, M. L\"assig, and A. Wagner, BMC Evol. Biol. {\bf 4}, 51 (2004).

456: \bibitem{ohno} S. Ohno, {\it Evolution by Gene Duplication} (Springer-Verlag, Berlin, 1970).

457: \bibitem{superfamily} J. Gough, K. Karplus, R. Hughey, and C. Chothia, J. Mol. Biol. {\bf 313}, 903 (2001).

458: \bibitem{lethal} H. Jeong, S. P. Mason, A.-L. Barab\'asi, and Z. N. Oltvai, {Nature (London)} {\bf 411}, 41 (2001).

459: \bibitem{wagner} A. Wagner, {Mol. Biol. Evol.} {\bf 18}, 1283 (2001).

460: \bibitem{maslov} S. Maslov and K. Sneppen, {Science} {\bf 296}, 910 (2002).

461: \bibitem{ab} R. Albert and A.-L. Barab\'asi, Phys. Rev. Lett. {\bf 85}, 5234 (2000).

462: \bibitem{koonin} E. V. Koonin, Y. I. Wolf, and G. P. Karev, {Nature} {\bf 420}, 218 (2002).

463: \bibitem{vespignani2} A. V\'azquez, R. Pastor-Satorras, and A. Vespignani, Phys. Rev. E {\bf 65}, 066130 (2002).

464: \bibitem{ravasz} E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai, and A.-L. Barab\'asi, Science {\bf 297,} 1551 (2002); E. Ravasz and A.-L. Barab\'asi, Phys. Rev. E {\bf 67,} 026112 (2003).

465: \bibitem{han-vidal} J.-D. Han, {\em et al.}, Nature (London) {\bf 430}, 88 (2004).

466: \bibitem{knn} R. Pastor-Satorras, A. V\'azquez and A. Vespignani, Phys. Rev. Lett. {\bf 87,} 258701 (2001).

467: \bibitem{assort} M. E. J. Newman, {Phys. Rev. Lett.} {\bf 89}, 208701 (2002).

468: \bibitem{todd} A. E. Todd, C. A. Orengo, and J.~M. Thornton, {J. Mol. Biol.} {\bf 307}, 1113 (2001).

469: \bibitem{sole03} R.~V. Sol\'e and P. Fern\'andez, (arXiv:q-bio.GN/0312032).

470: \bibitem{vidal} S. Li, {\em et al.}, Science {\bf 303}, 540 (2004).

471: \bibitem{giot} L. Giot, {\em et al.}, Science {\bf 302}, 1727 (2003).

472: \end{thebibliography}

473: \end{document}

474:

475: