physics0205002/nw.tex
1: \documentclass[twocolumn,showpacs,preprintnumbers,amsmath,amssymb]{revtex4}
2: \usepackage{graphicx}% Include figure files
3: \usepackage{epsfig}% Include figure files
4: \usepackage{dcolumn}% Align table columns on decimal point
5: \usepackage{bm}% bold math
6: %\documentstyle[aps,prl,epsf,psfig,twocolumn,floats,showpacs]{revtex}
7: %\documentclass[aps,prl,epsf,psfig,twocolumn,floats,showpacs]{revtex}
8: \begin{document} 
9: %\twocolumn[\hsize\textwidth\columnwidth\hsize\csname @twocolumnfalse\endcsname
10: \title{Evolving Networks with Multi-species Nodes \\ 
11: and Spread in the Number of Initial Links}
12: \author{Jong-Won Kim$^1$, Brian Hunt$^2$, and Edward Ott$^{1,3}$}
13: \affiliation{$^1$Department of Physics, and Institute for Research in 
14: Electronics and Applied Physics, \\
15: University of Maryland, College Park, Maryland  20742 \\ 
16: $^2$Department of Mathematics, and Institute for Physical Science and
17: Technology, \\
18: University of Maryland, College Park, Maryland  20742 \\ 
19: $^3$Department of Electrical and Computer Engineering, 
20: University of Maryland, College Park, Maryland  20742} 
21: \date{\today} 
22:  
23: \begin{abstract}
24:   We consider models for growing networks incorporating two effects not
25: previously considered: (i) different species of nodes, with each
26: species having different properties (such as different attachment probabilities 
27: to other node species); and (ii) when a new node is born, its number of links to
28: old nodes is random with a given probability distribution. Our numerical 
29: simulations show good agreement with analytic solutions.
30: As an application of our model, we investigate the movie-actor network with 
31: movies considered as nodes and actors as links. 
32: \end{abstract}
33: \pacs{05.10.-a, 05.45.Pq, 02.50.Cw, 87.23.Ge}
34: \maketitle 
35: %]
36: 
37: \section{Introduction}
38: 
39:   It is known that many evolving network systems, including the world wide web, 
40: as well as social, biological, and communication systems, show power law 
41: distributions. In particular, the number of nodes with $k$ links is often 
42: observed to be $n_k \sim k^{-\nu}$, where $\nu$ typically varies from 2.0 to 3.1
43: \cite{Dorogovtsev1}. The mechanism for power-law network scaling was addressed 
44: in a seminal paper by 
45: Barab\'{a}si and Albert (BA) who proposed \cite{Barabasi1} 
46: a simple growing network model in which the probability of a new 
47: node forming a link with an old node (the ``attachment probability") is 
48: proportional to the number of links of the old node. This model yields a power 
49: law distribution of links with exponent $\nu = 3$. 
50: Many other works have been done extending this the model. 
51: For example Krapivsky and Redner \cite{Krapivsky1} provide a comprehensive 
52: description for a model with more general dependence of the attachment 
53: probability on the number $k$ of old node links. 
54: For attachment probability proportional to $A_k = a k + b$ they found that, 
55: depending on $b/a$,
56: the exponent $\nu$ can vary from 2 to $\infty$. Furthermore, 
57: for $A_k \sim k^\alpha$, when $\alpha < 1$, $n_k$ decays faster than a power 
58: law, while when $\alpha > 1$, there emerges a single node which connects to 
59: nearly all other nodes. Other modifications of the model are the introduction
60: of aging of nodes \cite{Dorogovtsev2}, initial attractiveness of nodes 
61: \cite{Dorogovtsev3}, the addition or re-wiring of links \cite{Albert1}, the
62: assignment of weights to links \cite{Yook1}, etc. 
63: 
64:   We have attempted to construct more general growing network models featuring
65: two effects which have not been considered previously: (i) multiple species of
66: nodes [in real network systems, there may be different species of nodes with 
67: each species having different properties (\it e.g.\rm, each species may have 
68: different 
69: probabilities for adding new nodes and may also have different attachment 
70: probabilities to the same node species and to other node species, etc.)]. 
71: (ii) initial link distributions [\it i.e.\rm, when a new node is born, its 
72: number of links 
73: to old nodes is not necessarily a constant number, but, rather, is 
74: characterized by a given probability distribution $p_k$ of new links]. 
75: 
76:   As an application of our model, we investigate the movie-actor network
77: with movies considered as nodes and actors as links (\it i.e.\rm, if the same 
78: actor appears in two movies there is a link between the two 
79: movies \cite{movie}). Moreover, we consider theatrical movies and 
80: made-for-television movies to constitute two different species. 
81: 
82: \section{Model}
83: 
84:   We construct a growing network model which incorporates multiple species and 
85: initial link probabilities. Given an initial network, we create new nodes at
86: a constant rate. We let the new node belong to species $j$ with 
87: probability $Q^{(j)}$ ($\sum_j Q^{(j)}=1$). We decide 
88: how many links $l$ the new node establishes with already existing nodes by 
89: randomly choosing $l$ from a probability distribution $p^{(j)}_l$. 
90: Then, we randomly attach the new node to $l$ existing nodes with preferential 
91: attachment probability proportional to a factor $A^{(j,i)}_k$, where $k$ is the
92: number of links of the target node of species $i$ to which the new node of 
93: species $j$ may connect. That is, the connection probability between an 
94: existing node and a new node is determined by the number of links of the 
95: existing node and the species of the new node and the target node.
96: 
97:   As for the single species case \cite{Krapivsky1}, the evolution of this model 
98: can be described by rate equations. In our case the rate equations give the
99: evolution of $N^{(i)}_k$, the number of species $i$ nodes that have 
100: $k$ links, 
101: \begin{eqnarray}
102: \frac{dN^{(i)}_k}{dt} &=& \sum^S_{j=1}Q^{(j)}\bar{k}^{(j)}
103: \frac{\left[A^{(j,i)}_{k-1} N^{(i)}_{k-1} - A^{(j,i)}_k N^{(i)}_k \right]}
104: {\sum_m \sum_k A^{(j,m)}_k N^{(m)}_k} \nonumber \\
105: & & + Q^{(i)}p^{(i)}_k,
106: \label{eq:Nk}
107: \end{eqnarray}
108: where $S$ is the total number of species and $\bar{k}^{(j)}=\sum_l l p^{(j)}_l$ 
109: is the average number of new links to a new node of species $j$, and $t$ is 
110: normalized so that the rate of creation of new nodes is one per unit time. 
111: The term proportional to 
112: $A^{(j,i)}_{k-1}N^{(i)}_{k-1}$ accounts for the increase of $N^{(i)}_k$ 
113: due to the addition of a new node of species $j$ that links to a species 
114: $i$ node with $k-1$ connections. The term proportional to 
115: $A^{(j,i)}_k N^{(i)}_k$ accounts for the decrease of $N^{(i)}_k$ due to 
116: linking of a new species $j$ node with an existing species $i$ node with 
117: $k$ connections. The denominator, $\sum_m \sum_k A^{(j,m)}_k N^{(m)}_k$,
118: is a normalization factor. If we add a new node with $l$ initial links, we have 
119: $l$ chances of increasing/decreasing $N^{(i)}_k$. This is accounted for by the 
120: factor $\bar{k}^{(j)} = \sum_l l p^{(j)}_l$ appearing in the summand of 
121: Eq. (\ref{eq:Nk}). The last term, $Q^{(i)}p^{(i)}_k$, accounts for the 
122: introduction of new nodes of species $i$. Since all nodes have at least one 
123: link, $N^{(i)}_0 = 0 $.
124: 
125: \section{Analysis of the Model} 
126: 
127:   Equation (\ref{eq:Nk}) implies that total number of nodes and total number 
128: of links increase at fixed rates. The total number of nodes of species $i$ 
129: increases at the rate $Q^{(i)}$. Thus
130: \begin{equation}
131: \sum_kN^{(i)}_k = Q^{(i)}t.
132: \label{eq:Qi}
133: \end{equation}
134: The link summation over all species $\sum_i \sum_k kN^{(i)}_k$ is twice the 
135: total number of links in the network. Thus
136: \begin{equation}
137: \sum^S_i \sum_k k N^{(i)}_k = 2 \left<\dot{k}\right>t,
138: \label{eq:kbar}
139: \end{equation}
140: where 
141: $\left<\dot{k}\right>=\sum_i\sum_kQ^{(i)}kp^{(i)}_k=\sum_iQ^{(i)}\bar{k}^{(i)}$.
142: Solutions of (\ref{eq:Nk}) occur in the form(\it c.f.\rm, \cite{Krapivsky1} for
143: the case of single species nodes),
144: \begin{equation}
145: N^{(i)}_k = n^{(i)}_k t,
146: \label{eq:nkt}
147: \end{equation}
148: where $n^{(i)}_k$ is independent of $t$. Eq. (\ref{eq:Nk}) yields 
149: \begin{equation}
150: n^{(i)}_k = \frac{B^{(i)}_{k-1}n^{(i)}_{k-1} + Q^{(i)}p^{(i)}_k}
151: {(B^{(i)}_k + 1)},
152: \label{eq:sol}
153: \end{equation}
154: where $B^{(i)}_k$ is
155: \begin{equation}
156: B^{(i)}_k = \sum^S_{j=1}Q^{(j)}\bar{k}^{(j)}
157: \frac{A^{(j,i)}_k} {\sum_m \sum_k A^{(j,m)}_k n^{(m)}_k}.
158: \label{eq:eta}
159: \end{equation}
160: 
161: %\subsection{Single Species with initial link probability} 
162: 
163:   To most simply illustrate the effect of spread in the initial number of 
164: links, we first consider the case of a network with a single species of node
165: and with a simple form for the attachment $A_k = A^{(1,1)}_k$. In particular,
166: we choose \cite{Krapivsky1}, $A_k = k +c$. (Note that by Eq. (\ref{eq:Nk})
167: this is equivalent to $A_k = ak+b$ with $c=b/a$.)
168: Inserting this $A_k$ into Eq. (\ref{eq:eta}), we 
169: obtain $\sum_k (k+c)n_k = 2{\left<\dot{k}\right>} +cQ$ and  
170: $B_k = (k+c)/\eta$, where 
171: $\eta = (2{\left<\dot{k}\right>}+cQ)/{(Q\bar{k})} = 2 +c/{\bar{k}} \ge 2$. 
172: (Note that$\left<\dot{k}\right> = Q \bar{k}$ for the single species case.) 
173: Thus Eq. (\ref{eq:sol}) yields 
174: \begin{equation}
175: \left[(k+c) n_k - (k+c-1)n_{k-1} \right] + \eta n_k =  \eta Q p_k.
176: \label{eq:sol1}
177: \end{equation}
178: Setting $p_k = p_1 (k+c)^{-\beta}$, we can solve Eq. (\ref{eq:sol1}) for large 
179: $k$ by approximating the discrete variable $k$ as continuous, so that
180: \begin{equation}
181: (k+c) n_k - (k+c-1) n_{k-1} \cong \frac{d}{dk}[(k+c)n_k].
182: \label{eq:approx}
183: \end{equation}
184: Solution of the resulting differential equation,
185: \begin{equation}
186: \frac{d}{dk}[(k+c)n_k] + \eta n_k = \eta Q p_1 (k+c)^{-\beta},
187: \label{eq:sol2}
188: \end{equation}
189: for $n_k$ with $\beta \ne \eta+1$ consists of a homogeneous solution 
190: proportional to $(k+c)^{-(\eta+1)}$ plus the particular solution, 
191: $[\eta Q p_1/(\eta +1 -\beta)](k+c)^{-\beta}$. For $\beta = \eta+1$ 
192: the solution is $n_k = \eta Q p_1 (k+c)^{-(\eta +1)} \ln [d(k+c)]$, where $d$ is
193: an arbitrary constant. Hence, for \it sufficiently large \rm $k$ we have
194: $n_k \sim k^{-(\eta + 1)}$ if $\beta > \eta + 1$, and $n_k \sim k^{-\beta}$ 
195: if $\beta < \eta + 1$. Thus the result for $\beta > \eta + 1$ is independent of 
196: $\beta$ and, for $c = 0$, coincides 
197: with that give in Ref. \cite{Barabasi1} ($\eta +1 = 3$ when $c = 0$).
198: Solutions of Eq. (\ref{eq:sol1}) for $n_k$ versus $k$ in the range
199: $1 \le k \le 10^4$ are shown as open circles in Fig. \ref{fig:model}(a)
200: for initial link probabilities of the form
201: \begin{eqnarray}
202: p_k = \left\{ 
203:   \begin{array}{ll} 
204:     p_1 k^{-1} & \text{   for $1 \le k \le 10^2$} \\
205:     p_1 10^{2(\bar{\beta}-1)} k^{-\bar{\beta}} &\text{   for $k \ge 10^2$,}
206:   \end{array} 
207: \right.
208: \label{eq:pk}
209: \end{eqnarray}
210: which are plotted as solid lines in Fig. \ref{fig:model}(a).
211: The values of $\bar{\beta}$ used for the figure are 
212: $\bar{\beta} = 0.5, 1, 2, 3, 4$, and $\infty$ ($\bar{\beta}=\infty$ corresponds
213: to $p_k \equiv 0$ for $k > 10^2$). For clarity $n_k$ has been 
214: shifted by a constant factor so that $n_1$ coincides with the corresponding 
215: value of $p_1$. Also, to separate the graphs for easier visual inspection,
216: the value of $p_1$ for successive $\bar{\beta}$ values is changed
217: by a constant factor [since (\ref{eq:sol1}) is linear, the form of the solution
218: is not effected].
219: We note from Fig. \ref{fig:model}(a) that $n_k$ follows $p_k$ for $k < 10^2$
220: in all cases. This is as expected, since $p_k$ decreases slower than $k^{-3}$
221: in this range. Furthermore, $n_k$ very closely follows $p_k$ for $k > 10^2$
222: for $\bar{\beta} = 0.5, 1.0, 2.0$. As $\bar{\beta}$ increases deviations of 
223: $n_k$ from $p_k$ in $k > 10^2$ become more evident, and the large $k$ asymptotic
224: $k^{-3}$ dependence is observed. Thus, if $p_k$ decreases sufficiently rapidly,
225: then the behavior of $n_k$ is determined by the growing network dynamics, 
226: while, if $p_k$ decreases slowly, then the behavior of $n_k$ is determined
227: by $p_k$.
228: 
229: \begin{figure}[t]
230: \epsfig{file=model.eps, width=90mm}
231: \caption{(a) $n_k$ and $p_k$ versus $k$ for the single species network
232: model. Solid lines are the initial link probability $p_k$ and circles 
233: are the $n_k$ obtained from Eq. (\ref{eq:sol1}).
234: (b) $n^{(1)}_k$ and $n^{(2)}_k$ versus $k$ for the two species network model. 
235: Circles (species $1$) and crosses (species $2$) are log-binned data from our 
236: numerical simulation. The total number of nodes in our numerical network system 
237: is $10^6$. The dashed lines are solutions obtained from (\ref{eq:sol}) and
238: (\ref{eq:eta2}).}
239: \label{fig:model}
240: \end{figure}
241: 
242: %\subsection{Multi-species without initial link probability} 
243: 
244:   To simply illustrate the effect of multiple species we now consider a 
245: growing two species network with $p_k = \delta_{1,k}$ (\it i.e.\rm,
246: $p_k = 0$ for $k \ge 2$). Then, Eq. (\ref{eq:eta}) becomes
247: \begin{subequations} \label{eq:eta1}
248: \begin{align}
249: B^{(1)}_k &=& 
250: \frac{Q^{(1)}A^{(1,1)}_k}{\sum_m \sum_k A^{(1,m)}_k n^{(m)}_k} + 
251: \frac{Q^{(2)}A^{(2,1)}_k}{\sum_m \sum_k A^{(2,m)}_k n^{(m)}_k}, 
252: \label{eq:eta1a} \\
253: B^{(2)}_k &=& 
254: \frac{Q^{(1)}A^{(1,2)}_k}{\sum_m \sum_k A^{(1,m)}_k n^{(m)}_k} + 
255: \frac{Q^{(2)}A^{(2,2)}_k}{\sum_m \sum_k A^{(2,m)}_k n^{(m)}_k},
256: \label{eq:eta1b}
257: \end{align}
258: \end{subequations}
259: where $\sum_m$ represents summation of species $1$ and $2$ nodes.
260: 
261:   In order to illustrate the model with our numerical simulations, we specialize
262: to a specific case. We choose attachment coefficients 
263: $A^{(1,1)}_k = ak$, $A^{(1,2)}_k = ak$, $A^{(2,1)}_k = bk$, and $A^{(2,2)}=0$.
264: Thus a new species $1$ node connects to either existing 
265: species $1$ nodes and species $2$ nodes with equal probability, while a new 
266: species $2$ node can connect to existing species $1$ nodes only.
267: Therefore, the first summation term in Eq. (\ref{eq:eta1}), 
268: ${\sum_m \sum_k A^{(1,m)}_k n^{(m)}_k}$, becomes 
269: $a\sum_k(kn^{(1)}_k+kn^{(2)}_k)$,
270: which is $a$ times the total increase of links at each time 
271: $a \times 2(Q^{(1)}+Q^{(2)})$. Recall that 
272: $Q^{(1)} + Q^{(2)} = 1$. In order to calculate the second summation term in 
273: Eq. (\ref{eq:eta1}), 
274: ${\sum_m \sum_k A^{(2,m)}_k n^{(m)}_k} = b \sum_k k n^{(1)}_k$, 
275: we define a parameter $\gamma$ that is the 
276: ratio of the total number of links of species $1$ to the total number of links 
277: in the network. Since the probability of linking a new species $1$ node to 
278: existing species $1$ nodes is determined by the total number of links of 
279: species $1$,
280: this probability is exactly same as $\gamma$. Thus, if we add a new species $1$ 
281: node, the number of links of species $1$ increases by $Q^{(1)}$ due to
282: the new node and by $\gamma Q^{(1)}$ due to the existing species $1$ nodes 
283: that become
284: connected with the new node, while the number of links of species $2$
285: increases by $(1-\gamma)Q^{(1)}$.
286: But, if we add a new species $2$ node, the numbers of links 
287: increases by $Q^{(2)}$ for both species because a new species $2$ node can 
288: link to species $1$ nodes only. Thus, the increase of species $1$ links is 
289: $(1+\gamma)Q^{(1)} + Q^{(2)}$ and that of species $2$ links is 
290: $(1-\gamma)Q^{(1)} + Q^{(2)}$. Since $\gamma$ is the ratio of the 
291: number of species $1$ links to the total number of links,
292: $\gamma = [(1+\gamma)Q^{(1)} + Q^{(2)}]/2$ or
293: \begin{equation}
294: \gamma = \frac{1}{2-Q^{(1)}}.
295: \label{eq:gamma}
296: \end{equation} 
297: With this $\gamma$, Eq. (\ref{eq:eta1}) becomes
298: \begin{subequations} \label{eq:eta2}
299: \begin{eqnarray}
300: B^{(1)}_k &=& \frac{Q^{(1)}}{2}k + \frac{Q^{(2)}(2-Q^{(1)})}{2}k 
301: = \frac{k}{\eta^{(1)}}, 
302: \label{eq:eta2a} \\
303: B^{(2)}_k &=& \frac{Q^{(1)}}{2}k = \frac{k}{\eta^{(2)}}. 
304: \label{eq:eta2b}
305: \end{eqnarray}
306: \end{subequations}
307: where obtain $\eta^{(1)} = 2/[Q^{(1)}+Q^{(2)}(2-Q^{(1)})]$ 
308: and $\eta^{(2)} = 2/Q^{(1)}$. 
309: 
310:   Proceeding as for the single species case, we approximate (\ref{eq:sol}) by
311: an ordinary differential equation (\it c.f.\rm, Eq. (\ref{eq:sol2})) to obtain
312: $n^{(i)}_k \sim k^{-(1+\eta^{(i)})}$. As an example, we set 
313: $Q^{(1)} = Q^{(2)} = 0.5$, in which case Eqs. (\ref{eq:eta2}) give exponents
314: $1+\eta^{(1)}=2.6$ and $1+\eta^{(2)} = 5$. In Fig. \ref{fig:model}(b) we plot,
315: for this case, the analytic solution obtained from (\ref{eq:sol}) and
316: (\ref{eq:eta2}) as dashed lines, and the results of numerical simulations
317: as open circles and pluses. The simulation results, obtained by histogram
318: binning with uniform bin size in $\log k$, agree with the analytic solutions,
319: and both show the expected large $k$ power law behaviors, 
320: $n^{(1)}_k \sim k^{-2.6}$ and $n^{(2)}_k \sim k^{-5}$. 
321: 
322: \section{The Movie-Actor Network}
323: 
324:   We now investigate the movie-actor network. We collected data from the 
325: Internet Movie Data Base (IMDB) web site \cite{Imdb}. The total number of 
326: movies is 285,297 and the total number of actors/actresses is 555,907. Within 
327: this database are 226,325 theatrical movies and 24,865 made for television 
328: movies. The other movies in the database are made for television series, 
329: video, mini series, and video games. In order to get good statistics,
330: we choose only theatrical and television movies made 
331: between 1950 to 2000. Thus we have two species of movies.
332: We also consider only actors/actresses from these movies.
333: We consider two movies to be linked if they have an actor/actress in common.
334: We label the theatrical movies species $1$, and the made for television
335: movies species $2$.
336: 
337:   In order to apply our model, Eq. (\ref{eq:Nk}), we require as input
338: $Q^{(j)}, p^{(j)}_k$ and $A^{(j,i)}_k$ which we obtain from the movie-actor
339: network data. We take $Q^{(1)}$ and $Q^{(2)}$ to be, respectively, 
340: the fractions of theatrical and made for television movies in our data base.
341: We obtain $Q^{(1)} = 0.83$ and $Q^{(2)} = 0.17$. We now consider $p^{(j)}_k$.
342: Suppose a new movie is produced casting $r$ actors. For each actor $s$
343: $(s = 1, 2, ..., r)$ let $l_s$ denote the number of previous movies in which
344: that actor appeared. Then the total number of the
345: initial links of the new movie is $\sum_s l_s$. From histograms of this 
346: number, we obtain (Figs. \ref{fig:pks}) the initial link probability 
347: distributions $p^{(j)}_k$. 
348: 
349: \begin{figure}[t]
350: \epsfig{file=pk.eps,width= 8.5 cm}
351: \caption{The initial link probability distributions $p_k$ of
352: (a) theatrical movies and (b) television movies. These plots are obtained 
353: using bins of equal width in $\log k$ and dividing the number of nodes in
354: each bin by the product of the bin width in $k$ (which varies from bin to bin)
355: and the total number of nodes.}
356: \label{fig:pks}
357: \end{figure}
358: 
359: The attachment $A^{(j,i)}_k$ can be numerically obtained from data via, 
360: \begin{equation}
361: A^{(j,i)}_k \sim \frac{\left<\Delta(j;i,k)\right>}{\delta t},
362: \label{eq:ak}
363: \end{equation}
364: where $\Delta(j;i,k)$ is the increase during a time interval $\delta t$  
365: in the number of links between old species $i$ nodes that had $k$ links
366: and new species $j$ nodes, and $<...>$ is an average over all such species
367: $i$ nodes \cite{Jeong1}. In the movie network, we count all movies and links 
368: from 1950 to 1999, and measure the increments in the number of links for a 
369: $\delta t$ of one year. We obtain attachment coefficient 
370: $A^{(1,1)}_k \sim 0.10k^{0.59}$ and 
371: $A^{(1,2)}_k \sim 0.04k^{0.85}$ for theatrical movies, 
372: and $A^{(2,1)}_k 0\sim 0.02k^{0.71}$ and $A^{(2,2)}_k \sim 0.04k^{0.77}$ 
373: for television movies. See Fig. \ref{fig:aks}.
374: 
375: \begin{figure}[t]
376: \epsfig{file=aks.eps,width= 8.5 cm}
377: \caption{Attachment coefficients for theatrical movies (a) $A^{(1,1)}_k$ and 
378: (b) $A^{(1,2)}_k$, and for television movies (c) $A^{(2,1)}_k$ and 
379: (d) $A^{(2,2)}_k$. All data are obtained using log-binning without 
380: normalization (see caption to Fig. \ref{fig:pks}).}
381: \label{fig:aks}
382: \end{figure}
383: 
384:   Incorporating these results for $Q^{(i)}$, $p^{(i)}_k$ and $A^{(j,i)}_k$
385: in our multi-species model, Eq. (\ref{eq:Nk}), we carry out numerical 
386: simulations as follows: 
387: (i) We add a new movie at each time step. We randomly designate each new movie
388: as a theatrical movie with probability $Q^{(1)}=0.83$ or a television movie 
389: with probability $Q^{(2)}=0.17$. 
390: (ii) With initial link probability $p^{(j)}_k$,
391: we randomly choose the number of connections to make to old movies.  
392: (iii) We then use the attachment $A^{(j,i)}_k$ to randomly choose connections
393: of new species $j$ movie to old species $i$ movies. 
394: (iv) We repeat (i)-(iii) adding 100,000 new movies, and finally calculate 
395: the probability distributions of movies with $k$ links. 
396:  
397: \begin{figure}[b]
398: \epsfig{file=nk.eps,width= 8 cm}
399: \caption{The probability distributions $n^{(i)}_k$ of movies that have $k$
400: links; (a) theatrical movies $n^{(1)}_k$ and (b) television movies $n^{(2)}_k$.
401: Dots are $n^{(i)}_k$ obtained from the movie network while circles are 
402: from numerical simulation using $Q^{(j)}$ obtained from our data base, 
403: $p^{(j)}_k$ in Fig. \ref{fig:pks} and $A^{(j,i)}_k$ in Fig. \ref{fig:aks}. 
404: All data are obtained using log-binning (see caption to Fig. \ref{fig:pks}).}
405: \label{fig:nk}
406: \end{figure}
407: 
408:   Figure \ref{fig:nk} shows $n^{(i)}_k$ versus $k$ obtained from our 
409: movie-actor network data base (dots) and from numerical simulations using 
410: Eq.(\ref{eq:Nk}) (open circles) with our empirically obtained 
411: results for $Q^{(j)}$, $p^{(j)}_k$, and $A^{(j,i)}_k$. The results are 
412: are roughly consistent with the existence of two scaling regions 
413: \cite{twoscale1}.
414: For small $k$ $(k \lesssim 10^2$) the two species exhibit slow power law
415: decay with different exponents, $n^{(1)}_k \sim k^{-0.5}$, 
416: $n^{(2)}_k \sim k^{-0.2}$, while for large $k$ the probabilities decay
417: much more rapidly. Indeed, the results of \cite{Krapivsky1} suggest that
418: the decay should be exponential for large $k$ since the attachment
419: $A^{(j,i)}_k$ grow sub-linearly with $k$.
420: We showed in Sec. III for the single species 
421: model with a linear attachment $A_k \sim k$ that $n_k$ follows $p_k$ when 
422: $p_k$ decays slowly, while $n_k$ is independent of $p_k$ when $p_k$ decays 
423: sufficiently quickly. As we will later show, this feature is also applicable to 
424: multi-species networks with nonlinear attachments. 
425: As seen in Figs. \ref{fig:pknk}(a) and \ref{fig:pknk}(b), $n^{(i)}_k$ 
426: follows $p^{(i)}_k$ in the small $k$ region. However, it is not clear whether 
427: $n^{(i)}_k$ follows $p^{(i)}_k$ in the large $k$ region.
428: In order to check the behavior of $n^{(i)}_k$ in this region, we carried out 
429: another numerical simulation using an initial link probability 
430: $\bar{p}^{(i)}_k$ which is cut off at $k=50$. 
431: That is, $\bar{p}^{(i)}_k = p^{(i)}_k/\sum \bar{p}^{(i)}_k$ when 
432: $k \le 50$ and $\bar{p}^{(i)}_k = 0$ when $k > 50$. 
433: Using $\bar{p}^{(i)}_k$ in place of $p^{(i)}_k$, 
434: we obtain from our simulation corresponding data, $\bar{n}^{(i)}_k$ versus
435: $k$, which are shown in Figs. \ref{fig:pknk}(c) and \ref{fig:pknk}(d) as
436: filled in circles. For comparison the data for $n^{(i)}_k$ from Figs.
437: \ref{fig:pknk}(a) and \ref{fig:pknk}(b) are plotted in Figs. \ref{fig:pknk}(c)
438: and \ref{fig:pknk}(d) as open circles. It is seen that the cutoff at $k=50$
439: induces a substantial change in the distribution of the number of links
440: for $k>50$. Thus it appears that, in the range tested, the large $k$ behavior 
441: of the movie-actor network is determined by the initial link probability 
442: $p^{(i)}_k$ rather than by the dynamics of the growing network.
443: 
444: %\section{Conclusion}
445: 
446:   In conclusion, in this paper we propose a model for a multi-species network
447:  with variable initial link probabilities. We have investigated the 
448: movie-actor network as an example. We believe that the effect of multiple 
449: species nodes may be important for modeling other complicated networks
450: (\it e.g.\rm, the world wide web can be divided into commercial sites and 
451: educational or personal sites).  We also conjecture that the initial link 
452: probability is a key feature of many growing networks.
453:  
454: \begin{figure}[t]
455: \epsfig{file=pknk.eps,width= 8 cm}
456: \caption{(a) and (b) are $n^{(i)}_k$ (circles) obtained from numerical 
457: simulations using $p^{(i)}_k$ (dashed lines), while (c) and (d) show 
458: $n^{(i)}_k$ from (a) and (b) (open circles) plotted with results denoted
459: $\bar{n}^{(i)}_k$ (filled circles) from simulation using  a cutoff initial
460: link probability $\bar{p}^{(i)}_k$ 
461: (where $\bar{p}^{(i)}_k = p^{(i)}_k/\sum \bar{p}^{(i)}_k$ when
462: $k \le 50$ and $\bar{p}^{(i)}_k = 0$ when $k > 50$). 
463: All data are obtained using log-binning (see caption to Fig. \ref{fig:pks}).}
464: \label{fig:pknk}
465: \end{figure}
466: 
467: \begin{references}
468: \bibitem{Dorogovtsev1} S.N. Dorogovtsev and J.F.F. Mendes, 
469: ArXiv:cond-mat/0106144 v1 7 Jun 2001. They summarize values of $\gamma$
470: for several network systems in Table I.
471: \bibitem{Barabasi1} A.-L. Barab\'{a}si and R. Albert, Science 
472: \bf 286\rm, 509(1999).
473: \bibitem{Krapivsky1} P.L. Krapivsky and S. Redner,  
474: Phys. Rev. E 6306(6):6123 (2001); See also P.L. Krapivsky, S. Render, 
475: and F. Leyvraz, Phys. Rev. Lett. \bf 85\rm, 4629(2000).
476: \bibitem{Dorogovtsev2} S.N. Dorogovtsev, J.F.F. Mendes, and A.N. Samukhin 
477: Phys. Rev. Lett. \bf 85\rm, 4633(2000).
478: \bibitem{Dorogovtsev3} S.N. Dorogovtsev and J.F.F. Mendes, Phys. Rev. E 
479: \bf 62\rm, 1842(2000).
480: \bibitem{Albert1} R. Albert and A.-L. Barab\'{a}si, Phys. Rev. Lett. 
481: \bf 85\rm, 5234(2000).
482: \bibitem{Yook1} S.H. Yook, H. Jeong, and A.-L. Barab\'{a}si, Phys. Rev. Lett. 
483: \bf 86\rm, 5835(2001).
484: \bibitem{movie} Barab\'{a}si and Albert also investigated the movie-actor 
485: network. However, they consider actors as nodes that are linked if
486: they are cast in the same movie. See Ref. \cite{Barabasi1} and Ref.
487: \cite{Albert1}. 
488: \bibitem{Imdb} The Internet Movie Database, http://www.imbd.com
489: \bibitem{Jeong1} The technique we use for obtaining $A^{(j,i)}_k$ is similar
490: to that used by H. Jeong \it et al\rm. who presume single species situations
491: (in which case the superscripts $j$, $i$ do not apply). [H. Jeong, Z. N\'{e}da,
492: and A.-L. Barab\'{a}si, ArXiv:cond-mat/0104131 v1 7 Apr 2001.] 
493: \bibitem{twoscale1} Similar observations suggesting two scaling regions have
494: also been recently observed in other cases of growing networks.
495: Barab\'{a}si \it et al\rm. investigated the scientific 
496: collaboration network [A.-L. Barab\'{a}si, \it et al\rm.,
497: ArXiv:cond-mat/0104162 v1 10 Apr 2001]. They argue that a model in which
498: links are continuously created between existing nodes explains the existence 
499: of two scaling regions in their data.
500: Vazquez investigated the citation network of papers (nodes) and authors (links)
501: for Phys. Rev. D and found two scalings in its in-degree distribution. 
502: See A. Vazquez, ArXiv:cond-mat/0105031 v1 2 May 2001. 
503: \end{references}
504: 
505: \end{document}
506: