1: \documentclass[11pt]{article}
2: \textwidth 6.5in\textheight 8.9in\oddsidemargin 0in\topmargin -.35in
3: \usepackage{epsfig,fancyheadings}\usepackage{redner}\pagestyle{plain}
4:
5: \makeatletter
6: \renewcommand\theequation{\thesection.\arabic{equation}}
7: %\renewcommand{\baselinestretch}{2} %double spacing
8: \@addtoreset{equation}{section}
9: \makeatother
10:
11: \begin{document}
12: \title{A Statistical Physics Perspective on Web Growth}
13: \author{P.~L.~Krapivsky and S.~Redner\\
14: Center for BioDynamics, Center for Polymer Studies,\\
15: and Department of Physics, Boston University, Boston, MA 02215, USA}
16: \maketitle
17: \begin{abstract}
18: Approaches from statistical physics are applied to investigate the
19: structure of network models whose growth rules mimic aspects of the
20: evolution of the world-wide web. We first determine the degree
21: distribution of a growing network in which nodes are introduced one at a
22: time and attach to an earlier node of degree $k$ with rate $A_k\sim
23: k^\gamma$. Very different behaviors arise for $\gamma<1$, $\gamma=1$, and
24: $\gamma>1$. We also analyze the degree distribution of a heterogeneous
25: network, the joint age-degree distribution, the correlation between degrees
26: of neighboring nodes, as well as global network properties. An extension
27: to directed networks is then presented. By tuning model parameters to
28: reasonable values, we obtain distinct power-law forms for the in-degree and
29: out-degree distributions with exponents that are in good agreement with
30: current data for the web. Finally, a general growth process with
31: independent introduction of nodes and links is investigated. This leads to
32: independently growing sub-networks that may coalesce with other
33: sub-networks. General results for both the size distribution of
34: sub-networks and the degree distribution are obtained.
35:
36: \end{abstract}
37:
38: \section{Introduction}
39:
40: With the recent appearance of the Internet and the world-wide web,
41: understanding the properties of growing networks with popularity-based
42: construction rules has become an active and fruitful research area
43: \cite{review}. In such models, newly-introduced nodes preferentially attach
44: to pre-existing nodes of the network that are already ``popular''. This
45: leads to graphs whose structure is quite different from the well-known {\em
46: random graph} \cite{bol,jan} in which links are created at random between
47: nodes without regard to their popularity. This discovery of a new class of
48: graph theory problems has fueled much effort to characterize their
49: properties.
50:
51: One basic measure of the structure of such networks is the {\em node degree}
52: $N_k$ defined as the number of nodes in the network that are linked to $k$
53: other nodes. In the case of the random graph, the node degree is simply a
54: Poisson distribution. In contrast, many popularity-driven growing networks
55: have much broader degree distributions with a stretched exponential or a
56: power-law tail. The latter form means that there is no characteristic scale
57: for the node degree, a feature that typifies many networked systems
58: \cite{review}.
59:
60: Power laws, or more generally, distributions with highly skewed tails,
61: characterize the degree distributions of many man-made and naturally
62: occurring networks \cite{review}. For example, the degree distributions at
63: the level of autonomous systems and at the router level exhibit highly skewed
64: tails \cite{fff,matta,as}. Other important Internet-based graphs, such as
65: the hyperlink graph of the world-wide web also appear to have a degree
66: distribution with a power-law tail \cite{kum,BA,www1,www2,www3}. These
67: observations have spurred a flurry of recent work to understand the
68: underlying mechanisms for these phenomena.
69:
70: A related example with interest to anyone who publishes, is the distribution
71: of scientific citations \cite{lotka,LS,redner}. Here one treats publications
72: as nodes and citations as links in a citation graph. Currently-available
73: data suggests that the citation distribution has a power-law tail with an
74: associated exponent close to $-3$ \cite{redner}. As we shall see, this
75: exponent emerges naturally in the {\it Growing Network} (GN) model where the
76: relative probability of linking from a new node to a previous node
77: (equivalent to citing an earlier paper) is strictly proportional to the
78: popularity of the target node.
79:
80: In this paper, we apply tools from statistical physics, especially the rate
81: equation approach, to quantify the structure of growing networks and to
82: elucidate the types of geometrical features that arise in networks with
83: physically-motivated growth rules. The utility of the rate equations has
84: been demonstrated in a diverse range of phenomena in non-equilibrium
85: statistical physics, such as aggregation \cite{agg}, coarsening
86: \cite{coarse}, and epitaxial surface growth \cite{surf}. We will attempt to
87: convince the reader that the rate equations are also a simple yet powerful
88: analysis tool to analyze growing network systems. In addition to providing
89: comprehensive information about the node degree distribution, the rate
90: equations can be easily adapted to analyze both heterogeneous and directed
91: networks, the age distribution of nodes, correlations between node degrees,
92: various global network properties, as well as the cluster size distribution
93: in models that give rise to independently evolving sub-networks. Thus the
94: rate equation method appears to be better suited for probing the structure of
95: growing networks compared to the classical approaches for analyzing random
96: graphs, such as probabilistic \cite{bol} or generating function \cite{jan}
97: techniques.
98:
99: In the next section, we introduce three basic models that will be the focus
100: of this review. In the following three sections, we then present rate
101: equation analyses to determine basic geometrical properties of these
102: networks. We close with a brief summary.
103:
104: \section{Models}
105:
106: The models we study appear to embody many of the basic growth processes in
107: web graphs and related systems. These include:
108:
109: \begin{itemize}
110:
111: \item The {\em Growing Network} (GN) \cite{BA,simon}. Nodes are added one at
112: a time and a single link is established between the new node and a
113: pre-existing node according to an attachment probability $A_k$ that depends
114: only on the degree of the ``target'' node (Fig.~\ref{network}).
115:
116: \begin{figure}[ht]
117: \begin{center}
118: \includegraphics[width=0.3\textwidth]{network.eps}
119: \caption{Growing network. Nodes are added sequentially and
120: a single link joins a new node to an earlier node. Node 1 has (total)
121: degree 5, node 2 has degree 3, nodes 4 and 6 have degree 2, and the
122: remaining nodes have degree 1.}~\label{network}
123: \end{center}
124: \end{figure}
125:
126: \item The {\em Web Graph} (WG). This represents an extension of the GN to
127: incorporate link directionality \cite{KRR} and leads to independent,
128: dynamically generated in-degree and out-degree distributions. The network
129: growth occurs by two distinct processes \cite{gen} that are meant to mimic
130: how hyperlinks are created in the web (Fig.~\ref{io-growth}):
131:
132: \begin{itemize}
133: \item[(i)] With probability $p$, a new node is introduced and it immediately
134: attaches to an earlier target node. The attachment probability depends
135: only on the in-degree of the target.
136: \item[(ii)] With probability $q=1-p$, a new link is created between already
137: existing nodes. The choices of the originating and target nodes depend on
138: the out-degree of the former and the in-degree of the latter.
139: \end{itemize}
140:
141: \begin{figure}[ht]
142: \begin{center}
143: \includegraphics[width=0.35\textwidth]{io-growth.eps}
144: \caption{Growth processes in the web graph model:
145: (i) node creation and immediate attachment, and (ii) link creation. In (i)
146: the new node is shaded, while in both (i) and (ii) the new link is dashed.}
147: \label{io-growth}
148: \end{center}
149: \end{figure}
150:
151: \item The {\em Multicomponent Graph} (MG). Nodes and links are introduced
152: {\em independently} \cite{clusters}. (i) With probability $p$, a new {\em
153: unlinked} node is introduced, while (ii) with probability $q=1-p$, a new
154: link is created between existing nodes. As in the WG, the choices of the
155: originating and target nodes depend on the out-degree of the former and the
156: in-degree of the latter. Step (i) allows for the formation of many
157: clusters.
158:
159: \end{itemize}
160:
161: \section{Structure of the Growing Network}
162:
163: Because of its simplicity, we first study the structure of the GN
164: \cite{BA,simon}. The basic approaches developed in this section will then be
165: extended to the WG and MG models.
166:
167: \subsection{Degree Distribution of a Homogeneous Network}
168:
169: We first focus on the node degree distribution $N_k$. To determine its
170: evolution, we shall write the rate equations that account for the change in
171: the degree distribution after each node addition event. These equations
172: contain complete information about the node degree, from which any measure of
173: node degree (such as moments) can be easily extracted. For the GN growth
174: process in which nodes are introduced one at a time, the rate equations for
175: the degree distribution $N_k(t)$ are \cite{KRL}
176: \begin{equation}
177: \label{Nk}
178: {d N_k\over dt}=
179: {A_{k-1} N_{k-1}-A_k N_k\over A}+\delta_{k1}.
180: \end{equation}
181: The first term on the right, $A_{k-1}N_{k-1}/A$, accounts for processes in
182: which a node with $k-1$ links is connected to the new node, thus increasing
183: $N_k$ by one. Since there are $N_{k-1}$ nodes of degree $k-1$, the rate at
184: which such processes occur is proportional to $A_{k-1}N_{k-1}$, and the
185: factor $A(t)=\sum_{j\geq 1} A_jN_j(t)$ converts this rate into a normalized
186: probability. A corresponding role is played by the second (loss) term on the
187: right-hand side; $A_kN_k/A$ is the probability that a node with $k$ links is
188: connected to the new node, thus leading to a loss in $N_k$. The last term
189: accounts for the introduction of new nodes with no incoming links.
190:
191: We start by solving for the time dependence of the moments of the degree
192: distribution defined via $M_n(t)=\sum_{j\geq 1} j^n N_j(t)$. This is a
193: standard method of analysis of rate equations by which one can gain partial,
194: but valuable, information about the time dependence of the system with
195: minimal effort. By explicitly summing Eqs.~(\ref{Nk}) over all $k$, we
196: easily obtain $\dot M_0(t)=1$, whose solution is $M_0(t)= M_0(0)+t$. Notice
197: that by definition $M_0(t)=\sum_k N_k$ is just the total number of nodes in
198: the network. It is clear by the nature of the growth process that this
199: quantity simply grows as $t$. In a similar fashion, the first moment of the
200: degree distribution obeys $\dot M_1(t)=2$ with solution $M_1(t)= M_1(0)+2t$.
201: This time evolution for $M_1$ can be understood either by explicitly summing
202: the rate equations, or by observing that this first moment simply equals the
203: total number of link endpoints. Clearly, this quantity must grow as $2t$
204: since the introduction of a single node introduces two link endpoints. Thus
205: we find the simple result that the first two moments are {\em independent\/}
206: of the attachment kernel $A_k$ and grow {\em linearly} with time. On the
207: other hand, higher moments and the degree distribution itself do depend in an
208: essential way on the kernel $A_k$.
209:
210: As a preview to the general behavior for the degree distribution, consider
211: the strictly linear kernel \cite{BA,KRL,DMS}, for which $A(t)$ coincides with
212: $M_1(t)$. In this case, we can solve Eqs.~(\ref{Nk}) for an arbitrary
213: initial condition. However, since the long-time behavior is most
214: interesting, we limit ourselves to the asymptotic regime ($t\to\infty$) where
215: the initial condition is irrelevant. Using therefore $M_1=2t$, we solve the
216: first few of Eqs.~(\ref{Nk}) directly and obtain $N_1=2t/3$, $N_2=t/6$, {\it
217: etc}. Thus each of the $N_k$ grow linearly with time. Accordingly, we
218: substitute $N_k(t)=t\,n_k$ in Eqs.~(\ref{Nk}) to yield the simple recursion
219: relation $n_k=n_{k-1} (k-1)/(k+2)$. Solving for $n_k$ gives
220: \begin{equation}
221: \label{nk1}
222: n_k={4\over k(k+1)(k+2)}.
223: \end{equation}
224:
225: Returning to the case of general attachment kernels, let us assume that the
226: degree distribution and $A(t)$ both grow linearly with time. This hypothesis
227: can be easily verified numerically for attachment kernels that do not grow
228: faster than linearly with $k$. Then substituting $N_k(t)=t\,n_k$ and
229: $A(t)=\mu t$ into Eqs.~(\ref{Nk}) we obtain the recursion relation
230: $n_k=n_{k-1} A_{k-1}/(\mu+A_k)$ and $n_1=\mu/(\mu+A_1)$. Finally, solving
231: for $n_k$, we obtain the formal expression
232: \begin{equation}
233: \label{Nkgen}
234: n_k={\mu\over A_k}\prod_{j=1}^{k}
235: \left(1+{\mu\over A_j}\right)^{-1}.
236: \end{equation}
237: To complete the solution, we need the amplitude $\mu$. Using the definition
238: $\mu=\sum_{j\geq 1}A_jn_j$ in Eq.~(\ref{Nkgen}), we obtain the implicit
239: relation
240: \begin{equation}
241: \label{mugen}
242: \sum_{k=1}^\infty \prod_{j=1}^{k}
243: \left(1+{\mu\over A_j}\right)^{-1}=1
244: \end{equation}
245: which shows that the amplitude $\mu$ depends on the entire attachment kernel.
246:
247: For the generic case $A_k\sim k^\gamma$, we substitute this form into
248: Eq.~(\ref{Nkgen}) and then rewrite the product as the exponential of a sum of
249: a logarithm. In the continuum limit, we convert this sum to an integral,
250: expand the logarithm to lowest order, and then evaluate the integral to yield
251: the following basic results:
252: \begin{eqnarray}
253: \label{cases}
254: n_k\sim\cases{
255: k^{-\gamma}\exp
256: \left[-\mu\left({{k^{1-\gamma}-2^{1-\gamma}}\over 1-\gamma}\right)\right],
257: &$0\leq\gamma<1$;\cr
258: k^{-\nu}, \quad \nu>2,
259: & $\gamma=1$;\cr
260: {\rm best\ seller} & $1<\gamma<2$;\cr
261: {\rm bible} & $2<\gamma$.}
262: \end{eqnarray}
263:
264: Thus the degree distribution decays exponentially for $\gamma=0$, as in the
265: case of the random graph, while for all $0<\gamma<1$, the distribution
266: exhibits robust stretched exponential behavior. The linear kernel is the
267: case that has garnered much of the current research interest. As shown
268: above, $n_k={4/[k(k+1)(k+2)]}$ for the strictly linear kernel $A_k=k$. One
269: might anticipate that $n_k\propto k^{-3}$ holds for all {\em asymptotically}
270: linear kernels, $A_k\sim k$. However, the situation is more delicate and the
271: degree distribution exponent depends on microscopic details of $A_k$. {}From
272: Eq.~(\ref{Nkgen}), we obtain $n_k\sim k^{-\nu}$, where the exponent
273: $\nu=1+\mu$ can be tuned to {\em any} value larger than 2 \cite{KRL,KR}.
274: This non-universal behavior shows that one must be cautious in drawing
275: general conclusions from the GN with a linear attachment kernel.
276:
277: % Another important message of this derivation is that the rate equation method
278: % provides a complete and satisfying way to the degree distribution for many
279: % different types of attachment kernels.
280:
281: \begin{figure}[ht]
282: \begin{center}
283: \includegraphics[width=0.25\textwidth]{degrees.eps}
284: \caption{A node with in-degree $i=4$, out-degree $j=5$, and total
285: degree 9.}~\label{degrees}
286: \end{center}
287: \end{figure}
288:
289: As an illustrative example of the vagaries of asymptotically linear kernels,
290: consider the shifted linear kernel $A_k=k+w$. One way to motivate this
291: kernel is to explicitly keep track of link directionality. In particular,
292: the node degree for an undirected graph naturally generalizes to the
293: in-degree and out-degree for a directed graph, the number of incoming and
294: outgoing links at a node, respectively. Thus the total degree $k$ in a
295: directed graph is the sum of the in-degree $i$ and out-degree $j$
296: (Fig.~\ref{degrees}). (More details on this model are given in the next
297: section.)~ The most general linear attachment kernel for a directed graph has
298: the form $A_{ij}=ai+bj$. The GN corresponds to the case where the out-degree
299: of any node equals one; thus $j=1$ and $k=i+1$. For this example the general
300: linear attachment kernel reduces to $A_k=a(k-1)+b$. Since the overall scale
301: is irrelevant, we can re-write $A_k$ as the shifted linear kernel $A_k=k+w$,
302: with $w=-1+b/a$ that can vary over the range $-1<w<\infty$.
303:
304: To determine the degree distribution for the shifted linear kernel, note that
305: $A(t)=\sum_jA_jN_j(t)$ simply equals \hbox{$A(t)=M_1(t)+wM_0(t)$}. {}Using
306: $A=\mu t$, $M_0=t$ and $M_1=2t$, we get $\mu=2+w$ and hence the relation
307: $\nu=1+\mu$ from the previous paragraph becomes $\nu=3+w$. Thus a simple
308: additive shift in the attachment kernel profoundly affects the asymptotic
309: degree distribution. Furthermore, from Eq.~(\ref{Nkgen}) we determine the
310: entire degree distribution to be
311: \begin{equation}
312: \label{nkw}
313: n_k=(2+w)\,{\Gamma(3+2w)\over \Gamma(1+w)}\,
314: {\Gamma(k+w)\over \Gamma(k+3+2w)}.
315: \end{equation}
316:
317: Finally, we outline the intriguing behavior for super-linear kernels. In
318: this case, there is a ``runaway'' or gelation-like phenomenon in which one
319: node links to almost every other node. For $\gamma>2$, all but a finite
320: number of nodes are linked to a {\em single} node that has the rest of the
321: links. We term such an overwhelmingly popular node as a ``bible''. For
322: $1<\gamma\leq 2$, the number of nodes with a just a few links is no longer
323: finite, but grows slower than linearly in time, and the remainder of the
324: nodes are linked to an extremely popular node that we now term ``best
325: seller''. Full details about this runaway behavior are given in \cite{KRL}.
326:
327: As a final parenthetical note, when the attachment kernel has the form
328: $A_k\propto k^\gamma$, with $\gamma<0$, there is preferential attachment to
329: poorly-connected sites. Here, the degree distribution exhibits faster than
330: exponential decay, $n_k\propto k^{-\gamma(k-1)}$. When $\gamma< -2$, the
331: propensity for avoiding popularity is so strong that there is a finite
332: probability of forming a ``worm'' graph in which each node attaches only to
333: its immediate predecessor.
334:
335: \subsection{Degree Distribution of a Heterogeneous Network}
336:
337: A practically-relevant generalization of the GN is to endow each node with an
338: intrinsic and permanently defined ``attractiveness'' \cite{BiA}. This
339: accounts for the obvious fact that not all nodes are equivalent, but that
340: some are clearly more attractive than others at their inception. Thus the
341: subsequent attachment rate to a node should be a function of both its degree
342: and its intrinsic attractiveness. For this generalization, the rate equation
343: approach yields complete results with minimal additional effort beyond that
344: needed to solve the homogeneous network.
345:
346: Let us assign each node an attractiveness parameter $\eta>0$, with arbitrary
347: distribution, at its inception. This attractiveness modifies the node
348: attachment rate as follows: for a node with degree $k$ and attractiveness
349: $\eta$, the attachment rate is simply $A_k(\eta)$. Now we need to
350: characterize nodes both by their degree and their attractiveness -- thus
351: $N_k(\eta)$ is the number of nodes with degree $k$ and attractiveness $\eta$.
352: This joint degree-attractiveness distribution obeys the rate equation,
353: \begin{equation}
354: \label{Nk-het}
355: {d N_k(\eta)\over dt}=
356: {A_{k-1}(\eta) N_{k-1}(\eta)-A_k(\eta) N_k(\eta)\over A}+p_0(\eta)\delta_{k1}.
357: \end{equation}
358: Here $p_0(\eta)$ is the probability that a newly-introduced node has
359: attractiveness $\eta$, and the normalization factor $A=\int d\eta
360: \sum_{k}A_k(\eta)N_k(\eta)$.
361:
362: Following the same approach as that used to analyze Eq.~(\ref{Nk}), we
363: substitute $A=\mu t$ and $n_k(\eta)=tN_k(\eta)$ into Eq.~(\ref{Nk-het}) to
364: obtain the recursion relation
365: \begin{equation}
366: \label{Nkgen-het}
367: n_k(\eta)=p_0(\eta){\mu\over A_k(\eta)}\prod_{j=1}^{k}
368: \left(1+{\mu\over A_j(\eta)}\right)^{-1}.
369: \end{equation}
370:
371: For concreteness, consider the linear attachment kernel $A_k(\eta)=\eta k$.
372: Then applying the same analysis as in the homogeneous network, we find
373: \begin{equation}
374: \label{nk-het}
375: n_k(\eta)= {\mu\,p_0(\eta)\over \eta}\,
376: {\Gamma(k)\, \Gamma\left(1+{\mu\over \eta}\right)\over
377: \Gamma\left(k+1+{\mu\over \eta}\right)}.
378: \end{equation}
379: To determine the amplitude $\mu$ we substitute (\ref{nk-het}) into the
380: definition $\mu=\int d\eta\, \sum_{k\geq 1}A_k(\eta)\,n_k(\eta)$ and use the
381: identity \cite{knuth}
382: \begin{eqnarray*}
383: \label{identity}
384: \sum_{k=1}^\infty {\Gamma(k+u)\over \Gamma(k+v)}
385: ={\Gamma(u+1)\over (v-u-1)\,\Gamma(v)}
386: \end{eqnarray*}
387: to simplify the sum. This yields the implicit relation
388: \begin{equation}
389: \label{mu-het}
390: 1=\int d\eta\, p_0(\eta)\,\left({\mu\over \eta}-1\right)^{-1}.
391: \end{equation}
392: This condition on $\mu$ leads to two alternatives: If the support of $\eta$
393: is unbounded, then the integral diverges and there is no solution for $\mu$.
394: In this limit, the most attractive node is connected to a finite fraction of
395: all links. Conversely, if the support of $\eta$ is bounded, the resulting
396: degree distribution is similar to that of the homogeneous network. For fixed
397: $\eta$, $n_k(\eta)\sim k^{-\nu(\eta)}$ with an attractiveness-dependent decay
398: exponent $\nu(\eta)=1+\mu/\eta$. Amusingly, the total degree distribution
399: $n_k=\int d\eta\,n_k(\eta)$ is no longer a strict power law \cite{BiA}.
400: Rather, the asymptotic behavior is governed by properties of the initial
401: attractiveness distribution near the upper cutoff. In particular, if
402: $p_0(\eta)\sim (\eta_{\rm max}-\eta)^{\omega-1}$ (with $\omega>0$ to ensure
403: normalization), the total degree distribution exhibits a logarithmic
404: correction
405: \begin{equation}
406: \label{nk-asymp-het}
407: n_k\sim k^{-(1+\mu/\eta_{\rm max})}\,(\ln k)^{-\omega}.
408: \end{equation}
409:
410: \subsection{Age Distribution}
411:
412: In addition to the degree distribution, we determine {\em when} connections
413: occur. Naively, we expect that older nodes will be better connected. We
414: study this feature by resolving each node both by its degree and its age to
415: provide a more complete understanding of the network evolution. Thus define
416: $c_k(t,a)$ to be the average number of nodes of age $a$ that have $k-1$
417: incoming links at time $t$. Here age $a$ means that the node was introduced
418: at time $t-a$. The original degree distribution may be recovered from the
419: joint age-degree distribution through $N_k(t)=\int_0^t da\,c_k(t,a)$.
420:
421: For simplicity, we consider only the case of the strictly linear kernel; more
422: general kernels were considered in Ref.~\cite{KR}. The joint age-degree
423: distribution evolves according to the rate equation
424: \begin{equation}
425: \label{ck1}
426: \left({\partial \over \partial t}+{\partial \over \partial a}\right)c_k
427: ={A_{k-1}c_{k-1}-A_k c_k\over 2t}
428: +\delta_{k1}\delta(a).
429: \end{equation}
430: The second term on the left accounts for the aging of nodes. We assume here
431: that the probability of linking to a given node again depends only on its
432: degree and not on its age. Finally, we again have used $A(t)\equiv
433: M_1(t)\simeq 2t$ for the linear attachment kernel in the long-time limit.
434:
435: The homogeneous form of this equation implies that solution should be
436: self-similar. Thus we seek a solution as a function of the {\em single}
437: variable $a/t$ rather than two separate variables. Writing
438: $c_k(t,a)=f_k(x)$ with $x=1-{a\over t}$, we convert Eq.~(\ref{ck1}) into the
439: ordinary differential equation
440: \begin{equation}
441: \label{fk1}
442: -2x\,{df_k\over dx}=(k-1) f_{k-1}-k f_k.
443: \end{equation}
444: We omit the delta function term, since it merely provides the boundary
445: condition $c_k(t,a=0)=\delta_{k1}$, or $f_k(1)=\delta_{k1}$.
446:
447:
448: The solution to this boundary-value problem may be simplified by assuming the
449: exponential solution $f_k=\Phi\varphi^{k-1}$; this is consistent with the
450: boundary condition, provided that $\Phi(1)=1$ and $\varphi(1)=0$. This
451: ansatz reduces the infinite set of rate equations (\ref{fk1}) into two
452: elementary differential equations for $\varphi(x)$ and $\Phi(x)$ whose
453: solutions are $\varphi(x)=1-\sqrt{x}$ and $\Phi(x)=\sqrt{x}$. In terms of
454: the original variables of $a$ and $t$, the joint age-degree distribution is
455: then
456: \begin{eqnarray}
457: \label{ck1all}
458: c_k(t,a)=\sqrt{1-{a\over t}}\left\{1-\sqrt{1-{a\over t}}\right\}^{k-1}.
459: \end{eqnarray}
460:
461: Thus the degree distribution for fixed-age nodes decays {\em exponentially},
462: with a characteristic degree that diverges as $\langle k\rangle\sim
463: (1-a/t)^{-1/2}$ for $a\to t$. As expected, young nodes (those with $a/t\to
464: 0$) typically have a small degree while old nodes have large degree
465: (Fig.~\ref{age}). It is the large characteristic degree of old nodes that
466: ultimately leads to a {\em power-law} total degree distribution when the
467: joint age-degree distribution is integrated over all ages.
468:
469: \begin{figure}[ht]
470: \begin{center}
471: \includegraphics[width=0.4\textwidth]{age.eps}
472: \caption{Age-dependent degree distribution for the GN for the linear
473: attachment kernel. Low-degree nodes tend to be relatively young while
474: high-degree nodes are old. The inset shows detail for $a/t\geq 0.98$.}
475: \label{age}
476: \end{center}
477: \end{figure}
478:
479: \subsection{Node Degree Correlations}
480:
481: The rate equation approach is sufficiently versatile that we can also obtain
482: much deeper geometrical properties of growing networks. One such property is
483: the correlation between degrees of connected nodes \cite{KR}. These develop
484: naturally because a node with large degree is likely to be old. Thus its
485: ancestor is also old and hence also has a large degree. In the context of
486: the web, this correlation merely expresses that obvious fact that it is more
487: likely that popular web sites have hyperlinks among each other rather than to
488: marginal sites.
489:
490: To quantify the node degree correlation, we define $C_{kl}(t)$ as the number
491: of nodes of degree $k$ that attach to an ancestor node of degree $l$
492: (Fig.~\ref{corr-def}). For example, in the network of Fig.~\ref{network},
493: there are $N_1=6$ nodes of degree 1, with $C_{12}=C_{13}=C_{15}=2$. There
494: are also $N_2=2$ nodes of degree 2, with $C_{25}=2$, and $N_3=1$ nodes of
495: degree 3, with $C_{35}=1$.
496:
497: \begin{figure}[ht]
498: \begin{center}
499: \includegraphics[width=0.2\textwidth]{corr-def.eps}
500: \caption{Definition of the node degree correlation $C_{kl}$ for the case
501: $k=3$ and $l=4$.}
502: \label{corr-def}
503: \end{center}
504: \end{figure}
505:
506: For simplicity, we again specialize to the case of the strictly linear
507: attachment kernel. More general kernels can also be treated within our
508: general framework \cite{KR}. For the linear attachment kernel, the degree
509: correlation $C_{kl}(t)$ evolves according to the rate equation
510: \begin{eqnarray}
511: \label{Nkl}
512: M_1\,{d C_{kl}\over dt}=(k-1) C_{k-1,l}-kC_{kl}+
513: (l-1) C_{k,l-1}-l C_{kl}+(l-1)C_{l-1}\,\delta_{k1}.
514: \end{eqnarray}
515: The processes that gives rise to each term in this equation are illustrated in
516: Fig.~\ref{corr-RE}. The first two terms on the right account for the change
517: in $C_{kl}$ due to the addition of a link onto a node of degree $k-1$ (gain)
518: or $k$ (loss) respectively, while the second set of terms gives the change in
519: $C_{kl}$ due to the addition of a link onto the ancestor node. Finally, the
520: last term accounts for the gain in $C_{1l}$ due to the addition of a new
521: node.
522:
523: \begin{figure}[ht]
524: \begin{center}
525: \includegraphics[width=0.7\textwidth]{corr-RE.eps}
526: \caption{The processes that contribute ((i)--(v) in order)
527: to the various terms in the rate equation (\ref{Nkl}). The newly-added
528: node and link are shown dashed.}
529: \label{corr-RE}
530: \end{center}
531: \end{figure}
532:
533: As in the case of the node degree, the time dependence can be separated as
534: $C_{kl}= tc_{kl}$. This reduces Eqs.~(\ref{Nkl}) to the time-independent
535: recursion relation,
536: \begin{eqnarray}
537: \label{nkl}
538: (k+l+2)c_{kl}=(k-1) c_{k-1,l}+(l-1) c_{k,l-1}
539: +(l-1)c_{l-1}\,\delta_{k1}.
540: \end{eqnarray}
541: This can be further reduced to a constant-coefficient inhomogeneous recursion
542: relation by the substitution
543: \begin{eqnarray*}
544: \label{Akl}
545: c_{kl}={\Gamma(k)\,\Gamma(l)\over \Gamma(k+l+3)}\,\,d_{kl}
546: \end{eqnarray*}
547: to yield
548: \begin{equation}
549: \label{A}
550: d_{kl}=d_{k-1,l}+d_{k,l-1}+4(l+2)\delta_{k1}.
551: \end{equation}
552: Solving Eqs.~(\ref{A}) for the first few $k$ yields the pattern of dependence
553: on $k$ and $l$ from which one can then infer the solution
554: \begin{equation}
555: \label{A-sol}
556: d_{kl}=4\,{\Gamma(k+l)\over \Gamma(k+2)\,\Gamma(l-1)}
557: +12\,{\Gamma(k+l-1)\over \Gamma(k+1)\,\Gamma(l-1)},
558: \end{equation}
559: from which we ultimately obtain
560: \begin{eqnarray}
561: \label{nkl-sol}
562: c_{kl}={4(l-1)\over k(k+l)(k+l+1)(k+l+2)}\left[{1\over k+1}
563: +{3\over k+l-1}\right].
564: \end{eqnarray}
565: The important feature of this result is that the joint distribution does not
566: factorize, that is, $c_{kl}\ne n_kn_{l}$. This correlation between the
567: degrees of connected nodes is an important distinction between the GN and
568: classical random graphs.
569:
570: While the solution of Eq.~(\ref{nkl-sol}) is unwieldy, it greatly simplifies
571: in the scaling regime, $k\to\infty$ and $l\to\infty$ with $y=l/k$ finite.
572: The scaled form of the solution is
573: \begin{eqnarray}
574: \label{nkl-scal}
575: c_{kl}=k^{-4}\,{4y(y+4)\over (1+y)^4}.
576: \end{eqnarray}
577: For fixed large $k$, the distribution $c_{kl}$ has a single maximum at
578: $y^*=(\sqrt{33}-5)/2 \cong 0.372$. Thus a node whose degree $k$ is large is
579: typically linked to another node whose degree is also large; the typical
580: degree of the ancestor is 37\% that of the daughter node. In general, when
581: $k$ and $l$ are both large and their ratio is different from one, the
582: limiting behaviors of $c_{kl}$ are
583: \begin{equation}
584: \label{nklext}
585: c_{kl}\to\cases{16\,(l/k^5) & $l\ll k$,\cr
586: 4/(k^2\,l^2) & $l\gg k$.\cr}
587: \end{equation}
588: Here we explicitly see the absence of factorization in the degree
589: correlation: $c_{kl}\ne n_kn_{l}\propto (k\,l)^{-3}$.
590:
591: \subsection{Global Properties}
592:
593: In addition to elucidating the degree distribution and degree correlations,
594: the rate equations can be applied to determine global properties. One useful
595: example is the {\em out-component\/} with respect to a given node {\bf x} --
596: this is the set of nodes that can be reached by following directed links that
597: emanate from {\bf x} (Fig.~\ref{in-out}). In the context of the web, this is
598: the set of nodes that are reached by following hyperlinks that emanate from a
599: fixed node to target nodes, and then iteratively following target nodes ad
600: infinitum. In a similar vein, one may enumerate all nodes that refer to a
601: fixed node, plus all nodes that refer these daughter nodes, {\it etc}. This
602: progeny comprises the in-component to node {\bf x} -- the set from which {\bf
603: x} can be reached by following a path of directed links.
604:
605: \begin{figure}[ht]
606: \begin{center}
607: \includegraphics[width=0.35\textwidth]{in-out.eps}
608: \caption{In-component and out-components of node {\bf x}.}~\label{in-out}
609: \end{center}
610: \end{figure}
611:
612: \subsubsection{The In-Component}
613:
614: For simplicity, we study the in-component size distribution for the GN with a
615: constant attachment kernel, $A_k=1$. We consider this kernel because many
616: results about network components are {\it independent\/} of the form of the
617: kernel and thus it suffices to consider the simplest situation; the extension
618: to more general attachment kernels is discussed in \cite{KR}.
619:
620: For the constant attachment kernel, the number $I_s(t)$ of in-components with
621: $s$ nodes satisfies the rate equation
622: \begin{equation}
623: \label{Ik}
624: {d I_s\over dt}={(s-1)I_{s-1}-sI_s\over A}+\delta_{s1}.
625: \end{equation}
626: The loss term accounts for processes in which the attachment of a new node to
627: an in-component of size $s$ increases its size by one. This gives a loss
628: rate that is proportional to $s$. If there is more than one in-component of
629: size $s$ they must be disjoint, so that the total loss rate for $I_s(t)$ is
630: simply $sI_s(t)$. A similar argument applies for the gain term. Finally,
631: dividing by $A(t)=\sum_j A_j N_j(t)$ converts these rates to normalized
632: probabilities. For the constant attachment kernel, $A(t)=M_0(t)$, so
633: asymptotically $A=t$. Interestingly, Eq.~(\ref{Ik}) is almost identical to
634: the rate equations for the degree distribution for the GN with linear
635: attachment kernel, except that the prefactor equals $t^{-1}$ rather than
636: $(2t)^{-1}$. This change in the normalization factor is responsible for
637: shifting the exponent of the resulting distribution from $-3$ to $-2$.
638:
639: To determine $I_s(t)$, we again note, by explicitly solving the first few of
640: the rate equations, that each $I_s$ grows linearly in time. Thus we
641: substitute $I_s(t)=ti_s$ into Eqs.~(\ref{Ik}) to obtain $i_1=1/2$ and
642: $i_s=i_{s-1}(s-1)/(s+1)$. This immediately gives
643: \begin{equation}
644: \label{is}
645: i_s={1\over s(s+1)}.
646: \end{equation}
647: This $s^{-2}$ tail for the in-component distribution is a robust feature,
648: {\em independent\/} of the form of the attachment kernel \cite{KR}. This
649: $s^{-2}$ tail also agrees with recent measurements of the web \cite{www2}.
650:
651: \subsubsection{The Out-Component}
652:
653: The complementary out-component from each node can be determined by
654: constructing a mapping between the out-component and an underlying network
655: ``genealogy''. We build a genealogical tree for the GN by taking generation
656: $g=0$ to be the initial node. Nodes that attach to those in generation $g$
657: form generation $g+1$; the node index does not matter in this
658: characterization. For example, in the network of Fig.~\ref{network}, node 1
659: is the ``ancestor'' of 6, while 10 is the ``descendant'' of 6 and there are 5
660: nodes in generation $g=1$ and 4 in $g=2$. This leads to the genealogical
661: tree of Fig.~\ref{genealogy}.
662:
663: \begin{figure}[ht]
664: \begin{center}
665: \includegraphics[width=0.35\textwidth]{genealogy.eps}
666: \caption{Genealogy of the network in Fig.~\ref{network}.
667: The nodes indices indicate when each is introduced. The nodes are also
668: arranged according to generation number.}~\label{genealogy}
669: \end{center}
670: \end{figure}
671:
672: The genealogical tree provides a convenient way to characterize the
673: out-component distribution. As one can directly verify from
674: Fig.~\ref{genealogy}, the number $O_s$ of out-components with $s$ nodes
675: equals $L_{s-1}$, the number of nodes in generation $s-1$ in the genealogical
676: tree. We therefore compute $L_g(t)$, the size of generation $g$ at time $t$.
677: For this discussion, we again treat only the constant attachment kernel and
678: refer the reader to Ref.~\cite{KR} for more general attachment kernels. We
679: determine $L_g(t)$ by noting that $L_g(t)$ increases when a new node attaches
680: to a node in generation $g-1$. This occurs with rate $L_{g-1}/M_0$, where
681: $M_0(t)=1+t$ is the number of nodes. This gives the differential equation
682: for $\dot L_g(t)=L_{g-1}/(1+t)$ with solution $L_g(\tau)={\tau^g/g!}$, where
683: $\tau=\ln(1+t)$. Thus the number $O_s$ of out-components with $s$ nodes
684: equals
685: \begin{equation}
686: \label{Rk}
687: O_s(\tau)={\tau^{s-1}/ (s-1)!}.
688: \end{equation}
689: Note that the generation size $L_g(t)$ grows with $g$, when $g<\tau$, and
690: then decreases and becomes of order 1 when $g=e\tau$. The genealogical tree
691: therefore contains approximately $e\tau$ generations at time $t$. This
692: result allows us to determine the diameter of the network, since the maximum
693: distance between any pair of nodes is twice the distance from the root to the
694: last generation. Therefore the diameter of the network scales as
695: $2e\tau\approx 2e\ln N$; this is the same dependence on $N$ as in the random
696: graph \cite{bol,jan}. More importantly, this result shows that the diameter
697: of the GN is always small -- ranging from the order of $\ln N$ for a constant
698: attachment kernel, to the order of one for super-linear attachment kernels.
699:
700: \section{The Web Graph}
701:
702: In the world-wide web, link directionality is clearly relevant, as hyperlinks
703: go {\em from} an issuing website {\em to} a target website but not vice
704: versa. Thus to characterize the local graph structure more fully, the node
705: degree should be resolved into the {\em in-degree} -- the number of incoming
706: links to a node, and the complementary {\em out-degree} (Fig.~\ref{degrees}).
707: Measurements on the web indicate that these distributions are power laws with
708: different exponents \cite{www3}. These properties can be accounted for by
709: the web graph (WG) model (Fig.~\ref{io-growth}) and the rate equations
710: provide an extremely convenient analysis tool.
711:
712: \subsection{Average Degrees}
713:
714: Let us first determine the average node degrees (in-degree, out-degree, and
715: total degree) of the WG. Let $N(t)$ be the total number of nodes, and $I(t)$
716: and $J(t)$ the in-degree and out-degree of the entire network, respectively.
717: According to the elemental growth steps of the model, these degrees evolve by
718: one of the following two possibilities:
719: \begin{eqnarray*}
720: (N,I,J)\to \cases{(N+1,I+1,J+1) & with probability $p$,\cr
721: (N,I+1,J+1) & with probability $q$.}
722: \end{eqnarray*}
723: That is, with probability $p$ a new node and new directed link are created
724: (Fig.~\ref{io-growth}) so that the number of nodes and both the total in- and
725: out-degrees increase by one. Conversely, with probability $q$ a new directed
726: link is created and the in- and out-degrees each increase by one, while the
727: total number of nodes is unchanged. As a result, $N(t)=pt$, and
728: $I(t)=J(t)=t$. Thus the average in- and out-degrees, ${\cal D}_{\rm in}\equiv
729: I(t)/N(t)$ and ${\cal D}_{\rm out}\equiv J(t)/N(t)$, are both equal to $1/p$.
730:
731: \subsection{Degree Distributions}
732:
733: To determine the degree distributions, we need to specify: (i) the {\em
734: attachment rate} $A(i,j)$, defined as the probability that a
735: newly-introduced node links to an existing node with $i$ incoming and $j$
736: outgoing links, and (ii) the {\em creation rate} $C(i_1,j_1|i_2,j_2)$,
737: defined as the probability of adding a new link from a $(i_1,j_1)$ node to a
738: $(i_2,j_2)$ node. We will use rates that are expected to occur in
739: the web. Clearly, the attachment and creation rates should be non-decreasing
740: in $i$ and $j$. Moreover, it seems intuitively plausible that the attachment
741: rate depends only on the in-degree of the target node, $A(i,j)=A_i$; {\it
742: i.e.}, a website designer decides to create link to a target based only on
743: the popularity of the latter. In the same spirit, we take the link creation
744: rate to depend only on the out-degree of the issuing node and the in-degree
745: of the target node, $C(i_1,j_1|i_2,j_2)= C(j_1,i_2)$. The former property
746: reflects the fact that the development rate of a site depends only on the
747: number of outgoing links.
748:
749: The interesting situation of power-law degree distributions arises for
750: asymptotically linear rates, and we therefore consider
751: \begin{equation}
752: \label{AC}
753: A_i=i+\lambda_{\rm in} \qquad{\rm and}\qquad C(j,i)=(i+\lambda_{\rm
754: in})(j+\lambda_{\rm out})
755: \end{equation}
756: The parameters $\lambda_{\rm in}$ and $\lambda_{\rm out}$ must satisfy the
757: constraint $\lambda_{\rm in}>0$ and $\lambda_{\rm out}>-1$ to ensure that the
758: rates are positive for all attainable in- and out-degree values, $i\geq 0$
759: and $j\geq 1$.
760:
761: With these rates, the joint degree distribution, $N_{ij}(t)$, defined as the
762: average number of nodes with $i$ incoming and $j$ outgoing links, evolves
763: according to
764: \begin{eqnarray}
765: \label{Nij}
766: {d N_{ij}\over dt}&=&
767: (p+q)\left[{(i-1+\lambda_{\rm in})N_{i-1,j}
768: -(i+\lambda_{\rm in})N_{ij}\over I+\lambda_{\rm in} N}\right]\\
769: &&\hskip 0.285in
770: +q\left[{(j-1+\lambda_{\rm out})N_{i,j-1}
771: -(j+\lambda_{\rm out})N_{ij}\over J+\lambda_{\rm out} N}\right]
772: +p\,\delta_{i0}\delta_{j1}.\nonumber
773: \end{eqnarray}
774: The first group of terms on the right accounts for the changes in the
775: in-degree of target nodes by simultaneous creation of a new node and link
776: (probability $p$) or by creation of a new link only (probability $q$). For
777: example, the creation of a link to a node with in-degree $i$ leads to a loss
778: in the number of such nodes. This occurs with rate $(p+q)(i+\lambda_{\rm
779: in})N_{ij}$, divided by the appropriate normalization factor
780: $\sum_{i,j}(i+\lambda_{\rm in})N_{ij}= I+\lambda_{\rm in} N$. The factor
781: $p+q=1$ in Eq.~(\ref{Nij}) is explicitly written to make clear these two
782: types of processes. Similarly, the second group of terms account for
783: out-degree changes. These occur due to the creation of new links between
784: already existing nodes -- hence the prefactor $q$. The last term accounts
785: for the introduction of new nodes with no incoming links and one outgoing
786: link. As a useful consistency check, one may verify that the total number of
787: nodes, $N=\sum_{i,j} N_{ij}$, grows according to $\dot N=p$, while the total
788: in- and out-degrees, $I=\sum_{i,j} iN_{ij}$ and $J=\sum_{i,j} jN_{ij}$, obey
789: $\dot I=\dot J=1$.
790:
791: By solving the first few of Eqs.~(\ref{Nij}), it is again clear that the
792: $N_{ij}$ grow linearly with time. Accordingly, we substitute
793: $N_{ij}(t)=t\,n_{ij}$, as well as $N=pt$ and $I=J=t$, into Eqs.~(\ref{Nij})
794: to yield a recursion relation for $n_{ij}$. Using the shorthand notations,
795: \begin{eqnarray*}
796: a=q\,{1+p\lambda_{\rm in}\over 1+p\lambda_{\rm out}}\quad {\rm and}\quad
797: b=1+(1+p)\lambda_{\rm in},
798: \end{eqnarray*}
799: the recursion relation for $n_{ij}$ is
800: \begin{eqnarray}
801: \label{nij}
802: [i+a(j+\lambda_{\rm out})+b]n_{ij}
803: =(i-1+\lambda_{\rm in})n_{i-1,j}+a(j-1+\lambda_{\rm out})n_{i,j-1}
804: +p(1+p\lambda_{\rm in})\delta_{i0}\delta_{j1}.
805: \end{eqnarray}
806: The in-degree and out-degree distributions are straightforwardly expressed
807: through the joint distribution: ${\cal I}_i(t)
808: =\sum_j N_{ij}(t)$ and ${\cal O}_j(t)=\sum_i N_{ij}(t)$. Because of the
809: linear time dependence of the node degrees, we write ${\cal I}_i(t)=t\,I_i$
810: and ${\cal O}_j(t)=t\,O_j$. The densities $I_i$ and $O_j$ satisfy
811: \begin{subeqnarray}
812: \label{Ii}
813: (i+b)I_{i} &=&(i-1+\lambda_{\rm in})I_{i-1}
814: +p(1+p\lambda_{\rm in})\delta_{i0},\\
815: \left(j+{1\over q}+{\lambda_{\rm out}\over q}\right)O_j
816: &=&(j-1+\lambda_{\rm out})O_{j-1}+p{1+p\lambda_{\rm out}\over q}\delta_{j1},
817: \end{subeqnarray}
818: respectively. The solution to these recursion formulae may be expressed
819: in terms of the following ratios of gamma functions
820: \begin{subeqnarray}
821: \label{I-sol}
822: I_{i}&=&I_0\,{\Gamma(i+\lambda_{\rm in})\,\Gamma(b+1)\over
823: \Gamma(i+b+1)\,\Gamma(\lambda_{\rm in})},\\
824: \label{O-sol}
825: O_{j}&=&O_1\,{\Gamma(j+\lambda_{\rm out})\,\,
826: \Gamma(2+q^{-1}+\lambda_{\rm out} q^{-1})\over
827: \Gamma(j+1+q^{-1}+\lambda_{\rm out} q^{-1})\,\Gamma(1+\lambda_{\rm out})},
828: \end{subeqnarray}
829: with $I_0=p(1+p\lambda_{\rm in})/b$ and
830: $O_1=p(1+p\lambda_{\rm out})/(1+q+\lambda_{\rm out})$.
831:
832: {}From the asymptotics of the gamma function, the asymptotic behavior of the
833: in- and out-degree distributions have the distinct power law forms \cite{KRR},
834: \begin{subeqnarray}
835: \label{in}
836: I_i\sim i^{-\nu_{\rm in}},~~~\qquad \nu_{\rm in}&=&2+p\lambda_{\rm in},\\
837: \hskip 0.7in O_j\sim j^{-\nu_{\rm out}}, \qquad
838: \nu_{\rm out}&=&1+q^{-1}+\lambda_{\rm out}\, pq^{-1},
839: \end{subeqnarray}
840: with $\nu_{\rm in}$ and $\nu_{\rm out}$ both necessarily greater than 2. Let
841: us now compare these predictions with current data for the web \cite{www3}.
842: First, the value of $p$ is fixed by noting that $p^{-1}$ equals the average
843: degree of the entire network. Current data for the web gives ${\cal D}_{\rm
844: in}\equiv {\cal D}_{\rm out}\approx 7.5$, and thus we set $p^{-1}=0.75$.
845: Now Eqs.~(\ref{in}) contain two free parameters and by choosing them to be
846: $\lambda_{\rm in}=0.75$ and $\lambda_{\rm out}=3.55$ we reproduced the
847: observed exponents for the degree distributions of the web, $\nu_{\rm
848: in}\approx 2.1$ and $\nu_{\rm out}\approx 2.7$, respectively. The fact
849: that the parameters $\lambda_{\rm in}$ and $\lambda_{\rm out}$ are of the
850: order of one indicates that the model with linear rates of node attachment
851: and bilinear rates of link creation is a viable description of the web.
852:
853: \section{Multicomponent Graph}
854:
855: In addition to the degree distributions, current measurements indicate that
856: the web consists of a ``giant'' component that contains approximately 91\% of
857: all nodes, and a large number of finite components \cite{www3}. The models
858: discussed thus far are unsuited to describe the number and size distribution
859: of these components, since the growth rules necessarily produce only a single
860: connected component. In this section, we outline a simple modification of
861: the WG, the multicomponent graph (MG), that naturally produces many
862: components. In this example, the rate equations now provide a comprehensive
863: characterization for the size distribution of the components.
864:
865: In the MG model, we simply separate node and link creation steps. Namely,
866: when a node is introduced it does not immediately attach to an earlier node,
867: but rather, a new node begins its existence as isolated and joins the network
868: only when a link creation event reaches the new node. For the average
869: network degrees, this small modification already has a significant effect.
870: The number of nodes and the total in- and out-degrees of the network, $N,I,J$
871: now increase with time as $N=pt$ and $I=J=qt$. Thus the in- and out-degrees
872: of each node are time independent and equal to $qp^{-1}$, while the total
873: degree is ${\cal D}=2q/p$.
874:
875: As in the case of the WG model, we study the case of a bilinear link creation
876: rate given in Eq.~(\ref{AC}), with now $\lambda_{\rm in},\lambda_{\rm out}>0$
877: to ensure that $C(j,i)>0$ for all permissible in- and out-degrees, $i\geq 0$
878: and $j\geq 0$.
879:
880: \subsection{Local Properties}
881:
882: We study local characteristics by employing the same approach as in the WG
883: model. We find that results differ only in minute details, {\it e.g.}, the
884: in- and out-degree densities $I_i$ and $O_j$ are again the ratios of gamma
885: functions, and the respective exponents are
886: \begin{equation}
887: \label{inout}
888: \nu_{\rm in}=2\left(1+{\lambda_{\rm in}\over {\cal D}}\right),\qquad
889: \nu_{\rm out}=2\left(1+{\lambda_{\rm out}\over {\cal D}}\right).
890: \end{equation}
891: Notice the decoupling -- the in-degree exponent is independent of
892: $\lambda_{\rm out}$, while $\nu_{\rm out}$ is independent of $\lambda_{\rm
893: in}$. The expressions (\ref{inout}) are neater than their WG counterparts,
894: reflecting the fact that the governing rules of the MG model are more
895: symmetric.
896:
897: To complement our discussion, we now outline the asymptotic behavior of the
898: joint in- and out-degree distribution. Although this distribution defies
899: general analysis, we can obtain partial and useful information by fixing one
900: index and letting the other index vary. An elementary but cumbersome
901: analysis yields following limiting behaviors
902: \begin{equation}
903: \label{extreme}
904: n_{ij}\sim\cases{i^{-\xi_{\rm in}}, & $1\ll i$;\cr
905: j^{-\xi_{\rm out}}, & $1\ll j$;}
906: \end{equation}
907: with
908: \begin{eqnarray*}
909: \xi_{\rm in} &=&\nu_{\rm in}+{{\cal D}\over 2}\,
910: {(\nu_{\rm in}-1)(\nu_{\rm out}-2)\over \nu_{\rm out}-1}\\
911: \xi_{\rm out} &=&\nu_{\rm out}+{{\cal D}\over 2}\,
912: {(\nu_{\rm out}-1)(\nu_{\rm in}-2)\over \nu_{\rm in}-1}.
913: \end{eqnarray*}
914:
915: We also can determine the joint degree distribution analytically in the
916: subset of the parameter space where $\nu_{\rm in}=\nu_{\rm out}$, {\it i.e.},
917: $\lambda_{\rm in}=\lambda_{\rm out}$. In what follows, we therefore denote
918: $\lambda_{\rm in}=\lambda_{\rm out}\equiv \lambda$. The resulting recursion
919: equation for the joint degree distribution is
920: \begin{eqnarray}
921: \label{nij*}
922: (i+j+1+\lambda+\lambda q^{-1})n_{ij}=(i-1+\lambda)n_{i-1,j}
923: +(j-1+\lambda)n_{i,j-1}+c\,\delta_{i,0}\,\delta_{j,0},
924: \end{eqnarray}
925: with $c=p(1+2\lambda/{\cal D})$. Because the degrees $i$ and $j$ appear in
926: Eq.~(\ref{nij*}) with equal prefactors, the substitution
927: \begin{eqnarray*}
928: \label{mij}
929: n_{ij}={\Gamma(i+\lambda)\,\Gamma(j+\lambda)\over
930: \Gamma(i+j+2+\lambda+\lambda q^{-1})}\,\,m_{ij}
931: \end{eqnarray*}
932: reduces Eqs.~(\ref{nij*}) into the constant-coefficient recursion relation
933: \begin{equation}
934: \label{m}
935: m_{ij}=m_{i-1,j}+m_{i,j-1}+\mu\,\delta_{i,0}\,\delta_{j,1}, \qquad
936: {\rm with}\quad \mu=c\,{\Gamma(1+\lambda+\lambda q^{-1})\over
937: \Gamma^2(\lambda)}.
938: \end{equation}
939: We solve Eq.~(\ref{m}) by employing the generating function technique.
940: Multiplying Eq.~(\ref{m}) by $x^iy^j$ and summing over all $i,j\geq 0$, we
941: find that the generating function ${\mathcal M}(x,y)=\sum_{i,j\geq
942: 0}m_{ij}x^iy^j$ equals $\mu/(1-x-y)$. Expanding ${\mathcal M}(x,y)$ in $x$
943: yields $\mu \sum x^i/(1-y)^{i+1}$ which we then expand in $y$ by employing
944: the identity $(1-y)^{-i-1}=\sum_{j\geq 0} {i+j\choose i}y^{j}$. Finally, we
945: arrive at
946: \begin{equation}
947: \label{mij-sol}
948: m_{ij}=\mu\,\,{\Gamma(i+j+1)\over \Gamma(i+1)\,\Gamma(j+1)},
949: \end{equation}
950: from which the joint degree distribution is
951: \begin{equation}
952: \label{nij-sol}
953: n_{ij}={\mu\,\Gamma(i+\lambda)\,\Gamma(j+\lambda)\,\Gamma(i+j+1)\over
954: \Gamma(i+1)\,\Gamma(j+1)\,\Gamma(i+j+2+\lambda+\lambda q^{-1})}
955: \longrightarrow \mu\,
956: {(ij)^{\lambda-1}\over (i+j)^{1+\lambda+\lambda/q}},
957: \quad{\rm as}\quad i,j\to\infty.
958: \end{equation}
959: Thus again, the in- and out-degrees of a node are correlated: $n_{ij}\ne
960: I_iO_j\sim i^{-\nu}j^{-\nu}$.
961:
962:
963: \subsection{Global Properties}
964:
965: Let us now turn now to the distribution of connected components (clusters,
966: for brevity). For simplicity, we consider models with undirected links. Let
967: us first estimate the total number of clusters ${\cal N}$. At each time
968: step, ${\cal N}\to {\cal N}+1$ with probability $p$, or ${\cal N}\to {\cal
969: N}-1$ with probability $q$. This implies
970: \begin{equation}
971: \label{N}
972: {\cal N}=(p-q)t.
973: \end{equation}
974: The gain rate of ${\cal N}$ is exactly equal to $p$, while in the loss term
975: we ignore self-connections and tacitly assume that links are always created
976: between different clusters. In the long-time limit, self-connections should
977: be asymptotically negligible when the total number of clusters grows with
978: time and no macroscopic clusters ({\it i.e.}, components that contain a
979: finite fraction of all nodes) arise.
980:
981: This assumption of no self-connections greatly simplifies the description of
982: the cluster merging process. Consider two clusters (labeled by $\alpha=1,2$)
983: with total in-degrees $i_\alpha$, out-degrees $j_\alpha$, and number of nodes
984: $k_\alpha$. When these clusters merge, the combined cluster is characterized
985: by
986: \begin{eqnarray*}
987: \label{12}
988: i=i_1+i_2+1,\qquad
989: j=j_1+j_2+1,\qquad
990: k=k_1+k_2.
991: \end{eqnarray*}
992: Thus starting with single-node clusters with $(i,j,k)=(0,0,1)$, the above
993: merging rule leads to clusters that always satisfy the constraint $i=j=k-1$.
994: Thus the size $k$ characterizes both the in-degree and out-degree of
995: clusters.
996:
997: To simplify formulae without sacrificing generality, we consider the link
998: creation rate of Eq.~(\ref{AC}), with $\lambda_{\rm in}=\lambda_{\rm out}=1$.
999: Then the merging rate $W(k_1,k_2)$ of the two clusters is proportional to
1000: $(i_1+k_1)(j_2+k_2)+(i_2+k_2)(j_1+k_1)$, or
1001: \begin{eqnarray*}
1002: \label{w}
1003: W(k_1,k_2)=(2k_1-1)(2k_2-1).
1004: \end{eqnarray*}
1005: Let $C(k,t)$ denotes the number of clusters of mass $k$. This distribution
1006: evolves according to
1007: \begin{eqnarray}
1008: \label{comp}
1009: {dC(k,t)\over dt}=
1010: {q\over t^2}\sum_{k_1+k_2=k} (2k_1-1)(2k_2-1)\,C(k_1,t)C(k_2,t)
1011: -{2q\over t}\,(2k-1)C(k,t)+p\,\delta_{k,1},
1012: \end{eqnarray}
1013: The first set of terms account for the gain in $C(k,t)$ due to the
1014: coalescence of clusters of size $k_1$ and $k_2$, with $k_1+k_2=k$.
1015: Similarly, the second set of terms accounts for the loss in $C(k,t)$ due to
1016: the coalescence of a cluster of size $k$ with any other cluster. The last
1017: term accounts for the input of unit-size clusters. These rate equations are
1018: similar to those of irreversible aggregation with product kernel \cite{agg}.
1019: The primary difference is that we explicitly treat the number of clusters as
1020: finite.
1021:
1022: One can verify that the total number of nodes $N(t)=\sum k\,C(k,t)$ grows
1023: with rate $p$ and that the total number of clusters ${\cal N}(t)=\sum C(k,t)$
1024: grows with rate $p-q$, in agreement with Eq.~(\ref{N}). Solving the first
1025: few Eqs.~(\ref{comp}) shows again that $C(k,t)$ grow linearly with time.
1026: Accordingly, we substitute $C(k,t)=t\,c_k$ into Eqs.~(\ref{comp}) to yield
1027: the time-independent recursion relation
1028: \begin{eqnarray}
1029: \label{mk}
1030: c_k=q\sum_{k_1+k_2=k} (2k_1-1)(2k_2-1)\,c_{k_1}c_{k_2}
1031: -2q(2k-1)c_k+p\,\delta_{k,1}.
1032: \end{eqnarray}
1033:
1034: A giant component, {\it i.e.}, a cluster that contains a finite fraction of
1035: all the nodes, emerges when the link creation rate exceeds a threshold value.
1036: To determine this threshold, we study the moments of the cluster size
1037: distribution ${\cal M}_n=\sum_{k\geq 1} k^n\,c_k$. We already know that the
1038: first two moments are ${\cal M}_0=p-q$ and ${\cal M}_1=p$. We can obtain an
1039: equation for the second moment by multiplying Eq.~(\ref{mk}) by $k^2$ and
1040: summing over $k\geq 1$ to give ${\cal M}_2 =2q(2{\cal M}_2-{\cal M}_1)^2+p$.
1041: When this equation has a real solution, ${\cal M}_2$ is finite. The
1042: solution is
1043: \begin{equation}
1044: \label{M2}
1045: {\cal M}_2={1+8pq-\sqrt{1-16pq}\over 16 q}
1046: \end{equation}
1047: and gives, when $1-16pq=0$, to a threshold value $p_c=(2+\sqrt{3})/4$. For
1048: $1-16pq\geq 0$ ($p>p_c$) all clusters have finite size and the second moment
1049: is finite.
1050:
1051: In this steady-state regime, we can obtain the cluster size distribution by
1052: introducing the generating function ${\cal C}(z)=\sum_{k=1}^\infty c_k z^k$
1053: to convert Eq.~(\ref{mk}) into the differential equation
1054: \begin{equation}
1055: \label{Cz}
1056: 2z{\cal C}'(z)-{\cal C}(z)=1-\sqrt{1-[pz-{\cal C}(z)]/q}.
1057: \end{equation}
1058: The asymptotic behavior of the cluster size distribution can now be read off
1059: from the behavior of the generating function in the $z\to 1$ limit. In
1060: particular, the power-law behavior
1061: \begin{equation}
1062: \label{asym}
1063: c_k\sim {B\over k^\tau}\quad{\rm as} \quad k\to\infty
1064: \end{equation}
1065: implies that the corresponding generating function has the form
1066: \begin{equation}
1067: \label{gen}
1068: {\cal C}(z)={\cal M}_0+{\cal M}_1(z-1)
1069: +{{\cal M}_2-{\cal M}_1\over 2}\,(z-1)^2+
1070: B\Gamma(1-\tau)(1-z)^{\tau-1}+\ldots.
1071: \end{equation}
1072: Here the asymptotic behavior is controlled by the dominant singular term
1073: $(1-z)^{\tau-1}$. However, there are also subdominant singular terms and
1074: regular terms in the generating function. In Eq.~(\ref{gen}) we explicitly
1075: included the three regular terms which ensure that the first three moments of
1076: the cluster-size distribution are correctly reproduced, namely, ${\cal
1077: C}(1)={\cal M}_0$, ${\cal C}'(1)={\cal M}_1$, and ${\cal C}''(1)={\cal
1078: M}_2-{\cal M}_1$.
1079:
1080: Finally, substituting Eq.~(\ref{gen}) into Eq.~(\ref{Cz}) we find that the
1081: dominant singular terms are of the order of $(1-z)^{\tau-2}$. Balancing all
1082: contributions of this order in the equation determines the exponent of the
1083: cluster size distribution to be
1084: \begin{equation}
1085: \label{tau}
1086: \tau=1+{2\over 1-\sqrt{1-16pq}}.
1087: \end{equation}
1088: This exponent satisfies the bound $\tau>3$ and thus justifies using the
1089: behavior of the second moment of the size distribution as the criterion to
1090: find the threshold value $p_c$.
1091:
1092: For $p\geq p_c$ there is no giant cluster and the cluster size distribution
1093: has a power-law tail with $\tau$ given by Eq.~(\ref{tau}). Intriguingly, the
1094: power-law form holds for any value $p>p_c$. This is in stark contrast to
1095: all other percolation-type phenomena, where away from the threshold, there is
1096: an exponential tail in cluster size distributions \cite{percolation}. Thus
1097: in contrast to ordinary critical phenomena, the entire range $p>p_c$ is
1098: critical.
1099:
1100: As a corollary to the power-law tail of the cluster size distribution for
1101: $p>p_c$, we can estimate the size of the largest cluster $k_{\rm max}$ to see
1102: how ``finite'' it really is. Using the extreme statistics criterion
1103: $\sum_{k\geq k_{\rm max}}N\,c_k=1$ we obtain $k_{\rm max}\sim N^{1/(\tau-1)}$,
1104: or
1105: \begin{equation}
1106: \label{kmax}
1107: k_{\rm max}\sim N^{(1-\sqrt{1-16pq})/2}.
1108: \end{equation}
1109: This is very different from the corresponding behavior on the random graph,
1110: where below the percolation threshold the largest component scales
1111: logarithmically with the number of nodes. Thus for the random graph, the
1112: dependence of $k_{\rm max}(N)$ changes from $\ln N$ just below, to $N$, just
1113: above the percolation threshold; for the MG, the change is much more gentle:
1114: from $N^{1/2}$ to $N$.
1115:
1116: These considerations suggest that the phase transition in the MG is
1117: dramatically different from the percolation transition. Very recently,
1118: simplified versions of the MG were studied
1119: \cite{clusters,kk,sam,kkkr,french}. Numerical \cite{clusters} and analytical
1120: \cite{sam,kkkr,french} evidence suggest that the size of the giant component
1121: $G(p)$ near the threshold scales as
1122: \begin{equation}
1123: \label{giant}
1124: G(p)\propto \exp\left(-\,{{\rm const.}\over\sqrt{p_c-p}}\right).
1125: \end{equation}
1126: Therefore, the phase transition of this dynamically grown network is of
1127: infinite order since all derivatives of $G(p)$ vanish as $p\to p_c$. In
1128: contrast, static random graphs with any desired degree distribution
1129: \cite{reed} exhibit a standard percolation transition
1130: \cite{clusters,reed,chung,dani}.
1131:
1132: \section{Summary}
1133:
1134: In this paper, we have presented a statistical physics viewpoint on growing
1135: network problems. This perspective is strongly influenced by the phenomenon
1136: of aggregation kinetics, where the rate equation approach has proved
1137: extremely useful. From the wide range of results that we were able to obtain
1138: for evolving networks, we hope that the reader appreciates both the
1139: simplicity and the power of the rate equation method for characterizing
1140: evolving networks. We quantified the degree distribution of the growing
1141: network model and found a diverse range of phenomenology that depends on the
1142: form of the attachment kernel. At the qualitative level, a stretched
1143: exponential form for the degree distribution should be regarded as
1144: ``generic'', since it occurs for an attachment kernel that is sub-linear in
1145: node degree ({\it e.g.}, $A_k\sim k^\gamma$ with $\gamma<1$). On the other
1146: hand, a power-law degree distribution arises only for linear attachment
1147: kernels, $A_k\sim k$. However, this result is ``non-generic'' as the degree
1148: distribution exponent now depends on the detailed form of the attachment
1149: kernel.
1150:
1151: We investigated extensions of the basic growing network to incorporate
1152: processes that naturally occur in the development in the web. In particular,
1153: by allowing for link directionality, the full degree distribution naturally
1154: resolves into independent in-degree and out-degree distributions. When the
1155: rates at which links are created are linear functions of the in- and
1156: out-degrees of the terminal nodes of the link, the in- and out-degree
1157: distributions are power laws with different exponents, $\nu_{\rm in}$ and
1158: $\nu_{\rm out}$, that match with current measurements on the web with
1159: reasonable values for the model parameters. We also considered a model with
1160: independent node and link creation rates. This leads to a network with many
1161: independent components and now the size distribution of these components is
1162: an important characteristic. We have characterized basic aspects of this
1163: process by the rate equation approach and showed that the network is in a
1164: critical state even away from the percolation threshold. The rate equation
1165: approach also provides evidence of an unusual, infinite-order percolation
1166: transition.
1167:
1168: While statistical physics tools have fueled much progress in elucidating the
1169: structure of growing networks, there are still many open questions. One set
1170: is associated with understanding dynamical processes in such networks. For
1171: example, what is the nature of information transmission? What governs the
1172: formation of traffic jams on the web? Another set is concerned with growth
1173: mechanisms. While we can make much progress in characterizing networks with
1174: idealized growth rules, it is important to understand the actual rules that
1175: govern the growth of the Internet. These issues appear to be fruitful
1176: challenges for future research.
1177:
1178: \section{Acknowledgements}
1179:
1180: It is a pleasure to thank Francois Leyvraz and Geoff Rodgers for
1181: collaborations that led to some of the work reported here. We also thank
1182: John Byers and Mark Crovella for numerous informative discussions. Finally,
1183: we are grateful to NSF grants INT9600232 and DMR9978902 for financial
1184: support.
1185:
1186:
1187: \begin{thebibliography}{99}
1188:
1189: \bibitem{review}
1190: Recent reviews from the physicist's perspective include:
1191: S.~H.~Strogatz, Nature {\bf 410}, 268 (2001);
1192: R.~Albert and A.-L.~Barab\'asi, Rev.\ Mod.\ Phys.\ {\bf 74}, 47 (2002);
1193: S.~N.~Dorogovtsev and J.~F.~F.~Mendes, Adv.\ Phys. {\bf xx}, xxx (2002).
1194:
1195: \bibitem{bol}
1196: B.~Bollob\'as, {\it Random Graphs} (Academic Press, London, 1985).
1197:
1198: \bibitem{jan}
1199: S.~Janson, T.~Luczak, and A.~Rucinski,
1200: {\it Random Graphs} (Wiley, New York, 2000).
1201:
1202: \bibitem{fff}
1203: M.~Faloutsos, P.~Faloutsos, and C.~Faloutsos,
1204: Comp.\ Commun.\ Rev.\ {\bf 29}(4), 251 (1999).
1205:
1206: \bibitem{matta}
1207: A.~Medina, I.~Matta, and J.~Byers, Comp.\ Commun.\ Rev.\
1208: {\bf 30}(2), 18 (2000).
1209:
1210: \bibitem{as}
1211: H.~Tangmunarunkit, J.~Doyle, R.~Govindan, S.~Jamin, S.~Shenker,
1212: and W.~Willinger, Comp.\ Commun.\ Rev.\ {\bf 31}, 7 (2001).
1213:
1214: \bibitem{kum}
1215: S.~R.~Kumar, P.~Raphavan, S.~Rajagopalan, and A.~Tomkins,
1216: in: {\it Proc. 8th WWW Conf.} (1999);
1217: S.~R.~Kumar, P.~Raphavan, S.~Rajagopalan, and A.~Tomkins,
1218: in: {\it Proc. 25th VLDB Conf.} (1999);
1219: J.~Kleinberg, R.~Kumar, P.~Raghavan, S.~Rajagopalan, and
1220: A.~Tomkins, in: {\it Proceedings of the International Conference on
1221: Combinatorics and Computing}, Lecture Notes in Computer Science,
1222: Vol.~1627 (Springer-Verlag, Berlin, 1999).
1223:
1224: \bibitem{BA}
1225: A.-L.~Barab\'asi and R.~Albert, Science {\bf 286}, 509 (1999);
1226: R.~Albert, H.~Jeong, and A.-L.~Barab\'asi, Nature {\bf 401}, 130
1227: (1999).
1228:
1229: \bibitem{www1}
1230: B.~A.~Huberman, P.~L.~T.~Pirolli, J.~E.~Pitkow, and R.~Lukose,
1231: Science {\bf 280}, 95 (1998);
1232: B.~A.~Huberman and L.~A.~Adamic, Nature {\bf 401}, 131 (1999).
1233:
1234: \bibitem{www2}
1235: G.~Caldarelli, R.~Marchetti, and L.~Pietronero, Europhys.\ Lett.
1236: {\bf 52}, 386 (2000)
1237:
1238: \bibitem{www3} A.~Broder, R.~Kumar,
1239: F.~Maghoul, P.~Raghavan, S.~Rajagopalan, R.~Stata, A.~Tomkins, and
1240: J.~Wiener, Computer Networks {\bf 33}, 309 (2000).
1241:
1242: \bibitem{lotka}
1243: A.~J.~Lotka, J. Washington Acad. Sci.\ {\bf 16}, 317 (1926);
1244: W.~Shockley, Proc.\ IRE {\bf 45}, 279 (1957);
1245: E.~Garfield, Science {\bf 178}, 471 (1972).
1246:
1247: \bibitem{LS}
1248: J. Laherr\`ere and D. Sornette, Eur.\ Phys.\ J. B {\bf 2}, 525 (1998).
1249:
1250: \bibitem{redner}
1251: S.~Redner, Eur.\ Phys.\ J. B {\bf 4}, 131 (1998).
1252:
1253: \bibitem{agg}
1254: M.~H.~Ernst, in: {\it Fractals in Physics}, edited by L.~Pietronero
1255: and E.~Tosatti (Elsevier, Amsterdam, 1986), p.~289.
1256:
1257: \bibitem{coarse}
1258: A.~J.~Bray, Adv. Phys. {\bf 43}, 357 (1994).
1259:
1260: \bibitem{surf}
1261: A.~Pimpinelli and J.~Villain,
1262: {\em Physics of Crystal Growth} (Cambridge University Press, Cambridge,
1263: 1998).
1264:
1265: \bibitem{simon}
1266: The earliest growing network model was proposed to describe word
1267: frequency: H.~A.~Simon, Biometrica {\bf 42}, 425 (1955);
1268: H.~A.~Simon, {\em Models of Man} (Wiley, New York, 1957).
1269:
1270: \bibitem{KRR}
1271: P.~L.~Krapivsky, G.~J.~Rodgers, and S.~Redner, Phys.\ Rev.\
1272: Lett.\ {\bf 86}, 5401 (2001).
1273:
1274: \bibitem{gen}
1275: R.~Albert and A.-L.~Barab\'asi, Phys.\ Rev.\ Lett.\ {\bf 85}, 5234 (2000);
1276: S.~N.~Dorogovtsev and J.~F.~F.~Mendes, Europhys.\ Lett.\ {\bf 52}, 33
1277: (2000).
1278:
1279: \bibitem{clusters}
1280: D.~S.~Callaway, J.~E.~Hopcroft, J.~M.~Kleinberg, M.~E.~J.~Newman,
1281: and S.~H.~Strogatz, Phys.\ Rev.\ E {\bf 64},
1282: 041902 (2001).
1283:
1284: \bibitem{KRL}
1285: P.~L.~Krapivsky, S.~Redner, and F.~Leyvraz, Phys.\ Rev.\ Lett.\
1286: {\bf 85}, 4629 (2000).
1287:
1288: \bibitem{DMS}
1289: S.~N.~Dorogovtsev, J.~F.~F.~Mendes, and A.~N.~Samukhin, Phys.\ Rev.\
1290: Lett.\ {\bf 85}, 4633 (2000).
1291:
1292: \bibitem{KR}
1293: P.~L.~Krapivsky and S.~Redner, Phys.\ Rev.\ E {\bf 63}, 066123
1294: (2001).
1295:
1296: \bibitem{BiA}
1297: G.~Bianconi and A.-L.~Barab\'asi, Europhys.\ Lett.\ {\bf 54}, 436 (2000).
1298:
1299: \bibitem{knuth}
1300: R.~L.~Graham, D.~E.~Knuth, and O.~Patashnik,
1301: {\em Concrete Mathematics: A Foundation for Computer Science},
1302: (Reading, Mass.: Addison-Wesley, 1989).
1303:
1304: \bibitem{percolation}
1305: See {\it e.g.}, D. Stauffer and A. Aharony,
1306: Introduction to Percolation Theory (Taylor \& Francis, London, 1992).
1307:
1308: \bibitem{kk}
1309: L.~Kullmann and J.~Kert\'esz, Phys.\ Rev.\ E {\bf 63},
1310: 051112 (2001); D.~Lancaster, {\it cond-mat}/0110111.
1311:
1312: \bibitem{sam}
1313: S.~N.~Dorogovtsev, J.~F.~F.~Mendes, and A.~N.~Samukhin,
1314: Phys.\ Rev.\ E {\bf 64}, 066110 (2001).
1315:
1316: \bibitem{kkkr}
1317: J.~Kim, P.~L.~Krapivsky, B.~Kahng, and S.~Redner,
1318: {\it cond-mat}/0203167.
1319:
1320: \bibitem{french}
1321: M.~Bauer and D.~Bernard, {\it cond-mat}/0203232.
1322:
1323: \bibitem{reed}
1324: M.~Molloy and B.~Reed, Random Struct.\ Alg.\ {\bf 6}, 161 (1995);
1325: Combin.\ Probab.\ Comput.\ {\bf 7}, 295 (1998).
1326:
1327: \bibitem{chung}
1328: W.~Aiello, F.~Chung, and L.~Lu,
1329: in: {\it Proc. 32nd ACM Symposium on Theory of Computing} (2000).
1330:
1331: \bibitem{dani}
1332: R.~Cohen, K.~Erez, D.~ben-Avraham, and S.~Havlin, Phys.\ Rev.\ Lett.\
1333: {\bf 85}, 4626 (2000);
1334: M.~E.~J.~Newman, S.~H.~Strogatz, and D.~J.~Watts,
1335: Phys.\ Rev.\ E {\bf 64}, 026118 (2001).
1336:
1337: \end{thebibliography}
1338: \end{document}
1339:
1340: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1341: \bibitem{intr}
1342: L.~Egghe and R.~Rousseau, {\it Introduction to Informetrics}
1343: (Elsevier, 1990).
1344:
1345: \bibitem{gil}
1346: N.~Gilbert, Sociol.\ Res.\ {\bf 2}, 2 (1997).
1347:
1348: \bibitem{new}
1349: M. E. J. Newman, cond-mat/0007214.
1350:
1351: \bibitem{zan}
1352: D.~H.~Zanette and S.~C.~Manrubia,
1353: {\it nlin.AO}/0009046.
1354:
1355: \bibitem{mg}
1356: T.~Matsoukas and E.~Gulari, J.\ Coll.\ Interface Sci. {\bf
1357: 132}, 13 (1989).
1358:
1359: \bibitem{isi}
1360: {\it Science Citation Index Journal Citation Reports}
1361: (Institute for Scientific Information, Philadelphia). Web site:
1362: http://www.isinet.com/welcome.html.
1363:
1364: \bibitem{larson}
1365: R.~Larson, in: {\it Ann.\ Meeting Amer.\ Soc.\ Info.\ Sci.}
1366: (1996).
1367:
1368: \bibitem{b2}
1369: A.-L.~Barab\'asi, R.~Albert, and H.~Jeong, Physica A {\bf 272}, 173 (1999).
1370:
1371: \bibitem{j1}
1372: S.~N.~Dorogovtsev and J.~F.~F.~Mendes, Phys.\ Rev.\ E {\bf 62}, 1842
1373: (2000).
1374:
1375: \bibitem{burda}
1376: Z.~Burda, J.~D.~Correia, and A.~Krzywicki, Phys.\ Rev.\ E {\bf 64},
1377: 046118 (2001).
1378:
1379: \bibitem{BKT}
1380: V.~L.~Berezinskii, Sov.\ Phys.\ JETP {\bf 32}, 493 (1970);
1381: J.~M.~Kosterlitz and D.~J.~Thouless, J.\ Phys.\ C {\bf 6}, 1181 (1973).
1382:
1383:
1384: Let us fix one degree, e.g., the in-degree $i$ (we consider large $i$) and
1385: vary the out-degree $j$. The average out-degree always scales linearly with
1386: the in-degree, $\langle j\rangle=iq$, implying that popular nodes at average
1387: have large out-degrees. However, the maximum of the joint degree
1388: distribution scales linearly with the in-degree, $j=i(\lambda-1)/(2+\lambda
1389: q^{-1})$, only when $\lambda>1$. Thus popular nodes typically have small
1390: out-degrees when $\lambda\leq 1$.
1391:
1392: An interesting feature of this distribution becomes evident if we fix the
1393: in-degree $i$ and vary the out-degree $j$. If $\lambda>1$, the degree
1394: distribution reaches a maximum when $j=i(\lambda-1)/(2+\lambda q^{-1})$ (here
1395: we consider large $i$). The average out-degree always scales linearly with
1396: the in-degree, $\langle j\rangle=iq$. Thus, popular nodes tend to have large
1397: out-degrees. The dual property holds as well: Nodes with large out-degree
1398: tend to be popular.
1399:
1400: For completeness, the analytical form of the degree distributions for
1401: $\lambda_{\rm in}=\lambda_{\rm out}=1$ are
1402: \begin{subeqnarray}
1403: \label{IOd}
1404: I_{i}&=&{2\over {\cal D}}\,{\Gamma(i+1)\,\Gamma(\nu)\over
1405: \Gamma(i+1+\nu)}=O_i,\\
1406: n_{d}&=&{2\over {\cal D}}\,{\Gamma(d+2)\,\Gamma(1+\nu)\over
1407: \Gamma(d+2+\nu)},\\
1408: n_{ij}&=&{2\over {\cal D}}\,{\Gamma(d+1)\,\Gamma(1+\nu)\over
1409: \Gamma(d+2+\nu)},
1410: \end{subeqnarray}
1411: where $n_{d}=\sum_{i+j=d} n_{ij}$ is the density of nodes with total degree
1412: $d$ and $\nu=2(1+{\cal D}^{-1})$. Thus for the kernel $K(j,i)=(i+1)(j+1)$,
1413: all distributions are algebraic over the entire range of the corresponding
1414: degrees and have an especially neat form.
1415:
1416: