0206:cs0206011/web.tex

1: \documentclass[11pt]{article}

2: \textwidth 6.5in\textheight 8.9in\oddsidemargin 0in\topmargin -.35in

3: \usepackage{epsfig,fancyheadings}\usepackage{redner}\pagestyle{plain}

4:

5: \makeatletter

6: \renewcommand\theequation{\thesection.\arabic{equation}}

7: %\renewcommand{\baselinestretch}{2}  %double spacing

8: \@addtoreset{equation}{section}

9: \makeatother

10:

11: \begin{document}

12: \title{A Statistical Physics Perspective on Web Growth}

13: \author{P.~L.~Krapivsky and S.~Redner\\

14: Center for BioDynamics, Center for Polymer Studies,\\

15: and Department of Physics, Boston University, Boston, MA 02215, USA}

16: \maketitle

17: \begin{abstract}

18:   Approaches from statistical physics are applied to investigate the

19:   structure of network models whose growth rules mimic aspects of the

20:   evolution of the world-wide web.  We first determine the degree

21:   distribution of a growing network in which nodes are introduced one at a

22:   time and attach to an earlier node of degree $k$ with rate $A_k\sim

23:   k^\gamma$.  Very different behaviors arise for $\gamma<1$, $\gamma=1$, and

24:   $\gamma>1$.  We also analyze the degree distribution of a heterogeneous

25:   network, the joint age-degree distribution, the correlation between degrees

26:   of neighboring nodes, as well as global network properties.  An extension

27:   to directed networks is then presented.  By tuning model parameters to

28:   reasonable values, we obtain distinct power-law forms for the in-degree and

29:   out-degree distributions with exponents that are in good agreement with

30:   current data for the web.  Finally, a general growth process with

31:   independent introduction of nodes and links is investigated.  This leads to

32:   independently growing sub-networks that may coalesce with other

33:   sub-networks.  General results for both the size distribution of

34:   sub-networks and the degree distribution are obtained.

35:

36: \end{abstract}

37:

38: \section{Introduction}

39:

40: With the recent appearance of the Internet and the world-wide web,

41: understanding the properties of growing networks with popularity-based

42: construction rules has become an active and fruitful research area

43: \cite{review}.  In such models, newly-introduced nodes preferentially attach

44: to pre-existing nodes of the network that are already ``popular''.  This

45: leads to graphs whose structure is quite different from the well-known {\em

46:   random graph} \cite{bol,jan} in which links are created at random between

47: nodes without regard to their popularity.  This discovery of a new class of

48: graph theory problems has fueled much effort to characterize their

49: properties.

50:

51: One basic measure of the structure of such networks is the {\em node degree}

52: $N_k$ defined as the number of nodes in the network that are linked to $k$

53: other nodes.  In the case of the random graph, the node degree is simply a

54: Poisson distribution.  In contrast, many popularity-driven growing networks

55: have much broader degree distributions with a stretched exponential or a

56: power-law tail.  The latter form means that there is no characteristic scale

57: for the node degree, a feature that typifies many networked systems

58: \cite{review}.

59:

60: Power laws, or more generally, distributions with highly skewed tails,

61: characterize the degree distributions of many man-made and naturally

62: occurring networks \cite{review}.  For example, the degree distributions at

63: the level of autonomous systems and at the router level exhibit highly skewed

64: tails \cite{fff,matta,as}.  Other important Internet-based graphs, such as

65: the hyperlink graph of the world-wide web also appear to have a degree

66: distribution with a power-law tail \cite{kum,BA,www1,www2,www3}.  These

67: observations have spurred a flurry of recent work to understand the

68: underlying mechanisms for these phenomena.

69:

70: A related example with interest to anyone who publishes, is the distribution

71: of scientific citations \cite{lotka,LS,redner}.  Here one treats publications

72: as nodes and citations as links in a citation graph.  Currently-available

73: data suggests that the citation distribution has a power-law tail with an

74: associated exponent close to $-3$ \cite{redner}.  As we shall see, this

75: exponent emerges naturally in the {\it Growing Network} (GN) model where the

76: relative probability of linking from a new node to a previous node

77: (equivalent to citing an earlier paper) is strictly proportional to the

78: popularity of the target node.

79:

80: In this paper, we apply tools from statistical physics, especially the rate

81: equation approach, to quantify the structure of growing networks and to

82: elucidate the types of geometrical features that arise in networks with

83: physically-motivated growth rules.  The utility of the rate equations has

84: been demonstrated in a diverse range of phenomena in non-equilibrium

85: statistical physics, such as aggregation \cite{agg}, coarsening

86: \cite{coarse}, and epitaxial surface growth \cite{surf}.  We will attempt to

87: convince the reader that the rate equations are also a simple yet powerful

88: analysis tool to analyze growing network systems.  In addition to providing

89: comprehensive information about the node degree distribution, the rate

90: equations can be easily adapted to analyze both heterogeneous and directed

91: networks, the age distribution of nodes, correlations between node degrees,

92: various global network properties, as well as the cluster size distribution

93: in models that give rise to independently evolving sub-networks.  Thus the

94: rate equation method appears to be better suited for probing the structure of

95: growing networks compared to the classical approaches for analyzing random

96: graphs, such as probabilistic \cite{bol} or generating function \cite{jan}

97: techniques.

98:

99: In the next section, we introduce three basic models that will be the focus

100: of this review.  In the following three sections, we then present rate

101: equation analyses to determine basic geometrical properties of these

102: networks.  We close with a brief summary.

103:

104: \section{Models}

105:

106: The models we study appear to embody many of the basic growth processes in

107: web graphs and related systems.  These include:

108:

109: \begin{itemize}

110:

111: \item The {\em Growing Network} (GN) \cite{BA,simon}.  Nodes are added one at

112:   a time and a single link is established between the new node and a

113:   pre-existing node according to an attachment probability $A_k$ that depends

114:   only on the degree of the ``target'' node (Fig.~\ref{network}).

115:

116: \begin{figure}[ht]

117:   \begin{center}

118:     \includegraphics[width=0.3\textwidth]{network.eps}

119:  \caption{Growing network.  Nodes are added sequentially and

120:    a single link joins a new node to an earlier node.  Node 1 has (total)

121:    degree 5, node 2 has degree 3, nodes 4 and 6 have degree 2, and the

122:    remaining nodes have degree 1.}~\label{network}

123:   \end{center}

124: \end{figure}

125:

126: \item The {\em Web Graph} (WG).  This represents an extension of the GN to

127:   incorporate link directionality \cite{KRR} and leads to independent,

128:   dynamically generated in-degree and out-degree distributions.  The network

129:   growth occurs by two distinct processes \cite{gen} that are meant to mimic

130:   how hyperlinks are created in the web (Fig.~\ref{io-growth}):

131:

132: \begin{itemize}

133: \item[(i)] With probability $p$, a new node is introduced and it immediately

134:   attaches to an earlier target node.  The attachment probability depends

135:   only on the in-degree of the target.

136: \item[(ii)] With probability $q=1-p$, a new link is created between already

137:   existing nodes.  The choices of the originating and target nodes depend on

138:   the out-degree of the former and the in-degree of the latter.

139: \end{itemize}

140:

141: \begin{figure}[ht]

142:   \begin{center}

143:     \includegraphics[width=0.35\textwidth]{io-growth.eps}

144: \caption{Growth processes in the web graph model:

145:   (i) node creation and immediate attachment, and (ii) link creation.  In (i)

146:   the new node is shaded, while in both (i) and (ii) the new link is dashed.}

147: \label{io-growth}

148: \end{center}

149: \end{figure}

150:

151: \item The {\em Multicomponent Graph} (MG).  Nodes and links are introduced

152:   {\em independently} \cite{clusters}.  (i) With probability $p$, a new {\em

153:     unlinked} node is introduced, while (ii) with probability $q=1-p$, a new

154:   link is created between existing nodes.  As in the WG, the choices of the

155:   originating and target nodes depend on the out-degree of the former and the

156:   in-degree of the latter.  Step (i) allows for the formation of many

157:   clusters.

158:

159: \end{itemize}

160:

161: \section{Structure of the Growing Network}

162:

163: Because of its simplicity, we first study the structure of the GN

164: \cite{BA,simon}.  The basic approaches developed in this section will then be

165: extended to the WG and MG models.

166:

167: \subsection{Degree Distribution of a Homogeneous Network}

168:

169: We first focus on the node degree distribution $N_k$.  To determine its

170: evolution, we shall write the rate equations that account for the change in

171: the degree distribution after each node addition event.  These equations

172: contain complete information about the node degree, from which any measure of

173: node degree (such as moments) can be easily extracted.  For the GN growth

174: process in which nodes are introduced one at a time, the rate equations for

175: the degree distribution $N_k(t)$ are \cite{KRL}

176: \begin{equation}

177: \label{Nk}

178: {d N_k\over dt}=

179: {A_{k-1} N_{k-1}-A_k N_k\over A}+\delta_{k1}.

180: \end{equation}

181: The first term on the right, $A_{k-1}N_{k-1}/A$, accounts for processes in

182: which a node with $k-1$ links is connected to the new node, thus increasing

183: $N_k$ by one.  Since there are $N_{k-1}$ nodes of degree $k-1$, the rate at

184: which such processes occur is proportional to $A_{k-1}N_{k-1}$, and the

185: factor $A(t)=\sum_{j\geq 1} A_jN_j(t)$ converts this rate into a normalized

186: probability.  A corresponding role is played by the second (loss) term on the

187: right-hand side; $A_kN_k/A$ is the probability that a node with $k$ links is

188: connected to the new node, thus leading to a loss in $N_k$.  The last term

189: accounts for the introduction of new nodes with no incoming links.

190:

191: We start by solving for the time dependence of the moments of the degree

192: distribution defined via $M_n(t)=\sum_{j\geq 1} j^n N_j(t)$.  This is a

193: standard method of analysis of rate equations by which one can gain partial,

194: but valuable, information about the time dependence of the system with

195: minimal effort.  By explicitly summing Eqs.~(\ref{Nk}) over all $k$, we

196: easily obtain $\dot M_0(t)=1$, whose solution is $M_0(t)= M_0(0)+t$.  Notice

197: that by definition $M_0(t)=\sum_k N_k$ is just the total number of nodes in

198: the network.  It is clear by the nature of the growth process that this

199: quantity simply grows as $t$.  In a similar fashion, the first moment of the

200: degree distribution obeys $\dot M_1(t)=2$ with solution $M_1(t)= M_1(0)+2t$.

201: This time evolution for $M_1$ can be understood either by explicitly summing

202: the rate equations, or by observing that this first moment simply equals the

203: total number of link endpoints.  Clearly, this quantity must grow as $2t$

204: since the introduction of a single node introduces two link endpoints.  Thus

205: we find the simple result that the first two moments are {\em independent\/}

206: of the attachment kernel $A_k$ and grow {\em linearly} with time.  On the

207: other hand, higher moments and the degree distribution itself do depend in an

208: essential way on the kernel $A_k$.

209:

210: As a preview to the general behavior for the degree distribution, consider

211: the strictly linear kernel \cite{BA,KRL,DMS}, for which $A(t)$ coincides with

212: $M_1(t)$.  In this case, we can solve Eqs.~(\ref{Nk}) for an arbitrary

213: initial condition.  However, since the long-time behavior is most

214: interesting, we limit ourselves to the asymptotic regime ($t\to\infty$) where

215: the initial condition is irrelevant.  Using therefore $M_1=2t$, we solve the

216: first few of Eqs.~(\ref{Nk}) directly and obtain $N_1=2t/3$, $N_2=t/6$, {\it

217:   etc}.  Thus each of the $N_k$ grow linearly with time.  Accordingly, we

218: substitute $N_k(t)=t\,n_k$ in Eqs.~(\ref{Nk}) to yield the simple recursion

219: relation $n_k=n_{k-1} (k-1)/(k+2)$.  Solving for $n_k$ gives

220: \begin{equation}

221: \label{nk1}

222: n_k={4\over k(k+1)(k+2)}.

223: \end{equation}

224:

225: Returning to the case of general attachment kernels, let us assume that the

226: degree distribution and $A(t)$ both grow linearly with time.  This hypothesis

227: can be easily verified numerically for attachment kernels that do not grow

228: faster than linearly with $k$.  Then substituting $N_k(t)=t\,n_k$ and

229: $A(t)=\mu t$ into Eqs.~(\ref{Nk}) we obtain the recursion relation

230: $n_k=n_{k-1} A_{k-1}/(\mu+A_k)$ and $n_1=\mu/(\mu+A_1)$.  Finally, solving

231: for $n_k$, we obtain the formal expression

232: \begin{equation}

233: \label{Nkgen}

234: n_k={\mu\over A_k}\prod_{j=1}^{k}

235: \left(1+{\mu\over A_j}\right)^{-1}.

236: \end{equation}

237: To complete the solution, we need the amplitude $\mu$.  Using the definition

238: $\mu=\sum_{j\geq 1}A_jn_j$ in Eq.~(\ref{Nkgen}), we obtain the implicit

239: relation

240: \begin{equation}

241: \label{mugen}

242: \sum_{k=1}^\infty \prod_{j=1}^{k}

243: \left(1+{\mu\over A_j}\right)^{-1}=1

244: \end{equation}

245: which shows that the amplitude $\mu$ depends on the entire attachment kernel.

246:

247: For the generic case $A_k\sim k^\gamma$, we substitute this form into

248: Eq.~(\ref{Nkgen}) and then rewrite the product as the exponential of a sum of

249: a logarithm.  In the continuum limit, we convert this sum to an integral,

250: expand the logarithm to lowest order, and then evaluate the integral to yield

251: the following basic results:

252: \begin{eqnarray}

253: \label{cases}

254: n_k\sim\cases{

255: k^{-\gamma}\exp

256: \left[-\mu\left({{k^{1-\gamma}-2^{1-\gamma}}\over 1-\gamma}\right)\right],

257: &$0\leq\gamma<1$;\cr

258: k^{-\nu}, \quad \nu>2,

259: & $\gamma=1$;\cr

260: {\rm best\ seller} & $1<\gamma<2$;\cr

261: {\rm bible} & $2<\gamma$.}

262: \end{eqnarray}

263:

264: Thus the degree distribution decays exponentially for $\gamma=0$, as in the

265: case of the random graph, while for all $0<\gamma<1$, the distribution

266: exhibits robust stretched exponential behavior.  The linear kernel is the

267: case that has garnered much of the current research interest.  As shown

268: above, $n_k={4/[k(k+1)(k+2)]}$ for the strictly linear kernel $A_k=k$.  One

269: might anticipate that $n_k\propto k^{-3}$ holds for all {\em asymptotically}

270: linear kernels, $A_k\sim k$.  However, the situation is more delicate and the

271: degree distribution exponent depends on microscopic details of $A_k$.  {}From

272: Eq.~(\ref{Nkgen}), we obtain $n_k\sim k^{-\nu}$, where the exponent

273: $\nu=1+\mu$ can be tuned to {\em any} value larger than 2 \cite{KRL,KR}.

274: This non-universal behavior shows that one must be cautious in drawing

275: general conclusions from the GN with a linear attachment kernel.

276:

277: % Another important message of this derivation is that the rate equation method

278: % provides a complete and satisfying way to the degree distribution for many

279: % different types of attachment kernels.

280:

281: \begin{figure}[ht]

282:   \begin{center}

283:     \includegraphics[width=0.25\textwidth]{degrees.eps}

284:  \caption{A node with in-degree $i=4$, out-degree $j=5$, and total

285:   degree 9.}~\label{degrees}

286:   \end{center}

287: \end{figure}

288:

289: As an illustrative example of the vagaries of asymptotically linear kernels,

290: consider the shifted linear kernel $A_k=k+w$.  One way to motivate this

291: kernel is to explicitly keep track of link directionality.  In particular,

292: the node degree for an undirected graph naturally generalizes to the

293: in-degree and out-degree for a directed graph, the number of incoming and

294: outgoing links at a node, respectively.  Thus the total degree $k$ in a

295: directed graph is the sum of the in-degree $i$ and out-degree $j$

296: (Fig.~\ref{degrees}).  (More details on this model are given in the next

297: section.)~ The most general linear attachment kernel for a directed graph has

298: the form $A_{ij}=ai+bj$.  The GN corresponds to the case where the out-degree

299: of any node equals one; thus $j=1$ and $k=i+1$.  For this example the general

300: linear attachment kernel reduces to $A_k=a(k-1)+b$.  Since the overall scale

301: is irrelevant, we can re-write $A_k$ as the shifted linear kernel $A_k=k+w$,

302: with $w=-1+b/a$ that can vary over the range $-1<w<\infty$.

303:

304: To determine the degree distribution for the shifted linear kernel, note that

305: $A(t)=\sum_jA_jN_j(t)$ simply equals \hbox{$A(t)=M_1(t)+wM_0(t)$}.  {}Using

306: $A=\mu t$, $M_0=t$ and $M_1=2t$, we get $\mu=2+w$ and hence the relation

307: $\nu=1+\mu$ from the previous paragraph becomes $\nu=3+w$.  Thus a simple

308: additive shift in the attachment kernel profoundly affects the asymptotic

309: degree distribution.  Furthermore, from Eq.~(\ref{Nkgen}) we determine the

310: entire degree distribution to be

311: \begin{equation}

312: \label{nkw}

313: n_k=(2+w)\,{\Gamma(3+2w)\over \Gamma(1+w)}\,

314: {\Gamma(k+w)\over \Gamma(k+3+2w)}.

315: \end{equation}

316:

317: Finally, we outline the intriguing behavior for super-linear kernels.  In

318: this case, there is a ``runaway'' or gelation-like phenomenon in which one

319: node links to almost every other node.  For $\gamma>2$, all but a finite

320: number of nodes are linked to a {\em single} node that has the rest of the

321: links.  We term such an overwhelmingly popular node as a ``bible''.  For

322: $1<\gamma\leq 2$, the number of nodes with a just a few links is no longer

323: finite, but grows slower than linearly in time, and the remainder of the

324: nodes are linked to an extremely popular node that we now term ``best

325: seller''.  Full details about this runaway behavior are given in \cite{KRL}.

326:

327: As a final parenthetical note, when the attachment kernel has the form

328: $A_k\propto k^\gamma$, with $\gamma<0$, there is preferential attachment to

329: poorly-connected sites.  Here, the degree distribution exhibits faster than

330: exponential decay, $n_k\propto k^{-\gamma(k-1)}$.  When $\gamma< -2$, the

331: propensity for avoiding popularity is so strong that there is a finite

332: probability of forming a ``worm'' graph in which each node attaches only to

333: its immediate predecessor.

334:

335: \subsection{Degree Distribution of a Heterogeneous Network}

336:

337: A practically-relevant generalization of the GN is to endow each node with an

338: intrinsic and permanently defined ``attractiveness'' \cite{BiA}.  This

339: accounts for the obvious fact that not all nodes are equivalent, but that

340: some are clearly more attractive than others at their inception.  Thus the

341: subsequent attachment rate to a node should be a function of both its degree

342: and its intrinsic attractiveness.  For this generalization, the rate equation

343: approach yields complete results with minimal additional effort beyond that

344: needed to solve the homogeneous network.

345:

346: Let us assign each node an attractiveness parameter $\eta>0$, with arbitrary

347: distribution, at its inception.  This attractiveness modifies the node

348: attachment rate as follows: for a node with degree $k$ and attractiveness

349: $\eta$, the attachment rate is simply $A_k(\eta)$.  Now we need to

350: characterize nodes both by their degree and their attractiveness -- thus

351: $N_k(\eta)$ is the number of nodes with degree $k$ and attractiveness $\eta$.

352: This joint degree-attractiveness distribution obeys the rate equation,

353: \begin{equation}

354: \label{Nk-het}

355: {d N_k(\eta)\over dt}=

356: {A_{k-1}(\eta) N_{k-1}(\eta)-A_k(\eta) N_k(\eta)\over A}+p_0(\eta)\delta_{k1}.

357: \end{equation}

358: Here $p_0(\eta)$ is the probability that a newly-introduced node has

359: attractiveness $\eta$, and the normalization factor $A=\int d\eta

360: \sum_{k}A_k(\eta)N_k(\eta)$.

361:

362: Following the same approach as that used to analyze Eq.~(\ref{Nk}), we

363: substitute $A=\mu t$ and $n_k(\eta)=tN_k(\eta)$ into Eq.~(\ref{Nk-het}) to

364: obtain the recursion relation

365: \begin{equation}

366: \label{Nkgen-het}

367: n_k(\eta)=p_0(\eta){\mu\over A_k(\eta)}\prod_{j=1}^{k}

368: \left(1+{\mu\over A_j(\eta)}\right)^{-1}.

369: \end{equation}

370:

371: For concreteness, consider the linear attachment kernel $A_k(\eta)=\eta k$.

372: Then applying the same analysis as in the homogeneous network, we find

373: \begin{equation}

374: \label{nk-het}

375: n_k(\eta)= {\mu\,p_0(\eta)\over \eta}\,

376: {\Gamma(k)\, \Gamma\left(1+{\mu\over \eta}\right)\over

377: \Gamma\left(k+1+{\mu\over \eta}\right)}.

378: \end{equation}

379: To determine the amplitude $\mu$ we substitute (\ref{nk-het}) into the

380: definition $\mu=\int d\eta\, \sum_{k\geq 1}A_k(\eta)\,n_k(\eta)$ and use the

381: identity \cite{knuth}

382: \begin{eqnarray*}

383: \label{identity}

384: \sum_{k=1}^\infty {\Gamma(k+u)\over \Gamma(k+v)}

385: ={\Gamma(u+1)\over (v-u-1)\,\Gamma(v)}

386: \end{eqnarray*}

387: to simplify the sum.  This yields the implicit relation

388: \begin{equation}

389: \label{mu-het}

390: 1=\int d\eta\, p_0(\eta)\,\left({\mu\over \eta}-1\right)^{-1}.

391: \end{equation}

392: This condition on $\mu$ leads to two alternatives: If the support of $\eta$

393: is unbounded, then the integral diverges and there is no solution for $\mu$.

394: In this limit, the most attractive node is connected to a finite fraction of

395: all links.  Conversely, if the support of $\eta$ is bounded, the resulting

396: degree distribution is similar to that of the homogeneous network.  For fixed

397: $\eta$, $n_k(\eta)\sim k^{-\nu(\eta)}$ with an attractiveness-dependent decay

398: exponent $\nu(\eta)=1+\mu/\eta$.  Amusingly, the total degree distribution

399: $n_k=\int d\eta\,n_k(\eta)$ is no longer a strict power law \cite{BiA}.

400: Rather, the asymptotic behavior is governed by properties of the initial

401: attractiveness distribution near the upper cutoff.  In particular, if

402: $p_0(\eta)\sim (\eta_{\rm max}-\eta)^{\omega-1}$ (with $\omega>0$ to ensure

403: normalization), the total degree distribution exhibits a logarithmic

404: correction

405: \begin{equation}

406: \label{nk-asymp-het}

407: n_k\sim k^{-(1+\mu/\eta_{\rm max})}\,(\ln k)^{-\omega}.

408: \end{equation}

409:

410: \subsection{Age Distribution}

411:

412: In addition to the degree distribution, we determine {\em when} connections

413: occur.  Naively, we expect that older nodes will be better connected.  We

414: study this feature by resolving each node both by its degree and its age to

415: provide a more complete understanding of the network evolution.  Thus define

416: $c_k(t,a)$ to be the average number of nodes of age $a$ that have $k-1$

417: incoming links at time $t$.  Here age $a$ means that the node was introduced

418: at time $t-a$.  The original degree distribution may be recovered from the

419: joint age-degree distribution through $N_k(t)=\int_0^t da\,c_k(t,a)$.

420:

421: For simplicity, we consider only the case of the strictly linear kernel; more

422: general kernels were considered in Ref.~\cite{KR}.  The joint age-degree

423: distribution evolves according to the rate equation

424: \begin{equation}

425: \label{ck1}

426: \left({\partial \over \partial t}+{\partial \over \partial a}\right)c_k

427: ={A_{k-1}c_{k-1}-A_k c_k\over 2t}

428: +\delta_{k1}\delta(a).

429: \end{equation}

430: The second term on the left accounts for the aging of nodes.  We assume here

431: that the probability of linking to a given node again depends only on its

432: degree and not on its age.  Finally, we again have used $A(t)\equiv

433: M_1(t)\simeq 2t$ for the linear attachment kernel in the long-time limit.

434:

435: The homogeneous form of this equation implies that solution should be

436: self-similar.  Thus we seek a solution as a function of the {\em single}

437: variable $a/t$ rather than two separate variables.  Writing

438: $c_k(t,a)=f_k(x)$ with $x=1-{a\over t}$, we convert Eq.~(\ref{ck1}) into the

439: ordinary differential equation

440: \begin{equation}

441: \label{fk1}

442: -2x\,{df_k\over dx}=(k-1) f_{k-1}-k f_k.

443: \end{equation}

444: We omit the delta function term, since it merely provides the boundary

445: condition $c_k(t,a=0)=\delta_{k1}$, or $f_k(1)=\delta_{k1}$.

446:

447:

448: The solution to this boundary-value problem may be simplified by assuming the

449: exponential solution $f_k=\Phi\varphi^{k-1}$; this is consistent with the

450: boundary condition, provided that $\Phi(1)=1$ and $\varphi(1)=0$.  This

451: ansatz reduces the infinite set of rate equations (\ref{fk1}) into two

452: elementary differential equations for $\varphi(x)$ and $\Phi(x)$ whose

453: solutions are $\varphi(x)=1-\sqrt{x}$ and $\Phi(x)=\sqrt{x}$.  In terms of

454: the original variables of $a$ and $t$, the joint age-degree distribution is

455: then

456: \begin{eqnarray}

457: \label{ck1all}

458: c_k(t,a)=\sqrt{1-{a\over t}}\left\{1-\sqrt{1-{a\over t}}\right\}^{k-1}.

459: \end{eqnarray}

460:

461: Thus the degree distribution for fixed-age nodes decays {\em exponentially},

462: with a characteristic degree that diverges as $\langle k\rangle\sim

463: (1-a/t)^{-1/2}$ for $a\to t$.  As expected, young nodes (those with $a/t\to

464: 0$) typically have a small degree while old nodes have large degree

465: (Fig.~\ref{age}).  It is the large characteristic degree of old nodes that

466: ultimately leads to a {\em power-law} total degree distribution when the

467: joint age-degree distribution is integrated over all ages.

468:

469: \begin{figure}[ht]

470:   \begin{center}

471:     \includegraphics[width=0.4\textwidth]{age.eps}

472: \caption{Age-dependent degree distribution for the GN for the linear

473:   attachment kernel.  Low-degree nodes tend to be relatively young while

474:   high-degree nodes are old.  The inset shows detail for $a/t\geq 0.98$.}

475: \label{age}

476: \end{center}

477: \end{figure}

478:

479: \subsection{Node Degree Correlations}

480:

481: The rate equation approach is sufficiently versatile that we can also obtain

482: much deeper geometrical properties of growing networks.  One such property is

483: the correlation between degrees of connected nodes \cite{KR}.  These develop

484: naturally because a node with large degree is likely to be old.  Thus its

485: ancestor is also old and hence also has a large degree.  In the context of

486: the web, this correlation merely expresses that obvious fact that it is more

487: likely that popular web sites have hyperlinks among each other rather than to

488: marginal sites.

489:

490: To quantify the node degree correlation, we define $C_{kl}(t)$ as the number

491: of nodes of degree $k$ that attach to an ancestor node of degree $l$

492: (Fig.~\ref{corr-def}).  For example, in the network of Fig.~\ref{network},

493: there are $N_1=6$ nodes of degree 1, with $C_{12}=C_{13}=C_{15}=2$.  There

494: are also $N_2=2$ nodes of degree 2, with $C_{25}=2$, and $N_3=1$ nodes of

495: degree 3, with $C_{35}=1$.

496:

497: \begin{figure}[ht]

498:   \begin{center}

499:     \includegraphics[width=0.2\textwidth]{corr-def.eps}

500: \caption{Definition of the node degree correlation $C_{kl}$ for the case

501:   $k=3$ and $l=4$.}

502: \label{corr-def}

503: \end{center}

504: \end{figure}

505:

506: For simplicity, we again specialize to the case of the strictly linear

507: attachment kernel.  More general kernels can also be treated within our

508: general framework \cite{KR}.  For the linear attachment kernel, the degree

509: correlation $C_{kl}(t)$ evolves according to the rate equation

510: \begin{eqnarray}

511: \label{Nkl}

512: M_1\,{d C_{kl}\over dt}=(k-1) C_{k-1,l}-kC_{kl}+

513: (l-1) C_{k,l-1}-l C_{kl}+(l-1)C_{l-1}\,\delta_{k1}.

514: \end{eqnarray}

515: The processes that gives rise to each term in this equation are illustrated in

516: Fig.~\ref{corr-RE}.  The first two terms on the right account for the change

517: in $C_{kl}$ due to the addition of a link onto a node of degree $k-1$ (gain)

518: or $k$ (loss) respectively, while the second set of terms gives the change in

519: $C_{kl}$ due to the addition of a link onto the ancestor node.  Finally, the

520: last term accounts for the gain in $C_{1l}$ due to the addition of a new

521: node.

522:

523: \begin{figure}[ht]

524:   \begin{center}

525:     \includegraphics[width=0.7\textwidth]{corr-RE.eps}

526: \caption{The processes that contribute ((i)--(v) in order)

527:   to the various terms in the rate equation (\ref{Nkl}).  The newly-added

528:   node and link are shown dashed.}

529: \label{corr-RE}

530: \end{center}

531: \end{figure}

532:

533: As in the case of the node degree, the time dependence can be separated as

534: $C_{kl}= tc_{kl}$.  This reduces Eqs.~(\ref{Nkl}) to the time-independent

535: recursion relation,

536: \begin{eqnarray}

537: \label{nkl}

538: (k+l+2)c_{kl}=(k-1) c_{k-1,l}+(l-1) c_{k,l-1}

539: +(l-1)c_{l-1}\,\delta_{k1}.

540: \end{eqnarray}

541: This can be further reduced to a constant-coefficient inhomogeneous recursion

542: relation by the substitution

543: \begin{eqnarray*}

544: \label{Akl}

545: c_{kl}={\Gamma(k)\,\Gamma(l)\over \Gamma(k+l+3)}\,\,d_{kl}

546: \end{eqnarray*}

547: to yield

548: \begin{equation}

549: \label{A}

550: d_{kl}=d_{k-1,l}+d_{k,l-1}+4(l+2)\delta_{k1}.

551: \end{equation}

552: Solving Eqs.~(\ref{A}) for the first few $k$ yields the pattern of dependence

553: on $k$ and $l$ from which one can then infer the solution

554: \begin{equation}

555: \label{A-sol}

556: d_{kl}=4\,{\Gamma(k+l)\over \Gamma(k+2)\,\Gamma(l-1)}

557: +12\,{\Gamma(k+l-1)\over \Gamma(k+1)\,\Gamma(l-1)},

558: \end{equation}

559: from which we ultimately obtain

560: \begin{eqnarray}

561: \label{nkl-sol}

562: c_{kl}={4(l-1)\over k(k+l)(k+l+1)(k+l+2)}\left[{1\over k+1}

563: +{3\over k+l-1}\right].

564: \end{eqnarray}

565: The important feature of this result is that the joint distribution does not

566: factorize, that is, $c_{kl}\ne n_kn_{l}$.  This correlation between the

567: degrees of connected nodes is an important distinction between the GN and

568: classical random graphs.

569:

570: While the solution of Eq.~(\ref{nkl-sol}) is unwieldy, it greatly simplifies

571: in the scaling regime, $k\to\infty$ and $l\to\infty$ with $y=l/k$ finite.

572: The scaled form of the solution is

573: \begin{eqnarray}

574: \label{nkl-scal}

575: c_{kl}=k^{-4}\,{4y(y+4)\over (1+y)^4}.

576: \end{eqnarray}

577: For fixed large $k$, the distribution $c_{kl}$ has a single maximum at

578: $y^*=(\sqrt{33}-5)/2 \cong 0.372$.  Thus a node whose degree $k$ is large is

579: typically linked to another node whose degree is also large; the typical

580: degree of the ancestor is 37\% that of the daughter node.  In general, when

581: $k$ and $l$ are both large and their ratio is different from one, the

582: limiting behaviors of $c_{kl}$ are

583: \begin{equation}

584: \label{nklext}

585: c_{kl}\to\cases{16\,(l/k^5)    & $l\ll k$,\cr

586:                 4/(k^2\,l^2)   & $l\gg k$.\cr}

587: \end{equation}

588: Here we explicitly see the absence of factorization in the degree

589: correlation: $c_{kl}\ne n_kn_{l}\propto (k\,l)^{-3}$.

590:

591: \subsection{Global Properties}

592:

593: In addition to elucidating the degree distribution and degree correlations,

594: the rate equations can be applied to determine global properties.  One useful

595: example is the {\em out-component\/} with respect to a given node {\bf x} --

596: this is the set of nodes that can be reached by following directed links that

597: emanate from {\bf x} (Fig.~\ref{in-out}).  In the context of the web, this is

598: the set of nodes that are reached by following hyperlinks that emanate from a

599: fixed node to target nodes, and then iteratively following target nodes ad

600: infinitum.  In a similar vein, one may enumerate all nodes that refer to a

601: fixed node, plus all nodes that refer these daughter nodes, {\it etc}.  This

602: progeny comprises the in-component to node {\bf x} -- the set from which {\bf

603:   x} can be reached by following a path of directed links.

604:

605: \begin{figure}[ht]

606:   \begin{center}

607:     \includegraphics[width=0.35\textwidth]{in-out.eps}

608:  \caption{In-component and out-components of node {\bf x}.}~\label{in-out}

609:   \end{center}

610: \end{figure}

611:

612: \subsubsection{The In-Component}

613:

614: For simplicity, we study the in-component size distribution for the GN with a

615: constant attachment kernel, $A_k=1$.  We consider this kernel because many

616: results about network components are {\it independent\/} of the form of the

617: kernel and thus it suffices to consider the simplest situation; the extension

618: to more general attachment kernels is discussed in \cite{KR}.

619:

620: For the constant attachment kernel, the number $I_s(t)$ of in-components with

621: $s$ nodes satisfies the rate equation

622: \begin{equation}

623: \label{Ik}

624: {d I_s\over dt}={(s-1)I_{s-1}-sI_s\over A}+\delta_{s1}.

625: \end{equation}

626: The loss term accounts for processes in which the attachment of a new node to

627: an in-component of size $s$ increases its size by one.  This gives a loss

628: rate that is proportional to $s$.  If there is more than one in-component of

629: size $s$ they must be disjoint, so that the total loss rate for $I_s(t)$ is

630: simply $sI_s(t)$.  A similar argument applies for the gain term.  Finally,

631: dividing by $A(t)=\sum_j A_j N_j(t)$ converts these rates to normalized

632: probabilities.  For the constant attachment kernel, $A(t)=M_0(t)$, so

633: asymptotically $A=t$.  Interestingly, Eq.~(\ref{Ik}) is almost identical to

634: the rate equations for the degree distribution for the GN with linear

635: attachment kernel, except that the prefactor equals $t^{-1}$ rather than

636: $(2t)^{-1}$.  This change in the normalization factor is responsible for

637: shifting the exponent of the resulting distribution from $-3$ to $-2$.

638:

639: To determine $I_s(t)$, we again note, by explicitly solving the first few of

640: the rate equations, that each $I_s$ grows linearly in time.  Thus we

641: substitute $I_s(t)=ti_s$ into Eqs.~(\ref{Ik}) to obtain $i_1=1/2$ and

642: $i_s=i_{s-1}(s-1)/(s+1)$.  This immediately gives

643: \begin{equation}

644: \label{is}

645: i_s={1\over s(s+1)}.

646: \end{equation}

647: This $s^{-2}$ tail for the in-component distribution is a robust feature,

648: {\em independent\/} of the form of the attachment kernel \cite{KR}.  This

649: $s^{-2}$ tail also agrees with recent measurements of the web \cite{www2}.

650:

651: \subsubsection{The Out-Component}

652:

653: The complementary out-component from each node can be determined by

654: constructing a mapping between the out-component and an underlying network

655: ``genealogy''.  We build a genealogical tree for the GN by taking generation

656: $g=0$ to be the initial node.  Nodes that attach to those in generation $g$

657: form generation $g+1$; the node index does not matter in this

658: characterization.  For example, in the network of Fig.~\ref{network}, node 1

659: is the ``ancestor'' of 6, while 10 is the ``descendant'' of 6 and there are 5

660: nodes in generation $g=1$ and 4 in $g=2$.  This leads to the genealogical

661: tree of Fig.~\ref{genealogy}.

662:

663: \begin{figure}[ht]

664:   \begin{center}

665:     \includegraphics[width=0.35\textwidth]{genealogy.eps}

666:  \caption{Genealogy of the network in Fig.~\ref{network}.

667:    The nodes indices indicate when each is introduced.  The nodes are also

668:    arranged according to generation number.}~\label{genealogy}

669:   \end{center}

670: \end{figure}

671:

672: The genealogical tree provides a convenient way to characterize the

673: out-component distribution.  As one can directly verify from

674: Fig.~\ref{genealogy}, the number $O_s$ of out-components with $s$ nodes

675: equals $L_{s-1}$, the number of nodes in generation $s-1$ in the genealogical

676: tree.  We therefore compute $L_g(t)$, the size of generation $g$ at time $t$.

677: For this discussion, we again treat only the constant attachment kernel and

678: refer the reader to Ref.~\cite{KR} for more general attachment kernels.  We

679: determine $L_g(t)$ by noting that $L_g(t)$ increases when a new node attaches

680: to a node in generation $g-1$.  This occurs with rate $L_{g-1}/M_0$, where

681: $M_0(t)=1+t$ is the number of nodes.  This gives the differential equation

682: for $\dot L_g(t)=L_{g-1}/(1+t)$ with solution $L_g(\tau)={\tau^g/g!}$, where

683: $\tau=\ln(1+t)$.  Thus the number $O_s$ of out-components with $s$ nodes

684: equals

685: \begin{equation}

686: \label{Rk}

687: O_s(\tau)={\tau^{s-1}/ (s-1)!}.

688: \end{equation}

689: Note that the generation size $L_g(t)$ grows with $g$, when $g<\tau$, and

690: then decreases and becomes of order 1 when $g=e\tau$.  The genealogical tree

691: therefore contains approximately $e\tau$ generations at time $t$.  This

692: result allows us to determine the diameter of the network, since the maximum

693: distance between any pair of nodes is twice the distance from the root to the

694: last generation.  Therefore the diameter of the network scales as

695: $2e\tau\approx 2e\ln N$; this is the same dependence on $N$ as in the random

696: graph \cite{bol,jan}.  More importantly, this result shows that the diameter

697: of the GN is always small -- ranging from the order of $\ln N$ for a constant

698: attachment kernel, to the order of one for super-linear attachment kernels.

699:

700: \section{The Web Graph}

701:

702: In the world-wide web, link directionality is clearly relevant, as hyperlinks

703: go {\em from} an issuing website {\em to} a target website but not vice

704: versa.  Thus to characterize the local graph structure more fully, the node

705: degree should be resolved into the {\em in-degree} -- the number of incoming

706: links to a node, and the complementary {\em out-degree} (Fig.~\ref{degrees}).

707: Measurements on the web indicate that these distributions are power laws with

708: different exponents \cite{www3}.  These properties can be accounted for by

709: the web graph (WG) model (Fig.~\ref{io-growth}) and the rate equations

710: provide an extremely convenient analysis tool.

711:

712: \subsection{Average Degrees}

713:

714: Let us first determine the average node degrees (in-degree, out-degree, and

715: total degree) of the WG.  Let $N(t)$ be the total number of nodes, and $I(t)$

716: and $J(t)$ the in-degree and out-degree of the entire network, respectively.

717: According to the elemental growth steps of the model, these degrees evolve by

718: one of the following two possibilities:

719: \begin{eqnarray*}

720: (N,I,J)\to \cases{(N+1,I+1,J+1)  & with probability $p$,\cr

721:                   (N,I+1,J+1)    & with probability $q$.}

722: \end{eqnarray*}

723: That is, with probability $p$ a new node and new directed link are created

724: (Fig.~\ref{io-growth}) so that the number of nodes and both the total in- and

725: out-degrees increase by one.  Conversely, with probability $q$ a new directed

726: link is created and the in- and out-degrees each increase by one, while the

727: total number of nodes is unchanged.  As a result, $N(t)=pt$, and

728: $I(t)=J(t)=t$.  Thus the average in- and out-degrees, ${\cal D}_{\rm in}\equiv

729: I(t)/N(t)$ and ${\cal D}_{\rm out}\equiv J(t)/N(t)$, are both equal to $1/p$.

730:

731: \subsection{Degree Distributions}

732:

733: To determine the degree distributions, we need to specify: (i) the {\em

734:   attachment rate} $A(i,j)$, defined as the probability that a

735: newly-introduced node links to an existing node with $i$ incoming and $j$

736: outgoing links, and (ii) the {\em creation rate} $C(i_1,j_1|i_2,j_2)$,

737: defined as the probability of adding a new link from a $(i_1,j_1)$ node to a

738: $(i_2,j_2)$ node.  We will use rates that are expected to occur in

739: the web.  Clearly, the attachment and creation rates should be non-decreasing

740: in $i$ and $j$.  Moreover, it seems intuitively plausible that the attachment

741: rate depends only on the in-degree of the target node, $A(i,j)=A_i$; {\it

742:   i.e.}, a website designer decides to create link to a target based only on

743: the popularity of the latter.  In the same spirit, we take the link creation

744: rate to depend only on the out-degree of the issuing node and the in-degree

745: of the target node, $C(i_1,j_1|i_2,j_2)= C(j_1,i_2)$.  The former property

746: reflects the fact that the development rate of a site depends only on the

747: number of outgoing links.

748:

749: The interesting situation of power-law degree distributions arises for

750: asymptotically linear rates, and we therefore consider

751: \begin{equation}

752: \label{AC}

753: A_i=i+\lambda_{\rm in} \qquad{\rm and}\qquad C(j,i)=(i+\lambda_{\rm

754:   in})(j+\lambda_{\rm out})

755: \end{equation}

756: The parameters $\lambda_{\rm in}$ and $\lambda_{\rm out}$ must satisfy the

757: constraint $\lambda_{\rm in}>0$ and $\lambda_{\rm out}>-1$ to ensure that the

758: rates are positive for all attainable in- and out-degree values, $i\geq 0$

759: and $j\geq 1$.

760:

761: With these rates, the joint degree distribution, $N_{ij}(t)$, defined as the

762: average number of nodes with $i$ incoming and $j$ outgoing links, evolves

763: according to

764: \begin{eqnarray}

765: \label{Nij}

766: {d N_{ij}\over dt}&=&

767: (p+q)\left[{(i-1+\lambda_{\rm in})N_{i-1,j}

768: -(i+\lambda_{\rm in})N_{ij}\over I+\lambda_{\rm in} N}\right]\\

769:  &&\hskip 0.285in

770: +q\left[{(j-1+\lambda_{\rm out})N_{i,j-1}

771: -(j+\lambda_{\rm out})N_{ij}\over J+\lambda_{\rm out} N}\right]

772: +p\,\delta_{i0}\delta_{j1}.\nonumber

773: \end{eqnarray}

774: The first group of terms on the right accounts for the changes in the

775: in-degree of target nodes by simultaneous creation of a new node and link

776: (probability $p$) or by creation of a new link only (probability $q$).  For

777: example, the creation of a link to a node with in-degree $i$ leads to a loss

778: in the number of such nodes.  This occurs with rate $(p+q)(i+\lambda_{\rm

779:   in})N_{ij}$, divided by the appropriate normalization factor

780: $\sum_{i,j}(i+\lambda_{\rm in})N_{ij}= I+\lambda_{\rm in} N$.  The factor

781: $p+q=1$ in Eq.~(\ref{Nij}) is explicitly written to make clear these two

782: types of processes.  Similarly, the second group of terms account for

783: out-degree changes.  These occur due to the creation of new links between

784: already existing nodes -- hence the prefactor $q$.  The last term accounts

785: for the introduction of new nodes with no incoming links and one outgoing

786: link.  As a useful consistency check, one may verify that the total number of

787: nodes, $N=\sum_{i,j} N_{ij}$, grows according to $\dot N=p$, while the total

788: in- and out-degrees, $I=\sum_{i,j} iN_{ij}$ and $J=\sum_{i,j} jN_{ij}$, obey

789: $\dot I=\dot J=1$.

790:

791: By solving the first few of Eqs.~(\ref{Nij}), it is again clear that the

792: $N_{ij}$ grow linearly with time.  Accordingly, we substitute

793: $N_{ij}(t)=t\,n_{ij}$, as well as $N=pt$ and $I=J=t$, into Eqs.~(\ref{Nij})

794: to yield a recursion relation for $n_{ij}$.  Using the shorthand notations,

795: \begin{eqnarray*}

796: a=q\,{1+p\lambda_{\rm in}\over 1+p\lambda_{\rm out}}\quad {\rm and}\quad

797: b=1+(1+p)\lambda_{\rm in},

798: \end{eqnarray*}

799: the recursion relation for $n_{ij}$ is

800: \begin{eqnarray}

801: \label{nij}

802: [i+a(j+\lambda_{\rm out})+b]n_{ij}

803: =(i-1+\lambda_{\rm in})n_{i-1,j}+a(j-1+\lambda_{\rm out})n_{i,j-1}

804: +p(1+p\lambda_{\rm in})\delta_{i0}\delta_{j1}.

805: \end{eqnarray}

806: The in-degree and out-degree distributions are straightforwardly expressed

807: through the joint distribution: ${\cal I}_i(t)

808: =\sum_j N_{ij}(t)$ and ${\cal O}_j(t)=\sum_i N_{ij}(t)$.  Because of the

809: linear time dependence of the node degrees, we write ${\cal I}_i(t)=t\,I_i$

810: and ${\cal O}_j(t)=t\,O_j$.  The densities $I_i$ and $O_j$ satisfy

811: \begin{subeqnarray}

812: \label{Ii}

813: (i+b)I_{i} &=&(i-1+\lambda_{\rm in})I_{i-1}

814: +p(1+p\lambda_{\rm in})\delta_{i0},\\

815: \left(j+{1\over q}+{\lambda_{\rm out}\over q}\right)O_j

816: &=&(j-1+\lambda_{\rm out})O_{j-1}+p{1+p\lambda_{\rm out}\over q}\delta_{j1},

817: \end{subeqnarray}

818: respectively.  The solution to these recursion formulae may be expressed

819: in terms of the following ratios of gamma functions

820: \begin{subeqnarray}

821: \label{I-sol}

822: I_{i}&=&I_0\,{\Gamma(i+\lambda_{\rm in})\,\Gamma(b+1)\over

823: \Gamma(i+b+1)\,\Gamma(\lambda_{\rm in})},\\

824: \label{O-sol}

825: O_{j}&=&O_1\,{\Gamma(j+\lambda_{\rm out})\,\,

826: \Gamma(2+q^{-1}+\lambda_{\rm out} q^{-1})\over

827: \Gamma(j+1+q^{-1}+\lambda_{\rm out} q^{-1})\,\Gamma(1+\lambda_{\rm out})},

828: \end{subeqnarray}

829: with $I_0=p(1+p\lambda_{\rm in})/b$ and

830: $O_1=p(1+p\lambda_{\rm out})/(1+q+\lambda_{\rm out})$.

831:

832: {}From the asymptotics of the gamma function, the asymptotic behavior of the

833: in- and out-degree distributions have the distinct power law forms \cite{KRR},

834: \begin{subeqnarray}

835: \label{in}

836: I_i\sim i^{-\nu_{\rm in}},~~~\qquad \nu_{\rm in}&=&2+p\lambda_{\rm in},\\

837: \hskip 0.7in O_j\sim j^{-\nu_{\rm out}}, \qquad

838: \nu_{\rm out}&=&1+q^{-1}+\lambda_{\rm out}\, pq^{-1},

839: \end{subeqnarray}

840: with $\nu_{\rm in}$ and $\nu_{\rm out}$ both necessarily greater than 2.  Let

841: us now compare these predictions with current data for the web \cite{www3}.

842: First, the value of $p$ is fixed by noting that $p^{-1}$ equals the average

843: degree of the entire network.  Current data for the web gives ${\cal D}_{\rm

844:   in}\equiv {\cal D}_{\rm out}\approx 7.5$, and thus we set $p^{-1}=0.75$.

845: Now Eqs.~(\ref{in}) contain two free parameters and by choosing them to be

846: $\lambda_{\rm in}=0.75$ and $\lambda_{\rm out}=3.55$ we reproduced the

847: observed exponents for the degree distributions of the web, $\nu_{\rm

848:   in}\approx 2.1$ and $\nu_{\rm out}\approx 2.7$, respectively.  The fact

849: that the parameters $\lambda_{\rm in}$ and $\lambda_{\rm out}$ are of the

850: order of one indicates that the model with linear rates of node attachment

851: and bilinear rates of link creation is a viable description of the web.

852:

853: \section{Multicomponent Graph}

854:

855: In addition to the degree distributions, current measurements indicate that

856: the web consists of a ``giant'' component that contains approximately 91\% of

857: all nodes, and a large number of finite components \cite{www3}.  The models

858: discussed thus far are unsuited to describe the number and size distribution

859: of these components, since the growth rules necessarily produce only a single

860: connected component.  In this section, we outline a simple modification of

861: the WG, the multicomponent graph (MG), that naturally produces many

862: components.  In this example, the rate equations now provide a comprehensive

863: characterization for the size distribution of the components.

864:

865: In the MG model, we simply separate node and link creation steps.  Namely,

866: when a node is introduced it does not immediately attach to an earlier node,

867: but rather, a new node begins its existence as isolated and joins the network

868: only when a link creation event reaches the new node.  For the average

869: network degrees, this small modification already has a significant effect.

870: The number of nodes and the total in- and out-degrees of the network, $N,I,J$

871: now increase with time as $N=pt$ and $I=J=qt$.  Thus the in- and out-degrees

872: of each node are time independent and equal to $qp^{-1}$, while the total

873: degree is ${\cal D}=2q/p$.

874:

875: As in the case of the WG model, we study the case of a bilinear link creation

876: rate given in Eq.~(\ref{AC}), with now $\lambda_{\rm in},\lambda_{\rm out}>0$

877: to ensure that $C(j,i)>0$ for all permissible in- and out-degrees, $i\geq 0$

878: and $j\geq 0$.

879:

880: \subsection{Local Properties}

881:

882: We study local characteristics by employing the same approach as in the WG

883: model.  We find that results differ only in minute details, {\it e.g.}, the

884: in- and out-degree densities $I_i$ and $O_j$ are again the ratios of gamma

885: functions, and the respective exponents are

886: \begin{equation}

887: \label{inout}

888: \nu_{\rm in}=2\left(1+{\lambda_{\rm in}\over {\cal D}}\right),\qquad

889: \nu_{\rm out}=2\left(1+{\lambda_{\rm out}\over {\cal D}}\right).

890: \end{equation}

891: Notice the decoupling -- the in-degree exponent is independent of

892: $\lambda_{\rm out}$, while $\nu_{\rm out}$ is independent of $\lambda_{\rm

893:   in}$. The expressions (\ref{inout}) are neater than their WG counterparts,

894: reflecting the fact that the governing rules of the MG model are more

895: symmetric.

896:

897: To complement our discussion, we now outline the asymptotic behavior of the

898: joint in- and out-degree distribution.  Although this distribution defies

899: general analysis, we can obtain partial and useful information by fixing one

900: index and letting the other index vary.  An elementary but cumbersome

901: analysis yields following limiting behaviors

902: \begin{equation}

903: \label{extreme}

904: n_{ij}\sim\cases{i^{-\xi_{\rm in}}, & $1\ll i$;\cr

905:                 j^{-\xi_{\rm out}}, & $1\ll j$;}

906: \end{equation}

907: with

908: \begin{eqnarray*}

909: \xi_{\rm in} &=&\nu_{\rm in}+{{\cal D}\over 2}\,

910: {(\nu_{\rm in}-1)(\nu_{\rm out}-2)\over \nu_{\rm out}-1}\\

911: \xi_{\rm out} &=&\nu_{\rm out}+{{\cal D}\over 2}\,

912: {(\nu_{\rm out}-1)(\nu_{\rm in}-2)\over \nu_{\rm in}-1}.

913: \end{eqnarray*}

914:

915: We also can determine the joint degree distribution analytically in the

916: subset of the parameter space where $\nu_{\rm in}=\nu_{\rm out}$, {\it i.e.},

917: $\lambda_{\rm in}=\lambda_{\rm out}$.  In what follows, we therefore denote

918: $\lambda_{\rm in}=\lambda_{\rm out}\equiv \lambda$.  The resulting recursion

919: equation for the joint degree distribution is

920: \begin{eqnarray}

921: \label{nij*}

922: (i+j+1+\lambda+\lambda q^{-1})n_{ij}=(i-1+\lambda)n_{i-1,j}

923: +(j-1+\lambda)n_{i,j-1}+c\,\delta_{i,0}\,\delta_{j,0},

924: \end{eqnarray}

925: with $c=p(1+2\lambda/{\cal D})$.  Because the degrees $i$ and $j$ appear in

926: Eq.~(\ref{nij*}) with equal prefactors, the substitution

927: \begin{eqnarray*}

928: \label{mij}

929: n_{ij}={\Gamma(i+\lambda)\,\Gamma(j+\lambda)\over

930: \Gamma(i+j+2+\lambda+\lambda q^{-1})}\,\,m_{ij}

931: \end{eqnarray*}

932: reduces Eqs.~(\ref{nij*}) into the constant-coefficient recursion relation

933: \begin{equation}

934: \label{m}

935: m_{ij}=m_{i-1,j}+m_{i,j-1}+\mu\,\delta_{i,0}\,\delta_{j,1},  \qquad

936: {\rm with}\quad \mu=c\,{\Gamma(1+\lambda+\lambda q^{-1})\over

937: \Gamma^2(\lambda)}.

938: \end{equation}

939: We solve Eq.~(\ref{m}) by employing the generating function technique.

940: Multiplying Eq.~(\ref{m}) by $x^iy^j$ and summing over all $i,j\geq 0$, we

941: find that the generating function ${\mathcal M}(x,y)=\sum_{i,j\geq

942:   0}m_{ij}x^iy^j$ equals $\mu/(1-x-y)$.  Expanding ${\mathcal M}(x,y)$ in $x$

943: yields $\mu \sum x^i/(1-y)^{i+1}$ which we then expand in $y$ by employing

944: the identity $(1-y)^{-i-1}=\sum_{j\geq 0} {i+j\choose i}y^{j}$. Finally, we

945: arrive at

946: \begin{equation}

947: \label{mij-sol}

948: m_{ij}=\mu\,\,{\Gamma(i+j+1)\over \Gamma(i+1)\,\Gamma(j+1)},

949: \end{equation}

950: from which the joint degree distribution is

951: \begin{equation}

952: \label{nij-sol}

953: n_{ij}={\mu\,\Gamma(i+\lambda)\,\Gamma(j+\lambda)\,\Gamma(i+j+1)\over

954: \Gamma(i+1)\,\Gamma(j+1)\,\Gamma(i+j+2+\lambda+\lambda q^{-1})}

955: \longrightarrow \mu\,

956: {(ij)^{\lambda-1}\over (i+j)^{1+\lambda+\lambda/q}},

957: \quad{\rm as}\quad i,j\to\infty.

958: \end{equation}

959: Thus again, the in- and out-degrees of a node are correlated: $n_{ij}\ne

960: I_iO_j\sim i^{-\nu}j^{-\nu}$.

961:

962:

963: \subsection{Global Properties}

964:

965: Let us now turn now to the distribution of connected components (clusters,

966: for brevity).  For simplicity, we consider models with undirected links.  Let

967: us first estimate the total number of clusters ${\cal N}$.  At each time

968: step, ${\cal N}\to {\cal N}+1$ with probability $p$, or ${\cal N}\to {\cal

969:   N}-1$ with probability $q$.  This implies

970: \begin{equation}

971: \label{N}

972: {\cal N}=(p-q)t.

973: \end{equation}

974: The gain rate of ${\cal N}$ is exactly equal to $p$, while in the loss term

975: we ignore self-connections and tacitly assume that links are always created

976: between different clusters.  In the long-time limit, self-connections should

977: be asymptotically negligible when the total number of clusters grows with

978: time and no macroscopic clusters ({\it i.e.}, components that contain a

979: finite fraction of all nodes) arise.

980:

981: This assumption of no self-connections greatly simplifies the description of

982: the cluster merging process.  Consider two clusters (labeled by $\alpha=1,2$)

983: with total in-degrees $i_\alpha$, out-degrees $j_\alpha$, and number of nodes

984: $k_\alpha$.  When these clusters merge, the combined cluster is characterized

985: by

986: \begin{eqnarray*}

987: \label{12}

988: i=i_1+i_2+1,\qquad

989: j=j_1+j_2+1,\qquad

990: k=k_1+k_2.

991: \end{eqnarray*}

992: Thus starting with single-node clusters with $(i,j,k)=(0,0,1)$, the above

993: merging rule leads to clusters that always satisfy the constraint $i=j=k-1$.

994: Thus the size $k$ characterizes both the in-degree and out-degree of

995: clusters.

996:

997: To simplify formulae without sacrificing generality, we consider the link

998: creation rate of Eq.~(\ref{AC}), with $\lambda_{\rm in}=\lambda_{\rm out}=1$.

999: Then the merging rate $W(k_1,k_2)$ of the two clusters is proportional to

1000: $(i_1+k_1)(j_2+k_2)+(i_2+k_2)(j_1+k_1)$, or

1001: \begin{eqnarray*}

1002: \label{w}

1003: W(k_1,k_2)=(2k_1-1)(2k_2-1).

1004: \end{eqnarray*}

1005: Let $C(k,t)$ denotes the number of clusters of mass $k$.  This distribution

1006: evolves according to

1007: \begin{eqnarray}

1008: \label{comp}

1009: {dC(k,t)\over dt}=

1010: {q\over t^2}\sum_{k_1+k_2=k} (2k_1-1)(2k_2-1)\,C(k_1,t)C(k_2,t)

1011: -{2q\over t}\,(2k-1)C(k,t)+p\,\delta_{k,1},

1012: \end{eqnarray}

1013: The first set of terms account for the gain in $C(k,t)$ due to the

1014: coalescence of clusters of size $k_1$ and $k_2$, with $k_1+k_2=k$.

1015: Similarly, the second set of terms accounts for the loss in $C(k,t)$ due to

1016: the coalescence of a cluster of size $k$ with any other cluster.  The last

1017: term accounts for the input of unit-size clusters.  These rate equations are

1018: similar to those of irreversible aggregation with product kernel \cite{agg}.

1019: The primary difference is that we explicitly treat the number of clusters as

1020: finite.

1021:

1022: One can verify that the total number of nodes $N(t)=\sum k\,C(k,t)$ grows

1023: with rate $p$ and that the total number of clusters ${\cal N}(t)=\sum C(k,t)$

1024: grows with rate $p-q$, in agreement with Eq.~(\ref{N}).  Solving the first

1025: few Eqs.~(\ref{comp}) shows again that $C(k,t)$ grow linearly with time.

1026: Accordingly, we substitute $C(k,t)=t\,c_k$ into Eqs.~(\ref{comp}) to yield

1027: the time-independent recursion relation

1028: \begin{eqnarray}

1029: \label{mk}

1030: c_k=q\sum_{k_1+k_2=k} (2k_1-1)(2k_2-1)\,c_{k_1}c_{k_2}

1031: -2q(2k-1)c_k+p\,\delta_{k,1}.

1032: \end{eqnarray}

1033:

1034: A giant component, {\it i.e.}, a cluster that contains a finite fraction of

1035: all the nodes, emerges when the link creation rate exceeds a threshold value.

1036: To determine this threshold, we study the moments of the cluster size

1037: distribution ${\cal M}_n=\sum_{k\geq 1} k^n\,c_k$.  We already know that the

1038: first two moments are ${\cal M}_0=p-q$ and ${\cal M}_1=p$.  We can obtain an

1039: equation for the second moment by multiplying Eq.~(\ref{mk}) by $k^2$ and

1040: summing over $k\geq 1$ to give ${\cal M}_2 =2q(2{\cal M}_2-{\cal M}_1)^2+p$.

1041: When this equation has a real solution, ${\cal M}_2$ is finite.  The

1042: solution is

1043: \begin{equation}

1044: \label{M2}

1045: {\cal M}_2={1+8pq-\sqrt{1-16pq}\over 16 q}

1046: \end{equation}

1047: and gives, when $1-16pq=0$, to a threshold value $p_c=(2+\sqrt{3})/4$.  For

1048: $1-16pq\geq 0$ ($p>p_c$) all clusters have finite size and the second moment

1049: is finite.

1050:

1051: In this steady-state regime, we can obtain the cluster size distribution by

1052: introducing the generating function ${\cal C}(z)=\sum_{k=1}^\infty c_k z^k$

1053: to convert Eq.~(\ref{mk}) into the differential equation

1054: \begin{equation}

1055: \label{Cz}

1056: 2z{\cal C}'(z)-{\cal C}(z)=1-\sqrt{1-[pz-{\cal C}(z)]/q}.

1057: \end{equation}

1058: The asymptotic behavior of the cluster size distribution can now be read off

1059: from the behavior of the generating function in the $z\to 1$ limit.  In

1060: particular, the power-law behavior

1061: \begin{equation}

1062: \label{asym}

1063: c_k\sim {B\over k^\tau}\quad{\rm as} \quad k\to\infty

1064: \end{equation}

1065: implies that the corresponding generating function has the form

1066: \begin{equation}

1067: \label{gen}

1068: {\cal C}(z)={\cal M}_0+{\cal M}_1(z-1)

1069: +{{\cal M}_2-{\cal M}_1\over 2}\,(z-1)^2+

1070: B\Gamma(1-\tau)(1-z)^{\tau-1}+\ldots.

1071: \end{equation}

1072: Here the asymptotic behavior is controlled by the dominant singular term

1073: $(1-z)^{\tau-1}$.  However, there are also subdominant singular terms and

1074: regular terms in the generating function.  In Eq.~(\ref{gen}) we explicitly

1075: included the three regular terms which ensure that the first three moments of

1076: the cluster-size distribution are correctly reproduced, namely, ${\cal

1077:   C}(1)={\cal M}_0$, ${\cal C}'(1)={\cal M}_1$, and ${\cal C}''(1)={\cal

1078:   M}_2-{\cal M}_1$.

1079:

1080: Finally, substituting Eq.~(\ref{gen}) into Eq.~(\ref{Cz}) we find that the

1081: dominant singular terms are of the order of $(1-z)^{\tau-2}$.  Balancing all

1082: contributions of this order in the equation determines the exponent of the

1083: cluster size distribution to be

1084: \begin{equation}

1085: \label{tau}

1086: \tau=1+{2\over 1-\sqrt{1-16pq}}.

1087: \end{equation}

1088: This exponent satisfies the bound $\tau>3$ and thus justifies using the

1089: behavior of the second moment of the size distribution as the criterion to

1090: find the threshold value $p_c$.

1091:

1092: For $p\geq p_c$ there is no giant cluster and the cluster size distribution

1093: has a power-law tail with $\tau$ given by Eq.~(\ref{tau}).  Intriguingly, the

1094: power-law form holds for any value $p>p_c$.  This is in stark contrast to

1095: all other percolation-type phenomena, where away from the threshold, there is

1096: an exponential tail in cluster size distributions \cite{percolation}.  Thus

1097: in contrast to ordinary critical phenomena, the entire range $p>p_c$ is

1098: critical.

1099:

1100: As a corollary to the power-law tail of the cluster size distribution for

1101: $p>p_c$, we can estimate the size of the largest cluster $k_{\rm max}$ to see

1102: how ``finite'' it really is.  Using the extreme statistics criterion

1103: $\sum_{k\geq k_{\rm max}}N\,c_k=1$ we obtain $k_{\rm max}\sim N^{1/(\tau-1)}$,

1104: or

1105: \begin{equation}

1106: \label{kmax}

1107: k_{\rm max}\sim N^{(1-\sqrt{1-16pq})/2}.

1108: \end{equation}

1109: This is very different from the corresponding behavior on the random graph,

1110: where below the percolation threshold the largest component scales

1111: logarithmically with the number of nodes.  Thus for the random graph, the

1112: dependence of $k_{\rm max}(N)$ changes from $\ln N$ just below, to $N$, just

1113: above the percolation threshold; for the MG, the change is much more gentle:

1114: from $N^{1/2}$ to $N$.

1115:

1116: These considerations suggest that the phase transition in the MG is

1117: dramatically different from the percolation transition.  Very recently,

1118: simplified versions of the MG were studied

1119: \cite{clusters,kk,sam,kkkr,french}.  Numerical \cite{clusters} and analytical

1120: \cite{sam,kkkr,french} evidence suggest that the size of the giant component

1121: $G(p)$ near the threshold scales as

1122: \begin{equation}

1123: \label{giant}

1124: G(p)\propto \exp\left(-\,{{\rm const.}\over\sqrt{p_c-p}}\right).

1125: \end{equation}

1126: Therefore, the phase transition of this dynamically grown network is of

1127: infinite order since all derivatives of $G(p)$ vanish as $p\to p_c$.  In

1128: contrast, static random graphs with any desired degree distribution

1129: \cite{reed} exhibit a standard percolation transition

1130: \cite{clusters,reed,chung,dani}.

1131:

1132: \section{Summary}

1133:

1134: In this paper, we have presented a statistical physics viewpoint on growing

1135: network problems.  This perspective is strongly influenced by the phenomenon

1136: of aggregation kinetics, where the rate equation approach has proved

1137: extremely useful.  From the wide range of results that we were able to obtain

1138: for evolving networks, we hope that the reader appreciates both the

1139: simplicity and the power of the rate equation method for characterizing

1140: evolving networks.  We quantified the degree distribution of the growing

1141: network model and found a diverse range of phenomenology that depends on the

1142: form of the attachment kernel.  At the qualitative level, a stretched

1143: exponential form for the degree distribution should be regarded as

1144: ``generic'', since it occurs for an attachment kernel that is sub-linear in

1145: node degree ({\it e.g.}, $A_k\sim k^\gamma$ with $\gamma<1$).  On the other

1146: hand, a power-law degree distribution arises only for linear attachment

1147: kernels, $A_k\sim k$.  However, this result is ``non-generic'' as the degree

1148: distribution exponent now depends on the detailed form of the attachment

1149: kernel.

1150:

1151: We investigated extensions of the basic growing network to incorporate

1152: processes that naturally occur in the development in the web.  In particular,

1153: by allowing for link directionality, the full degree distribution naturally

1154: resolves into independent in-degree and out-degree distributions.  When the

1155: rates at which links are created are linear functions of the in- and

1156: out-degrees of the terminal nodes of the link, the in- and out-degree

1157: distributions are power laws with different exponents, $\nu_{\rm in}$ and

1158: $\nu_{\rm out}$, that match with current measurements on the web with

1159: reasonable values for the model parameters.  We also considered a model with

1160: independent node and link creation rates.  This leads to a network with many

1161: independent components and now the size distribution of these components is

1162: an important characteristic.  We have characterized basic aspects of this

1163: process by the rate equation approach and showed that the network is in a

1164: critical state even away from the percolation threshold.  The rate equation

1165: approach also provides evidence of an unusual, infinite-order percolation

1166: transition.

1167:

1168: While statistical physics tools have fueled much progress in elucidating the

1169: structure of growing networks, there are still many open questions.  One set

1170: is associated with understanding dynamical processes in such networks.  For

1171: example, what is the nature of information transmission?  What governs the

1172: formation of traffic jams on the web?  Another set is concerned with growth

1173: mechanisms.  While we can make much progress in characterizing networks with

1174: idealized growth rules, it is important to understand the actual rules that

1175: govern the growth of the Internet.  These issues appear to be fruitful

1176: challenges for future research.

1177:

1178: \section{Acknowledgements}

1179:

1180: It is a pleasure to thank Francois Leyvraz and Geoff Rodgers for

1181: collaborations that led to some of the work reported here.  We also thank

1182: John Byers and Mark Crovella for numerous informative discussions.  Finally,

1183: we are grateful to NSF grants INT9600232 and DMR9978902 for financial

1184: support.

1185:

1186:

1187: \begin{thebibliography}{99}

1188:

1189: \bibitem{review}

1190:    Recent reviews from the physicist's perspective include:

1191:    S.~H.~Strogatz, Nature {\bf 410}, 268 (2001);

1192:    R.~Albert and A.-L.~Barab\'asi, Rev.\ Mod.\ Phys.\ {\bf 74}, 47 (2002);

1193:    S.~N.~Dorogovtsev and J.~F.~F.~Mendes, Adv.\ Phys. {\bf xx}, xxx (2002).

1194:

1195: \bibitem{bol}

1196:    B.~Bollob\'as, {\it Random Graphs} (Academic Press, London, 1985).

1197:

1198: \bibitem{jan}

1199:    S.~Janson, T.~Luczak, and A.~Rucinski,

1200:    {\it Random Graphs} (Wiley, New York, 2000).

1201:

1202: \bibitem{fff}

1203:    M.~Faloutsos, P.~Faloutsos, and C.~Faloutsos,

1204:    Comp.\ Commun.\ Rev.\ {\bf 29}(4), 251 (1999).

1205:

1206: \bibitem{matta}

1207:    A.~Medina, I.~Matta, and J.~Byers, Comp.\ Commun.\ Rev.\

1208:    {\bf 30}(2), 18 (2000).

1209:

1210: \bibitem{as}

1211:    H.~Tangmunarunkit, J.~Doyle, R.~Govindan, S.~Jamin, S.~Shenker,

1212:    and W.~Willinger, Comp.\ Commun.\ Rev.\ {\bf 31}, 7 (2001).

1213:

1214: \bibitem{kum}

1215:    S.~R.~Kumar, P.~Raphavan, S.~Rajagopalan, and A.~Tomkins,

1216:    in: {\it Proc. 8th WWW Conf.} (1999);

1217:    S.~R.~Kumar, P.~Raphavan, S.~Rajagopalan, and A.~Tomkins,

1218:    in: {\it Proc. 25th VLDB Conf.} (1999);

1219:    J.~Kleinberg, R.~Kumar, P.~Raghavan, S.~Rajagopalan, and

1220:    A.~Tomkins, in: {\it Proceedings of the International Conference on

1221:      Combinatorics and Computing}, Lecture Notes in Computer Science,

1222:    Vol.~1627 (Springer-Verlag, Berlin, 1999).

1223:

1224: \bibitem{BA}

1225:    A.-L.~Barab\'asi and R.~Albert, Science {\bf 286}, 509 (1999);

1226:     R.~Albert, H.~Jeong, and A.-L.~Barab\'asi,  Nature {\bf 401}, 130

1227:    (1999).

1228:

1229: \bibitem{www1}

1230:    B.~A.~Huberman, P.~L.~T.~Pirolli, J.~E.~Pitkow, and R.~Lukose,

1231:    Science {\bf 280}, 95 (1998);

1232:    B.~A.~Huberman and L.~A.~Adamic, Nature {\bf 401}, 131 (1999).

1233:

1234: \bibitem{www2}

1235:    G.~Caldarelli, R.~Marchetti, and L.~Pietronero, Europhys.\ Lett.

1236:    {\bf 52}, 386 (2000)

1237:

1238: \bibitem{www3} A.~Broder, R.~Kumar,

1239:    F.~Maghoul, P.~Raghavan, S.~Rajagopalan, R.~Stata, A.~Tomkins, and

1240:    J.~Wiener, Computer Networks {\bf 33}, 309 (2000).

1241:

1242: \bibitem{lotka}

1243:    A.~J.~Lotka, J. Washington Acad. Sci.\ {\bf 16}, 317 (1926);

1244:    W.~Shockley, Proc.\ IRE {\bf 45}, 279 (1957);

1245:    E.~Garfield, Science {\bf 178}, 471 (1972).

1246:

1247: \bibitem{LS}

1248:    J. Laherr\`ere and D. Sornette, Eur.\ Phys.\ J. B {\bf 2}, 525 (1998).

1249:

1250: \bibitem{redner}

1251:    S.~Redner, Eur.\ Phys.\ J. B {\bf 4}, 131 (1998).

1252:

1253: \bibitem{agg}

1254:    M.~H.~Ernst, in: {\it Fractals in Physics}, edited by L.~Pietronero

1255:    and E.~Tosatti (Elsevier, Amsterdam, 1986), p.~289.

1256:

1257: \bibitem{coarse}

1258:    A.~J.~Bray, Adv. Phys. {\bf 43}, 357 (1994).

1259:

1260: \bibitem{surf}

1261:    A.~Pimpinelli and J.~Villain,

1262:    {\em Physics of Crystal Growth} (Cambridge University Press, Cambridge,

1263:    1998).

1264:

1265: \bibitem{simon}

1266:    The earliest growing network model was proposed to describe word

1267:    frequency:  H.~A.~Simon, Biometrica {\bf 42}, 425 (1955);

1268:    H.~A.~Simon, {\em Models of Man} (Wiley, New York, 1957).

1269:

1270: \bibitem{KRR}

1271:    P.~L.~Krapivsky, G.~J.~Rodgers, and S.~Redner, Phys.\ Rev.\

1272:    Lett.\ {\bf 86}, 5401 (2001).

1273:

1274: \bibitem{gen}

1275:    R.~Albert and A.-L.~Barab\'asi, Phys.\ Rev.\ Lett.\ {\bf 85}, 5234 (2000);

1276:    S.~N.~Dorogovtsev and J.~F.~F.~Mendes, Europhys.\ Lett.\ {\bf 52}, 33

1277:    (2000).

1278:

1279: \bibitem{clusters}

1280:    D.~S.~Callaway, J.~E.~Hopcroft, J.~M.~Kleinberg, M.~E.~J.~Newman,

1281:    and S.~H.~Strogatz, Phys.\ Rev.\ E {\bf 64},

1282:    041902 (2001).

1283:

1284: \bibitem{KRL}

1285:    P.~L.~Krapivsky, S.~Redner, and F.~Leyvraz, Phys.\ Rev.\ Lett.\

1286:    {\bf 85}, 4629 (2000).

1287:

1288: \bibitem{DMS}

1289:    S.~N.~Dorogovtsev, J.~F.~F.~Mendes, and A.~N.~Samukhin, Phys.\ Rev.\

1290:    Lett.\ {\bf 85}, 4633 (2000).

1291:

1292: \bibitem{KR}

1293:    P.~L.~Krapivsky and S.~Redner, Phys.\ Rev.\ E {\bf 63}, 066123

1294:    (2001).

1295:

1296: \bibitem{BiA}

1297:    G.~Bianconi and A.-L.~Barab\'asi, Europhys.\ Lett.\ {\bf 54}, 436 (2000).

1298:

1299: \bibitem{knuth}

1300:    R.~L.~Graham, D.~E.~Knuth, and O.~Patashnik,

1301:    {\em Concrete Mathematics: A Foundation for Computer Science},

1302:    (Reading, Mass.: Addison-Wesley, 1989).

1303:

1304: \bibitem{percolation}

1305:    See {\it e.g.}, D. Stauffer and A. Aharony,

1306:    Introduction to Percolation Theory (Taylor \& Francis, London, 1992).

1307:

1308: \bibitem{kk}

1309:    L.~Kullmann and J.~Kert\'esz, Phys.\ Rev.\ E {\bf 63},

1310:    051112 (2001);  D.~Lancaster, {\it cond-mat}/0110111.

1311:

1312: \bibitem{sam}

1313:    S.~N.~Dorogovtsev, J.~F.~F.~Mendes,  and A.~N.~Samukhin,

1314:    Phys.\ Rev.\ E {\bf 64}, 066110 (2001).

1315:

1316: \bibitem{kkkr}

1317:    J.~Kim, P.~L.~Krapivsky, B.~Kahng, and S.~Redner,

1318:    {\it cond-mat}/0203167.

1319:

1320: \bibitem{french}

1321:    M.~Bauer and D.~Bernard, {\it cond-mat}/0203232.

1322:

1323: \bibitem{reed}

1324:    M.~Molloy and B.~Reed, Random Struct.\ Alg.\ {\bf 6}, 161 (1995);

1325:    Combin.\ Probab.\ Comput.\ {\bf 7}, 295 (1998).

1326:

1327: \bibitem{chung}

1328:    W.~Aiello, F.~Chung, and L.~Lu,

1329:    in: {\it Proc. 32nd ACM Symposium on Theory of Computing} (2000).

1330:

1331: \bibitem{dani}

1332:    R.~Cohen, K.~Erez, D.~ben-Avraham, and S.~Havlin, Phys.\ Rev.\ Lett.\

1333:    {\bf 85}, 4626 (2000);

1334:    M.~E.~J.~Newman, S.~H.~Strogatz, and D.~J.~Watts,

1335:    Phys.\ Rev.\ E {\bf 64}, 026118 (2001).

1336:

1337: \end{thebibliography}

1338: \end{document}

1339:

1340: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1341: \bibitem{intr}

1342:   L.~Egghe and R.~Rousseau, {\it Introduction to Informetrics}

1343:   (Elsevier, 1990).

1344:

1345: \bibitem{gil}

1346:   N.~Gilbert, Sociol.\ Res.\ {\bf 2}, 2 (1997).

1347:

1348: \bibitem{new}

1349:   M. E. J. Newman, cond-mat/0007214.

1350:

1351: \bibitem{zan}

1352:    D.~H.~Zanette and S.~C.~Manrubia,

1353:    {\it nlin.AO}/0009046.

1354:

1355: \bibitem{mg}

1356:    T.~Matsoukas and E.~Gulari, J.\  Coll.\ Interface Sci. {\bf

1357:      132}, 13 (1989).

1358:

1359: \bibitem{isi}

1360:   {\it Science Citation Index Journal Citation Reports}

1361:   (Institute for Scientific Information, Philadelphia).  Web site:

1362:   http://www.isinet.com/welcome.html.

1363:

1364: \bibitem{larson}

1365:    R.~Larson, in: {\it Ann.\ Meeting Amer.\ Soc.\ Info.\ Sci.}

1366:    (1996).

1367:

1368: \bibitem{b2}

1369:    A.-L.~Barab\'asi, R.~Albert, and H.~Jeong, Physica A {\bf 272}, 173 (1999).

1370:

1371: \bibitem{j1}

1372:    S.~N.~Dorogovtsev and J.~F.~F.~Mendes, Phys.\ Rev.\ E {\bf 62}, 1842

1373:    (2000).

1374:

1375: \bibitem{burda}

1376:    Z.~Burda, J.~D.~Correia, and A.~Krzywicki, Phys.\ Rev.\ E {\bf 64},

1377:    046118 (2001).

1378:

1379: \bibitem{BKT}

1380:    V.~L.~Berezinskii, Sov.\ Phys.\ JETP {\bf 32}, 493 (1970);

1381:    J.~M.~Kosterlitz and D.~J.~Thouless, J.\ Phys.\ C {\bf 6}, 1181 (1973).

1382:

1383:

1384: Let us fix one degree, e.g., the in-degree $i$ (we consider large $i$) and

1385: vary the out-degree $j$.  The average out-degree always scales linearly with

1386: the in-degree, $\langle j\rangle=iq$, implying that popular nodes at average

1387: have large out-degrees.  However, the maximum of the joint degree

1388: distribution scales linearly with the in-degree, $j=i(\lambda-1)/(2+\lambda

1389: q^{-1})$, only when $\lambda>1$. Thus popular nodes typically have small

1390: out-degrees when $\lambda\leq 1$.

1391:

1392: An interesting feature of this distribution becomes evident if we fix the

1393: in-degree $i$ and vary the out-degree $j$.  If $\lambda>1$, the degree

1394: distribution reaches a maximum when $j=i(\lambda-1)/(2+\lambda q^{-1})$ (here

1395: we consider large $i$).  The average out-degree always scales linearly with

1396: the in-degree, $\langle j\rangle=iq$.  Thus, popular nodes tend to have large

1397: out-degrees.  The dual property holds as well: Nodes with large out-degree

1398: tend to be popular.

1399:

1400: For completeness, the analytical form of the degree distributions for

1401: $\lambda_{\rm in}=\lambda_{\rm out}=1$ are

1402: \begin{subeqnarray}

1403: \label{IOd}

1404: I_{i}&=&{2\over {\cal D}}\,{\Gamma(i+1)\,\Gamma(\nu)\over

1405: \Gamma(i+1+\nu)}=O_i,\\

1406: n_{d}&=&{2\over {\cal D}}\,{\Gamma(d+2)\,\Gamma(1+\nu)\over

1407: \Gamma(d+2+\nu)},\\

1408: n_{ij}&=&{2\over {\cal D}}\,{\Gamma(d+1)\,\Gamma(1+\nu)\over

1409: \Gamma(d+2+\nu)},

1410: \end{subeqnarray}

1411: where $n_{d}=\sum_{i+j=d} n_{ij}$ is the density of nodes with total degree

1412: $d$ and $\nu=2(1+{\cal D}^{-1})$.  Thus for the kernel $K(j,i)=(i+1)(j+1)$,

1413: all distributions are algebraic over the entire range of the corresponding

1414: degrees and have an especially neat form.

1415:

1416: