1: % Uppercase A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
2: % Lowercase a b c d e f g h i j k l m n o p q r s t u v w x y z
3: % Digits 0 1 2 3 4 5 6 7 8 9
4: % Exclamation ! Double quote " Hash (number) #
5: % Dollar $ Percent % Ampersand &
6: % Acute accent ' Left paren ( Right paren )
7: % Asterisk * Plus + Comma ,
8: % Minus - Point . Solidus /
9: % Colon : Semicolon ; Less than <
10: % Equals = Greater than > Question mark ?
11: % At @ Left bracket [ Backslash \
12: % Right bracket ] Circumflex ^ Underscore _
13: % Grave accent ` Left brace { Vertical bar |
14: % Right brace } Tilde ~
15: %------------------------------------------------------------------
16: \documentclass{elsart}
17: \usepackage[T1]{fontenc}
18: \usepackage{amsmath,amssymb,mathrsfs,epsfig}
19: \newcommand{\mean}[1]{\langle #1\rangle}
20: % command for typesetting mean value of some quantity
21: \def\dd{\mathrm{d}}
22: % command for differential 'd' (\d mentioned in
23: % the file instrau.ps doesn't work for some reason)
24: % -------------------------------------------------
25: \begin{document}
26: \begin{frontmatter}
27: \journal{Physica A}
28: \title{Distance-dependent connectivity:
29: Yet Another Approach to the Small World Phenomenon}
30: \author{Mat\' u\v s Medo}
31: \address{Department of Theoretical Physics and Physics
32: Education,\\
33: Mlynsk\' a dolina, 842 48 Bratislava, Slovak Republic}
34: \ead{medo@fmph.uniba.sk}
35: \begin{abstract}
36: We investigate a relationship network of humans located
37: in a metric space where relationships are drawn according
38: to a distance-dependent probability density. The obtained
39: spatial graph allows us to calculate the average separation
40: of people in a very simple manner. The acquired results agree
41: with the well-known presence of the small-world phenomenon
42: in human relationships. They indicate that this feature can
43: be understood merely as a consequence of the probability
44: composition. We also examine how this phenomenon evolves
45: with the development of human society.
46: \end{abstract}
47: \begin{keyword}
48: Small-world phenomenon, random networks, spatial graphs, convolution.
49: \end{keyword}
50: \end{frontmatter}
51:
52: \section{Introduction}
53: In the 1960's, the american social psychologist Stanley Milgram
54: examined how people know each other and introduced a quantity
55: named the degree of separation $D$. It is the number of people
56: needed to bind two chosen persons via a chain of acquaintances.
57: E. g., if persons $A$ and $B$ do not know each other, but they
58: have a common friend $C$, their degree of separation is
59: $D(A,B)=1$. Milgram measured the mean degree of separation
60: between people in USA and found a surprisingly small value,
61: $\mean{D}=6$. This gave another name to this phenomenon -- "six
62: degrees of separation".
63:
64: Nowadays, if a network has small average distance between its
65: vertices together with a large value of the average clustering
66: coefficient, we say that small world phenomenon is present
67: and such a network is called small world network (SWN). We often
68: encounter SWP in random networks.
69:
70: Clearly "To be an acquaintance" is a somewhat vague statement.
71: There are various possible definitions -- e. g. shaking the
72: other's hands, talking to each other for at least one hour, etc.
73: Fortunately, results do not depend significantly on the specific
74: choice, SWN was observed in all of those cases. However, the
75: number six in the name of the phenomenon can not be taken
76: literally. Actually it is just an expression for the number,
77: which is very small compared to the size of the investigated
78: population, which is taken to be $6\,400$ millions (the
79: approximate number of people on the Earth) in this article.
80:
81: Nowadays, SWP is a well known feature of various natural and
82: artificial random graphs \cite{Watts}. Article citations,
83: World Wide Web, neural networks and other examples exhibit this
84: feature \cite{Dorogo-Mendes,Albert-Bar}.
85:
86: There are many ways to construct a SWN. Some models are rather
87: mathematical and do not examine the mechanism of the origin of a
88: network. They impose some heuristic rules (e. g.
89: \cite{Erd-Ren1,Erd-Ren2,Watts-Strog}) instead. Other models look
90: for the reasons for the introduced rules. This is much more
91: satisfactory from the physicist's point of view. The first such
92: model is known as "preferential linking" \cite{Bar-Albert}. It
93: is quite reasonable for cases like the growth of the WWW, where
94: sites with many links to them are well known and in the future
95: will presumably attract more links than poorly linked pages.
96:
97: In this work we focus on the random network of human
98: relationships. It evolves in a very complicated manner, therefore
99: it is very hard to impose some well accepted rules for its
100: growth. Hence, we do not look for the time evolution of human
101: acquaintances. Instead, we inquire a static case with the random
102: network already developed.
103:
104: If the acquaintance between $A$ and $B$ is present, we link them
105: with an edge. We obtain the random graph of human relationships
106: in this way. We can introduce a metric to this network by
107: assigning a fixed position in the plane to every person. In
108: order to obtain analytical results, we assume a constant
109: population density. In particular, we suppose that
110: people--vertices are distributed regularly and form a square
111: lattice in the plane. With proper rescaling, the edges of unit
112: squares in this lattice have length $1$. Further we assume that
113: the probability that two people know each other, depends on their
114: distance by means of some distribution function. This model
115: should keep some basic features of the real random network of
116: human relationships.
117:
118: \section{The Mathematical Model}
119: Let's have an infinite square lattice where squares have sides
120: equal to $1$ and there is one person in every vertex. We label
121: the probability that two people with distance $d$ know each other
122: $Q(d)$. We assume homogeneity of the population, therefore this
123: probability function is the same for every pair.
124:
125: Summation of $Q(d)$ through all vertices leads to the average
126: number of acquaintances for any person which we denote
127: $N_{\mathrm A}$. Next, we assume that the function $Q(d)$ is
128: changing slowly on the scale of $1$. Therefore, we can change
129: summation to integration and obtain
130: \begin{equation}
131: \label{normalization}
132: N_{\mathrm A}=\int_{-\infty}^{\infty}\dd x
133: \int_{-\infty}^{\infty}\dd y\,Q\big(\sqrt{x^2+y^2}\big).
134: \end{equation}
135: Our aim is to quantify the average degree of separation
136: $\mean{D}$ for couples with the same geometrical distance equal
137: to $b$. To achieve this, we choose two such people and label
138: them $A$ and $B$ with positions $\vec{r}_A=[0,0]$,
139: $\vec{r}_B=[b,0]$ (this particular choice will not affect our
140: results substantially).
141:
142: \section{An Analytical Solution}
143: Every person in the lattice can be located by its coordinates
144: $[x,y]$. We will denote the distance between $X$ and $Y$ as
145: $d_{XY}$. Let us introduce a symbol $\sim$ for the relation of
146: acquaintances. This is a binary relation which is symmetric but
147: not transitive. The probability that $X$ knows $Y$ is then
148: $P(X\sim Y)=Q(d_{XY})\equiv Q_{XY}$.
149:
150: We name $P(D)$ the probability that the degree of separation for
151: $A$ and $B$ with distance $b$ is equal to the number $D$.If we
152: want to find out the average degree of separation in our present
153: network, $\mean{D}$, we need to know the probabilities $P(D)$
154: for all different values of $D$. At the moment, only $P(0)$ is
155: known, since apparently $P(0)=Q(d_{AB})=Q(b)$.
156: \[
157: \includegraphics[scale=1.2]{sw_figs.1}
158: \]
159: Let's examine the degree of separation $D=2$. This means that
160: there are two other persons on the path between $A$ and $B$.
161: We denote their coordinates as $\vec{r}_1=(x_1,y_1)$ and
162: $\vec{r}_2=(x_2,y_2)$. For the presence of such a track, edges
163: $A1$, $12$ and $2B$ are needed together with edges $A2$, $1B$ and
164: $AB$ missing (see picture above). Since their presence is
165: independent, we have
166: \begin{eqnarray}
167: \label{first}
168: P(2)&=&\sum_{1,2}Q_{A1}Q_{12}Q_{2B}\big(1-Q_{A2}\big)
169: \big(1-Q_{1B}\big)\big(1-Q_{AB}\big)\approx\notag\\
170: &\approx&\iint\limits_{1,2}Q_{A1}Q_{12}Q_{2B}\big(1-Q_{A2}\big)
171: \big(1-Q_{1B}\big)\big(1-Q_{AB}\big)\,\dd\vec{r}_1\dd\vec{r}_2.
172: \end{eqnarray}
173: where the summation runs through various placements of persons
174: $1$ and $2$. The change of summation to integration is possible
175: due to the fact that $Q(d)$ is changing slowly on the scale of
176: $1$.
177:
178: Here we utilized the fact that in probabilities addition rule
179: $P(A\cup B)=P(A)+P(B)-P(A\cap B)$ we can neglect the last term
180: since probabilities $P(A)$, $P(B)$ are small and $P(A\cap B)$ is
181: of the higher order of smallness. Unfortunately due to this
182: approximation we apparently reach "probabilities" $P(D)$ higher
183: than $1$ for high enough value of $D$. Though probabilities
184: $P(D)$ small with respect to $1$ can be considered accurate.
185: This implies that obtained results cannot be used to evaluate
186: the exact value of the average degree of separation for nodes
187: $A$ and $B$ because in such a calculation we would need value of
188: $P(D)$ for every $D$. Still from the growth of $P(D)$ we can
189: easily see for which $D^*$ it reaches relevant values, e. g.
190: $P(D^*)=1/3$. This $D^*$ then characterizes the mean degree of
191: separation of $A$ and $B$.
192:
193: We can compute the first approximation to (\ref{first}), getting
194: \begin{equation}
195: \label{1approx}
196: P(2)^{(0)}=\iint\limits_{1,2}Q_{A1}Q_{12}Q_{2B}
197: \,\dd\vec{r}_1\,\dd\vec{r}_2.
198: \end{equation}
199: As $Q_{A1}=Q(x_1-0,y_1-0)$, $Q_{12}=Q(x_2-x_1,y_2-y_1)$, and
200: $Q_{2B}=Q(b-x_2,0-y_2)$ we notice that (\ref{first}) is a double
201: convolution of the function $Q(d)$ enumerated at point $(b,0)$.
202: Thus we can write
203: \[
204: P(2)^{(0)}=\big[Q\ast Q\ast Q\big](b,0)\implies
205: P(D)^{(0)}=Q^{[D]}(b,0).
206: \]
207: For the Fourier transformation of the convolution, the following
208: equation holds:
209: \[
210: \mathscr{F}\big\{Q^{[D]}\big\}=\big(\mathscr{F}\{Q\}\big)^D.
211: \]
212: Using this formula we can write $P(D)^{(0)}$ in the form
213: \begin{equation}
214: \label{outcome}
215: P(D)^{(0)}=\mathscr{F}^{-1}
216: \Big\{\big(\mathscr{F}[Q]\big)^D\Big\}(b,0).
217: \end{equation}
218:
219: The mean clustering coefficient $\mean{C}$ is the probability
220: that two acquaintances of $A$ know each other. It can be
221: evaluated in a way very similar to the calculation of
222: $P(D)$, the corresponding graph is on the picture below.
223: \[
224: \includegraphics[scale=1.2]{sw_figs.2}
225: \]
226: In order to write down an expression for $\mean{C}$ it is
227: straightforward to rewrite (\ref{first}). We obtain the number of
228: connected triples $A12$ with node $A$ fixed by this integration.
229: We just have to avoid double counting of every track
230: (interchanging positions of $1$ and $2$) -- this brings an
231: additional factor of $1/2$. The average number of acquaintances
232: for every vertex is $N_{\mathrm A}$, therefore the average number
233: of possible triples is
234: $N_{\mathrm A}(N_{\mathrm A}-1)/2\approx N_{\mathrm A}^2/2$.
235: The mean clustering coefficient is the ratio of the average
236: number of triples to the average number of possible triples.
237: That is,
238: \begin{eqnarray}
239: \label{cc}
240: \mean{C}&=&\frac1{N_{\mathrm A}^2}
241: \iint\limits_{1,2}Q_{A1}Q_{12}Q_{2A}
242: \,\dd\vec{r}_1\,\dd\vec{r}_2=
243: \frac1{N_{\mathrm A}^2}\big[Q\ast Q\ast Q\big](0,0)=\notag\\
244: &=&\frac1{N_{\mathrm A}^2}\mathscr{F}^{-1}
245: \Big\{\big(\mathscr{F}[Q]\big)^3\Big\}(0,0).
246: \end{eqnarray}
247:
248: Equations (\ref{outcome}) and (\ref{cc}) are solutions of the
249: problem. Unfortunately, the relevant functions $Q(d)$ (see
250: next section) do not have an analytical form of their forward
251: and inverse Fourier transformation. Therefore we have to
252: calculate the values of $\mean{C}$ and $P(D)$ numerically.
253: Equation (\ref{outcome}) requires a very high calculation
254: precision. This makes the evaluation of $P(D)$ very slow and
255: even with some clever treatment (see Appendix A) it is in
256: practise impossible for high values of $b$. This is just our
257: case, because we are interested in $b=50\,000$. Thus some other
258: (approximate) approach is needed. First we have to find more
259: about the nature of function $Q(d)$.
260:
261: \section{An Empirical Entries}
262: In the present, there are approximately $6\,400$ millions people
263: on the Earth. It means that length of the assumed square lattice
264: side is $2L=80\,000$. In order to obtain a numeric results we
265: choose $b=50\,000$ and the average number of acquaintances
266: $N_{\mathrm A}=1\,000$.
267:
268: To get some insight on the distribution $Q(d)$, some analysis is
269: needed. First it is clear that $Q(d)$ should be decreasing with
270: $d$. Moreover, closely living people know each other almost
271: certainly. That is
272: \begin{equation}
273: \label{limit}
274: \lim_{d\to0} Q(d)=1.
275: \end{equation}
276: Together with (\ref{normalization}) we now have two requirements
277: for $Q(d)$. Indeed, there are many functions satisfying them,
278: e. g. we can choose $Q(d)=C\exp[-r/a]$.
279:
280: The last quantity we can compute is the average number of
281: {\em distant people\/} every person in the lattice know,
282: $N_{\mathrm d}$. Here {\em distant\/} means that people's
283: distance from the chosen fixed person (node) is greater than
284: $L/2$. This is a simple analogy to the number of people we know
285: on the other side of the Earth. So we have
286: \begin{equation}
287: N_{\mathrm d}=N_{\mathrm A}-2\pi\int\limits_0^{L/2}rQ(r)\,\dd r.
288: \end{equation}
289: If exponential distribution discussed above satistfies
290: (\ref{normalization}) and (\ref{limit}) it folllows that
291: $N_{\mathrm d}\approx10^{-13}$. This is in a clear contradiction
292: to the fact that there are people who have very distant friends.
293: Still we can improve $N_{\mathrm d}$ if we use stretched exponential
294: $Q(d)=\exp[-K\,d^a]$ with exponent $a$ between $0.2$ and
295: $0.3$.\footnote{Approximate solution presented in next section
296: can be used also for this distribution.}
297: However, if we check $Q(1)$ (probability to know our closest person)
298: it is well below $0.3$. Stretched exponentials therefore satisfy
299: condition (\ref{normalization}) just formally and we will not it
300: discuss it later. Moreover, mean clustering coefficient is then
301: very small (from $2\cdot 10^{-4}$ to $3\cdot10^{-3}$).
302:
303: Now it's clear that distribution $Q(d)$ can't decrease so fast as
304: exponential functions, wide tails are inevitable in our model.
305: This leads us to power-law distributions $1/x^a$. According to
306: (\ref{limit}) we demand
307: \begin{equation}
308: \label{powerlaw}
309: Q_a(d)=\frac1{1+bd^a},
310: \end{equation}
311: where $b$ is fixed by (\ref{normalization}). Number of far
312: friends now ranges from $N_{\mathrm d}\approx0.01$ ($a=3.5$)
313: to $N_{\mathrm d}\approx 17$ ($a=2.5$). This range of exponents
314: gives us reasonable range for values of $N_{\mathrm d}$.
315:
316: In this article we also show results for the normal distribution
317: $Q_{\mathrm n}(d)=\exp[-ad^2]$ ($N_{\mathrm d}\approx0$)
318: and the uniform distribution within fixed radius
319: $Q_{\mathrm u}(d)=\vartheta(R_{\mathrm A}-d)$ ($N_{\mathrm d}=0$).
320:
321: With regard to the fact that all used distributions $Q(d)$
322: approach to zero for large values of $d$ it is almost certain
323: that the shortest chain of acquaintances between chosen $A$ and
324: $B$ do not run out of the examined lattice with side $80\,000$.
325: Therefore it doesn't matter if we have integration (summation)
326: bound in infinity or $\pm L=\pm 40\,000$. This allows us to
327: use all results derived for infinite lattice in the real case
328: of finite lattice.
329:
330: \section{An Approximate Solution for Power-law Distributions}
331: To demonstrate the calculation we take $P(2)$ as an example
332: again. In the previous section we found out that power-law
333: distributions are especially important in our model. Their joint
334: probability $Q(r_1)Q(b-r_1)$ has sharp maximum for $r_1=0$
335: and low minimum for $r_1=b/2$. Their ratio is
336: \[
337: \frac{Q(b/2)^2}{Q(b)Q(0)}\approx\Big(\frac{4}{b}\Big)^a
338: \]
339: where $a$ is the exponent in (\ref{powerlaw}). This implies that
340: in (\ref{first}) we can constrain summation to
341: $r_{A1},r_{A2}\ll b$ or $r_{B2},r_{B1}\ll b$ or
342: $r_{A1},r_{B2}\ll b$ (see picture below).
343: \[
344: \includegraphics[scale=1.2]{sw_figs.3}
345: \]
346: Here we obtained three different diagrams. Let's examine first
347: one in detail.
348:
349: Since edges $AB$ and $B1$ are long we can write
350: \[
351: P(2)\approx\iint\limits_{1{,}2}
352: Q_{A1}Q_{12}Q_{2B}\big(1-Q_{A2}\big)
353: \,\dd\vec{r}_1\,\dd\vec{r}_2.
354: \]
355: It is easy to show that for power-law distributions
356: $Q(b-r_1)\approx Q(b)\equiv Q_{AB}$ when $r_1\ll b$.
357: Thus we have (for corresponding diagram see picture below)
358: \begin{align*}
359: P(2)&\approx\iint\limits_{1{,}2}
360: Q_{A1}Q_{12}Q_{AB}\big(1-Q_{A2}\big)
361: \,\dd\vec{r}_1\,\dd\vec{r}_2=\\
362: &=Q(b)\iint\limits_{1{,}2}
363: Q_{A1}Q_{12}\,\dd\vec{r}_1\,\dd\vec{r}_2-
364: Q(b)\iint\limits_{1{,}2}
365: Q_{A1}Q_{12}Q_{A2}\,\dd\vec{r}_1\,\dd\vec{r}_2.
366: \end{align*}
367: \[
368: \includegraphics[scale=1.2]{sw_figs.4}
369: \]
370: Both integrals are easy to compute. Second one brings average
371: clustering coefficient $\mean{C}$ into account. The result is
372: \[
373: P(2)\approx Q(b)N_{\mathrm A}^2\big(1-\mean{C}\big).
374: \]
375: Remaining two diagrams for $P(2)$ can be evaluated in the same
376: way.
377:
378: In the computation of $P(D)$ for higher values of $D$ we
379: encounter products of kind $(1-Q_{13})(1-Q_{24})\ldots$ even
380: after neglecting probabilities $Q_{ij}$ for long edges $ij$.
381: Here we can make first order approximation
382: \[
383: (1-Q_{13})(1-Q_{24})\approx 1-Q_{13}-Q_{24}
384: \]
385: which is valid almost everywhere except small spatial region
386: that do not contributes substantially (see section Results and
387: discussion). Moreover, second approximation considering
388: terms $Q_{13}Q_{24}$ would increase evaluated probabilities.
389: Therefore first approximation results will be some lower bound
390: estimates of $P(D)$.
391:
392: Higher values of $D$ introduce long closed loops of kind
393: $A12\ldots nA$ ($n\leq D$). Corresponding integrals can be
394: carried out in the same way like it was presented in the
395: derivation of (\ref{cc}). Finally we obtain
396: \begin{equation}
397: \label{cn}
398: C_n\equiv\frac{1}{N_{\mathrm A}^n}\iint\limits_{1{,}2}
399: Q_{A1}Q_{12}\cdots Q_{nA}\,\dd\vec{r}^n=
400: \frac{1}{N_{\mathrm A}^n}\mathscr{F}^{-1}
401: \Big\{\big(\mathscr{F}[Q]\big)^n\Big\}(0,0).
402: \end{equation}
403: This helps us to find values of $C_n$ for any $n$. Clearly
404: $C_2=\mean{C}$. With the use of such a closed loop integrals
405: we can write
406: \begin{equation}
407: \label{final}
408: \begin{aligned}
409: P(0)&=Q(b),\\
410: P(1)&=Q(b)N_{\mathrm A}2,\\
411: P(2)&=Q(b)N_{\mathrm A}^2\big(3-2C_2\big),\\
412: P(3)&=Q(b)N_{\mathrm A}^3\big(4-6C_2-2C_3\big),\\
413: P(4)&=Q(b)N_{\mathrm A}^4\big(5-12C_2-6C_3-2C_4\big),\\
414: P(5)&=Q(b)N_{\mathrm A}^5
415: \big(6-20C_2-12C_3-6C_4-2C_5\big),\,\ldots
416: \end{aligned}
417: \end{equation}
418:
419: \section{Results and Discussion}
420: In this section we summarize results for various distributions
421: $Q(d)$ ranging from the uniform $Q_{\mathrm u}$ and normal
422: $Q_{\mathrm n}$ to power-law distributions $Q_{3.5}$--$Q_{2.5}$
423: (see (\ref{powerlaw})) and flat distribution $Q_{\mathrm ER}$.
424: This list is sorted according to the quantity of long shortcuts
425: in such networks of relationships.
426:
427: \subsection*{Flat Distribution}
428: If we have flat distribution $Q_{\mathrm ER}$, every pair of
429: vertices is connected with the same probability $p$. It is shown
430: in \cite{Erd-Ren1} that in the network consisting of $N$ vertices
431: holds
432: \[
433: \mean{l}\approx\frac{\ln N}{\ln pN}.
434: \]
435: Here $pN$ is the average number of acquaintances for a person in
436: the network, $pN=N_{\mathrm A}$. We have $N_{\mathrm A}=1\,000$
437: and $N=6\,400$ millions thus
438: $D^*_{\mathrm ER}=\mean{l}-1\approx 2.3$ and $\mean{C}\approx0$.
439:
440: \subsection*{Uniform Distribution Within Fixed Radius}
441: We can discuss such a case where every person knows just
442: $N_{\mathrm A}$ closest neighbors. This leads us to the distribution
443: $Q_{\mathrm u}(d)=\vartheta(R_{\mathrm A}-d)$ where distance
444: $R_{\mathrm A}$ is fixed by (\ref{normalization}). It gives us
445: $R_{\mathrm A}=\sqrt{N_{\mathrm A}/\pi}$ and therefore
446: \[
447: D^*_{\mathrm u}\approx b\,\sqrt{\frac{\pi}{N_{\mathrm A}}}.
448: \]
449: It's worth to note that we don't have any randomness in this
450: model thus
451: $D^*_{\mathrm u}=\mean{D_{\mathrm u}}$.\footnote{Randomness
452: can be introduced by random placement of vertices. Hence we
453: obtain
454: so called random geometric graphs discussed in \cite{Penrose}. This
455: approach is complementary to presented one where vertex placement is
456: fixed but their connecting is due to some probability distribution.}
457:
458: \subsection*{Normal Distribution}
459: The only distribution which allows us to evaluate (\ref{outcome})
460: analytically is normal distribution $Q_{\mathrm n}$. The
461: result is
462: \[
463: P(D)=\frac{N_{\mathrm A}^D}{D+1}\exp
464: \bigg[-\frac{\pi b^2}{N_{\mathrm A}(D+1)}\bigg].
465: \]
466: It was argued before that a solution of the equation
467: $P(D^*_{\mathrm n})=1/3$ characterizes value of the mean degree
468: of separation $\mean{D}_{\mathrm n}$. For $N_{\mathrm A}=1\,000$
469: and $b=50\,000$ we can use some approximations which lead us to
470: \[
471: D^*_{\mathrm n}\approx n_{\mathrm n}\approx
472: b\,\sqrt{\frac{\pi}{N_{\mathrm A}\ln N_{\mathrm A}}}=
473: \frac{D^*_{\mathrm u}}{\sqrt{\ln N_{\mathrm A}}}.
474: \]
475: The actual value of $D^*_{\mathrm n}$ is about one third of
476: $D^*_{\mathrm u}$ (this is due to the existence of some
477: longer connections in the network, although it is extremely
478: suppressed by the exponential decay). We can note that both
479: $D^*_{\mathrm u}$ and $D^*_{\mathrm n}$ scale with $b^1$. This
480: clearly differs from $\ln b$ scaling of the Erd\" os-R\' enyi
481: model. The clustering coefficient $\mean{C}$ can be evaluated
482: easily both for normal and uniform distribution. We obtain high
483: values of $\mean{C}$ (see graph below) in both cases. This
484: agrees with our expectations.
485:
486: \subsection*{Power-law Distributions}
487: Numerical computation of coefficients $C_2,\ldots,C_5$
488: with (\ref{cn}) is rather fast -- their values are shown
489: in the table below.
490: \begin{center}
491: \begin{tabular}{|c|c|c|c|}
492: \hline
493: & $a=2.5$ & $a=3.0$ & $a=3.5$\\
494: \hline
495: $C_2$ & 0.068 & 0.154 & 0.233\\
496: $C_3$ & 0.030 & 0.090 & 0.153\\
497: $C_4$ & 0.016 & 0.059 & 0.109\\
498: $C_5$ & 0.009 & 0.042 & 0.084\\
499: \hline
500: \end{tabular}
501: \end{center}
502: Substituting these values into (\ref{final}) leads us to
503: values of mean degree of separation which are marked in the
504: Fig. \ref{fig:graf}. We see that power-law distributions
505: $Q(d)$ results in high values of mean clustering coefficients
506: $\mean{C}=C_2$ together with small values of $D^*$ (from $6$
507: to $4$). Thus small world phenomenon is clearly present in
508: these networks.
509: \begin{figure}[b]
510: \begin{center}
511: \includegraphics[scale=1.1]{sw_figs.5}
512: \qquad
513: \includegraphics[scale=1.1]{sw_figs.6}
514: \end{center}
515: \caption{Graphs of mean degree of separation and clustering
516: coefficients for various distribution functions.}
517: \label{fig:graf}
518: \end{figure}
519:
520: One can also ask for some comparison with the well-known
521: Barabasi-Albert model. Mean vertex degree is then
522: $\mean{k}=2m$ and mean clustering coefficient is
523: $\mean{C}=(m-1)\ln^2N/8N$ (here $m$ is degree of just
524: added vertices, see \cite{FFH}). With respect to our choice
525: $N_{\mathrm A}=1\,000$, $N=6.4\cdot 10^9$ it follows that
526: $m=500$ and $\mean{C}\approx 10^{-11}$. This is certinaly
527: nonrealistic value, our model gives better estimation
528: of $\mean{C}$.
529:
530: For presented values of coefficients $C_i$ expressions in
531: parentheses in (\ref{final}) do not fall close to zero for
532: quite wide range of values of $D$. Therefore we can (very
533: approximately) write
534: \[
535: P(D)\approx Q(b)DN_{\mathrm A}^D.
536: \]
537: Solution of the equation $P(D^*)=1/3$ is approximately
538: $D^*\approx a\ln b/\ln N_{\mathrm A}$. With $b$ this scales as
539: $\ln b$. This is very different from $b^1$ scaling of $\mean{D}$
540: for the uniform a normal distribution. Such a scaling is similar
541: to the scaling in the Erd\" os-R\' enyi model, though values of
542: clustering coefficient are kept high as we demanded in the
543: introduction.
544:
545: Probability $P(2)$ can be evaluated also by straightforward
546: summation in accordance with (\ref{first}) although it takes
547: huge amount of computer time. Obtained values agree very well
548: with results presented above for all examined exponents but
549: $2.5$ -- this case requires more computer time than it was
550: given. Computation of $P(3)$ in the same way exceeds our
551: computer possibilities for every exponent but we do not
552: regard it necessary.
553:
554: \subsection*{Time Evolution and Some Limitations}
555: Human relationships in modern world are much more widespread
556: than it was in the past. One can think of slowly changing
557: exponent of the power-law distribution function $Q(d)$ from large
558: values to smaller (perhaps resulting to almost flat distribution
559: in the future -- internet helps to bridge the distances). According
560: to the Fig. \ref{fig:graf} we see that this would affect exact
561: value of clustering coefficient. However it would remain high
562: enough for wide range of exponents. Similarly changes of mean
563: degree of separation are not important at all -- it remains very
564: small compared to the size of human population.
565:
566: Finally it has to be noted that in the described model we do not
567: consider presence of some organized hierarchic structures in
568: human society. E. g. chief of the firm knows his employees, but
569: he also knows another chiefs who know their employees, etc.
570: Amount of people involved in the hierarchical tree grows
571: exponentially with the number of its levels. Such an arrangement
572: therefore introduces additional way how to know each other with
573: small resulting degree of separation. In presented calculation we
574: didn't include this effect. Yet there is one important insight.
575: If we proved the degree of separation being small without
576: considering of the hierarchies, their presence would even
577: decrease it.
578:
579: \section{Conclusion}
580: We have examined the mean degree of separation and the
581: clustering coefficient for a random network of human
582: relationships in this article. We were able to compute these
583: quantities in our model. For a power-law decay of probability
584: $Q(d)$, we obtained a small mean degree of separation compared
585: to the size of the network, along with a large value of the mean
586: clustering coefficient. Both of these features are typical for
587: small world networks. Thus we have shown that the small world
588: phenomenon can be understood as a simple consequence of
589: additivity of probabilities.
590:
591: We saw that the style of calculation depends on the used
592: distribution $Q(d)$. The computation was finished analytically
593: for some special cases. In other cases, thanks to some
594: approximations, we utilized the advantage of (\ref{cc}) where $b$
595: do not enter the inverse Fourier transformation, making it easy
596: to evaluate numerically.
597:
598: It's worth to note that the model solved herein is similar to the
599: Watts and Strogatz model \cite{Watts-Strog} where long shortcuts
600: were introduced by a random rewiring procedure. In our model long
601: shortcuts are present thanks to wide tails of power-law
602: distributions. This model model brings two basic advantages. First,
603: the derivation and the resulting relations for $C$ and $D^*$ are
604: more simple. Moreover, our model has more realistic foundations. Nevertheless, the typical behavior of this model is the same as
605: in previous models. The introduction of long shortcuts to the system
606: decreases the average degree of separation rapidly, but also keeps
607: the clustering coefficient high enough for the so called small world
608: phenomenon to appear.
609:
610: \appendix
611: \section{Numerics of the Fourier Transformation}
612: The Fourier integrals encountered in the solution of presented
613: problem can not be solved analytically thus numerical techniques
614: have to be used. In the inverse Fourier transform this is
615: especially awkward because we meet rapidly oscillating term
616: $\exp[\mathrm{i}\,bu]$. Here $b$ is the distance
617: between chosen persons $A$ and $B$, by assumption big number
618: ($b=50\,000$). Therefore we have to compute Fourier
619: transformation of $f(d)$ very accurately. In order to make
620: computation less demanding on the computer time, it is convenient
621: to find some approximation in the computing of the inverse
622: Fourier transformation. We will continue with this derivation in
623: the onedimensional case for the sake of simplicity.
624:
625: The Fourier transformation of the even function $f(x)$ is an even
626: real function. According to the (\ref{outcome}) we are looking
627: for the inverse Fourier transformation of its $n$-th power, we
628: will denote it $\hat{g}(u)$. It is also even real function.
629: Therefore its inverse Fourier transformation is real function
630: (sine-proportional terms vanish). Thus
631: \[
632: g(b)=\frac1{2\pi}\int\limits_{-\infty}^{\infty}\hat{g}(u)
633: \mathrm{e}^{\mathrm{i}bu}\,\dd u=\frac1{2\pi}
634: \int\limits_{-\infty}^{\infty}\hat{g}(u)\cos[bu]\,\dd u.
635: \]
636: This integral can be expressed as the sum of
637: contributions from all periods of the $\cos[bu]$ function,
638: $I_n=\mean{2\pi n/b,2\pi(n+1)/b}$ (here $n\in\mathbb{N}$)
639: \[
640: g(b)=\sum_{n=-\infty}^{\infty} S_n(b),\quad
641: S_n(b)=\frac1{2\pi}\int\limits_{I_n}\hat{g}(u)\cos[bu]\,\dd u.
642: \]
643: In the integrand of previous equation we can make Taylor
644: expansion of $\hat{g}(u)$ around $\xi_n=2\pi(n+1/2)/b$.
645: Thereafter terms of kind $u^m\cos[bu]$ emerge ($m\in\mathbb{N}$).
646: Such integrals are easy to compute -- first two terms of
647: resulting expansion are then
648: \[
649: S_n(b)=\frac1{b^3}
650: \frac{\dd^2\hat{g}}{\dd u^2}\bigg\rvert_{\xi_n}+
651: \frac{\pi^2-6}{6b^5}
652: \frac{\dd^4\hat{g}}{\dd u^4}\bigg\rvert_{\xi_n}.
653: \]
654: Finally we have
655: \begin{equation}
656: \label{priblizenie}
657: g(b)=\frac1{b^3}\sum_{n=-\infty}^{\infty}
658: \frac{\dd^2\hat{g}}{\dd u^2}\bigg\rvert_{\xi_n}+
659: \frac{\pi^2-6}{6b^5}\sum_{n=-\infty}^{\infty}
660: \frac{\dd^4\hat{g}}{\dd u^4}\bigg\rvert_{\xi_n}.
661: \end{equation}
662: This helps us to speed up inverse Fourier transformation -- we
663: do not have to know so many values of $\hat{g}(u)$. For every
664: range $I_n$ evaluation of $\hat{g}(u)$ in three points (for
665: numerical calculation of second derivative in the leading term of
666: (\ref{priblizenie})) is enough. We just have to keep in mind that
667: these points have to be close enough (with respect to $2\pi/b$),
668: otherwise we can obtain evidently incorrect results (e. g.
669: $g(b)=0$ when border points have distance $2\pi/b$).
670:
671: \begin{ack}
672: The author would like to thank to staff of his department,
673: especially to Martin Moj\v zi\v s and Vladim\' ir \v Cern\' y
674: for valuable conversations and to Mari\' an Klein for computer
675: time. Acknowledgement belongs also to J\' an Bo\v da for
676: introduction to the field, Mi\v ska Sonlajtnerov\' a for her
677: enthusiastic encouragement and my parents for their support.
678: \end{ack}
679:
680: \begin{thebibliography}{9}
681: \bibitem{Watts}
682: Watts, D. J.,
683: {\em Small worlds: The Dynamics of Networks between Order
684: and Randomness\/}, Princeton University Press (2003).
685:
686: \bibitem{Dorogo-Mendes}
687: Dorogovtsev, S. N. and Mendes, J. F. F.,
688: Evolution of networks,
689: {\em Adv. Phys.\/} {\bf 51} (2002), 1079.
690:
691: \bibitem{Albert-Bar}
692: Réka A., Barabási A.-L.,
693: Statistical mechanics of complex networks,
694: {\em Rev. of Modern Physics\/} \textbf{74} (2002), 47.
695:
696: \bibitem{Erd-Ren1}
697: Erd\" os P. and R\' enyi A.,
698: On random graphs,
699: {\em Publications Mathematicae\/} {\bf 6} (1959), 290.
700:
701: \bibitem{Erd-Ren2}
702: Erd\" os P. and R\' enyi A.,
703: On the evolution of random graphs,
704: {\em Publ. Math. Inst. Gung. Acad.\/} {\bf 5} (1960), 17.
705:
706: \bibitem{Watts-Strog}
707: Watts, D. J. and Strogatz, S.,
708: Collective dynamics of small-world networks,
709: {\em Nature\/} {\bf 393} (1998), 440.
710:
711: \bibitem{Bar-Albert}
712: Barab\'asi, A. L. and Albert, R.,
713: Emergence of scaling in random networks,
714: {\em Science\/} {\bf 286} (1999), 509.
715:
716: \bibitem{Penrose}
717: Penrose, M.,
718: {\em Random Geometric Graphs\/},
719: Oxford University Press (2003).
720:
721: \bibitem{FFH}
722: Fronczak, A., Fronczak, P., and Holyst, J. A.,
723: Mean-field theory for clustering coefficients in Barabasi-Albert networks,
724: {\em Phys. Rev. E\/} {\bf 68} (2003), 046126.
725: \end{thebibliography}
726: \end{document}