cond-mat0210514/pok.tex
1: \documentclass[twocolumn,rmp,aps]{revtex4}
2: 
3: \usepackage{dcolumn,graphicx,amsmath,amssymb,pxfonts}
4: 
5: \begin{document}
6: 
7: \title{Structure and Time-Evolution of an Internet Dating Community}
8: 
9: \author{Petter \surname{Holme}}
10: \email{holme@tp.umu.se}
11: \affiliation{Department of Physics, Ume{\aa} University, 
12:   901~87 Ume{\aa}, Sweden}
13: 
14: \author{Christofer R.\ \surname{Edling}}
15: \affiliation{Department of Sociology, Stockholm University, 106~91
16:   Stockholm, Sweden}
17: 
18: \author{Fredrik \surname{Liljeros}}
19: \affiliation{Department of Sociology, Stockholm University, 106~91
20:   Stockholm, Sweden}
21: \affiliation{Department of Medical
22:     Epidemiology and Biostatistics, Karolinska Institutet S-171 77
23:     Solna, Sweden}
24: 
25: \begin{abstract}
26: We present statistics for the structure and time-evolution of a
27: network constructed from user activity in an Internet community. The
28: vastness and precise time resolution of an Internet community offers
29: unique possibilities to monitor social network formation and
30: dynamics. Time evolution of well-known quantities, such as clustering,
31: mixing (degree-degree correlations), average geodesic length, degree,
32: and reciprocity is studied. In contrast to earlier analyses of
33: scientific collaboration networks, mixing by degree between vertices
34: is found to be disassortative. Furthermore, both the evolutionary
35: trajectories of the average geodesic length and of the clustering
36: coefficients are found to have minima.
37: \end{abstract}
38: 
39: \maketitle
40: 
41: \footnotesize
42: 
43: We thank Christian Wollter and Michael Lokner at pussokram.com, Stefan
44: Praszalowicz at nioki.com, and Niklas Angemyr and Reginald Smith for
45: granting and helping us getting access to data. We thank Mark Newman
46: for comments on assortative-mixing, and the editor and anonymous
47: reviewer for helpful comments. PH is partially supported by the
48: Swedish Research Council through contract no.\ 2002-4135. CRE is supported
49: by the Bank of Sweden Tercentenary Foundation. FL is supported by the
50: National Institute of Public Health.
51: 
52: \normalsize
53: 
54: \section{Introduction}
55: With the growing interest in social network analysis from the physics
56: community, a new research area is emerging in the intersection between
57: statistical physics and sociology (Albert and Barab\'{a}si 2002;
58: Dorogovtsev and Mendes 2002; Newman 2003). Sociologists have been
59: interested in network analysis for at least half a century, and with
60: mathematicians and statisticians they have developed a set of tools to
61: analyze positions, structures, and processes of social networks
62: (Wasserman and Faust 1994; Butts 2001). Although there are exceptions
63: (Fararo and Sunshine 1964; Skvoretz 1990), most sociological and
64: anthropological studies of networks have focused on small-group
65: interaction or cognitive networks. In one respect this is quite
66: natural as most groups and formal organizations are of small
67: size. Also, a pragmatic reason for this is that data collection of
68: large social networks, behavioral or cognitive, is cumbersome and
69: often practically impossible to carry through. Therefore, although
70: recent analyses (Watts and Strogatz 1998; Watts 1999; Newman 2001)
71: have brought new attention to comparative analysis of large-scale
72: social networks, the statistical physics method, emphasizing the limit
73: of large system sizes (Albert and Barab\'{a}si 2002), has been of
74: limited utility. However, the extended use of database technology
75: provide new possibilities for constructing real world networks for the
76: analysis of e.g.\ movie-actor networks (Watts and Strogatz 1998) and
77: co-authorship in science (Newman 2001). Surely, these networks reflect
78: social interaction, but they are also heavily constrained by the logic
79: of a particular industry or a particular professional activity. Thus,
80: to allow for exploration of the possible universal properties of
81: social networks in general, there is still an urgent need to analyze
82: other types of large empirical social networks. In this paper we
83: report on an investigation of a large social network, aiming to give a
84: phenomenological description that will hopefully shed some new light
85: on the processes forming the structure of social networks. To put
86: results in context, we try to compare our findings to other studies
87: whenever possible, and to contrast parameters to what would be
88: expected from a random network with similar characteristics.
89: 
90: To construct network data and large graphs based on more spontaneous
91: patterns of human interaction than e.g.\ co-authorship and
92: co-actorship, one can consider data from e-mail exchange (Ebel,
93: Mielsch et al.\ 2002) or user activity in Internet communities
94: (Rothaermel and Sugiyama 2001; Smith 2002). The present work belongs
95: to the latter category, with a strong focus on the dynamics of the
96: network. In contrast to previous studies of Internet communities
97: (Smith 2002), we use down-to-the-second timing of the communication to
98: investigate time evolution and obtain steady state estimates of
99: well-known measures of graph structure. We use data from a Swedish
100: Internet community called pussokram.com (roughly ``kiss'n'hug'' in
101: English) that is primarily targeted at adolescents and young
102: adults. The community provides an arena for flirting, dating, and
103: other romantic communication; as well as communication for
104: non-romantic friendship.
105: 
106: Studies suggest that online interaction is driven by the same needs as
107: face-to-face interaction, and should not be regarded as a separate
108: arena but as an integrated part of modern social life (Wellman and
109: Haythornthwaite 2002). Thus communicative actions taken by members of
110: the community can be expected to share many features with the web of
111: human acquaintances and romances in the social off-line world. Indeed,
112: for many people in contemporary Western societies, interaction on the
113: Internet is as real as any other interaction (Wellman 2001). Internet
114: communities are interesting by and for themselves, but this suggests
115: that the formation and dynamics of social networks in an Internet
116: community can share the same generic properties as all social
117: acquaintance networks, and that the study of Internet communities can
118: provide important information for enhancing our understanding of
119: social networks in general.
120: 
121: The paper is divided into four sections. In the next section we give a
122: detailed description of the functions of the Internet community in
123: focus. The third section contains statistical analyses and
124: presentation of results that we summarize and discuss in the fourth
125: and concluding section.
126: 
127: \section{The Internet community pussokram.com\label{sec:pok}}
128: 
129: Pussokram.com is a Swedish Internet community primarily intended for
130: romantic communication and targeted at adolescents and young
131: adults. The community had around 30$\,$000 active users during the spring
132: and summer 2002, the mean user age is 21 years, and approximately 70
133: percent of the users are women (therefore, and to simplify, we will
134: use the female gender when referring to users in this paper). Both age
135: and sex are self reported. It is possible to have multiple accounts on
136: the community. A crude check on the number of accounts linked to every
137: unique e-mail address indicates that this is not very common (more
138: than 99.7\% of the membership accounts are associated with a unique
139: e-mail address and no e-mail address are associated with more than 5
140: accounts).\footnote{Of course it is possible to use an unique e-mail
141:   address for every unique e-mail account but since this information
142:   is not revealed its hard to see way on would go through the extra
143:   effort so doing.} Our data consists of all the user activities on
144: pussokram.com logged for 512 days from 13:39:25 on February 13, 2001
145: ($t = 0$) to 13:28:19 on July 10, 2002. The smallest time-unit on the
146: log is 1 second. We analyze the activity of all users registered at
147: time $t = 0$, as well as the activity of any new users during this time
148: span.\footnote{Personal integrity is of course an issue here. For the
149:   analysis, we study the anonymized data to prevent any intrusion of
150:   privacy, and we do not have access to specific message
151:   contents. Like everyone else, we can read the guest books, but still
152:   we cannot link an user (and her guest book) to the vertices of the
153:   network. Thus, we cannot identify any specific individual person in
154:   the data. We do not even have data that can be cross-examined with
155:   other databases (like computer IP-addresses) to detect users
156:   identity}  Time $t = 0$ defines the start up day for this particular
157: community. However prior to $t = 0$ there was a mail server for
158: sending anonymous love messages on the Internet. Registered users of
159: this service had their accounts automatically transferred to
160: pussokram.com. We only study activity on the community, nevertheless
161: this recruitment might induce higher initial growth of active users.
162: 
163: \begin{figure*}
164:   \centering{\resizebox*{\linewidth}{!}{\includegraphics{pok.eps}}}
165:   \caption{Screenshot of a typical user homepage at
166:     pussokram.com. ``User A'', ``User B'', etc.\ symbolize user names. (The
167:     translation is due to the authors. Italics denote a description
168:     rather than a translation.)
169: }
170:   \label{fig:pok}
171: \end{figure*}
172: 
173: Pussokram.com has a pronounced romantic profile, where:
174: \begin{itemize}
175: \item Users are encouraged to send messages to others that they are
176:   secretly in love with.
177: \item The provider answers questions related to love and sex posed by
178:   the users under the pseudonym Dr.\ Love.
179: \item The design of the HTML-pages makes use of a romantic iconography
180:   well known to the targeted users (with Valentine's hearts, deep red
181:   colors, etc., see Fig.~\ref{fig:pok}). Nevertheless, a quick glance
182:   through some of the public guest books reveals that many of the
183:   contacts taken are also non-romantic.
184: \end{itemize}
185: 
186: \subsection{Types of contacts in pussokram.com}
187: 
188: There are four major modes of communication at pussokram.com. We study
189: each of the networks generated by these four types of contacts
190: separately and we also study the union of these networks generated by
191: any of these contacts. A brief description of the four types of
192: contacts follows:
193: \begin{itemize}
194: \item The \textbf{Messages} are in effect intra-community e-mails. These
195: are private in the sense that no one in the community, except the
196: sender and receiver, can access them. Not even information on how many
197: messages other users have received are retrievable for other users.
198: \item In \textbf{Guest book} signing, each user has a guest book that
199:   every community member is free to write in.
200: \item \textbf{Flirt} or ``friendship request:'' User A can ask user B to
201:   be her friend. If user B accepts user A's request then they can both
202:   easily see if the other is online whenever they are logged onto
203:   pussokram.com. Information on the friends of a specific user is
204:   private to the user only.
205: \item \textbf{Friendship}: A friendship relation is established after
206:   acceptance of a friendship request, as described above. The
207:   friendship network is thus bi-directional. A friendship can be
208:   canceled by any of the friends.
209: \end{itemize}
210: 
211: \subsection{Ways to receive attention and search users}
212: 
213: Unless engaged in peer-to-peer contact of some sort, users at
214: pussokram.com are relatively anonymous towards each other. There is
215: reason to believe that knowledge about the prior interactive behavior
216: of other individuals structures the present interactive behavior of a
217: given individual (the so called imitation factor). The only
218: information about a user's interaction history available to other
219: users. But there are several ways for an user to draw attention to
220: herself (i.e.\ to direct other users to her community homepage), and
221: for users to find information about others. Here we summarize various
222: ways that can be used to receive attention, search for other users,
223: and promote oneself at pussokram.com. The following information is
224: displayed when a logged on user browse the pussokram.com website:
225: \begin{itemize}
226: \item The username of the most recently registered community member.
227: \item The name of the most recently edited diary (each user has space
228:   open for others to read, intended as a diary).
229: \item The names of the most recent users to browse a specific user's
230:   homepage.
231: \item The names of similar users are displayed on a specific users
232:   homepage. Similarity is assesses through self-reported background
233:   variables.
234: \item A long interview with the ``user of the week'' (although updated
235:   more seldom than weekly). This is an epithet that users can apply for.
236: \item Photographs of 10-20 users are displayed at the login-page.
237: \end{itemize}
238: 
239: A user can search out other users with a search engine (the
240: ``s\"{o}kofinder''---in English ``search'n'finder''---in
241: Fig.~\ref{fig:pok}) that handles the following
242: criteria: Sub-string of the username, gender, age, place of residence,
243: online status, and if a user has provided a photograph of
244: herself. Presumably, these are the characteristics that drive user
245: activity, but because it is hard to assess their validity, and because
246: we are only interested in structural properties, we do not conduct any
247: analysis on them.
248: 
249: \subsection{Comparisons with other empirical and statistical networks}
250: 
251: For comparison we also use networks by instant messaging at the French
252: Internet community nioki.com and scientific collaboration (or, rather,
253: co-authorship) networks. nioki.com and pussokram.com are rather
254: similar, both in terms of content and design, but compared to
255: pussokram.com, nioki.com is even more youth oriented and not as
256: focused on romantic relations as pussokram.com. Besides the
257: possibility of searching for user names, nioki.com has two search
258: procedures \textit{recherche l'amiti\'{e}} (search for friendship) and
259: \textit{recherche l'amour} (search for love), where one can fill out
260: questionnaires to find other users that match ones preferences. In the
261: nioki.com network, an arc connects user A to user B if user B is in
262: user A's list of contacts (for details see (Smith 2002). In the
263: scientific collaboration networks (Newman 2001) the vertices are
264: scientists who have uploaded manuscripts to the Los Alamos preprint
265: repository arXiv.org, arcs are added between scientists who have
266: co-authored a paper. In contrast to the pussokram.com and nioki.com
267: networks, ties in the scientific collaboration network is
268: bi-directional. Note, that the pussokram.com networks are dynamic,
269: while we only have access to snapshot data of nioki.com and scientific
270: collaboration networks. For this reason we can only make comparisons
271: between the static properties of these networks.
272: 
273: In addition, following (Anderson, Butts et al.\ 1999; Pattison,
274: Wasserman et al.\ 2000; Shen-Orr, Milo et al.\ 2002), we compare some
275: observed quantities to the corresponding average values from
276: randomized networks with the same degree-sequence as the original. By
277: this approach, we examine how aspects of structures other than the
278: degree sequence, influences the quantities. Every known real social
279: network deviates from the average randomized network in a larger or
280: lesser extent, depending on the social forces structuring the
281: interaction. For example, with regards to the present case, we believe
282: that an Internet community network will be closer to the average
283: randomized network than several other types of social networks,
284: because time and space constraints are much less pressing than in,
285: e.g., a kinship network. These randomized networks are generated by
286: sequentially going through all directed arcs A-B, and for every such
287: arc randomly select another arc, C-D, and then rewire so that A-D
288: forms one arc, and C-B forms another. The choice of C-D is done with
289: uniform randomness among all arcs that would not introduce a loop or a
290: multiple arc. We use this algorithm to generate $\sim 3000$ networks and the
291: quantities are averaged over these networks. This procedure is inspired
292: by Roberts (2000). However it differs from Roberts in the sense that
293: we use sweeps over all arcs (where each arc is rewired at least once)
294: as the unit of iterations of the algorithm.\footnote{To be precise our
295:   algorithm run as follows: We go sequentially through the arc
296:   set $A$ (see Sect.~\ref{sec:stat}). For every arc $(v,w)$ we
297:   construct a set $A'$ of arcs such that if a member $(v',w')$ of $A'$
298:   is to be rewired with $(v,w )$---i.e.\ so that $(v,w)$ and $(v',w')$
299:   are replaced by $(v,w')$ and $(v',w)$---then no loops or multiple
300:   arcs are formed. Then we choose one of $A$'s arcs with uniform
301:   randomness and rewire that arc with $(v,w)$.}
302: 
303: \section{Statistical analysis\label{sec:stat}}
304: 
305: The pussokram.com network consists of all registered users and the
306: communication flow between these users as described
307: above. Communication is conceived of as directed links between
308: users. This is translated into a graph of vertices (users) and arcs
309: (ties). Vertices are added to the network the first time a registered
310: user is active, i.e. the first time the user sends or receives a
311: message, signs a guest book, or sends or accepts a friendship request
312: as described above. Each of these interactions defines a unique
313: network, and by adding an arc for any activity one gets a total
314: network of online activities. We thus study five networks, and for
315: each of them the vertex set is empty at $t = 0$. We represent the
316: network as a directed graph, $G = (V, A)$, where $V$ is the vertex set
317: and $A$ is the set of arcs, or ordered pairs of vertices. $N = |V|$
318: denotes the order (number of vertices) of $G$, and $M = |A|$
319: represents the number of arcs. Sometimes we study properties of the
320: undirected graph obtained by taking the reflexive closure of
321: $G$.\footnote{I.e.\ the graph obtained if for every $(u,v)\in A$ and
322:   $(v,u)\notin A$ then $(v,u)$ is added to $A$.}
323: 
324: \begin{figure}
325:   \centering{\resizebox*{\linewidth}{!}{\includegraphics{len.eps}}}
326:   \caption{Time evolution of the number of vertices (a) and average
327:     degree (b) as a function of time.}
328:   \label{fig:len}
329: \end{figure}
330: 
331: \subsection{Decreasing growth rate of network size and convergence of
332:   average degree}
333: 
334: For each network, the number of vertices of each network, $N$, as a
335: function of time during the sampling is displayed in
336: Fig.~\ref{fig:len}(a), and the average degree, i.e.\ the average
337: number of arcs per vertex, $M / N$, is displayed in
338: Fig~\ref{fig:len}(b). As can be seen, both the number of vertices and
339: the average degree are increasing as a function of time, but with at a
340: decreasing growth rate. The average degree appears to converge to a
341: constant, but for $t < 100$, it increases as a power function. The
342: more rapid growth rate in the beginning of the period is explained by
343: the fact that old users log on for the first time during our sampling
344: period (see discussion in Section~\ref{sec:pok}). The decreasing
345: growth, and apparent approach to equilibrium, stand in contrast to the
346: accelerated growth of the Internet and the World Wide Web (Dorogovtsev
347: and Mendes 2002), as well the linear growth of scientific
348: co-authorship networks extracted from article databases (Newman 2001;
349: Newman 2001; Barab\'{a}si, Jeong et al.\ 2002). However, in social
350: networks, the average degree cannot be increasing without bounds, and
351: this goes for scientific collaboration networks too. We believe the
352: difference stems from a wider effective sampling time frame---due to
353: the much more rapid dynamics of an Internet community (compared to
354: scientific collaborations) we are, relatively speaking, able to follow
355: the process for a much longer period. In the sense that $G$ is a
356: steadily growing dynamic network, we deal with a non-equilibrium
357: representation of the social situation. When we speak of the network
358: ``reaching equilibrium,'' we refer to when all quantities that are
359: bounded as a function of $N$ (such as the average degree) are reaching
360: their constant limits.
361: 
362: \begin{figure}
363:   \centering{\resizebox*{\linewidth}{!}{\includegraphics{ass.eps}}}
364:   \caption{Reciprocity $R$ (a), and (b) assortative mixing coefficient
365:     $r_\mathrm{dir}$ as functions of time.}
366:   \label{fig:ass}
367: \end{figure}
368: 
369: \subsection{Reciprocity varies between networks}
370: 
371: Various types of social relations differ in direction, intensity, and
372: frequency (Granovetter 1973). Messages between agents with different
373: social status for example, tend to be unevenly distributed (Gould
374: 2002). In the present analysis, we can investigate the reciprocity of
375: communicative action by looking at the direction of the communication
376: flow between any two users. For example, if user A sends a friendship
377: request to user B, we observe a link between user A and user B, and
378: note an arc between the two vertices. But it makes quite a difference
379: whether user B accepts the invitation or not, i.e.\ whether we note one
380: or two arcs between the vertices. We define reciprocity $R$, as the
381: fraction of mutual dyads, i.e.\ the ratio between the number of
382: vertex-pairs $\{v,w\}$ occur in two arcs  ($(v,w)$ and $(w,v)$) and
383: vertex-pairs that occur in at least one arc.  More analytically:
384: \begin{equation}
385: R=\frac{2M}{M_2}-1~.\label{eq:rec}
386: \end{equation}
387: where $M_2$ is the number of arcs in the reflexive closure of $G$. $R$
388: lies strictly in the interval $[0,1]$; if $(u,v)$ is an arc then $R =
389: 0$ implies that $(v,u)$ is not an arc and $R = 1$ implies that $(v,u)$
390: is an arc.
391: 
392: \begin{table*}
393: \label{tab:ass}
394:   \caption{Assortative mixing coefficients, $r$, for five
395:     pussokram.com networks, and for nioki.com and arXiv.org
396:     networks. Statistics for corresponding randomized networks are
397:     within square brackets. Differences between the various mixing
398:     coefficients are discussed in the text. Double hyphens indicate
399:     missing data. Note: * $p\leq0.01$ nioki.com and arXiv.org data are
400:     not tested for significance.}
401: \begin{ruledtabular}
402: \begin{tabular}{l|ccccccc}
403: \hline
404: network & $N$ & $r$ & $r_\mathrm{dir}$ & $r_\mathrm{in\: in}$ &
405: $r_\mathrm{in\: out}$ & $r_\mathrm{out\: in}$ & $r_\mathrm{out\:
406:   out}$\\
407: all contacts & 29{$\,$}341& {--}0.048* & {--}0.059* &  {--}0.063*&{--}0.046* & {--}0.071*& {--}0.050* \\
408:  & & [{--}0.043]& [{--}0.041]& [{--}0.028]&[{--}0.021] & [{--}0.049]& [{--}0.035]\\
409: messages & 21{$\,$}545 & -­0.055* & {--}0.083*& ­0.054*& -­0.056*&  -­0.076* & -­0.087*\\
410:  & & [­-0.053] & [{--}0.061]&[-­0.013] &[­-0.011] & [-­0.058]& [­-0.057]\\
411: guest book & 20{$\,$}691 & -­0.073*& ­-0.085*&­-0.097* &­-0.043* & -­0.088*& -­0.053*\\
412:  & & [­-0.049] & [-­0.038]&[-­0.024] & [­-0.015]&[-­0.042] & [-­0.026]\\
413: friends
414:  & 14{$\,$}278& ­-0.042*&- - &- - & - -& - -& - -\\
415:  & &  [­0.031]&- - & - -& - -& - -& - -\\
416: flirts &8{$\,$}186 & ­-0.12*& -­0.12*& ­-0.006& ­-0.022& ­-0.12*& ­-0.042*\\
417:  & & [-­0.12] & [­-0.10]& [0.016]& [­-0.002]& [-­0.10]& [-­0.013]\\
418: nioki.com & 50{$\,$}259& -­0.13& -­0.10& -­0.088&-­0.084 & ­-0.10&-­0.095 \\
419:  & & [-­0.034]&[-­0.014] &[­-0.018] &[-­0.014] &[-­0.020] &[-­0.016] \\
420: arXiv.org &52{$\,$}909 & 0.36& - -& - -& - -&- - & - -\\
421:  & &  [­-0.034]& - -& - -& - -& - -& - -\\
422: \end{tabular}
423: \end{ruledtabular}
424: \end{table*}
425: 
426: The time evolution of the reciprocity can be seen in Fig.~\ref{fig:ass}a. As is
427: evident from the figure, reciprocity levels differ little between the
428: different networks. By definition, the friendship network has
429: reciprocity of 1. And by the same token, the flirt network has a
430: reciprocity equal to zero. For the other two networks, the curves
431: converge to values around 0.4 for the guest book and messages
432: networks, and 0.5 for the all contacts network (see Table~\ref{tab:ass}). 
433: It's hard to judge whether these are high or low values of
434: reciprocity. They are however compatible with data for the French
435: Internet community nioki.com. We normally assume acquaintance networks
436: to have a high degree of reciprocity, but one reason to expect a lower
437: value for online interaction is that an actor feels less social
438: pressure to respond to a communicative act over the Internet than in a
439: face-to-face, or telephone encounter, for example.
440: 
441: \subsection{Disassortative mixing coefficients of the pussokram.com networks}
442: 
443: Together with the degree distribution, the degree-degree correlation
444: is considered to govern much of the network's robustness towards
445: disturbances as well as the information flow. In other contexts the
446: discussion is usually phrased in terms of resilience against epidemics
447: and attack. A positive degree-degree correlation is also referred to
448: as assortative mixing by degree, and it means that vertices of 
449: high degree preferably attaches to each other, and vice versa. For
450: example, assortative mixing makes the networks more vulnerable to
451: outbreaks of diseases, and more robust against strategic attack
452: (Newman 2002), because if people with many contacts are connected to
453: other people with many contacts, the epidemic threshold will be
454: lowered. Disassortative mixing, on the other hand, gives rise to
455: larger epidemics (Morris and Kretzschmar 1995).
456: 
457: 
458: We measure assortative mixing by calculating Pearson's correlation
459: coefficient $r$ for the degrees at either side of an edge as suggested
460: by Newman (2002):
461: \begin{equation}\label{eq:r}
462: r=\frac{\langle k_\mathrm{to} k_\mathrm{from}\rangle -\langle
463:   k_\mathrm{to}\rangle\langle k_\mathrm{from}\rangle} {\sqrt{\langle
464:   k_\mathrm{to}^2\rangle-\langle k_\mathrm{to}\rangle^2}\sqrt{\langle
465:   k_\mathrm{from}^2\rangle-\langle k_\mathrm{from}\rangle^2}} 
466: \end{equation}
467: 
468: In equation \ref{eq:r},  $\langle\cdots\rangle$  denotes the average
469: over arcs, $k_\mathrm{from}$ is some (in-, out-, or total) degree of
470: the vertex that the arc starts from, and $k_\mathrm{to}$ is some degree of the
471: vertex that the arc leads to. We look at $r$ for total degree of both
472: bi-directional (where the reflexive closure has been taken if the
473: network is not bi-directional by definition) and directed graphs
474: $r_\mathrm{dir}$. Furthermore, we measure the four combinations of in-
475: and out degree correlations; e.g.\ the out-in correlation coefficient
476: indicates whether users that have many contacts (high out-degree)
477: prefers to communicate with those users that themselves receive
478: communication from many users (high in-degree).
479: 
480: The values for pussokram.com and other networks are displayed in
481: Table~\ref{tab:ass}. Interestingly enough all the pussokram.com networks,
482: as well as the nioki.com network display a significant disassortative
483: mixing for all types of degree-degree correlations. This is in
484: contrast to what have been measured for (scientific-, actor-, and
485: business-) collaboration networks (Newman 2002). To set these results
486: in perspective we also measure $r$ for a scientific collaboration
487: network, which clearly displays a positive assortative mixing
488: coefficient. Maybe an assortative mixing is significant only to
489: interaction in competitive areas, such as professional collaborations
490: (where only already big names are likely to be successful in
491: collaborating with other big names). This result relates to research
492: on exchange networks that claim that negative mixing is optimal when
493: actors are substitutable, as for example in friendship and dating
494: network (Cook, Emerson et al.\ 1983). In contrasts, professional
495: collaboration is positive because both knowledge and already
496: established channels for cooperation screen off potential alternative
497: collaborators. Another issue is the skewness of the degree
498: distribution. Intuitively, a large spread in the degree distribution
499: will increase the likelihood of observing negative mixing. And as can
500: be seen from the randomized networks in Table~\ref{tab:ass}, given the degree
501: distribution we would expect a negative mixing coefficient. However,
502: the observed coefficients are consistently, and significantly, higher
503: than expected. This strongly suggests that negative mixing arise from
504: this particular form of social interaction in which alters are
505: substitutable (Cook, Emerson et al.\ 1983). Note though, that some
506: network models, analyzing completely different forms of interaction,
507: with skewed degree distributions produce networks of zero or positive
508: assortative mixing (Newman 2002; Park and Newman 2003).
509: 
510: 
511: The six different assortative mixing coefficients of Table~\ref{tab:ass}
512: are all of the same sign and roughly of the same magnitude. This is
513: interesting since it suggests that the $r$-values is a result of other
514: structures (presumably the degree-sequence) rather than from the
515: behavior of individuals: There are no a priori reasons for
516: $r_\mathrm{in\: out}$ to be the same as e.g.\ $r_\mathrm{in\: in}$, as
517: a large $r_\mathrm{in\: out}$ means that actors that are active in the
518: community (have a high $k_\mathrm{out}$) tend to associate with those
519: who are successful in promoting themselves in the community (have a
520: high $k_\mathrm{in}$), while a large $r_\mathrm{in\: in}$ means that
521: the latter category has a preference towards each other.
522: 
523: Fig.~\ref{fig:ass}b shows the time development of the assortative
524: mixing coefficient $r_\mathrm{dir}$ (the time development of the other assortative
525: mixing coefficients of Table~\ref{tab:ass} is qualitatively
526: similar). We see that $r_\mathrm{dir}$ converges more quickly than the
527: average degree. This is not surprising since the correlation
528: coefficient is a function of the way ties are formed rather than the
529: size or average degree of the network. An interesting detail of
530: Fig.~\ref{fig:ass}b is the jump at $t\approx 300$ days in the flirt
531: (friendship request) network. This is due to the formation of a tie
532: between two of the most connected actors. (The fact that the flirt
533: network is by far the sparsest strengthens this effect.)
534: 
535: \subsection{Cumulative degree distributions are highly skewed}
536: 
537: \begin{figure*}
538:   \centering{\resizebox*{0.7\linewidth}{!}{\includegraphics{deg.eps}}}
539:   \caption{Cumulative degree distribution for the networks at the
540:     largest times, for all contacts (a), friendship confirmations and
541:     messages (b), guest book (c), and flirts (d).}
542:   \label{fig:deg}
543: \end{figure*}
544: 
545: The degree distribution has received much attention in comparative
546: analyses of complex networks since the work of Barab\'{a}si and Albert
547: (1999). A skewed degree distribution is commonly regarded as a
548: cumulative effect in the attachment of new arcs to the network (Simon
549: 1955; Barab\'{a}si and Albert 1999), and it offers a way to classify
550: different types of networks (Amaral, Scala et al.\ 2000). Indeed it
551: has been demonstrated that many apparently dissimilar types of
552: networks share the same highly skewed degree distributions of a
553: (truncated) power-law form (Albert and Barab\'{a}si 2002), indicating
554: an emerging scale-free structure. Such degree distributions are
555: generated through a growth process in which new arcs are drawn between
556: already existing vertices and new vertices only. However, a process
557: that reasonably describes the activity of an Internet community would
558: allow also for new arcs to be drawn between two already existing
559: vertices. Such a mixed process however, would result in a stretched
560: exponential distribution, and not a power-law, and thus a stretched
561: exponential distribution is what we would expect to observe. Another
562: process that can be responsible for cutting the tails of power-law
563: degree distributions in real-world networks is a limited capacity of
564: the actors.
565: 
566: Following (Liljeros, Edling et al.\ 2001) we measure the cumulative
567: degree distribution of all the pussokram.com networks, see
568: Fig.~\ref{fig:deg}. If the degree distribution follows a power-law
569: with exponent $­\gamma$ then the cumulative distribution will have the
570: exponent $­\alpha = ­\gamma + 1$. All pussokram.com networks are
571: highly skewed, but none of them fits a power-law form across the whole
572: range observed. However, it is interesting to note that there are no
573: clear signs of the (inevitable) high-degree truncation in any of the
574: graphs (Fig.~\ref{fig:deg}). A previous study of the French nioki.com
575: has reported a power-law fit of the cumulative degree distribution
576: (Smith 2002). Our result might appear to set the pussokram.com
577: community apart from the nioki.com community, but a closer inspection
578: of our graphs and (Smith 2002) reveals a striking similarity in the
579: functional form of the distribution. We therefore conclude that the
580: dynamics shaping the degree-distribution is to a large extent the same
581: for the two communities.
582: 
583: \subsection{Evolution of average geodesic length}
584: 
585: As a general measure of how closely connected a graph is, the average
586: geodesic (shortest path) length is one of the most studied network
587: quantities. There is no unique natural definition of average geodesic
588: length in an arbitrary directed graph{--}-the problem is the
589: contribution from disconnected pairs of vertices. One choice is to
590: measure the geodesic distance averaged over pairs of vertices in the
591: giant component:
592: \begin{equation}
593: l_\mathrm{GC}=\frac{1}{|A_\mathrm{GC}|} \sum_{(u,v)\in A_\mathrm{GC}}d(u,v)~,
594: \end{equation}
595: where $d(u, v)$ is the distance between $u$ and $v$, and
596: $A_\mathrm{GC}$ is the arc-set of the giant component. Another option
597: is to average the inverse geodesic length (Latora and Marchiori 2001), 
598: \begin{equation}
599: l^{-1}=\frac{1}{M} \sum_{(u,v)\in A}\frac{1}{d(u,v)}~,
600: \end{equation}
601: where $1/d(u, v)$ is defined as zero when no path exists from $u$ to
602: $v$. In the present paper we focus on $l^{-1}$, and $l_\mathrm{GC}$
603: for the reflexive closure of $G$. If the two measures agree, we can
604: infer that there is no additional effect influencing the shortest
605: paths in a substantial way, other than the bi-directional structure of
606: the largest connected subgraph.
607: 
608: \begin{figure}
609:   \centering{\resizebox*{\linewidth}{!}{\includegraphics{xlen.eps}}}
610:   \caption{Time evolution of the average geodesic length within (a)
611:     the giant component of the reflexive closure and (b) the average
612:     inverse degree.}
613:   \label{fig:xlen}
614: \end{figure}
615: 
616: As time evolves there are two conflicting mechanisms governing the
617: average geodesic length: The increasing number of vertices works for
618: an increase of $l$, whereas the increasing average degree makes $l$
619: shorter. For the pussokram.com data the latter effect dominates,
620: during the time span of our data set, to give a monotonously
621: decreasing $l_\mathrm{GC}$ (monotonously increasing $l^{-1}$) as shown
622: in Fig.~\ref{fig:xlen}. The same situation has been reported for
623: scientific collaboration networks (Barab\'{a}si, Jeong et al.\
624: 2002). Assuming the community outlives its members, $l$ will
625: eventually start to increase (when the number of inactive users slows
626: down the accelerated growth sufficiently).
627: 
628: \subsection{Density of short circuits}
629: 
630: Acquaintance networks are expected to have a high degree of
631: transitivity (Wasserman and Faust 1994), or in other words, a high
632: density of triangles, since if person A knows person B and person C,
633: then person B and person C are likely to be acquainted. We apply a
634: commonly used measure that gives the fraction of triangles out of the
635: connected 3-paths of the graph (a quantity that was defined for
636: undirected graphs, but is trivially generalized to directed graphs,
637: for which we use subscript ``dir''). If we let $p(n)$ denote the
638: number of representations of paths\footnote{A representation of a path
639:   of length three is a triplet $(u,v,w)$ such that $(u,v)$ and $(v,w)$
640:   are arcs. In an undirected network a path have two representations
641:   and a triangle has six representations.} and $c(n)$ denote the
642: number of representations of circuits, of length $n$, then we can
643: express the clustering coefficient,\footnote{This quantity is
644:   sometimes called transitivity, sometimes clustering
645:   coefficient. Note however that is not identical to Watts and
646:   Strogatz's (1998) clustering coefficient (where they average a local
647:   transitivity measure over the vertex set).} $C$, as:
648: \begin{equation}
649: C=\frac{c(3)}{p(3)}
650: \end{equation}
651: One can expect that social networks with many heterosexual romantic
652: relationships, such as the pussokram.com networks, to have rather few
653: triangles.\footnote{Presumably, homosexual relationships are not the
654:   common type of romantic relationship among Swedish
655:   adolescents. Therefore we expect few triangles. As a corollary, in a
656:   community populated largely by homosexual individuals, the number of
657:   triangles would be much higher. Regrettably we cannot test this
658:   hypothesis with available data.
659: } To get a better picture of the density of short circuits we also
660: measure the density of circuits of length four:
661: \begin{equation}
662: D=\frac{c(4)}{p(4)}
663: \end{equation}
664: The $n$-behavior of $c(n) / p(n)$ varies from network to network, and
665: could possibly be an informative quantity in it self. A very high $C$
666: will in most cases probably imply a high $D$ (for $R = 1$ network, two
667: triangles with one arc in common will contribute to $c(4)$), but the
668: reverse is less certain.
669: 
670: \begin{table*}
671: \label{tab:misc}
672:   \caption{Statistics for the fully-grown networks of
673:     pussokram.com, nioki.com and arXiv.org networks provided for
674:     comparison. Statistics for corresponding randomized networks are
675:     within square brackets. Double hyphens indicate missing
676:     data. Note: * $p\leq 0.01$. ${}^\dagger$The `friends' and
677:     `arXiv.org' data sets are undirected, $M$ denotes the number of
678:     undirected edges (which is half the number of $M$ in a directed
679:     representation of the graph). nioki.com and arXiv.org data are not
680:     tested for significance. 
681: }
682: \begin{ruledtabular}
683: \begin{tabular}{l|ccccccc}
684: \hline
685: network & all contacts & messages & guest book & friends & flirts & nioki.com & arXiv.org\\
686: $N$ & 29{$\,$}341 & 20{$\,$}691 & 21{$\,$}545 & 14{$\,$}278 & 8{$\,$}186 & 50{$\,$}259 & 52{$\,$}909 \\
687: $M$ & 174{$\,$}662 & 76{$\,$}257 & 73{$\,$}346 & 31{$\,$}871$^\dagger$ & 8{$\,$}744 & 405{$\,$}742 & 490{$\,$}600$^\dagger$\\
688: $R$ & 0.51 & 0.40 & 0.38 & 1 & 0 & 0.69 & 1\\
689: $l_\mathrm{GC}$ & 4.4 & 4.3 & 4.6 & 5.1 & 5.7 & 4.1 & 6.1\\
690: $l^{-1}$ & 0.12 & 0.10 & 0.084 & 0.18 & $4.0\times 10^{­4}$ & 0.209 &
691: 0.121\\
692: $C$ & 0.006 & 0.001* & 0.014* & 0.020* & 0 & 0.0065 & 0.45\\
693:  & [0.006] & [0.002] & [0.007] & [0.0044] & [0.001] & [0.0081] &
694: [0.0020]\\
695: $C_\mathrm{dir}$ & 0.012* & 0.005* & 0.014* &  - - & 0* & 0.0076 & -
696: -\\
697: & [0.007] & [0.003] & [0.005] & [0] & [0.0077] & \\
698: $D$ & 0.017 & 0.006* & 0.022* & 0.020* & 0.212* & 0.013 & 0.35\\
699: & [0.009] & [0.004] & [0.008] & [0.004] & [0.004] & [0.0081] &
700: [0.0021]\\
701: $D_\mathrm{dir}$ & 0.016* & 0.008* & 0.015* & - - & 0 & 0.016 & - -\\
702: & [0.007] & [0.003] & [0.005] & [0] & [0.0077] & \\
703: \end{tabular}
704: \end{ruledtabular}
705: \end{table*}
706: 
707: Values for $C_\mathrm{dir}$ and $D_\mathrm{dir}$ and their undirected
708: counterparts are shown in Table~\ref{tab:misc}. We note that, with a
709: few exceptions, the values for the real networks are significantly
710: larger than the randomized; the difference, however, is far less
711: dramatic than for the scientific collaboration network. This is
712: contrast between the Internet community networks and the arXiv.org
713: data is easily explained from the fact that a paper with
714: $n_\mathrm{auth}\geq 3$ authors represents a fully connected subgraph of
715: $G$ (contributing with $n_{\mathrm{auth}}(n_{\mathrm{auth}} {-}1)
716: (n_{\mathrm{auth}} {-}2) / 3$ triangles). However, we would like to
717: stress that the values themselves are not very informative, compared
718: to their time dependence.
719: 
720: \begin{figure*}
721:   \centering{\resizebox*{0.65\linewidth}{!}{\includegraphics{clu.eps}}}
722:   \caption{Density of short circuits for the different networks (flirt
723:     network omitted as it contains very few 3- and 4-circuits).
724:   }
725:   \label{fig:clu}
726: \end{figure*}
727: 
728: The time development of $C$ and $D$ for different networks is shown in
729: Fig.~\ref{fig:clu}. As a quantity dependent on only the local network
730: structure the density of short circuits is an intrinsic quantity; and,
731: as seen for the clustering coefficient (Barab\'{a}si, Jeong et al.\ 2002),
732: these quantities approach their equilibrium values from
733: above. Interestingly, just as for the assortative mixing coefficient,
734: the relaxation towards equilibrium is faster for $C$ and $D$ than for
735: the average degree $M / N$; i.e.\ the density of short cycles is
736: rather independent of the average degree.
737: 
738: 
739: As can be seen in Fig.~\ref{fig:clu}, most $C$ and $D$ curves have
740: extremes in the middle of the time range (the density of short
741: circuits are at their minima). The reason for this comes from a
742: conflict between counteracting mechanisms of different
743: time-scales. There are three natural time-scales in the system: The
744: average time between new registrations; the average time between new
745: contacts for an individual user; and the average life span of a user
746: in the community. The latter time-scale should be responsible for the
747: long-term behavior such as the increase towards equilibrium of $M /
748: N$. And as shorter circuits are more likely in a dense network, it is
749: natural that $C$ and $D$ increase in the large $t$ limit. The decrease
750: for early times is a finite size effect that can be seen in evolving
751: network models with constant average degree such as the
752: Barab\'{a}si-Albert model (Barab\'{a}si and Albert 1999; Barab\'{a}si,
753: Albert et al.\ 1999; Barab\'{a}si, Jeong et al.\ 2002) and extensions
754: (Holme and Kim 2002), where the $C$ and $D$ curves converge from
755: above. 
756: 
757: 
758: Another interesting aspect is that the values of $C$ and $D$, although
759: finite in the large $t$ limit, is much smaller than in the actor- and
760: scientific-collaboration networks. In an Internet community the way by
761: which people introduce strangers among their acquaintances to each
762: other (Newman 2001; Holme and Kim 2002) is likely not the mechanism
763: responsible for the finite clustering (remember that in network models
764: such as the Erd\"{o}s-R\'{e}nyi (1959) and Barab\'{a}si-Albert
765: (Barab\'{a}si and Albert 1999; Barab\'{a}si, Albert et al.\ 1999;
766: Barab\'{a}si, Jeong et al.\ 2002) models the clustering goes to zero as
767: the network grows). Instead a finite density of short circuits can be
768: explained by the tendency formulated in the proverbial
769: like-attracts-like, where the similarity is defined by signaled
770: social, psychological, and physiological traits.\footnote{Another
771:   possible explanation for the convergence of $C$ and $D$ to finite
772:   values is that short circuits are introduced from the offline world
773:   outside the community. Reading users' guest books, however, gives
774:   the impression that the vast majority of community-dyads were
775:   strangers offline. We believe that this effect is negligible, but we
776:   are unfortunately unable to go beyond speculation on this point.}
777: 
778: \begin{figure}
779:   \centering{\resizebox*{\linewidth}{!}{\includegraphics{rewi.eps}}}
780:   \caption{Time evolution of original and rewired quantities. (a)
781:     shows data for the assortative mixing coefficient $r$ for the
782:     undirected all-contacts network, (b) is the clustering coefficient
783:     for the same data. The rewired data is obtained from 100 updating
784:     sweeps over all links, and indicated by the upper and lower hinges
785:     (border values between the first and second quartile, and third
786:     and fourth quartile respectively).}
787:   \label{fig:rewi}
788: \end{figure}
789: 
790: To further convince ourselves that the sampling time is large enough
791: we also use rewiring to examine the time evolution of two structural
792: measures (the assortative mixing coefficient and the clustering
793: coefficient for the undirected all-contacts network). As seen in
794: Fig.~\ref{fig:rewi} the rewired quantities converge in the same time
795: scale as $r$ and $C$, which reconfirm that the sampling time frame is
796: sufficient. We note that for $k > 200$ days the assortative mixing
797: coefficient is significantly lower than the rewired reference
798: curve. For the same time interval the rewired clustering coefficient closely overlap the measured $C$-value; for $t > 200$ days the actual value overlap the
799: mid-quartiles of the rewired data during around 30\% of the 512
800: days. For the initial `non-equilibrium' part ($t < 100$ days) of the
801: time-evolution the curves of the rewired and real networks
802: diverges. In this region the network is rather sparse (see
803: Fig~\ref{fig:rewi}) which explains the low $C$-values for the rewired
804: $C$-curve. The high early values of $C$ seems contradictory to the
805: apparent absence of tendency towards triangle formation during latter
806: times. This means that the contact patterns of the early network is no
807: the same as later on. As it turns out, in the early community, a group
808: of actors contact each other rather frequently (rather more like
809: `chatting' than romantic contact making) whereas another group makes a
810: few contacts before quitting the community. We interpret this such
811: that it requires a minimal number, or ``critical mass'' (cf.\
812: Schelling 1978) of people for the community to function. Before the
813: critical mass is reached, the users either have the community as a
814: chat room (a usage with a presumably smaller critical mass) or leave
815: it.
816: 
817: \section{Summary and conclusions}
818: 
819: 
820: We have investigated networks of communication between the users of
821: the Internet community pussokram.com. The four different means of
822: contact at pussokram.com defines five different networks in our study
823: (one for each separately and one for all taken together). Apart from
824: recent studies of scientific collaboration networks and movie actor
825: networks, there are very few such phenomenological descriptions of
826: large social networks, and thus there is limited knowledge that our
827: findings can be related to.
828: 
829: 
830: It is obvious that the fact that the interaction under study takes
831: place on the Internet creates special conditions for communication. We
832: believe that the interaction online is exposed to less structural
833: forces than what is typically the case in most other social
834: settings. For example, simultaneous interaction is not a prerequisite
835: for communication in an Internet community, i.e.\ time as a structural
836: force is therefore of less importance than in most other
837: settings. Neither does geographical space constraint
838: communication. And in addition, that social signifiers are less
839: visible (compared to e.g.\ face-to-face interaction), and the relative
840: ease with which you can conceal your identity and transform your
841: appearance in online interaction, are factors reducing the structure
842: forming forces at work in `offline' social activity. It is therefore
843: interesting to note, that despite these caveats, the networks under
844: study here are much more structured than what would be expected in a
845: random network. 
846: 
847: 
848: To summarize our findings of the Internet community pussokram.com, we
849: see that:
850: \begin{itemize}
851: \item The average degree converges over time, but surprisingly we
852:     observe no cut-off in the degree distribution. Previous studies do
853:     suggest that there is an upper limit to the mean number of
854:     contacts (Marsden 1987), and on average we find this
855:     socio-cognitive limitation despite the fact that time and space is
856:     of less important here. The reason we see continued growth in the
857:     cumulative degree distribution might be that it's relatively
858:     costless to have a high turnover on ones contacts in an online
859:     community. Contacts are established without much investment, and
860:     can also be dropped without much sanctioning. 
861: \item Reciprocity is rather low, and presumably lower can be expected
862:   in a regular acquaintance network. Reciprocity levels quickly
863:   converge to a steady state. 
864: \item Most assortative mixing coefficients have small negative values,
865:   suggesting a pattern of dissasortative mixing. This can partly be
866:   explained by the conventional effect from the skewed degree sequence
867:   (Newman 2002). The observed effect is significantly larger than can
868:   be expected solely from the degree distribution. An explanation for
869:   these higher $r$-values is the particular nature of the dating
870:   interaction (Cook, Emerson et al.\ 1983). We also find that mixing
871:   coefficients as a function of time converge rapidly. The
872:   dissasortative mixing in the Internet community networks is in
873:   striking contrast to the strong assortative mixing seen in
874:   scientific collaboration networks, and the nice correspondence with
875:   previous work in sociology indicates that Internet communities
876:   indeed strongly resembles off-line social communities.
877: \item The cumulative degree distributions are highly skewed, being a
878:   mixture of previous mappings of acquaintance networks (Amaral, Scala
879:   et al.\ 2000)---for few contacts---and partnership networks
880:   (Liljeros, Edling et al.\ 2001)---for many contacts.
881: \item The geodesic length initially increases as new vertices are
882:  added to the network. But as the network settles the increase is
883:  limited by the growing average degree. Both $l_\mathrm{GC}$ and
884:  $l^{-1}$ shows consistently that the average geodesic length is
885:  decreasing during the whole sample period (a situation that can only
886:  exist for a non-equilibrium network).
887: \item Clustering---the density of triangles---converges over time to
888:   non-zero values (as opposed to completely random networks). Still,
889:   values are probably on a much lower level than would be expected in
890:   offline acquaintance networks. The explanation for these low values
891:   is twofold---the lack of introduction as a mechanism for
892:   tie-formation, and the romantic profile of pussokram.com promoting
893:   romantic contacts. The latter aspect is also manifested in that the
894:   density of 4-circuits is larger than the density of triangles for
895:   the pussokram.com networks. Once again, the Internet community
896:   networks are different from the scientific collaboration network
897:   where clustering is larger than the density of 4-circuits.
898: \end{itemize}
899: An Internet community such as pussokram.com defines a structured
900: social network that share more of the structuring forces with general
901: acquaintance networks than networks of professional collaborations
902: do. We believe that the precise timing resolution and fast dynamics
903: (giving a wide effective sampling time-frame) will make Internet
904: communities an invaluable object for future social networks studies of
905: the largest scale.
906: 
907: \section*{References}
908: 
909: "e-print arXiv:" refers (yet unpublished) to manuscripts uploaded to
910: the database arXiv.org.
911: 
912: \begin{list}{}{\setlength{\leftmargin}{5mm}\setlength{\rightmargin}{0mm}
913:     \setlength{\labelsep}{5mm}\setlength{\parsep}{2mm}
914:     \setlength{\itemindent}{-5mm}
915:     \setlength{\listparindent}{0mm}\setlength{\labelwidth}{0mm}
916:     \setlength{\itemsep}{0mm}\setlength{\partopsep}{0mm}}
917: 
918: \item Albert, R.\ and A.\ L.\ Barab\'{a}si (2002).\ ``Statistical
919:   mechanics of complex networks." \textit{Review of Modern Physics} \textbf{74}: 47-97.
920: \item Amaral, L.\ A.\ N., A.\ Scala, et al.\ (2000). ``Classes of
921:   small-world networks.'' \textit{Proceedings of the National Academy of
922:   Sciences of the United States of America}
923:   \textbf{97}(21): 11149-11152.
924: \item Anderson, B.\ S., C.\ Butts, et al.\ (1999). ``The interaction
925:   of size and density with graph-level indices.'' \textit{Social Networks}
926:   \textbf{21}(3): 239-267. 
927: \item Barab\'{a}si, A.\ L.\ and R.\ Albert (1999). ``Emergence of
928:   scaling in random networks.'' \textit{Science} \textbf{286}(5439): 509-512. 
929: \item Barab\'{a}si, A.\ L., R.\ Albert, et al.\ (1999). ``Mean-field
930:   theory for scale-free random networks.'' \textit{Physica A} \textbf{272}(1-2):
931:   173-187. 
932: \item Barab\'{a}si, A.\ L., H.\ Jeong, et al.\ (2002). ``Evolution of
933:   the social network of scientific collaborations.'' \textit{Physica A} \textbf{299}:
934:   559-564.
935: \item Butts, C.\ T.\ (2001). ``The complexity of social networks:
936:   theoretical and empirical findings.'' \textit{Social Networks} \textbf{23}(1): 31-71.
937: \item Cook, K.\ S., R.\ M.\ Emerson, et al.\ (1983). ``The
938:   distribution of power in exchange networks: Theory and experimental
939:   results.'' \textit{American Journal of} \textit{Sociology} \textbf{89}(2): 275-305. 
940: \item Dorogovtsev, S.\ N.\ and J.\ F.\ F.\ Mendes (2002). Accelerated
941:   growth of networks. Handbook of Graphs and Networks: From the Genome
942:   to the Internet.\ S.\ Bornholdt and H.\ G. Schuster.\ Berlin, Wiley-VCH. 
943: \item Dorogovtsev, S.\ N.\ and J.\ F.\ F.\ Mendes (2002). ``Evolution
944:   of networks.'' \textit{Advances in Physics} \textbf{51}(4): 1079-1187. 
945: \item Ebel, H., L.\ I.\ Mielsch, et al.\ (2002). ``Scale-free topology
946:   of e-mail networks.'' \textit{Physical Review E} \textbf{66}, art.\ no.\ 035103. 
947: \item Erd\"{o}s, P.\ and A.\ R\'{e}nyi (1959). ``On random graphs.''
948:   \textit{Publicationes Matematicae Debrecen} \textbf{6}: 290-297. 
949: \item Fararo, T.\ J.\ and M.\ H.\ Sunshine (1964). A study of a biased
950:   friendship net. Youth Development Center, Syracuse University,
951:   Syracuse. 
952: \item Gould, R.\ V.\ (2002). ``The origins of status hierachies: A
953:   formal theory and empirical test.'' \textit{American Journal of Sociology}
954:   \textbf{107}(5): 1143-1178. 
955: \item Granovetter, M.\ (1973). ``Strength of weak ties.'' \textit{American
956:   Journal of Sociology} \textbf{78}(6): 1360-1380. 
957: \item Holme, P.\ and B.\ J.\ Kim (2002). ``Growing scale-free networks
958:   with tunable clustering.'' \textit{Physical Review E} \textbf{65}(2): art.\ no.\ 026107.
959: \item Latora, V.\ and M.\ Marchiori (2001). ``Efficient behavior of
960:   small-world networks.'' \textit{Physical Review Letters} \textbf{87}(19): art.\ no.\
961:   198701.
962: \item Liljeros, F., C.\ R.\ Edling, et al.\ (2001). ``The web of human
963:   sexual contacts.'' \textit{Nature} \textbf{411}(6840): 907-908. 
964: \item Marsden, P.\ V.\ (1987). ``Core discussion networks of
965:   Americans.'' \textit{American Sociological Review} \textbf{52}(1): 122-131. 
966: \item Morris, M.\ and M.\ Kretzschmar (1995). ``Concurrent
967:   Partnerships and Transmission Dynamics in Networks.'' \textit{Social
968:   Networks} \textbf{17}(3-4): 299-318. 
969: \item Newman, M.\ E.\ J.\ (2001). ``Clustering and preferential
970:   attachment in growing networks.'' \textit{Physical Review E} \textbf{64}(2): art.\
971:   no.\ 025102.
972: \item Newman, M.\ E.\ J.\ (2001). ``Scientific collaboration
973:   networks. I. Network construction and fundamental results.''
974:   \textit{Physical Review E} \textbf{64}(1): art.\ no.\ 016131.
975: \item Newman, M.\ E.\ J.\ (2002). ``Assortative mixing in networks.''
976:   \textit{Physical Review Letters} \textbf{89}, art.\ no.\ 208701.
977: \item Newman, M.\ E.\ J.\ (2003). ``The structure and function of
978:   complex networks.'' \textit{SIAM Review} \textbf{45}(2): 167-256. 
979: \item Park, J.\ and M.\ E.\ J.\ Newman (2003). ``Origin of degree
980:   correlations in the Internet and other networks.'' \textit{Physical Review E}
981:   \textbf{68}(2), art.\ no.\ 026112.
982: \item Pattison, P., S.\ Wasserman, et al.\ (2000). ``Statistical
983:   evaluation of algebraic constraints for social networks.'' \textit{Journal
984:   of Mathematical Psychology} \textbf{44}: 536-568. 
985: \item Roberts, J.\ M.\ (2000). ``Simple methods for simulating
986:   sociomatrices with given marginal totals.'' \textit{Social Networks} \textbf{22}(3):
987:   273-283. 
988: \item Rothaermel, F.\ T.\ and S.\ Sugiyama (2001). ``Virtual Internet
989:   communities and commercial success: individual and community-level
990:   theory grounded in the atypical case of TimeZone.com.'' \textit{Journal of
991:   Management} \textbf{27}(3): 297-312. 
992: \item Schelling, T.\ C.\ (1978). Micromotives and macrobehavior.\ New
993:   York, Norton.
994: \item Shen-Orr, S.\ S., R.\ Milo, et al.\ (2002). ``Network motifs in
995:   the transcriptional regulation network of Escherichia coli.'' \textit{Nature
996:   Genetics} \textbf{31}(1): 64-68. 
997: \item Simon, H.\ A.\ (1955). ``On a class of skew distribution
998:   functions.'' \textit{Biometrika} \textbf{42}: 425-440. 
999: \item Skvoretz, J.\ (1990). ``Biased net theory: Approximations,
1000:   simulations, and observations.'' \textit{Social Networks} \textbf{12}(3): 217-238. 
1001: \item Smith, R.\ (2002). ``Instant Messaging as a Scale-Free
1002:   Network.'' eprint  arXiv:cond-mat/0206378, unpublished. 
1003: \item Wasserman, S.\ and K.\ Faust (1994). Social network analysis:
1004:   Methods and applications. Cambridge, Cambridge University Press. 
1005: \item Watts, D.\ J.\ (1999). ``Networks, dynamics, and the small-world
1006:   phenomenon.'' \textit{American Journal of Sociology} \textbf{105}(2): 493-527. 
1007: \item Watts, D.\ J.\ and S.\ H.\ Strogatz (1998). ``Collective
1008:   dynamics of `small-world' networks.'' \textit{Nature} \textbf{393}(6684): 440-442. 
1009: \item Wellman, B.\ (2001). ``Computer networks as social networks.''
1010:   \textit{Science} \textbf{293}(5537): 2031-2034. 
1011: \item Wellman, B.\ and C.\ A.\ Haythornthwaite (2002). The Internet in
1012:   everyday life. Oxford, Blackwell. 
1013: \end{list}
1014: 
1015: \end{document}
1016: