0210:cond-mat0210514/pok.tex

1: \documentclass[twocolumn,rmp,aps]{revtex4}

2:

3: \usepackage{dcolumn,graphicx,amsmath,amssymb,pxfonts}

4:

5: \begin{document}

6:

7: \title{Structure and Time-Evolution of an Internet Dating Community}

8:

9: \author{Petter \surname{Holme}}

10: \email{holme@tp.umu.se}

11: \affiliation{Department of Physics, Ume{\aa} University,

12:   901~87 Ume{\aa}, Sweden}

13:

14: \author{Christofer R.\ \surname{Edling}}

15: \affiliation{Department of Sociology, Stockholm University, 106~91

16:   Stockholm, Sweden}

17:

18: \author{Fredrik \surname{Liljeros}}

19: \affiliation{Department of Sociology, Stockholm University, 106~91

20:   Stockholm, Sweden}

21: \affiliation{Department of Medical

22:     Epidemiology and Biostatistics, Karolinska Institutet S-171 77

23:     Solna, Sweden}

24:

25: \begin{abstract}

26: We present statistics for the structure and time-evolution of a

27: network constructed from user activity in an Internet community. The

28: vastness and precise time resolution of an Internet community offers

29: unique possibilities to monitor social network formation and

30: dynamics. Time evolution of well-known quantities, such as clustering,

31: mixing (degree-degree correlations), average geodesic length, degree,

32: and reciprocity is studied. In contrast to earlier analyses of

33: scientific collaboration networks, mixing by degree between vertices

34: is found to be disassortative. Furthermore, both the evolutionary

35: trajectories of the average geodesic length and of the clustering

36: coefficients are found to have minima.

37: \end{abstract}

38:

39: \maketitle

40:

41: \footnotesize

42:

43: We thank Christian Wollter and Michael Lokner at pussokram.com, Stefan

44: Praszalowicz at nioki.com, and Niklas Angemyr and Reginald Smith for

45: granting and helping us getting access to data. We thank Mark Newman

46: for comments on assortative-mixing, and the editor and anonymous

47: reviewer for helpful comments. PH is partially supported by the

48: Swedish Research Council through contract no.\ 2002-4135. CRE is supported

49: by the Bank of Sweden Tercentenary Foundation. FL is supported by the

50: National Institute of Public Health.

51:

52: \normalsize

53:

54: \section{Introduction}

55: With the growing interest in social network analysis from the physics

56: community, a new research area is emerging in the intersection between

57: statistical physics and sociology (Albert and Barab\'{a}si 2002;

58: Dorogovtsev and Mendes 2002; Newman 2003). Sociologists have been

59: interested in network analysis for at least half a century, and with

60: mathematicians and statisticians they have developed a set of tools to

61: analyze positions, structures, and processes of social networks

62: (Wasserman and Faust 1994; Butts 2001). Although there are exceptions

63: (Fararo and Sunshine 1964; Skvoretz 1990), most sociological and

64: anthropological studies of networks have focused on small-group

65: interaction or cognitive networks. In one respect this is quite

66: natural as most groups and formal organizations are of small

67: size. Also, a pragmatic reason for this is that data collection of

68: large social networks, behavioral or cognitive, is cumbersome and

69: often practically impossible to carry through. Therefore, although

70: recent analyses (Watts and Strogatz 1998; Watts 1999; Newman 2001)

71: have brought new attention to comparative analysis of large-scale

72: social networks, the statistical physics method, emphasizing the limit

73: of large system sizes (Albert and Barab\'{a}si 2002), has been of

74: limited utility. However, the extended use of database technology

75: provide new possibilities for constructing real world networks for the

76: analysis of e.g.\ movie-actor networks (Watts and Strogatz 1998) and

77: co-authorship in science (Newman 2001). Surely, these networks reflect

78: social interaction, but they are also heavily constrained by the logic

79: of a particular industry or a particular professional activity. Thus,

80: to allow for exploration of the possible universal properties of

81: social networks in general, there is still an urgent need to analyze

82: other types of large empirical social networks. In this paper we

83: report on an investigation of a large social network, aiming to give a

84: phenomenological description that will hopefully shed some new light

85: on the processes forming the structure of social networks. To put

86: results in context, we try to compare our findings to other studies

87: whenever possible, and to contrast parameters to what would be

88: expected from a random network with similar characteristics.

89:

90: To construct network data and large graphs based on more spontaneous

91: patterns of human interaction than e.g.\ co-authorship and

92: co-actorship, one can consider data from e-mail exchange (Ebel,

93: Mielsch et al.\ 2002) or user activity in Internet communities

94: (Rothaermel and Sugiyama 2001; Smith 2002). The present work belongs

95: to the latter category, with a strong focus on the dynamics of the

96: network. In contrast to previous studies of Internet communities

97: (Smith 2002), we use down-to-the-second timing of the communication to

98: investigate time evolution and obtain steady state estimates of

99: well-known measures of graph structure. We use data from a Swedish

100: Internet community called pussokram.com (roughly ``kiss'n'hug'' in

101: English) that is primarily targeted at adolescents and young

102: adults. The community provides an arena for flirting, dating, and

103: other romantic communication; as well as communication for

104: non-romantic friendship.

105:

106: Studies suggest that online interaction is driven by the same needs as

107: face-to-face interaction, and should not be regarded as a separate

108: arena but as an integrated part of modern social life (Wellman and

109: Haythornthwaite 2002). Thus communicative actions taken by members of

110: the community can be expected to share many features with the web of

111: human acquaintances and romances in the social off-line world. Indeed,

112: for many people in contemporary Western societies, interaction on the

113: Internet is as real as any other interaction (Wellman 2001). Internet

114: communities are interesting by and for themselves, but this suggests

115: that the formation and dynamics of social networks in an Internet

116: community can share the same generic properties as all social

117: acquaintance networks, and that the study of Internet communities can

118: provide important information for enhancing our understanding of

119: social networks in general.

120:

121: The paper is divided into four sections. In the next section we give a

122: detailed description of the functions of the Internet community in

123: focus. The third section contains statistical analyses and

124: presentation of results that we summarize and discuss in the fourth

125: and concluding section.

126:

127: \section{The Internet community pussokram.com\label{sec:pok}}

128:

129: Pussokram.com is a Swedish Internet community primarily intended for

130: romantic communication and targeted at adolescents and young

131: adults. The community had around 30$\,$000 active users during the spring

132: and summer 2002, the mean user age is 21 years, and approximately 70

133: percent of the users are women (therefore, and to simplify, we will

134: use the female gender when referring to users in this paper). Both age

135: and sex are self reported. It is possible to have multiple accounts on

136: the community. A crude check on the number of accounts linked to every

137: unique e-mail address indicates that this is not very common (more

138: than 99.7\% of the membership accounts are associated with a unique

139: e-mail address and no e-mail address are associated with more than 5

140: accounts).\footnote{Of course it is possible to use an unique e-mail

141:   address for every unique e-mail account but since this information

142:   is not revealed its hard to see way on would go through the extra

143:   effort so doing.} Our data consists of all the user activities on

144: pussokram.com logged for 512 days from 13:39:25 on February 13, 2001

145: ($t = 0$) to 13:28:19 on July 10, 2002. The smallest time-unit on the

146: log is 1 second. We analyze the activity of all users registered at

147: time $t = 0$, as well as the activity of any new users during this time

148: span.\footnote{Personal integrity is of course an issue here. For the

149:   analysis, we study the anonymized data to prevent any intrusion of

150:   privacy, and we do not have access to specific message

151:   contents. Like everyone else, we can read the guest books, but still

152:   we cannot link an user (and her guest book) to the vertices of the

153:   network. Thus, we cannot identify any specific individual person in

154:   the data. We do not even have data that can be cross-examined with

155:   other databases (like computer IP-addresses) to detect users

156:   identity}  Time $t = 0$ defines the start up day for this particular

157: community. However prior to $t = 0$ there was a mail server for

158: sending anonymous love messages on the Internet. Registered users of

159: this service had their accounts automatically transferred to

160: pussokram.com. We only study activity on the community, nevertheless

161: this recruitment might induce higher initial growth of active users.

162:

163: \begin{figure*}

164:   \centering{\resizebox*{\linewidth}{!}{\includegraphics{pok.eps}}}

165:   \caption{Screenshot of a typical user homepage at

166:     pussokram.com. ``User A'', ``User B'', etc.\ symbolize user names. (The

167:     translation is due to the authors. Italics denote a description

168:     rather than a translation.)

169: }

170:   \label{fig:pok}

171: \end{figure*}

172:

173: Pussokram.com has a pronounced romantic profile, where:

174: \begin{itemize}

175: \item Users are encouraged to send messages to others that they are

176:   secretly in love with.

177: \item The provider answers questions related to love and sex posed by

178:   the users under the pseudonym Dr.\ Love.

179: \item The design of the HTML-pages makes use of a romantic iconography

180:   well known to the targeted users (with Valentine's hearts, deep red

181:   colors, etc., see Fig.~\ref{fig:pok}). Nevertheless, a quick glance

182:   through some of the public guest books reveals that many of the

183:   contacts taken are also non-romantic.

184: \end{itemize}

185:

186: \subsection{Types of contacts in pussokram.com}

187:

188: There are four major modes of communication at pussokram.com. We study

189: each of the networks generated by these four types of contacts

190: separately and we also study the union of these networks generated by

191: any of these contacts. A brief description of the four types of

192: contacts follows:

193: \begin{itemize}

194: \item The \textbf{Messages} are in effect intra-community e-mails. These

195: are private in the sense that no one in the community, except the

196: sender and receiver, can access them. Not even information on how many

197: messages other users have received are retrievable for other users.

198: \item In \textbf{Guest book} signing, each user has a guest book that

199:   every community member is free to write in.

200: \item \textbf{Flirt} or ``friendship request:'' User A can ask user B to

201:   be her friend. If user B accepts user A's request then they can both

202:   easily see if the other is online whenever they are logged onto

203:   pussokram.com. Information on the friends of a specific user is

204:   private to the user only.

205: \item \textbf{Friendship}: A friendship relation is established after

206:   acceptance of a friendship request, as described above. The

207:   friendship network is thus bi-directional. A friendship can be

208:   canceled by any of the friends.

209: \end{itemize}

210:

211: \subsection{Ways to receive attention and search users}

212:

213: Unless engaged in peer-to-peer contact of some sort, users at

214: pussokram.com are relatively anonymous towards each other. There is

215: reason to believe that knowledge about the prior interactive behavior

216: of other individuals structures the present interactive behavior of a

217: given individual (the so called imitation factor). The only

218: information about a user's interaction history available to other

219: users. But there are several ways for an user to draw attention to

220: herself (i.e.\ to direct other users to her community homepage), and

221: for users to find information about others. Here we summarize various

222: ways that can be used to receive attention, search for other users,

223: and promote oneself at pussokram.com. The following information is

224: displayed when a logged on user browse the pussokram.com website:

225: \begin{itemize}

226: \item The username of the most recently registered community member.

227: \item The name of the most recently edited diary (each user has space

228:   open for others to read, intended as a diary).

229: \item The names of the most recent users to browse a specific user's

230:   homepage.

231: \item The names of similar users are displayed on a specific users

232:   homepage. Similarity is assesses through self-reported background

233:   variables.

234: \item A long interview with the ``user of the week'' (although updated

235:   more seldom than weekly). This is an epithet that users can apply for.

236: \item Photographs of 10-20 users are displayed at the login-page.

237: \end{itemize}

238:

239: A user can search out other users with a search engine (the

240: ``s\"{o}kofinder''---in English ``search'n'finder''---in

241: Fig.~\ref{fig:pok}) that handles the following

242: criteria: Sub-string of the username, gender, age, place of residence,

243: online status, and if a user has provided a photograph of

244: herself. Presumably, these are the characteristics that drive user

245: activity, but because it is hard to assess their validity, and because

246: we are only interested in structural properties, we do not conduct any

247: analysis on them.

248:

249: \subsection{Comparisons with other empirical and statistical networks}

250:

251: For comparison we also use networks by instant messaging at the French

252: Internet community nioki.com and scientific collaboration (or, rather,

253: co-authorship) networks. nioki.com and pussokram.com are rather

254: similar, both in terms of content and design, but compared to

255: pussokram.com, nioki.com is even more youth oriented and not as

256: focused on romantic relations as pussokram.com. Besides the

257: possibility of searching for user names, nioki.com has two search

258: procedures \textit{recherche l'amiti\'{e}} (search for friendship) and

259: \textit{recherche l'amour} (search for love), where one can fill out

260: questionnaires to find other users that match ones preferences. In the

261: nioki.com network, an arc connects user A to user B if user B is in

262: user A's list of contacts (for details see (Smith 2002). In the

263: scientific collaboration networks (Newman 2001) the vertices are

264: scientists who have uploaded manuscripts to the Los Alamos preprint

265: repository arXiv.org, arcs are added between scientists who have

266: co-authored a paper. In contrast to the pussokram.com and nioki.com

267: networks, ties in the scientific collaboration network is

268: bi-directional. Note, that the pussokram.com networks are dynamic,

269: while we only have access to snapshot data of nioki.com and scientific

270: collaboration networks. For this reason we can only make comparisons

271: between the static properties of these networks.

272:

273: In addition, following (Anderson, Butts et al.\ 1999; Pattison,

274: Wasserman et al.\ 2000; Shen-Orr, Milo et al.\ 2002), we compare some

275: observed quantities to the corresponding average values from

276: randomized networks with the same degree-sequence as the original. By

277: this approach, we examine how aspects of structures other than the

278: degree sequence, influences the quantities. Every known real social

279: network deviates from the average randomized network in a larger or

280: lesser extent, depending on the social forces structuring the

281: interaction. For example, with regards to the present case, we believe

282: that an Internet community network will be closer to the average

283: randomized network than several other types of social networks,

284: because time and space constraints are much less pressing than in,

285: e.g., a kinship network. These randomized networks are generated by

286: sequentially going through all directed arcs A-B, and for every such

287: arc randomly select another arc, C-D, and then rewire so that A-D

288: forms one arc, and C-B forms another. The choice of C-D is done with

289: uniform randomness among all arcs that would not introduce a loop or a

290: multiple arc. We use this algorithm to generate $\sim 3000$ networks and the

291: quantities are averaged over these networks. This procedure is inspired

292: by Roberts (2000). However it differs from Roberts in the sense that

293: we use sweeps over all arcs (where each arc is rewired at least once)

294: as the unit of iterations of the algorithm.\footnote{To be precise our

295:   algorithm run as follows: We go sequentially through the arc

296:   set $A$ (see Sect.~\ref{sec:stat}). For every arc $(v,w)$ we

297:   construct a set $A'$ of arcs such that if a member $(v',w')$ of $A'$

298:   is to be rewired with $(v,w )$---i.e.\ so that $(v,w)$ and $(v',w')$

299:   are replaced by $(v,w')$ and $(v',w)$---then no loops or multiple

300:   arcs are formed. Then we choose one of $A$'s arcs with uniform

301:   randomness and rewire that arc with $(v,w)$.}

302:

303: \section{Statistical analysis\label{sec:stat}}

304:

305: The pussokram.com network consists of all registered users and the

306: communication flow between these users as described

307: above. Communication is conceived of as directed links between

308: users. This is translated into a graph of vertices (users) and arcs

309: (ties). Vertices are added to the network the first time a registered

310: user is active, i.e. the first time the user sends or receives a

311: message, signs a guest book, or sends or accepts a friendship request

312: as described above. Each of these interactions defines a unique

313: network, and by adding an arc for any activity one gets a total

314: network of online activities. We thus study five networks, and for

315: each of them the vertex set is empty at $t = 0$. We represent the

316: network as a directed graph, $G = (V, A)$, where $V$ is the vertex set

317: and $A$ is the set of arcs, or ordered pairs of vertices. $N = |V|$

318: denotes the order (number of vertices) of $G$, and $M = |A|$

319: represents the number of arcs. Sometimes we study properties of the

320: undirected graph obtained by taking the reflexive closure of

321: $G$.\footnote{I.e.\ the graph obtained if for every $(u,v)\in A$ and

322:   $(v,u)\notin A$ then $(v,u)$ is added to $A$.}

323:

324: \begin{figure}

325:   \centering{\resizebox*{\linewidth}{!}{\includegraphics{len.eps}}}

326:   \caption{Time evolution of the number of vertices (a) and average

327:     degree (b) as a function of time.}

328:   \label{fig:len}

329: \end{figure}

330:

331: \subsection{Decreasing growth rate of network size and convergence of

332:   average degree}

333:

334: For each network, the number of vertices of each network, $N$, as a

335: function of time during the sampling is displayed in

336: Fig.~\ref{fig:len}(a), and the average degree, i.e.\ the average

337: number of arcs per vertex, $M / N$, is displayed in

338: Fig~\ref{fig:len}(b). As can be seen, both the number of vertices and

339: the average degree are increasing as a function of time, but with at a

340: decreasing growth rate. The average degree appears to converge to a

341: constant, but for $t < 100$, it increases as a power function. The

342: more rapid growth rate in the beginning of the period is explained by

343: the fact that old users log on for the first time during our sampling

344: period (see discussion in Section~\ref{sec:pok}). The decreasing

345: growth, and apparent approach to equilibrium, stand in contrast to the

346: accelerated growth of the Internet and the World Wide Web (Dorogovtsev

347: and Mendes 2002), as well the linear growth of scientific

348: co-authorship networks extracted from article databases (Newman 2001;

349: Newman 2001; Barab\'{a}si, Jeong et al.\ 2002). However, in social

350: networks, the average degree cannot be increasing without bounds, and

351: this goes for scientific collaboration networks too. We believe the

352: difference stems from a wider effective sampling time frame---due to

353: the much more rapid dynamics of an Internet community (compared to

354: scientific collaborations) we are, relatively speaking, able to follow

355: the process for a much longer period. In the sense that $G$ is a

356: steadily growing dynamic network, we deal with a non-equilibrium

357: representation of the social situation. When we speak of the network

358: ``reaching equilibrium,'' we refer to when all quantities that are

359: bounded as a function of $N$ (such as the average degree) are reaching

360: their constant limits.

361:

362: \begin{figure}

363:   \centering{\resizebox*{\linewidth}{!}{\includegraphics{ass.eps}}}

364:   \caption{Reciprocity $R$ (a), and (b) assortative mixing coefficient

365:     $r_\mathrm{dir}$ as functions of time.}

366:   \label{fig:ass}

367: \end{figure}

368:

369: \subsection{Reciprocity varies between networks}

370:

371: Various types of social relations differ in direction, intensity, and

372: frequency (Granovetter 1973). Messages between agents with different

373: social status for example, tend to be unevenly distributed (Gould

374: 2002). In the present analysis, we can investigate the reciprocity of

375: communicative action by looking at the direction of the communication

376: flow between any two users. For example, if user A sends a friendship

377: request to user B, we observe a link between user A and user B, and

378: note an arc between the two vertices. But it makes quite a difference

379: whether user B accepts the invitation or not, i.e.\ whether we note one

380: or two arcs between the vertices. We define reciprocity $R$, as the

381: fraction of mutual dyads, i.e.\ the ratio between the number of

382: vertex-pairs $\{v,w\}$ occur in two arcs  ($(v,w)$ and $(w,v)$) and

383: vertex-pairs that occur in at least one arc.  More analytically:

384: \begin{equation}

385: R=\frac{2M}{M_2}-1~.\label{eq:rec}

386: \end{equation}

387: where $M_2$ is the number of arcs in the reflexive closure of $G$. $R$

388: lies strictly in the interval $[0,1]$; if $(u,v)$ is an arc then $R =

389: 0$ implies that $(v,u)$ is not an arc and $R = 1$ implies that $(v,u)$

390: is an arc.

391:

392: \begin{table*}

393: \label{tab:ass}

394:   \caption{Assortative mixing coefficients, $r$, for five

395:     pussokram.com networks, and for nioki.com and arXiv.org

396:     networks. Statistics for corresponding randomized networks are

397:     within square brackets. Differences between the various mixing

398:     coefficients are discussed in the text. Double hyphens indicate

399:     missing data. Note: * $p\leq0.01$ nioki.com and arXiv.org data are

400:     not tested for significance.}

401: \begin{ruledtabular}

402: \begin{tabular}{l|ccccccc}

403: \hline

404: network & $N$ & $r$ & $r_\mathrm{dir}$ & $r_\mathrm{in\: in}$ &

405: $r_\mathrm{in\: out}$ & $r_\mathrm{out\: in}$ & $r_\mathrm{out\:

406:   out}$\\

407: all contacts & 29{$\,$}341& {--}0.048* & {--}0.059* &  {--}0.063*&{--}0.046* & {--}0.071*& {--}0.050* \\

408:  & & [{--}0.043]& [{--}0.041]& [{--}0.028]&[{--}0.021] & [{--}0.049]& [{--}0.035]\\

409: messages & 21{$\,$}545 & -0.055* & {--}0.083*& 0.054*& -0.056*&  -0.076* & -0.087*\\

410:  & & [-0.053] & [{--}0.061]&[-0.013] &[-0.011] & [-0.058]& [-0.057]\\

411: guest book & 20{$\,$}691 & -0.073*& -0.085*&-0.097* &-0.043* & -0.088*& -0.053*\\

412:  & & [-0.049] & [-0.038]&[-0.024] & [-0.015]&[-0.042] & [-0.026]\\

413: friends

414:  & 14{$\,$}278& -0.042*&- - &- - & - -& - -& - -\\

415:  & &  [0.031]&- - & - -& - -& - -& - -\\

416: flirts &8{$\,$}186 & -0.12*& -0.12*& -0.006& -0.022& -0.12*& -0.042*\\

417:  & & [-0.12] & [-0.10]& [0.016]& [-0.002]& [-0.10]& [-0.013]\\

418: nioki.com & 50{$\,$}259& -0.13& -0.10& -0.088&-0.084 & -0.10&-0.095 \\

419:  & & [-0.034]&[-0.014] &[-0.018] &[-0.014] &[-0.020] &[-0.016] \\

420: arXiv.org &52{$\,$}909 & 0.36& - -& - -& - -&- - & - -\\

421:  & &  [-0.034]& - -& - -& - -& - -& - -\\

422: \end{tabular}

423: \end{ruledtabular}

424: \end{table*}

425:

426: The time evolution of the reciprocity can be seen in Fig.~\ref{fig:ass}a. As is

427: evident from the figure, reciprocity levels differ little between the

428: different networks. By definition, the friendship network has

429: reciprocity of 1. And by the same token, the flirt network has a

430: reciprocity equal to zero. For the other two networks, the curves

431: converge to values around 0.4 for the guest book and messages

432: networks, and 0.5 for the all contacts network (see Table~\ref{tab:ass}).

433: It's hard to judge whether these are high or low values of

434: reciprocity. They are however compatible with data for the French

435: Internet community nioki.com. We normally assume acquaintance networks

436: to have a high degree of reciprocity, but one reason to expect a lower

437: value for online interaction is that an actor feels less social

438: pressure to respond to a communicative act over the Internet than in a

439: face-to-face, or telephone encounter, for example.

440:

441: \subsection{Disassortative mixing coefficients of the pussokram.com networks}

442:

443: Together with the degree distribution, the degree-degree correlation

444: is considered to govern much of the network's robustness towards

445: disturbances as well as the information flow. In other contexts the

446: discussion is usually phrased in terms of resilience against epidemics

447: and attack. A positive degree-degree correlation is also referred to

448: as assortative mixing by degree, and it means that vertices of

449: high degree preferably attaches to each other, and vice versa. For

450: example, assortative mixing makes the networks more vulnerable to

451: outbreaks of diseases, and more robust against strategic attack

452: (Newman 2002), because if people with many contacts are connected to

453: other people with many contacts, the epidemic threshold will be

454: lowered. Disassortative mixing, on the other hand, gives rise to

455: larger epidemics (Morris and Kretzschmar 1995).

456:

457:

458: We measure assortative mixing by calculating Pearson's correlation

459: coefficient $r$ for the degrees at either side of an edge as suggested

460: by Newman (2002):

461: \begin{equation}\label{eq:r}

462: r=\frac{\langle k_\mathrm{to} k_\mathrm{from}\rangle -\langle

463:   k_\mathrm{to}\rangle\langle k_\mathrm{from}\rangle} {\sqrt{\langle

464:   k_\mathrm{to}^2\rangle-\langle k_\mathrm{to}\rangle^2}\sqrt{\langle

465:   k_\mathrm{from}^2\rangle-\langle k_\mathrm{from}\rangle^2}}

466: \end{equation}

467:

468: In equation \ref{eq:r},  $\langle\cdots\rangle$  denotes the average

469: over arcs, $k_\mathrm{from}$ is some (in-, out-, or total) degree of

470: the vertex that the arc starts from, and $k_\mathrm{to}$ is some degree of the

471: vertex that the arc leads to. We look at $r$ for total degree of both

472: bi-directional (where the reflexive closure has been taken if the

473: network is not bi-directional by definition) and directed graphs

474: $r_\mathrm{dir}$. Furthermore, we measure the four combinations of in-

475: and out degree correlations; e.g.\ the out-in correlation coefficient

476: indicates whether users that have many contacts (high out-degree)

477: prefers to communicate with those users that themselves receive

478: communication from many users (high in-degree).

479:

480: The values for pussokram.com and other networks are displayed in

481: Table~\ref{tab:ass}. Interestingly enough all the pussokram.com networks,

482: as well as the nioki.com network display a significant disassortative

483: mixing for all types of degree-degree correlations. This is in

484: contrast to what have been measured for (scientific-, actor-, and

485: business-) collaboration networks (Newman 2002). To set these results

486: in perspective we also measure $r$ for a scientific collaboration

487: network, which clearly displays a positive assortative mixing

488: coefficient. Maybe an assortative mixing is significant only to

489: interaction in competitive areas, such as professional collaborations

490: (where only already big names are likely to be successful in

491: collaborating with other big names). This result relates to research

492: on exchange networks that claim that negative mixing is optimal when

493: actors are substitutable, as for example in friendship and dating

494: network (Cook, Emerson et al.\ 1983). In contrasts, professional

495: collaboration is positive because both knowledge and already

496: established channels for cooperation screen off potential alternative

497: collaborators. Another issue is the skewness of the degree

498: distribution. Intuitively, a large spread in the degree distribution

499: will increase the likelihood of observing negative mixing. And as can

500: be seen from the randomized networks in Table~\ref{tab:ass}, given the degree

501: distribution we would expect a negative mixing coefficient. However,

502: the observed coefficients are consistently, and significantly, higher

503: than expected. This strongly suggests that negative mixing arise from

504: this particular form of social interaction in which alters are

505: substitutable (Cook, Emerson et al.\ 1983). Note though, that some

506: network models, analyzing completely different forms of interaction,

507: with skewed degree distributions produce networks of zero or positive

508: assortative mixing (Newman 2002; Park and Newman 2003).

509:

510:

511: The six different assortative mixing coefficients of Table~\ref{tab:ass}

512: are all of the same sign and roughly of the same magnitude. This is

513: interesting since it suggests that the $r$-values is a result of other

514: structures (presumably the degree-sequence) rather than from the

515: behavior of individuals: There are no a priori reasons for

516: $r_\mathrm{in\: out}$ to be the same as e.g.\ $r_\mathrm{in\: in}$, as

517: a large $r_\mathrm{in\: out}$ means that actors that are active in the

518: community (have a high $k_\mathrm{out}$) tend to associate with those

519: who are successful in promoting themselves in the community (have a

520: high $k_\mathrm{in}$), while a large $r_\mathrm{in\: in}$ means that

521: the latter category has a preference towards each other.

522:

523: Fig.~\ref{fig:ass}b shows the time development of the assortative

524: mixing coefficient $r_\mathrm{dir}$ (the time development of the other assortative

525: mixing coefficients of Table~\ref{tab:ass} is qualitatively

526: similar). We see that $r_\mathrm{dir}$ converges more quickly than the

527: average degree. This is not surprising since the correlation

528: coefficient is a function of the way ties are formed rather than the

529: size or average degree of the network. An interesting detail of

530: Fig.~\ref{fig:ass}b is the jump at $t\approx 300$ days in the flirt

531: (friendship request) network. This is due to the formation of a tie

532: between two of the most connected actors. (The fact that the flirt

533: network is by far the sparsest strengthens this effect.)

534:

535: \subsection{Cumulative degree distributions are highly skewed}

536:

537: \begin{figure*}

538:   \centering{\resizebox*{0.7\linewidth}{!}{\includegraphics{deg.eps}}}

539:   \caption{Cumulative degree distribution for the networks at the

540:     largest times, for all contacts (a), friendship confirmations and

541:     messages (b), guest book (c), and flirts (d).}

542:   \label{fig:deg}

543: \end{figure*}

544:

545: The degree distribution has received much attention in comparative

546: analyses of complex networks since the work of Barab\'{a}si and Albert

547: (1999). A skewed degree distribution is commonly regarded as a

548: cumulative effect in the attachment of new arcs to the network (Simon

549: 1955; Barab\'{a}si and Albert 1999), and it offers a way to classify

550: different types of networks (Amaral, Scala et al.\ 2000). Indeed it

551: has been demonstrated that many apparently dissimilar types of

552: networks share the same highly skewed degree distributions of a

553: (truncated) power-law form (Albert and Barab\'{a}si 2002), indicating

554: an emerging scale-free structure. Such degree distributions are

555: generated through a growth process in which new arcs are drawn between

556: already existing vertices and new vertices only. However, a process

557: that reasonably describes the activity of an Internet community would

558: allow also for new arcs to be drawn between two already existing

559: vertices. Such a mixed process however, would result in a stretched

560: exponential distribution, and not a power-law, and thus a stretched

561: exponential distribution is what we would expect to observe. Another

562: process that can be responsible for cutting the tails of power-law

563: degree distributions in real-world networks is a limited capacity of

564: the actors.

565:

566: Following (Liljeros, Edling et al.\ 2001) we measure the cumulative

567: degree distribution of all the pussokram.com networks, see

568: Fig.~\ref{fig:deg}. If the degree distribution follows a power-law

569: with exponent $\gamma$ then the cumulative distribution will have the

570: exponent $\alpha = \gamma + 1$. All pussokram.com networks are

571: highly skewed, but none of them fits a power-law form across the whole

572: range observed. However, it is interesting to note that there are no

573: clear signs of the (inevitable) high-degree truncation in any of the

574: graphs (Fig.~\ref{fig:deg}). A previous study of the French nioki.com

575: has reported a power-law fit of the cumulative degree distribution

576: (Smith 2002). Our result might appear to set the pussokram.com

577: community apart from the nioki.com community, but a closer inspection

578: of our graphs and (Smith 2002) reveals a striking similarity in the

579: functional form of the distribution. We therefore conclude that the

580: dynamics shaping the degree-distribution is to a large extent the same

581: for the two communities.

582:

583: \subsection{Evolution of average geodesic length}

584:

585: As a general measure of how closely connected a graph is, the average

586: geodesic (shortest path) length is one of the most studied network

587: quantities. There is no unique natural definition of average geodesic

588: length in an arbitrary directed graph{--}-the problem is the

589: contribution from disconnected pairs of vertices. One choice is to

590: measure the geodesic distance averaged over pairs of vertices in the

591: giant component:

592: \begin{equation}

593: l_\mathrm{GC}=\frac{1}{|A_\mathrm{GC}|} \sum_{(u,v)\in A_\mathrm{GC}}d(u,v)~,

594: \end{equation}

595: where $d(u, v)$ is the distance between $u$ and $v$, and

596: $A_\mathrm{GC}$ is the arc-set of the giant component. Another option

597: is to average the inverse geodesic length (Latora and Marchiori 2001),

598: \begin{equation}

599: l^{-1}=\frac{1}{M} \sum_{(u,v)\in A}\frac{1}{d(u,v)}~,

600: \end{equation}

601: where $1/d(u, v)$ is defined as zero when no path exists from $u$ to

602: $v$. In the present paper we focus on $l^{-1}$, and $l_\mathrm{GC}$

603: for the reflexive closure of $G$. If the two measures agree, we can

604: infer that there is no additional effect influencing the shortest

605: paths in a substantial way, other than the bi-directional structure of

606: the largest connected subgraph.

607:

608: \begin{figure}

609:   \centering{\resizebox*{\linewidth}{!}{\includegraphics{xlen.eps}}}

610:   \caption{Time evolution of the average geodesic length within (a)

611:     the giant component of the reflexive closure and (b) the average

612:     inverse degree.}

613:   \label{fig:xlen}

614: \end{figure}

615:

616: As time evolves there are two conflicting mechanisms governing the

617: average geodesic length: The increasing number of vertices works for

618: an increase of $l$, whereas the increasing average degree makes $l$

619: shorter. For the pussokram.com data the latter effect dominates,

620: during the time span of our data set, to give a monotonously

621: decreasing $l_\mathrm{GC}$ (monotonously increasing $l^{-1}$) as shown

622: in Fig.~\ref{fig:xlen}. The same situation has been reported for

623: scientific collaboration networks (Barab\'{a}si, Jeong et al.\

624: 2002). Assuming the community outlives its members, $l$ will

625: eventually start to increase (when the number of inactive users slows

626: down the accelerated growth sufficiently).

627:

628: \subsection{Density of short circuits}

629:

630: Acquaintance networks are expected to have a high degree of

631: transitivity (Wasserman and Faust 1994), or in other words, a high

632: density of triangles, since if person A knows person B and person C,

633: then person B and person C are likely to be acquainted. We apply a

634: commonly used measure that gives the fraction of triangles out of the

635: connected 3-paths of the graph (a quantity that was defined for

636: undirected graphs, but is trivially generalized to directed graphs,

637: for which we use subscript ``dir''). If we let $p(n)$ denote the

638: number of representations of paths\footnote{A representation of a path

639:   of length three is a triplet $(u,v,w)$ such that $(u,v)$ and $(v,w)$

640:   are arcs. In an undirected network a path have two representations

641:   and a triangle has six representations.} and $c(n)$ denote the

642: number of representations of circuits, of length $n$, then we can

643: express the clustering coefficient,\footnote{This quantity is

644:   sometimes called transitivity, sometimes clustering

645:   coefficient. Note however that is not identical to Watts and

646:   Strogatz's (1998) clustering coefficient (where they average a local

647:   transitivity measure over the vertex set).} $C$, as:

648: \begin{equation}

649: C=\frac{c(3)}{p(3)}

650: \end{equation}

651: One can expect that social networks with many heterosexual romantic

652: relationships, such as the pussokram.com networks, to have rather few

653: triangles.\footnote{Presumably, homosexual relationships are not the

654:   common type of romantic relationship among Swedish

655:   adolescents. Therefore we expect few triangles. As a corollary, in a

656:   community populated largely by homosexual individuals, the number of

657:   triangles would be much higher. Regrettably we cannot test this

658:   hypothesis with available data.

659: } To get a better picture of the density of short circuits we also

660: measure the density of circuits of length four:

661: \begin{equation}

662: D=\frac{c(4)}{p(4)}

663: \end{equation}

664: The $n$-behavior of $c(n) / p(n)$ varies from network to network, and

665: could possibly be an informative quantity in it self. A very high $C$

666: will in most cases probably imply a high $D$ (for $R = 1$ network, two

667: triangles with one arc in common will contribute to $c(4)$), but the

668: reverse is less certain.

669:

670: \begin{table*}

671: \label{tab:misc}

672:   \caption{Statistics for the fully-grown networks of

673:     pussokram.com, nioki.com and arXiv.org networks provided for

674:     comparison. Statistics for corresponding randomized networks are

675:     within square brackets. Double hyphens indicate missing

676:     data. Note: * $p\leq 0.01$. ${}^\dagger$The `friends' and

677:     `arXiv.org' data sets are undirected, $M$ denotes the number of

678:     undirected edges (which is half the number of $M$ in a directed

679:     representation of the graph). nioki.com and arXiv.org data are not

680:     tested for significance.

681: }

682: \begin{ruledtabular}

683: \begin{tabular}{l|ccccccc}

684: \hline

685: network & all contacts & messages & guest book & friends & flirts & nioki.com & arXiv.org\\

686: $N$ & 29{$\,$}341 & 20{$\,$}691 & 21{$\,$}545 & 14{$\,$}278 & 8{$\,$}186 & 50{$\,$}259 & 52{$\,$}909 \\

687: $M$ & 174{$\,$}662 & 76{$\,$}257 & 73{$\,$}346 & 31{$\,$}871$^\dagger$ & 8{$\,$}744 & 405{$\,$}742 & 490{$\,$}600$^\dagger$\\

688: $R$ & 0.51 & 0.40 & 0.38 & 1 & 0 & 0.69 & 1\\

689: $l_\mathrm{GC}$ & 4.4 & 4.3 & 4.6 & 5.1 & 5.7 & 4.1 & 6.1\\

690: $l^{-1}$ & 0.12 & 0.10 & 0.084 & 0.18 & $4.0\times 10^{4}$ & 0.209 &

691: 0.121\\

692: $C$ & 0.006 & 0.001* & 0.014* & 0.020* & 0 & 0.0065 & 0.45\\

693:  & [0.006] & [0.002] & [0.007] & [0.0044] & [0.001] & [0.0081] &

694: [0.0020]\\

695: $C_\mathrm{dir}$ & 0.012* & 0.005* & 0.014* &  - - & 0* & 0.0076 & -

696: -\\

697: & [0.007] & [0.003] & [0.005] & [0] & [0.0077] & \\

698: $D$ & 0.017 & 0.006* & 0.022* & 0.020* & 0.212* & 0.013 & 0.35\\

699: & [0.009] & [0.004] & [0.008] & [0.004] & [0.004] & [0.0081] &

700: [0.0021]\\

701: $D_\mathrm{dir}$ & 0.016* & 0.008* & 0.015* & - - & 0 & 0.016 & - -\\

702: & [0.007] & [0.003] & [0.005] & [0] & [0.0077] & \\

703: \end{tabular}

704: \end{ruledtabular}

705: \end{table*}

706:

707: Values for $C_\mathrm{dir}$ and $D_\mathrm{dir}$ and their undirected

708: counterparts are shown in Table~\ref{tab:misc}. We note that, with a

709: few exceptions, the values for the real networks are significantly

710: larger than the randomized; the difference, however, is far less

711: dramatic than for the scientific collaboration network. This is

712: contrast between the Internet community networks and the arXiv.org

713: data is easily explained from the fact that a paper with

714: $n_\mathrm{auth}\geq 3$ authors represents a fully connected subgraph of

715: $G$ (contributing with $n_{\mathrm{auth}}(n_{\mathrm{auth}} {-}1)

716: (n_{\mathrm{auth}} {-}2) / 3$ triangles). However, we would like to

717: stress that the values themselves are not very informative, compared

718: to their time dependence.

719:

720: \begin{figure*}

721:   \centering{\resizebox*{0.65\linewidth}{!}{\includegraphics{clu.eps}}}

722:   \caption{Density of short circuits for the different networks (flirt

723:     network omitted as it contains very few 3- and 4-circuits).

724:   }

725:   \label{fig:clu}

726: \end{figure*}

727:

728: The time development of $C$ and $D$ for different networks is shown in

729: Fig.~\ref{fig:clu}. As a quantity dependent on only the local network

730: structure the density of short circuits is an intrinsic quantity; and,

731: as seen for the clustering coefficient (Barab\'{a}si, Jeong et al.\ 2002),

732: these quantities approach their equilibrium values from

733: above. Interestingly, just as for the assortative mixing coefficient,

734: the relaxation towards equilibrium is faster for $C$ and $D$ than for

735: the average degree $M / N$; i.e.\ the density of short cycles is

736: rather independent of the average degree.

737:

738:

739: As can be seen in Fig.~\ref{fig:clu}, most $C$ and $D$ curves have

740: extremes in the middle of the time range (the density of short

741: circuits are at their minima). The reason for this comes from a

742: conflict between counteracting mechanisms of different

743: time-scales. There are three natural time-scales in the system: The

744: average time between new registrations; the average time between new

745: contacts for an individual user; and the average life span of a user

746: in the community. The latter time-scale should be responsible for the

747: long-term behavior such as the increase towards equilibrium of $M /

748: N$. And as shorter circuits are more likely in a dense network, it is

749: natural that $C$ and $D$ increase in the large $t$ limit. The decrease

750: for early times is a finite size effect that can be seen in evolving

751: network models with constant average degree such as the

752: Barab\'{a}si-Albert model (Barab\'{a}si and Albert 1999; Barab\'{a}si,

753: Albert et al.\ 1999; Barab\'{a}si, Jeong et al.\ 2002) and extensions

754: (Holme and Kim 2002), where the $C$ and $D$ curves converge from

755: above.

756:

757:

758: Another interesting aspect is that the values of $C$ and $D$, although

759: finite in the large $t$ limit, is much smaller than in the actor- and

760: scientific-collaboration networks. In an Internet community the way by

761: which people introduce strangers among their acquaintances to each

762: other (Newman 2001; Holme and Kim 2002) is likely not the mechanism

763: responsible for the finite clustering (remember that in network models

764: such as the Erd\"{o}s-R\'{e}nyi (1959) and Barab\'{a}si-Albert

765: (Barab\'{a}si and Albert 1999; Barab\'{a}si, Albert et al.\ 1999;

766: Barab\'{a}si, Jeong et al.\ 2002) models the clustering goes to zero as

767: the network grows). Instead a finite density of short circuits can be

768: explained by the tendency formulated in the proverbial

769: like-attracts-like, where the similarity is defined by signaled

770: social, psychological, and physiological traits.\footnote{Another

771:   possible explanation for the convergence of $C$ and $D$ to finite

772:   values is that short circuits are introduced from the offline world

773:   outside the community. Reading users' guest books, however, gives

774:   the impression that the vast majority of community-dyads were

775:   strangers offline. We believe that this effect is negligible, but we

776:   are unfortunately unable to go beyond speculation on this point.}

777:

778: \begin{figure}

779:   \centering{\resizebox*{\linewidth}{!}{\includegraphics{rewi.eps}}}

780:   \caption{Time evolution of original and rewired quantities. (a)

781:     shows data for the assortative mixing coefficient $r$ for the

782:     undirected all-contacts network, (b) is the clustering coefficient

783:     for the same data. The rewired data is obtained from 100 updating

784:     sweeps over all links, and indicated by the upper and lower hinges

785:     (border values between the first and second quartile, and third

786:     and fourth quartile respectively).}

787:   \label{fig:rewi}

788: \end{figure}

789:

790: To further convince ourselves that the sampling time is large enough

791: we also use rewiring to examine the time evolution of two structural

792: measures (the assortative mixing coefficient and the clustering

793: coefficient for the undirected all-contacts network). As seen in

794: Fig.~\ref{fig:rewi} the rewired quantities converge in the same time

795: scale as $r$ and $C$, which reconfirm that the sampling time frame is

796: sufficient. We note that for $k > 200$ days the assortative mixing

797: coefficient is significantly lower than the rewired reference

798: curve. For the same time interval the rewired clustering coefficient closely overlap the measured $C$-value; for $t > 200$ days the actual value overlap the

799: mid-quartiles of the rewired data during around 30\% of the 512

800: days. For the initial `non-equilibrium' part ($t < 100$ days) of the

801: time-evolution the curves of the rewired and real networks

802: diverges. In this region the network is rather sparse (see

803: Fig~\ref{fig:rewi}) which explains the low $C$-values for the rewired

804: $C$-curve. The high early values of $C$ seems contradictory to the

805: apparent absence of tendency towards triangle formation during latter

806: times. This means that the contact patterns of the early network is no

807: the same as later on. As it turns out, in the early community, a group

808: of actors contact each other rather frequently (rather more like

809: `chatting' than romantic contact making) whereas another group makes a

810: few contacts before quitting the community. We interpret this such

811: that it requires a minimal number, or ``critical mass'' (cf.\

812: Schelling 1978) of people for the community to function. Before the

813: critical mass is reached, the users either have the community as a

814: chat room (a usage with a presumably smaller critical mass) or leave

815: it.

816:

817: \section{Summary and conclusions}

818:

819:

820: We have investigated networks of communication between the users of

821: the Internet community pussokram.com. The four different means of

822: contact at pussokram.com defines five different networks in our study

823: (one for each separately and one for all taken together). Apart from

824: recent studies of scientific collaboration networks and movie actor

825: networks, there are very few such phenomenological descriptions of

826: large social networks, and thus there is limited knowledge that our

827: findings can be related to.

828:

829:

830: It is obvious that the fact that the interaction under study takes

831: place on the Internet creates special conditions for communication. We

832: believe that the interaction online is exposed to less structural

833: forces than what is typically the case in most other social

834: settings. For example, simultaneous interaction is not a prerequisite

835: for communication in an Internet community, i.e.\ time as a structural

836: force is therefore of less importance than in most other

837: settings. Neither does geographical space constraint

838: communication. And in addition, that social signifiers are less

839: visible (compared to e.g.\ face-to-face interaction), and the relative

840: ease with which you can conceal your identity and transform your

841: appearance in online interaction, are factors reducing the structure

842: forming forces at work in `offline' social activity. It is therefore

843: interesting to note, that despite these caveats, the networks under

844: study here are much more structured than what would be expected in a

845: random network.

846:

847:

848: To summarize our findings of the Internet community pussokram.com, we

849: see that:

850: \begin{itemize}

851: \item The average degree converges over time, but surprisingly we

852:     observe no cut-off in the degree distribution. Previous studies do

853:     suggest that there is an upper limit to the mean number of

854:     contacts (Marsden 1987), and on average we find this

855:     socio-cognitive limitation despite the fact that time and space is

856:     of less important here. The reason we see continued growth in the

857:     cumulative degree distribution might be that it's relatively

858:     costless to have a high turnover on ones contacts in an online

859:     community. Contacts are established without much investment, and

860:     can also be dropped without much sanctioning.

861: \item Reciprocity is rather low, and presumably lower can be expected

862:   in a regular acquaintance network. Reciprocity levels quickly

863:   converge to a steady state.

864: \item Most assortative mixing coefficients have small negative values,

865:   suggesting a pattern of dissasortative mixing. This can partly be

866:   explained by the conventional effect from the skewed degree sequence

867:   (Newman 2002). The observed effect is significantly larger than can

868:   be expected solely from the degree distribution. An explanation for

869:   these higher $r$-values is the particular nature of the dating

870:   interaction (Cook, Emerson et al.\ 1983). We also find that mixing

871:   coefficients as a function of time converge rapidly. The

872:   dissasortative mixing in the Internet community networks is in

873:   striking contrast to the strong assortative mixing seen in

874:   scientific collaboration networks, and the nice correspondence with

875:   previous work in sociology indicates that Internet communities

876:   indeed strongly resembles off-line social communities.

877: \item The cumulative degree distributions are highly skewed, being a

878:   mixture of previous mappings of acquaintance networks (Amaral, Scala

879:   et al.\ 2000)---for few contacts---and partnership networks

880:   (Liljeros, Edling et al.\ 2001)---for many contacts.

881: \item The geodesic length initially increases as new vertices are

882:  added to the network. But as the network settles the increase is

883:  limited by the growing average degree. Both $l_\mathrm{GC}$ and

884:  $l^{-1}$ shows consistently that the average geodesic length is

885:  decreasing during the whole sample period (a situation that can only

886:  exist for a non-equilibrium network).

887: \item Clustering---the density of triangles---converges over time to

888:   non-zero values (as opposed to completely random networks). Still,

889:   values are probably on a much lower level than would be expected in

890:   offline acquaintance networks. The explanation for these low values

891:   is twofold---the lack of introduction as a mechanism for

892:   tie-formation, and the romantic profile of pussokram.com promoting

893:   romantic contacts. The latter aspect is also manifested in that the

894:   density of 4-circuits is larger than the density of triangles for

895:   the pussokram.com networks. Once again, the Internet community

896:   networks are different from the scientific collaboration network

897:   where clustering is larger than the density of 4-circuits.

898: \end{itemize}

899: An Internet community such as pussokram.com defines a structured

900: social network that share more of the structuring forces with general

901: acquaintance networks than networks of professional collaborations

902: do. We believe that the precise timing resolution and fast dynamics

903: (giving a wide effective sampling time-frame) will make Internet

904: communities an invaluable object for future social networks studies of

905: the largest scale.

906:

907: \section*{References}

908:

909: "e-print arXiv:" refers (yet unpublished) to manuscripts uploaded to

910: the database arXiv.org.

911:

912: \begin{list}{}{\setlength{\leftmargin}{5mm}\setlength{\rightmargin}{0mm}

913:     \setlength{\labelsep}{5mm}\setlength{\parsep}{2mm}

914:     \setlength{\itemindent}{-5mm}

915:     \setlength{\listparindent}{0mm}\setlength{\labelwidth}{0mm}

916:     \setlength{\itemsep}{0mm}\setlength{\partopsep}{0mm}}

917:

918: \item Albert, R.\ and A.\ L.\ Barab\'{a}si (2002).\ ``Statistical

919:   mechanics of complex networks." \textit{Review of Modern Physics} \textbf{74}: 47-97.

920: \item Amaral, L.\ A.\ N., A.\ Scala, et al.\ (2000). ``Classes of

921:   small-world networks.'' \textit{Proceedings of the National Academy of

922:   Sciences of the United States of America}

923:   \textbf{97}(21): 11149-11152.

924: \item Anderson, B.\ S., C.\ Butts, et al.\ (1999). ``The interaction

925:   of size and density with graph-level indices.'' \textit{Social Networks}

926:   \textbf{21}(3): 239-267.

927: \item Barab\'{a}si, A.\ L.\ and R.\ Albert (1999). ``Emergence of

928:   scaling in random networks.'' \textit{Science} \textbf{286}(5439): 509-512.

929: \item Barab\'{a}si, A.\ L., R.\ Albert, et al.\ (1999). ``Mean-field

930:   theory for scale-free random networks.'' \textit{Physica A} \textbf{272}(1-2):

931:   173-187.

932: \item Barab\'{a}si, A.\ L., H.\ Jeong, et al.\ (2002). ``Evolution of

933:   the social network of scientific collaborations.'' \textit{Physica A} \textbf{299}:

934:   559-564.

935: \item Butts, C.\ T.\ (2001). ``The complexity of social networks:

936:   theoretical and empirical findings.'' \textit{Social Networks} \textbf{23}(1): 31-71.

937: \item Cook, K.\ S., R.\ M.\ Emerson, et al.\ (1983). ``The

938:   distribution of power in exchange networks: Theory and experimental

939:   results.'' \textit{American Journal of} \textit{Sociology} \textbf{89}(2): 275-305.

940: \item Dorogovtsev, S.\ N.\ and J.\ F.\ F.\ Mendes (2002). Accelerated

941:   growth of networks. Handbook of Graphs and Networks: From the Genome

942:   to the Internet.\ S.\ Bornholdt and H.\ G. Schuster.\ Berlin, Wiley-VCH.

943: \item Dorogovtsev, S.\ N.\ and J.\ F.\ F.\ Mendes (2002). ``Evolution

944:   of networks.'' \textit{Advances in Physics} \textbf{51}(4): 1079-1187.

945: \item Ebel, H., L.\ I.\ Mielsch, et al.\ (2002). ``Scale-free topology

946:   of e-mail networks.'' \textit{Physical Review E} \textbf{66}, art.\ no.\ 035103.

947: \item Erd\"{o}s, P.\ and A.\ R\'{e}nyi (1959). ``On random graphs.''

948:   \textit{Publicationes Matematicae Debrecen} \textbf{6}: 290-297.

949: \item Fararo, T.\ J.\ and M.\ H.\ Sunshine (1964). A study of a biased

950:   friendship net. Youth Development Center, Syracuse University,

951:   Syracuse.

952: \item Gould, R.\ V.\ (2002). ``The origins of status hierachies: A

953:   formal theory and empirical test.'' \textit{American Journal of Sociology}

954:   \textbf{107}(5): 1143-1178.

955: \item Granovetter, M.\ (1973). ``Strength of weak ties.'' \textit{American

956:   Journal of Sociology} \textbf{78}(6): 1360-1380.

957: \item Holme, P.\ and B.\ J.\ Kim (2002). ``Growing scale-free networks

958:   with tunable clustering.'' \textit{Physical Review E} \textbf{65}(2): art.\ no.\ 026107.

959: \item Latora, V.\ and M.\ Marchiori (2001). ``Efficient behavior of

960:   small-world networks.'' \textit{Physical Review Letters} \textbf{87}(19): art.\ no.\

961:   198701.

962: \item Liljeros, F., C.\ R.\ Edling, et al.\ (2001). ``The web of human

963:   sexual contacts.'' \textit{Nature} \textbf{411}(6840): 907-908.

964: \item Marsden, P.\ V.\ (1987). ``Core discussion networks of

965:   Americans.'' \textit{American Sociological Review} \textbf{52}(1): 122-131.

966: \item Morris, M.\ and M.\ Kretzschmar (1995). ``Concurrent

967:   Partnerships and Transmission Dynamics in Networks.'' \textit{Social

968:   Networks} \textbf{17}(3-4): 299-318.

969: \item Newman, M.\ E.\ J.\ (2001). ``Clustering and preferential

970:   attachment in growing networks.'' \textit{Physical Review E} \textbf{64}(2): art.\

971:   no.\ 025102.

972: \item Newman, M.\ E.\ J.\ (2001). ``Scientific collaboration

973:   networks. I. Network construction and fundamental results.''

974:   \textit{Physical Review E} \textbf{64}(1): art.\ no.\ 016131.

975: \item Newman, M.\ E.\ J.\ (2002). ``Assortative mixing in networks.''

976:   \textit{Physical Review Letters} \textbf{89}, art.\ no.\ 208701.

977: \item Newman, M.\ E.\ J.\ (2003). ``The structure and function of

978:   complex networks.'' \textit{SIAM Review} \textbf{45}(2): 167-256.

979: \item Park, J.\ and M.\ E.\ J.\ Newman (2003). ``Origin of degree

980:   correlations in the Internet and other networks.'' \textit{Physical Review E}

981:   \textbf{68}(2), art.\ no.\ 026112.

982: \item Pattison, P., S.\ Wasserman, et al.\ (2000). ``Statistical

983:   evaluation of algebraic constraints for social networks.'' \textit{Journal

984:   of Mathematical Psychology} \textbf{44}: 536-568.

985: \item Roberts, J.\ M.\ (2000). ``Simple methods for simulating

986:   sociomatrices with given marginal totals.'' \textit{Social Networks} \textbf{22}(3):

987:   273-283.

988: \item Rothaermel, F.\ T.\ and S.\ Sugiyama (2001). ``Virtual Internet

989:   communities and commercial success: individual and community-level

990:   theory grounded in the atypical case of TimeZone.com.'' \textit{Journal of

991:   Management} \textbf{27}(3): 297-312.

992: \item Schelling, T.\ C.\ (1978). Micromotives and macrobehavior.\ New

993:   York, Norton.

994: \item Shen-Orr, S.\ S., R.\ Milo, et al.\ (2002). ``Network motifs in

995:   the transcriptional regulation network of Escherichia coli.'' \textit{Nature

996:   Genetics} \textbf{31}(1): 64-68.

997: \item Simon, H.\ A.\ (1955). ``On a class of skew distribution

998:   functions.'' \textit{Biometrika} \textbf{42}: 425-440.

999: \item Skvoretz, J.\ (1990). ``Biased net theory: Approximations,

1000:   simulations, and observations.'' \textit{Social Networks} \textbf{12}(3): 217-238.

1001: \item Smith, R.\ (2002). ``Instant Messaging as a Scale-Free

1002:   Network.'' eprint  arXiv:cond-mat/0206378, unpublished.

1003: \item Wasserman, S.\ and K.\ Faust (1994). Social network analysis:

1004:   Methods and applications. Cambridge, Cambridge University Press.

1005: \item Watts, D.\ J.\ (1999). ``Networks, dynamics, and the small-world

1006:   phenomenon.'' \textit{American Journal of Sociology} \textbf{105}(2): 493-527.

1007: \item Watts, D.\ J.\ and S.\ H.\ Strogatz (1998). ``Collective

1008:   dynamics of `small-world' networks.'' \textit{Nature} \textbf{393}(6684): 440-442.

1009: \item Wellman, B.\ (2001). ``Computer networks as social networks.''

1010:   \textit{Science} \textbf{293}(5537): 2031-2034.

1011: \item Wellman, B.\ and C.\ A.\ Haythornthwaite (2002). The Internet in

1012:   everyday life. Oxford, Blackwell.

1013: \end{list}

1014:

1015: \end{document}

1016: