0403:q-bio0403021/vac.tex

1: \documentclass[floatfix,twocolumn,rmp,showpacs,superscriptaddress]{revtex4}

2:

3: \usepackage{dcolumn,graphicx,amsmath,amssymb,txfonts}

4:

5: \begin{document}

6:

7: \title{Efficient local strategies for vaccination and network attack}

8:

9: \author{Petter Holme}

10: \email{holme@tp.umu.se}

11: \affiliation{Department of Physics, Ume{\aa} University, 901 87

12:   Ume{\aa}, Sweden}

13: \affiliation{NORDITA, Blegdamsvej 17, 2100 Copenhagen, Denmark}

14:

15: \begin{abstract} % paper --> Letter (for APS)

16:   We study how a fraction of a population should be vaccinated to most

17:   efficiently stop epidemics. We argue that only local information

18:   (about the neighborhood of specific vertices) is usable in practice,

19:   and hence we consider only local vaccination strategies. The

20:   efficiency of the vaccination strategies is investigated with both

21:   static and dynamical measures. Among other things we find that the

22:   most efficient strategy for many real-world situations is to

23:   iteratively vaccinate the neighbor of the previous vaccinee that has

24:   most links out of the neighborhood.

25: \end{abstract}

26:

27: \pacs{89.65.--s, 89.75.Hc, 89.75.--k}

28:

29: \maketitle

30:

31: \section{Introduction}

32:

33: Diseases spread over networks. The spreading dynamics are closely

34: related to the structure of networks. For this reason network

35: epidemiology has turned into of the most vibrant subdisciplines of

36: complex network studies.~\cite{gies,lea:sex,mejn:rev} A topic of great

37: practical importance within network epidemiology is the vaccination

38: problem: How should a population be vaccinated to most efficiently

39: prevent a disease to turn into an epidemic? For economic reasons it is

40: often not possible to vaccinate the whole population. Some vaccines

41: have severe side effects and for this reason one may also want to keep

42: number of vaccinated individuals low. So if cheap vaccines, free of

43: side effects, does not exist; then having an efficient vaccination

44: strategy is essential for saving both money and life. If all ties

45: within the population is known, then the target persons for

46: vaccination can be identified using sophisticated global strategies

47: (cf.~\cite{our:attack}); but this is hardly possible for nation-wide

48: (or larger) vaccination campaigns. In a seminal paper Cohen \textit{et

49:   al.}~\cite{chn:vacc} suggested a vaccination strategy that only

50: requires a person to estimate which other persons he, or she, gets

51: close enough to for the disease to spread to---i.e., to name the

52: ``neighbors'' in the network over which the disease spreads. For

53: network with a skewed distribution of degree (number of neighbors) the

54: strategy to vaccinate a neighbor of a randomly chosen person is much

55: more efficient than a random vaccination. In this work we assume that each

56: individual knows a little bit more about his, or her, neighborhood

57: than just the names of the neighbors: We also assume that an

58: individual can guess the degree of the neighbors and the ties from one

59: neighbor to another. This assumption is not very unrealistic---people

60: are believed to have a good understanding of their social

61: surroundings (this is, for example, part of the explanation for the

62: ``navigability'' of social networks)~\cite{watts:search}.

63:

64: Finding the optimal set of vaccinees is closely related to the attack

65: vulnerability problem~\cite{our:attack,alb:attack}. The

66: major difference is the dynamic system that is confined to the

67: network---disease spreading for the vaccination problem and

68: information flow for the attack vulnerability problem. To be able to

69: protect the network efficiently one needs to know the worst case

70: attacking scenario. Large scale network attacks are, presumably, based

71: on local (rather than global) network information. So, a

72: grave scenario would be in the network was attacked with the same

73: strategy that is most efficient for vaccination. We will use the

74: vaccination problem as the framework for our discussion, but the

75: results applies for network attack as well.

76:

77: \section{Preliminaries}

78:

79: In our discussion we will use two measures of network structure: The

80: \textit{clustering coefficient} $C$ of the network defined as the

81: ratio of triangles with respect to connected triples normalized to the

82: interval $[0,1]$.~\cite{bw:sw} If $C=1$ there is a maximal number of

83: triangles (given a set of connected triples); if $C=0$ the graph has

84: no triangles. We also measure the degree-degree correlations through

85: the \textit{assortative mixing

86:   coefficient} defined as~\cite{mejn:assmix}

87: \begin{equation}

88:   r=\frac{4\langle k_1\, k_2\rangle - \langle k_1 + k_2\rangle^2}

89:   {2\langle k_1^2+k_2^2\rangle - \langle k_1+ k_2\rangle^2}~,

90: \end{equation}

91: where $k_i$ is the degree of the $i$'th argument of an edge in a list

92: of the edges, and $\langle\:\cdot\:\rangle$ denotes average over

93: that edge-list. We let $N$ denote the number of

94: vertices and $M$ the number of edges.

95:

96: \section{The networks}

97:

98: We will test the vaccination strategies we propose on both real-world

99: and model networks.

100: The first real-world network is a scientific

101: collaboration network~\cite{mejn:scicolpnas}. The vertices of this

102: network are scientists who have uploaded manuscripts to the preprint

103: repository arxiv.org. An edge between two authors means that

104: they have coauthored a preprint. We also study two small real-world

105: social networks: One constructed from an observational study of

106: friendships in a karate club, another based on an interview survey

107: among prisoners. The edges of these small networks are, probably, more

108: relevant for disease spreading than the arxiv network, but may suffer

109: from finite size effects. The three model networks are: 1. The Holme-Kim

110: (HK) model~\cite{hk:model} that produces networks with a power-law degree

111: distribution and tunable clustering. Basically, it is a

112: Barab\'{a}si-Albert (BA) type growth model based on preferential

113: attachment~\cite{ba:model}---just as the BA model

114: it has one parameter $m=M/N$ controlling the average degree and one

115: (additional) parameter $m_t\in [1,m]$ controlling the clustering. We

116: will use $M=2N=4000$ and $m=m_t+1=4$ giving the maximal clustering for

117: the given $N$ and $M$. 2. The networked seceder model, modeling social

118: networks with a community structure and exponentially decaying

119: degree distributions~\cite{our:seceder}. Briefly, it works by

120: sequentially updating the vertices by, for each vertex $v$, rewiring

121: all $v$'s edges to the neighborhood of a peripheral vertex. With a

122: probability $r$ an edge of $v$ can be rewired to a random vertex (so

123: $r$ controls the degree of community structure). We use the parameter

124: values $M=3N=6600$, $r=0.1$ and $10M$ iterations on an

125: Erd\H{o}s-R\'{e}nyi network~\cite{er:on}. 3. The Watts-Strogatz (WS)

126: model~\cite{wattsstrogatz} generates networks with exponentially decaying

127: degree distributions and tunable clustering. The WS model starts from

128: the vertices on a circular topology with edges between vertices

129: separated by 1 to $k$ steps on the circle. Then one goes through the

130: edges and rewire one side of them to randomly selected vertices with a

131: probability $P$. We use $P=0.05$ and $M=kN=2N=4000$.

132:

133: \begin{table}

134: \caption{Statistics of the networks. Note that the arxiv, prison and

135:   seceder model networks are not connected---the largest connected

136:   components contains $48561$, $58$ and $2162(1)$ nodes respectively.

137: }

138: \label{tab:stat}

139: \begin{ruledtabular}

140:   \begin{tabular}{l|llll}

141:     network & $N$ & $M$ & $C$ & $r$ \\\hline

142:     arxiv & 58342 & 294901 & 0.420 & +0.324 \\

143:     karate club & 34 & 78 & 0.256 & --0.476\\

144:     prison & 67 & 85 & 0.310 & +0.161\\

145:     HK & 2000 & 4000 & 0.1753(1) & --0.0458(1) \\

146:     seceder & 2200 & 6600 & 0.266(1) & +0.012(2)\\

147:     WS & 2000 & 4000 & 0.4219(1) & --0.01267(2) \\

148:   \end{tabular}

149: \end{ruledtabular}

150: \end{table}

151:

152: \begin{figure*}

153:   \resizebox*{\linewidth}{!}{\includegraphics{s1.eps}}

154:   \caption{

155:     The size of the largest connected component $S_1$ as a function of

156:   the fraction of vaccinated vertices for the (a) arxiv, (b) karate

157:   club, (c) prison, (d) HK model, (e) seceder model and (f) WS model

158:   network. Error bars are smaller than the symbol size. Lines are

159:   guides for the eyes.

160:   }

161:   \label{fig:s1}

162: \end{figure*}

163:

164: \section{The strategies}

165:

166: Now we turn to the definition of the strategies. We assume a fraction

167: $f$ of the population is to be vaccinated. As a reference we consider

168: random vaccination (\textsc{Rnd}, equivalent to site percolation). We use

169: the above mentioned \textit{neighbor vaccination} (\textsc{RNb})---to

170: vaccinate the neighbor of randomly chosen vertices---and the trivial improvement~\cite{bjk:pfs} if

171: knowledge about the neighbors' degrees are included: Pick a vertex at

172: random and vaccinate one (randomly chosen) of its highest-degree

173: neighbors (we call it \textsc{Deg}). To avoid overvaccination of a

174: neighborhood one can consider to vaccinate neighbors of a vertex $v$

175: with a maximal number of edges out of $v$'s neighborhood

176: (\textsc{Out}). For all strategies except \textsc{Rnd}

177: we also consider ``chained'' versions were one, instead of vaccinating a

178: neighbor of a randomly chosen vertex, vaccinates a neighbor of the vertex

179: vaccinated in the previous time step (if all neighbors are vaccinated

180: a neighbor of a random vertex is chosen instead). For the acronyms of

181: the chained versions a suffix ``C'' is added.

182:

183: \begin{figure}

184:   \resizebox*{0.9 \linewidth}{!}{\includegraphics{dyn.eps}}

185:   \caption{The average number of vertices that are at infected once or

186:   more during an outbreak $s$ for (a) the SIR and (b) the SIS disease

187:   dynamics. Error bars of the order of the symbol size. Lines are

188:   guides for the eyes.}

189:   \label{fig:dyn}

190: \end{figure}

191:

192: \section{Results and analysis}

193:

194: The results of this paper are presented in three sections: First we

195: study how the number of vertices in the largest connected subgraph

196: $S_1$ depends on the fraction $f$ of vaccinated vertices. Then we

197: show that the conclusions from $S_1$ also hold for dynamical simulations

198: of disease spreading. To interpret the results we also investigate

199: $S_1$ for a fixed $f$ as a function of the clustering and assortative

200: mixing coefficients.

201:

202: \subsection{Static efficiency}

203:

204: As a static efficiency measure we consider the size of the average

205: largest connected component of susceptible (non-vaccinated) vertices,

206: $S_1$. We average over $n_\mathrm{vac}=1000$ runs of the vaccination

207: procedures. The model networks are also averaged over

208: $n_\mathrm{net}=100$ network realizations. (Smaller or larger

209: $n_\mathrm{vac}$ and $n_\mathrm{net}$ does not make any qualitative

210: difference.) In Fig.~\ref{fig:s1} we display $S_1$ as a

211: function of $f$. For all except the WS model network the \textsc{Deg}

212: and \textsc{Out} (chained and unchained versions) form the most

213: efficient set of strategies. Within this group the order of efficiency

214: varies: For the arxiv network the \textsc{Out} strategy is more than

215: twice as efficient as any other for $0.25\lesssim f\lesssim 0.4$. For

216: the HK and seceder model networks the chained strategies are

217: considerably more efficient than the unchained ones. We note that the

218: difference between the chained and unchained versions of \textsc{Out}

219: and \textsc{Deg} is bigger than between \textsc{Out} and \textsc{Deg}

220: (or \textsc{OutC} and \textsc{DegC}). \textsc{Out} do converge to

221: \textsc{Deg} in the limit of vanishing $C$ but all networks we test

222: have rather high clustering. Another interesting observation is that

223: even if the degree distribution is narrow, such as for the seceder

224: model of Fig.~\ref{fig:s1}(e) (where $P(k)\sim \exp(-k)$) the more

225: elaborate strategies are much more efficient than random

226: vaccination. This is especially clear for higher $f$ which suggests

227: that the structural change of the network of susceptible vertices

228: during the vaccination procedure is an important factor for the

229: overall efficiency. For the WS model network the chained algorithms

230: are performing poorer than random vaccination. This is in contrast to

231: all other networks. We conclude that epidemiology related results

232: regarding the WS model networks should be cautiously generalized to

233: real-world systems.

234:

235: \subsection{Dynamic efficiency}

236:

237: Static measures of vaccination efficiency are potential

238: over-simplifications---there is a chance that the interplay between

239: disease dynamics and the underlying network structure has a

240: significant role. To motivate the use of $S_1$ we also investigate the

241: SIS and SIR models~\cite{gies} on vaccinated networks. In the SIS

242: model a vertex goes from ``susceptible'' (S) to ``infected'' (I) and

243: back to S. In the SIR model is just the same, except that an

244: infected vertex goes to the ``removed'' (R) state and remain

245: there. The probability to go from $S$ to $I$ (per contact) is zero for

246: vaccinated vertices and $\lambda=0.05$ for the rest. The I state lasts

247: $\delta=2.5$ time steps. We use synchronous updating and one randomly

248: chosen initially infected

249: person. The disease dynamics are averaged $n_\mathrm{dis}=100$ times for

250: all $n_\mathrm{vac}=1000$ runs of the vaccination schemes.  In

251: Fig.~\ref{fig:dyn}(a) we plot the average number of individuals that

252: at least once have been infected during an outbreak $s$---i.e., until

253: there are no I-vertices left, or (for SIS) has reached an endemic

254: state (defined in the simulations as when there are no susceptible

255: vertices that have not had the disease at least once)---for the arxiv

256: network. Other networks and simulation parameters give qualitatively

257: similar results. Qualitatively, the large picture from the $S_1$

258: calculations remains---the chained and unchained \textsc{Deg} and

259: \textsc{Out} strategies are very efficient, and the chained versions

260: are more efficient than the unchained. A difference is that the

261: unchained \textsc{RNb} also performs rather well. Quantitatively, the

262: differences between the strategies are huge, this is a result of the

263: threshold behaviors of the SIS and SIR models~\cite{chn:vacc}. The

264: conclusion of Fig.~\ref{fig:dyn} (and similar plots for other

265: networks) is that the order of the strategies' efficiencies are

266: largely the same as concluded from the $S_1(f)$-curves. But if high

267: resolution is required, the measurement of network fragility has to be

268: specific for the studied system.

269:

270: \begin{figure}

271:   \resizebox*{0.9 \linewidth}{!}{\includegraphics{rc.eps}}

272:   \caption{How the size of the largest connected component vaccination

273:     of 20\% of the population depends on clustering and degree-degree

274:   correlations. (a) shows $S_1(f=0.2)$ plotted against $r$. (b) shows

275:   $S_1(f=0.2)$

276:   as a function of $C$. The networks have the same size and degree

277:   sequence as the arxiv network. Error bars are smaller than the

278:   symbol size. Lines are guides for the eyes.}

279:   \label{fig:rew}

280: \end{figure}

281:

282: \subsection{The role of clustering and assortative mixing}

283:

284: To gain some insight how the network structure govern the relative

285: efficiencies of the strategies we measure $S_1(f=0.2)$ for varying

286: assortative mixing and clustering coefficients. The results hold for

287: other small $f$ values. We keep the size and

288: degree sequence constant to the values of the arxiv network. To

289: perform this sampling we rewire pairs of edges $(v_1,v_2)$ and

290: $(w_1,w_2)$ to $(v_1,w_2)$ and $(w_1,v_2)$ (unless this would

291: introduce a self-edge or multiple edges). To ensure that the

292: $n_\mathrm{rew}=100$ rewiring realizations are independent we start

293: with rewiring $n_\mathrm{init}=3M$ pairs of edges. Then we go through

294: pairs of edges randomly and execute only changes that makes the

295: current $r$ or $C$ closer to their target values. When the value of

296: $r$ or $C$ are within $0.1\%$ of the target value the iteration is

297: braked. The results seen in Fig.~\ref{fig:rew} shows that, just as

298: before the \textsc{Out} and \textsc{Deg} strategies, chained or

299: unchained, are most efficient throughout the parameter space. The

300: unchained versions are most efficient for $r\gtrsim 0.3$. An

301: explanation is that, for high $r$, the chained versions will

302: effectively only vaccinate the

303: high-connected vertices (that are grouped together for very high $r$)

304: and leave chains of low-degree vertices unvaccinated. The

305: $C$-dependence plotted in Fig.~\ref{fig:rew}(b) shows that the

306: unchained versions outperform the chained versions for $C\gtrsim

307: 0.15$. This is possibly a result of that the chains, for

308: combinatorial reasons, get stuck in one part of the network. It is not

309: an effect of biased

310: degree-degree correlations since if the rewiring procedure is conditioned to a

311: fixed $r$ Fig.~\ref{fig:rew}(b) remains essentially unaltered. We note

312: that the structure of the original arxiv network differs from the

313: rewired networks. For example, at $f=0.2$ of Fig.~\ref{fig:s1}(a) the

314: \textsc{Out} is 22\% more efficient than \textsc{OutC}, but in

315: Fig.~\ref{fig:rew} the \textsc{Out} and \textsc{OutC} curves differ

316: very little. For the \textsc{RNb} strategy the chained version is better than

317: the unchained throughout the range of $r$ and $C$ values.

318:

319: \section{Summary and conclusions}

320:

321: To summarize, we have investigated strategies for vaccination and

322: network attack that are based only on the knowledge of the

323: neighborhood---information that humans arguably possess and

324: utilize. Both static and dynamical measures of efficiency are

325: studied. For most networks, regardless of the number of vaccinated

326: vertices, the most efficient strategies are to choose a vertex $v$ and

327: vaccinate a neighbor of $v$ with highest degree (\textsc{Deg}), or the

328: neighbor of $v$ with most links out of $v$'s neighborhood

329: (\textsc{Out}). $v$ can be picked either as the lastly vaccinated

330: vertex (chained selection) or at random (unchained selection). For

331: real-world networks the chained versions tend to outperform the

332: unchained ones, whereas this situation is reversed for the three types

333: of model networks we study. We investigate the relative efficiency of

334: chained and unchained strategies further by sampling random networks

335: with a fixed degree sequence and varying assortative mixing and

336: clustering coefficients. We find that the unchained strategies are

337: preferable for networks with a very high clustering or strong

338: positive assortative mixing (larger values than in seen in real-world

339: networks). In Ref.~\cite{chn:vacc} the authors propose

340: the strategy to vaccinate  a random neighbor of a randomly selected

341: vertex. This strategy (\textsc{RNb}) requires less information of the

342: neighborhood than \textsc{Deg} and \textsc{Out} do. Thus the

343: practical procedure gets simpler: One only has to ask a person

344: ``name a person you meet regularly'' rather than ``name the acquaintance of yours who meet most people you are not

345: acquainted with regularly'' (for \textsc{Out}). (``Meet with regularly''

346: should be replaced with some phrase signifying a high risk of infection

347: transfer for the pathogen in question.) On the other hand, if the

348: information of the neighborhoods is incomplete \textsc{Deg} and

349: \textsc{Out} will, effectively, be reduced to \textsc{RNb} (and thus not

350: perform worse than \textsc{RNb}). To epitomize, choosing the people to

351: vaccinate in the right way will save a tremendous amount of vaccine

352: and side-effect cases. The best strategy can only be selected by

353: considering both the structure of the network the pathogen spreads

354: over, and the disease dynamics. If nothing of this is known the

355: \textsc{OutC} strategy our recommendation---it is better, or not much

356: worse, than the best strategy in most cases. Together with

357: \textsc{DegC}, \textsc{OutC} is most efficient for low clustering

358: and assortative mixing coefficients, which is the region of parameter

359: space for sexually transmitted diseases---the most interesting case

360: for network based vaccination schemes (due to the well-definedness of

361: sexual networks).

362:

363:

364: \section*{Acknowledgements}

365:

366:

367: The author is grateful for comments from M.\ Rosvall and acknowledges

368: support from the Swedish Research Council through contract no.\

369: 2002-4135.

370:

371: \begin{thebibliography}{10}

372:

373: \bibitem{alb:attack}

374: R.~Albert, H.~Jeong, and A.-L. Barab\'{a}si, \textit{Attack and error tolerance

375:   of complex networks}, Nature \textbf{406} (2000), pp.~378-382.

376:

377: \bibitem{ba:model}

378: A.-L. Barab\'{a}si and R.~Albert, \textit{Emergence of scaling in random

379:   networks}, Science \textbf{286} (1999), pp.~509-512.

380:

381: \bibitem{bw:sw}

382: A.~Barrat and M.~Weigt, \textit{On the properties of small-world network

383:   models}, Eur. Phys. J. B \textbf{13} (2000), pp.~547-560.

384:

385: \bibitem{chn:vacc}

386: R.~Cohen, S.~Havlin, and D.~ben Avraham, \textit{Efficient immunization

387:   strategies for computer networks and populations}, Phys. Rev. Lett.

388:   \textbf{91} (2003), art.~no.\ 247901.

389:

390: \bibitem{er:on}

391: P.~Erd\H{o}s and A.~R\'{e}nyi, \textit{On random graphs {I}}, Publ. Math.

392:   Debrecen \textbf{6} (1959), pp.~290-297.

393:

394: \bibitem{gies}

395: J.~Giesecke, \textit{Modern infectious disease epidemiology}, Arnold, London,

396:   2~ed., 2002.

397:

398: \bibitem{our:seceder}

399: A.~Gr\"{o}nlund and P.~Holme, \textit{Networking the seceder model: Group

400:   formation in social and economic systems}.

401: \newblock e-print: cond-mat/0312010.

402:

403: \bibitem{hk:model}

404: P.~Holme and B.~J. Kim, \textit{Growing scale-free networks with tunable

405:   clustering}, Phys. Rev. E \textbf{65} (2002), art.~no.\ 026107.

406:

407: \bibitem{our:attack}

408: P.~Holme, B.~J. Kim, C.~N. Yoon, and S.~K. Han, \textit{Attack vulnerability of

409:   complex networks}, Phys. Rev. E \textbf{65} (2002), art.~no.\ 066109.

410:

411: \bibitem{bjk:pfs}

412: B.~J. Kim, C.~N. Yoon, S.~K. Han, and H.~Jeong, \textit{Path finding strategies

413:   in scale-free networks}, Phys. Rev. E \textbf{65} (2002), art.~no.\ 027103.

414:

415: \bibitem{lea:sex}

416: F.~Liljeros, C.~R. Edling, and L.~A. {Nunes Amaral}, \textit{Sexual networks:

417:   Implication for the transmission of sexually transmitted infection}, Microbes

418:   Infect. \textbf{5} (2003), pp.~189-196.

419:

420: \bibitem{mejn:scicolpnas}

421: M.~E.~J. Newman, \textit{The structure of scientific collaboration networks},

422:   Proc. Natl. Acad. Sci. USA \textbf{98} (2001), pp.~404-409.

423:

424: \bibitem{mejn:assmix}

425: \leavevmode\vrule height 2pt depth -1.6pt width 23pt, \textit{Assortative

426:   mixing in networks}, Phys. Rev. Lett. \textbf{89} (2002), art.~no.\ 208701.

427:

428: \bibitem{mejn:rev}

429: \leavevmode\vrule height 2pt depth -1.6pt width 23pt, \textit{The structure and

430:   function of complex networks}, SIAM Rev. \textbf{45} (2003), pp.~167-256.

431:

432: \bibitem{watts:search}

433: D.~J. Watts, P.~S. Dodds, and M.~E.~J. Newman, \textit{Identity and search in

434:   social networks}, Science \textbf{296} (2002), pp.~1302-1305.

435:

436: \bibitem{wattsstrogatz}

437: D.~J. Watts and S.~H. Strogatz, \textit{Collective dynamics of {`small-world'}

438:   networks}, Nature \textbf{393} (1998), pp.~440-442.

439:

440: \end{thebibliography}

441:

442: \end{document}

443: