1: \documentclass[floatfix,twocolumn,rmp,showpacs,superscriptaddress]{revtex4}
2:
3: \usepackage{dcolumn,graphicx,amsmath,amssymb,txfonts}
4:
5: \begin{document}
6:
7: \title{Efficient local strategies for vaccination and network attack}
8:
9: \author{Petter Holme}
10: \email{holme@tp.umu.se}
11: \affiliation{Department of Physics, Ume{\aa} University, 901 87
12: Ume{\aa}, Sweden}
13: \affiliation{NORDITA, Blegdamsvej 17, 2100 Copenhagen, Denmark}
14:
15: \begin{abstract} % paper --> Letter (for APS)
16: We study how a fraction of a population should be vaccinated to most
17: efficiently stop epidemics. We argue that only local information
18: (about the neighborhood of specific vertices) is usable in practice,
19: and hence we consider only local vaccination strategies. The
20: efficiency of the vaccination strategies is investigated with both
21: static and dynamical measures. Among other things we find that the
22: most efficient strategy for many real-world situations is to
23: iteratively vaccinate the neighbor of the previous vaccinee that has
24: most links out of the neighborhood.
25: \end{abstract}
26:
27: \pacs{89.65.--s, 89.75.Hc, 89.75.--k}
28:
29: \maketitle
30:
31: \section{Introduction}
32:
33: Diseases spread over networks. The spreading dynamics are closely
34: related to the structure of networks. For this reason network
35: epidemiology has turned into of the most vibrant subdisciplines of
36: complex network studies.~\cite{gies,lea:sex,mejn:rev} A topic of great
37: practical importance within network epidemiology is the vaccination
38: problem: How should a population be vaccinated to most efficiently
39: prevent a disease to turn into an epidemic? For economic reasons it is
40: often not possible to vaccinate the whole population. Some vaccines
41: have severe side effects and for this reason one may also want to keep
42: number of vaccinated individuals low. So if cheap vaccines, free of
43: side effects, does not exist; then having an efficient vaccination
44: strategy is essential for saving both money and life. If all ties
45: within the population is known, then the target persons for
46: vaccination can be identified using sophisticated global strategies
47: (cf.~\cite{our:attack}); but this is hardly possible for nation-wide
48: (or larger) vaccination campaigns. In a seminal paper Cohen \textit{et
49: al.}~\cite{chn:vacc} suggested a vaccination strategy that only
50: requires a person to estimate which other persons he, or she, gets
51: close enough to for the disease to spread to---i.e., to name the
52: ``neighbors'' in the network over which the disease spreads. For
53: network with a skewed distribution of degree (number of neighbors) the
54: strategy to vaccinate a neighbor of a randomly chosen person is much
55: more efficient than a random vaccination. In this work we assume that each
56: individual knows a little bit more about his, or her, neighborhood
57: than just the names of the neighbors: We also assume that an
58: individual can guess the degree of the neighbors and the ties from one
59: neighbor to another. This assumption is not very unrealistic---people
60: are believed to have a good understanding of their social
61: surroundings (this is, for example, part of the explanation for the
62: ``navigability'' of social networks)~\cite{watts:search}.
63:
64: Finding the optimal set of vaccinees is closely related to the attack
65: vulnerability problem~\cite{our:attack,alb:attack}. The
66: major difference is the dynamic system that is confined to the
67: network---disease spreading for the vaccination problem and
68: information flow for the attack vulnerability problem. To be able to
69: protect the network efficiently one needs to know the worst case
70: attacking scenario. Large scale network attacks are, presumably, based
71: on local (rather than global) network information. So, a
72: grave scenario would be in the network was attacked with the same
73: strategy that is most efficient for vaccination. We will use the
74: vaccination problem as the framework for our discussion, but the
75: results applies for network attack as well.
76:
77: \section{Preliminaries}
78:
79: In our discussion we will use two measures of network structure: The
80: \textit{clustering coefficient} $C$ of the network defined as the
81: ratio of triangles with respect to connected triples normalized to the
82: interval $[0,1]$.~\cite{bw:sw} If $C=1$ there is a maximal number of
83: triangles (given a set of connected triples); if $C=0$ the graph has
84: no triangles. We also measure the degree-degree correlations through
85: the \textit{assortative mixing
86: coefficient} defined as~\cite{mejn:assmix}
87: \begin{equation}
88: r=\frac{4\langle k_1\, k_2\rangle - \langle k_1 + k_2\rangle^2}
89: {2\langle k_1^2+k_2^2\rangle - \langle k_1+ k_2\rangle^2}~,
90: \end{equation}
91: where $k_i$ is the degree of the $i$'th argument of an edge in a list
92: of the edges, and $\langle\:\cdot\:\rangle$ denotes average over
93: that edge-list. We let $N$ denote the number of
94: vertices and $M$ the number of edges.
95:
96: \section{The networks}
97:
98: We will test the vaccination strategies we propose on both real-world
99: and model networks.
100: The first real-world network is a scientific
101: collaboration network~\cite{mejn:scicolpnas}. The vertices of this
102: network are scientists who have uploaded manuscripts to the preprint
103: repository arxiv.org. An edge between two authors means that
104: they have coauthored a preprint. We also study two small real-world
105: social networks: One constructed from an observational study of
106: friendships in a karate club, another based on an interview survey
107: among prisoners. The edges of these small networks are, probably, more
108: relevant for disease spreading than the arxiv network, but may suffer
109: from finite size effects. The three model networks are: 1. The Holme-Kim
110: (HK) model~\cite{hk:model} that produces networks with a power-law degree
111: distribution and tunable clustering. Basically, it is a
112: Barab\'{a}si-Albert (BA) type growth model based on preferential
113: attachment~\cite{ba:model}---just as the BA model
114: it has one parameter $m=M/N$ controlling the average degree and one
115: (additional) parameter $m_t\in [1,m]$ controlling the clustering. We
116: will use $M=2N=4000$ and $m=m_t+1=4$ giving the maximal clustering for
117: the given $N$ and $M$. 2. The networked seceder model, modeling social
118: networks with a community structure and exponentially decaying
119: degree distributions~\cite{our:seceder}. Briefly, it works by
120: sequentially updating the vertices by, for each vertex $v$, rewiring
121: all $v$'s edges to the neighborhood of a peripheral vertex. With a
122: probability $r$ an edge of $v$ can be rewired to a random vertex (so
123: $r$ controls the degree of community structure). We use the parameter
124: values $M=3N=6600$, $r=0.1$ and $10M$ iterations on an
125: Erd\H{o}s-R\'{e}nyi network~\cite{er:on}. 3. The Watts-Strogatz (WS)
126: model~\cite{wattsstrogatz} generates networks with exponentially decaying
127: degree distributions and tunable clustering. The WS model starts from
128: the vertices on a circular topology with edges between vertices
129: separated by 1 to $k$ steps on the circle. Then one goes through the
130: edges and rewire one side of them to randomly selected vertices with a
131: probability $P$. We use $P=0.05$ and $M=kN=2N=4000$.
132:
133: \begin{table}
134: \caption{Statistics of the networks. Note that the arxiv, prison and
135: seceder model networks are not connected---the largest connected
136: components contains $48561$, $58$ and $2162(1)$ nodes respectively.
137: }
138: \label{tab:stat}
139: \begin{ruledtabular}
140: \begin{tabular}{l|llll}
141: network & $N$ & $M$ & $C$ & $r$ \\\hline
142: arxiv & 58342 & 294901 & 0.420 & +0.324 \\
143: karate club & 34 & 78 & 0.256 & --0.476\\
144: prison & 67 & 85 & 0.310 & +0.161\\
145: HK & 2000 & 4000 & 0.1753(1) & --0.0458(1) \\
146: seceder & 2200 & 6600 & 0.266(1) & +0.012(2)\\
147: WS & 2000 & 4000 & 0.4219(1) & --0.01267(2) \\
148: \end{tabular}
149: \end{ruledtabular}
150: \end{table}
151:
152: \begin{figure*}
153: \resizebox*{\linewidth}{!}{\includegraphics{s1.eps}}
154: \caption{
155: The size of the largest connected component $S_1$ as a function of
156: the fraction of vaccinated vertices for the (a) arxiv, (b) karate
157: club, (c) prison, (d) HK model, (e) seceder model and (f) WS model
158: network. Error bars are smaller than the symbol size. Lines are
159: guides for the eyes.
160: }
161: \label{fig:s1}
162: \end{figure*}
163:
164: \section{The strategies}
165:
166: Now we turn to the definition of the strategies. We assume a fraction
167: $f$ of the population is to be vaccinated. As a reference we consider
168: random vaccination (\textsc{Rnd}, equivalent to site percolation). We use
169: the above mentioned \textit{neighbor vaccination} (\textsc{RNb})---to
170: vaccinate the neighbor of randomly chosen vertices---and the trivial improvement~\cite{bjk:pfs} if
171: knowledge about the neighbors' degrees are included: Pick a vertex at
172: random and vaccinate one (randomly chosen) of its highest-degree
173: neighbors (we call it \textsc{Deg}). To avoid overvaccination of a
174: neighborhood one can consider to vaccinate neighbors of a vertex $v$
175: with a maximal number of edges out of $v$'s neighborhood
176: (\textsc{Out}). For all strategies except \textsc{Rnd}
177: we also consider ``chained'' versions were one, instead of vaccinating a
178: neighbor of a randomly chosen vertex, vaccinates a neighbor of the vertex
179: vaccinated in the previous time step (if all neighbors are vaccinated
180: a neighbor of a random vertex is chosen instead). For the acronyms of
181: the chained versions a suffix ``C'' is added.
182:
183: \begin{figure}
184: \resizebox*{0.9 \linewidth}{!}{\includegraphics{dyn.eps}}
185: \caption{The average number of vertices that are at infected once or
186: more during an outbreak $s$ for (a) the SIR and (b) the SIS disease
187: dynamics. Error bars of the order of the symbol size. Lines are
188: guides for the eyes.}
189: \label{fig:dyn}
190: \end{figure}
191:
192: \section{Results and analysis}
193:
194: The results of this paper are presented in three sections: First we
195: study how the number of vertices in the largest connected subgraph
196: $S_1$ depends on the fraction $f$ of vaccinated vertices. Then we
197: show that the conclusions from $S_1$ also hold for dynamical simulations
198: of disease spreading. To interpret the results we also investigate
199: $S_1$ for a fixed $f$ as a function of the clustering and assortative
200: mixing coefficients.
201:
202: \subsection{Static efficiency}
203:
204: As a static efficiency measure we consider the size of the average
205: largest connected component of susceptible (non-vaccinated) vertices,
206: $S_1$. We average over $n_\mathrm{vac}=1000$ runs of the vaccination
207: procedures. The model networks are also averaged over
208: $n_\mathrm{net}=100$ network realizations. (Smaller or larger
209: $n_\mathrm{vac}$ and $n_\mathrm{net}$ does not make any qualitative
210: difference.) In Fig.~\ref{fig:s1} we display $S_1$ as a
211: function of $f$. For all except the WS model network the \textsc{Deg}
212: and \textsc{Out} (chained and unchained versions) form the most
213: efficient set of strategies. Within this group the order of efficiency
214: varies: For the arxiv network the \textsc{Out} strategy is more than
215: twice as efficient as any other for $0.25\lesssim f\lesssim 0.4$. For
216: the HK and seceder model networks the chained strategies are
217: considerably more efficient than the unchained ones. We note that the
218: difference between the chained and unchained versions of \textsc{Out}
219: and \textsc{Deg} is bigger than between \textsc{Out} and \textsc{Deg}
220: (or \textsc{OutC} and \textsc{DegC}). \textsc{Out} do converge to
221: \textsc{Deg} in the limit of vanishing $C$ but all networks we test
222: have rather high clustering. Another interesting observation is that
223: even if the degree distribution is narrow, such as for the seceder
224: model of Fig.~\ref{fig:s1}(e) (where $P(k)\sim \exp(-k)$) the more
225: elaborate strategies are much more efficient than random
226: vaccination. This is especially clear for higher $f$ which suggests
227: that the structural change of the network of susceptible vertices
228: during the vaccination procedure is an important factor for the
229: overall efficiency. For the WS model network the chained algorithms
230: are performing poorer than random vaccination. This is in contrast to
231: all other networks. We conclude that epidemiology related results
232: regarding the WS model networks should be cautiously generalized to
233: real-world systems.
234:
235: \subsection{Dynamic efficiency}
236:
237: Static measures of vaccination efficiency are potential
238: over-simplifications---there is a chance that the interplay between
239: disease dynamics and the underlying network structure has a
240: significant role. To motivate the use of $S_1$ we also investigate the
241: SIS and SIR models~\cite{gies} on vaccinated networks. In the SIS
242: model a vertex goes from ``susceptible'' (S) to ``infected'' (I) and
243: back to S. In the SIR model is just the same, except that an
244: infected vertex goes to the ``removed'' (R) state and remain
245: there. The probability to go from $S$ to $I$ (per contact) is zero for
246: vaccinated vertices and $\lambda=0.05$ for the rest. The I state lasts
247: $\delta=2.5$ time steps. We use synchronous updating and one randomly
248: chosen initially infected
249: person. The disease dynamics are averaged $n_\mathrm{dis}=100$ times for
250: all $n_\mathrm{vac}=1000$ runs of the vaccination schemes. In
251: Fig.~\ref{fig:dyn}(a) we plot the average number of individuals that
252: at least once have been infected during an outbreak $s$---i.e., until
253: there are no I-vertices left, or (for SIS) has reached an endemic
254: state (defined in the simulations as when there are no susceptible
255: vertices that have not had the disease at least once)---for the arxiv
256: network. Other networks and simulation parameters give qualitatively
257: similar results. Qualitatively, the large picture from the $S_1$
258: calculations remains---the chained and unchained \textsc{Deg} and
259: \textsc{Out} strategies are very efficient, and the chained versions
260: are more efficient than the unchained. A difference is that the
261: unchained \textsc{RNb} also performs rather well. Quantitatively, the
262: differences between the strategies are huge, this is a result of the
263: threshold behaviors of the SIS and SIR models~\cite{chn:vacc}. The
264: conclusion of Fig.~\ref{fig:dyn} (and similar plots for other
265: networks) is that the order of the strategies' efficiencies are
266: largely the same as concluded from the $S_1(f)$-curves. But if high
267: resolution is required, the measurement of network fragility has to be
268: specific for the studied system.
269:
270: \begin{figure}
271: \resizebox*{0.9 \linewidth}{!}{\includegraphics{rc.eps}}
272: \caption{How the size of the largest connected component vaccination
273: of 20\% of the population depends on clustering and degree-degree
274: correlations. (a) shows $S_1(f=0.2)$ plotted against $r$. (b) shows
275: $S_1(f=0.2)$
276: as a function of $C$. The networks have the same size and degree
277: sequence as the arxiv network. Error bars are smaller than the
278: symbol size. Lines are guides for the eyes.}
279: \label{fig:rew}
280: \end{figure}
281:
282: \subsection{The role of clustering and assortative mixing}
283:
284: To gain some insight how the network structure govern the relative
285: efficiencies of the strategies we measure $S_1(f=0.2)$ for varying
286: assortative mixing and clustering coefficients. The results hold for
287: other small $f$ values. We keep the size and
288: degree sequence constant to the values of the arxiv network. To
289: perform this sampling we rewire pairs of edges $(v_1,v_2)$ and
290: $(w_1,w_2)$ to $(v_1,w_2)$ and $(w_1,v_2)$ (unless this would
291: introduce a self-edge or multiple edges). To ensure that the
292: $n_\mathrm{rew}=100$ rewiring realizations are independent we start
293: with rewiring $n_\mathrm{init}=3M$ pairs of edges. Then we go through
294: pairs of edges randomly and execute only changes that makes the
295: current $r$ or $C$ closer to their target values. When the value of
296: $r$ or $C$ are within $0.1\%$ of the target value the iteration is
297: braked. The results seen in Fig.~\ref{fig:rew} shows that, just as
298: before the \textsc{Out} and \textsc{Deg} strategies, chained or
299: unchained, are most efficient throughout the parameter space. The
300: unchained versions are most efficient for $r\gtrsim 0.3$. An
301: explanation is that, for high $r$, the chained versions will
302: effectively only vaccinate the
303: high-connected vertices (that are grouped together for very high $r$)
304: and leave chains of low-degree vertices unvaccinated. The
305: $C$-dependence plotted in Fig.~\ref{fig:rew}(b) shows that the
306: unchained versions outperform the chained versions for $C\gtrsim
307: 0.15$. This is possibly a result of that the chains, for
308: combinatorial reasons, get stuck in one part of the network. It is not
309: an effect of biased
310: degree-degree correlations since if the rewiring procedure is conditioned to a
311: fixed $r$ Fig.~\ref{fig:rew}(b) remains essentially unaltered. We note
312: that the structure of the original arxiv network differs from the
313: rewired networks. For example, at $f=0.2$ of Fig.~\ref{fig:s1}(a) the
314: \textsc{Out} is 22\% more efficient than \textsc{OutC}, but in
315: Fig.~\ref{fig:rew} the \textsc{Out} and \textsc{OutC} curves differ
316: very little. For the \textsc{RNb} strategy the chained version is better than
317: the unchained throughout the range of $r$ and $C$ values.
318:
319: \section{Summary and conclusions}
320:
321: To summarize, we have investigated strategies for vaccination and
322: network attack that are based only on the knowledge of the
323: neighborhood---information that humans arguably possess and
324: utilize. Both static and dynamical measures of efficiency are
325: studied. For most networks, regardless of the number of vaccinated
326: vertices, the most efficient strategies are to choose a vertex $v$ and
327: vaccinate a neighbor of $v$ with highest degree (\textsc{Deg}), or the
328: neighbor of $v$ with most links out of $v$'s neighborhood
329: (\textsc{Out}). $v$ can be picked either as the lastly vaccinated
330: vertex (chained selection) or at random (unchained selection). For
331: real-world networks the chained versions tend to outperform the
332: unchained ones, whereas this situation is reversed for the three types
333: of model networks we study. We investigate the relative efficiency of
334: chained and unchained strategies further by sampling random networks
335: with a fixed degree sequence and varying assortative mixing and
336: clustering coefficients. We find that the unchained strategies are
337: preferable for networks with a very high clustering or strong
338: positive assortative mixing (larger values than in seen in real-world
339: networks). In Ref.~\cite{chn:vacc} the authors propose
340: the strategy to vaccinate a random neighbor of a randomly selected
341: vertex. This strategy (\textsc{RNb}) requires less information of the
342: neighborhood than \textsc{Deg} and \textsc{Out} do. Thus the
343: practical procedure gets simpler: One only has to ask a person
344: ``name a person you meet regularly'' rather than ``name the acquaintance of yours who meet most people you are not
345: acquainted with regularly'' (for \textsc{Out}). (``Meet with regularly''
346: should be replaced with some phrase signifying a high risk of infection
347: transfer for the pathogen in question.) On the other hand, if the
348: information of the neighborhoods is incomplete \textsc{Deg} and
349: \textsc{Out} will, effectively, be reduced to \textsc{RNb} (and thus not
350: perform worse than \textsc{RNb}). To epitomize, choosing the people to
351: vaccinate in the right way will save a tremendous amount of vaccine
352: and side-effect cases. The best strategy can only be selected by
353: considering both the structure of the network the pathogen spreads
354: over, and the disease dynamics. If nothing of this is known the
355: \textsc{OutC} strategy our recommendation---it is better, or not much
356: worse, than the best strategy in most cases. Together with
357: \textsc{DegC}, \textsc{OutC} is most efficient for low clustering
358: and assortative mixing coefficients, which is the region of parameter
359: space for sexually transmitted diseases---the most interesting case
360: for network based vaccination schemes (due to the well-definedness of
361: sexual networks).
362:
363:
364: \section*{Acknowledgements}
365:
366:
367: The author is grateful for comments from M.\ Rosvall and acknowledges
368: support from the Swedish Research Council through contract no.\
369: 2002-4135.
370:
371: \begin{thebibliography}{10}
372:
373: \bibitem{alb:attack}
374: R.~Albert, H.~Jeong, and A.-L. Barab\'{a}si, \textit{Attack and error tolerance
375: of complex networks}, Nature \textbf{406} (2000), pp.~378-382.
376:
377: \bibitem{ba:model}
378: A.-L. Barab\'{a}si and R.~Albert, \textit{Emergence of scaling in random
379: networks}, Science \textbf{286} (1999), pp.~509-512.
380:
381: \bibitem{bw:sw}
382: A.~Barrat and M.~Weigt, \textit{On the properties of small-world network
383: models}, Eur. Phys. J. B \textbf{13} (2000), pp.~547-560.
384:
385: \bibitem{chn:vacc}
386: R.~Cohen, S.~Havlin, and D.~ben Avraham, \textit{Efficient immunization
387: strategies for computer networks and populations}, Phys. Rev. Lett.
388: \textbf{91} (2003), art.~no.\ 247901.
389:
390: \bibitem{er:on}
391: P.~Erd\H{o}s and A.~R\'{e}nyi, \textit{On random graphs {I}}, Publ. Math.
392: Debrecen \textbf{6} (1959), pp.~290-297.
393:
394: \bibitem{gies}
395: J.~Giesecke, \textit{Modern infectious disease epidemiology}, Arnold, London,
396: 2~ed., 2002.
397:
398: \bibitem{our:seceder}
399: A.~Gr\"{o}nlund and P.~Holme, \textit{Networking the seceder model: Group
400: formation in social and economic systems}.
401: \newblock e-print: cond-mat/0312010.
402:
403: \bibitem{hk:model}
404: P.~Holme and B.~J. Kim, \textit{Growing scale-free networks with tunable
405: clustering}, Phys. Rev. E \textbf{65} (2002), art.~no.\ 026107.
406:
407: \bibitem{our:attack}
408: P.~Holme, B.~J. Kim, C.~N. Yoon, and S.~K. Han, \textit{Attack vulnerability of
409: complex networks}, Phys. Rev. E \textbf{65} (2002), art.~no.\ 066109.
410:
411: \bibitem{bjk:pfs}
412: B.~J. Kim, C.~N. Yoon, S.~K. Han, and H.~Jeong, \textit{Path finding strategies
413: in scale-free networks}, Phys. Rev. E \textbf{65} (2002), art.~no.\ 027103.
414:
415: \bibitem{lea:sex}
416: F.~Liljeros, C.~R. Edling, and L.~A. {Nunes Amaral}, \textit{Sexual networks:
417: Implication for the transmission of sexually transmitted infection}, Microbes
418: Infect. \textbf{5} (2003), pp.~189-196.
419:
420: \bibitem{mejn:scicolpnas}
421: M.~E.~J. Newman, \textit{The structure of scientific collaboration networks},
422: Proc. Natl. Acad. Sci. USA \textbf{98} (2001), pp.~404-409.
423:
424: \bibitem{mejn:assmix}
425: \leavevmode\vrule height 2pt depth -1.6pt width 23pt, \textit{Assortative
426: mixing in networks}, Phys. Rev. Lett. \textbf{89} (2002), art.~no.\ 208701.
427:
428: \bibitem{mejn:rev}
429: \leavevmode\vrule height 2pt depth -1.6pt width 23pt, \textit{The structure and
430: function of complex networks}, SIAM Rev. \textbf{45} (2003), pp.~167-256.
431:
432: \bibitem{watts:search}
433: D.~J. Watts, P.~S. Dodds, and M.~E.~J. Newman, \textit{Identity and search in
434: social networks}, Science \textbf{296} (2002), pp.~1302-1305.
435:
436: \bibitem{wattsstrogatz}
437: D.~J. Watts and S.~H. Strogatz, \textit{Collective dynamics of {`small-world'}
438: networks}, Nature \textbf{393} (1998), pp.~440-442.
439:
440: \end{thebibliography}
441:
442: \end{document}
443: