1: \documentclass{cccg05}
2: \usepackage{graphicx,amssymb,amsmath}
3: \usepackage{subfigure}
4: \usepackage{epsfig}
5:
6: %----------------------- Macros and Definitions --------------------------
7:
8: % Add all additional macros here, do NOT include any additional files.
9:
10: \newcommand{\stress}{\mbox{\it stress}}
11: \newcommand{\st}{\mbox{\it st}}
12: \newcommand{\Ar}{\mbox{\it Ar}}
13: \newcommand{\betw}{\mbox{\it betw}}
14:
15: % The environments theorem (Theorem), invar (Invariant), lemma (Lemma),
16: % cor (Corollary), obs (Observation), conj (Conjecture), and prop
17: % (Proposition) are already defined in the cccg05.cls file.
18: % Add additional environments only if you REALLY need them.
19:
20: %----------------------- Title -------------------------------------------
21:
22: \title{A New Approach for Boundary Recognition in Geometric Sensor Networks}
23:
24: \author{S\'andor P.~Fekete\thanks{Department of Mathematical Optimization,
25: Braunschweig University of Technology, {\tt [s.fekete,a.kroeller@tu-bs.de}}
26: \and
27: Michael Kaufmann\thanks{Department of Computer Science, University of T\"ubingen, {\tt [mk,lehmannk@informatik.uni-tuebingen.de}}
28: \and
29: Alexander Kr\"oller\footnotemark[1]\ \thanks{Supported by the German Research Foundation (DFG) within the focus program ``Algorithms for Large and Complex Networks'' (SPP 1126), grant Fe407/8-1.}
30: \and
31: Katharina Lehmann\footnotemark[2]\ \thanks{Supported by the German Research Foundation (DFG) within the focus program ``Algorithms for Large and Complex
32: Networks'' (SPP 1126), grant Ka812/11-1.}
33: }
34:
35: % Add the appropriate index information!
36:
37: \index{Fekete, S\'andor P.}
38: \index{Kaufmann, Michael}
39: \index{Kr\"oller, Alexander}
40: \index{Lehmann, Katharina}
41:
42: %------------------------------ Text -------------------------------------
43:
44: \begin{document}
45: \maketitle
46:
47: \begin{abstract}
48: We describe a new approach for dealing with the following central
49: problem in the self-organization of a geometric sensor network:
50: Given a polygonal region $R$, and a large, dense set of sensor nodes that are scattered
51: uniformly at random in $R$. There is no central control unit, and nodes can only communicate locally by
52: wireless radio to all other nodes that are within communication radius $r$,
53: without knowing their coordinates or distances to other nodes.
54: The objective is to develop a simple distributed protocol that allows
55: nodes to identify themselves as being located near the boundary of $R$
56: and form connected pieces of the boundary.
57: We give a comparison of several centrality measures commonly
58: used in the analysis of social networks and show that
59: {\em restricted stress centrality} is particularly
60: suited for geometric networks; we provide mathematical as
61: well as experimental evidence for the quality of this measure.
62: \end{abstract}
63:
64: \section{Introduction}
65: \label{sec:intro}
66:
67: %{\bf Sensor Networks.}
68: In recent time, the study of wireless sensor networks (WSN) has become
69: a rapidly developing research area that offers fascinating
70: perspectives for combining technical progress with new applications of
71: distributed computing. Typical scenarios involve a large swarm of
72: small and inexpensive processor nodes, each with limited computing and
73: communication resources, that are distributed in some geometric
74: region; communication is performed by wireless radio with limited
75: range. As energy consumption is a limiting factor for the lifetime of
76: a node, communication has to be minimized. Upon start-up, the swarm
77: forms a decentralized and self-organizing network that surveys the
78: region.
79:
80: \begin{figure}[t]
81: \begin{center}
82: \centering
83: %\hspace*{0.03\textwidth}
84: \subfigure[60,000 sensor nodes, distributed uniformly at random in a polygonal region.\label{fig:city:b}]{
85: \epsfig{file=lq-ocp2-60k-70.eps,width=0.70\columnwidth}
86: }
87: \subfigure[A zoom into (a) shows the communication graph.\label{fig:city:c}]{
88: \epsfig{file=lq-ocp3-60k-70.eps,width=0.35\columnwidth}
89: }
90: %\hspace*{0.03\textwidth}
91: \subfigure[A further zoom into (b) shows the communication ranges.\label{fig:city:d}]{
92: \epsfig{file=lq-ocp4-60k-70.eps,width=0.35\columnwidth}
93: }
94: \vspace*{-6mm}
95: \caption{Scenario of a geometric sensor network, obtained by scattering sensor nodes in the street network surrounding Braunschweig University of Technology.}
96: \label{fig:city}
97: \end{center}
98: \vspace*{-6mm}
99: \end{figure}
100:
101: From an algorithmic point of view, the characteristics of a sensor
102: network require working under a paradigm that is different from
103: classical models of computation: Absence of a central control unit,
104: limited capabilities of nodes, and limited communication between nodes
105: require developing new algorithmic ideas that combine methods of
106: distributed computing and network protocols with traditional
107: centralized network algorithms. In other words: How can we use a
108: limited amount of strictly local information in order to achieve
109: distributed knowledge of global network properties?
110:
111: This task is much simpler if the exact
112: location of each node is known. Computing node coordinates
113: has received a considerable amount of attention.
114: Unfortunately, computing exact coordinates requires the use of
115: special location hardware like GPS, or alternatively,
116: scanning devices, imposing physical demands on size and structure
117: of sensor nodes. As we demonstrated in our paper~\cite{kfb-kl-05},
118: current methods for computing coordinates based on anchor points
119: and distance estimates encounter serious
120: difficulties in the presence of even small inaccuracies, which are
121: unavoidable in practice.
122:
123: As shown in \cite{fkp-nbtrsn-04}, there is a way to sidestep many of the above
124: difficulties, as some structural location aspects do {\em not}
125: depend on coordinates.
126: This is particularly relevant for sensor networks
127: that are deployed in an environment
128: with interesting geometric features. (See \cite{fkp-nbtrsn-04}
129: for a more detailed discussion.) Obviously, scenarios as the one
130: shown in Figure~1 pose a number of interesting
131: geometric questions. Conversely, exploiting the basic fact
132: that the communication graph of a sensor network
133: has a number of geometric properties provides
134: an elegant way to extract structural information.
135:
136: One key aspect of location awareness is {\em boundary recognition},
137: making sensors close to the boundary of the surveyed region
138: aware of their position and letting them form
139: connected {\em boundary strips} along each verge.
140: This is of major importance for keeping track of events entering or
141: leaving the region, as well as for communication with the
142: outside. Neglecting the existence of holes in the region may also
143: cause problems in communication, as routing along shortest paths tends
144: to put an increased load on nodes along boundaries, exhausting their
145: energy supply prematurely; thus, a moderately-sized hole (caused by
146: obstacles, by an event, or by a cluster of failed nodes) may tend to
147: grow larger and larger.
148:
149: We show that using a combination of geometry, stochastics, and tools
150: from social networks, a considerable amount of location awareness can indeed be
151: achieved in a large swarm of sensor nodes without any use of location
152: hardware. The result is a relatively simple distributed algorithm
153: for boundary recognition in large geometric sensor networks that shows
154: excellent performance for test networks with 80,000 nodes.
155:
156: \section{Centrality Measures for Social Networks}
157: \label{social}
158: A different area studying large and complex graphs is the field
159: of {\em Social Networks}, where nodes represent individuals
160: in a large collective, and edges indicate some interaction between
161: them. (See the recent book \cite{be-namf-05} for an overview and an extensive
162: list of references.) Identifying asymmetries within a network
163: is a natural approach; one particular way of doing this is based
164: on so-called centrality indices, i.e., real-valued functions that
165: assign high values to more ``central'' nodes, while ``boundary'' nodes
166: get low values.
167:
168: In the last five decades, many different centrality
169: indices have been proposed. There are two major classes: One is based
170: on local properties of the graph, so it is particularly suited for
171: typical scenarios of sensor networks and will be discussed in some detail.
172: The other class is based on more global properties, e.g.,
173: the computation of eigenvalues of the adjacency matrix, so it is less
174: useful for our purposes.
175:
176: %\pagebreak
177: %\vspace*{-6mm}
178: Centrality indices of the first class can be subdivided into three
179: subclasses: The first considers the distances to other vertices,
180: the second determines the number of vertices at a given
181: distance, while the third makes use of shortest
182: paths containing a given vertex.
183:
184: Considering the maximum distance to another vertex in the graph
185: (based on hop-count) does not reflect local topological structures
186: in a sensor network; in particular, it fails to indicate closeness
187: to interior boundaries. The size of the $k$-hop neighborhood
188: is better suited, and (for the simple choice $k=1$) was indeed the basis
189: for our approach described in \cite{fkp-nbtrsn-04}, as it is an indicator
190: for the size of the intersection of the communication range of
191: a node with $R$.
192: It is tempting to try to improve the results by increasing $k$,
193: but this is not without drawbacks with respect to topological properties,
194: as a boundary node
195: close to a ``thick'' part of $R$ may get a better value
196: than an interior node that is located in a ``thin'' part of the region.
197: See Figure~\ref{fig:cent:a} for a scenario with 80,000 nodes;
198: index values are represented on a color scale from dark (low)
199: to light (high).
200:
201: \begin{figure}
202: \begin{center}
203: \centering
204: \includegraphics[width=0.6\columnwidth]{lq-c-khop-4.eps}
205: \caption{$k$-hop neighborhood for $k$=4.}
206: \label{fig:cent:a}
207: \end{center}
208: \vspace*{-6mm}
209: \end{figure}
210:
211: This leaves the structure of shortest paths. In particular,
212: the {\it stress centrality} $stress(v)$ is defined as the number
213: of shortest paths containing $v$:
214: \begin{equation}
215: \stress(v) := \sum_{s \in V}\sum_{t \not = s \in V} \sigma_{st}(v),
216: \end{equation}
217: where $\sigma_{st}(v)$ denotes the number of shortest paths containing $v$.
218: Only considering vertices within a given distance $\delta$ yields
219: the {\em restricted stress centrality}:
220: \begin{equation}
221: \stress(v, \delta) := \sum_{s \in V_\delta(v)}\sum_{t \not = s \in V_\delta(v)} \sigma_{st}(v).
222: \end{equation}
223:
224: \begin{figure*}
225: \begin{center}
226: %\centering
227: %\hspace*{0.03\textwidth}
228: %\subfigure[$k$-hop neighborhood for $k$=4.\label{fig:cent:a}]{
229: %\epsfig{file=lq-c-khop-4.eps,width=0.45\textwidth}
230: %}
231: \subfigure[Betweenness centrality.\label{fig:cent:b}]{
232: \epsfig{file=lq-c-between-5.eps,width=0.65\columnwidth}
233: }
234: %\\
235: %\hspace*{0.03\textwidth}
236: \subfigure[Stress centrality.\label{fig:cent:c}]{
237: \epsfig{file=lq-c-stress.eps,width=0.65\columnwidth}
238: }
239: %\\
240: %\hspace*{0.03\textwidth}
241: \subfigure[Restricted stress centrality with threshold filter.\label{fig:cent:d}]{
242: \epsfig{file=lq-c-stress-thresh.eps,width=0.65\columnwidth}
243: }
244: \vspace*{-6mm}
245: \caption{Performance of different centrality measures, shown for a scenario of 80,000 nodes distributed uniformly at random.}
246: \label{fig:perform}
247: \end{center}
248: \vspace*{-6mm}
249: \end{figure*}
250:
251: In the context of a communication network, this measure can be
252: motivated as follows:
253: If each vertex sends a message to every other vertex along all shortest paths,
254: the stress centrality counts how many times vertex $v$ is busy with
255: passing on a message. As there may be quite many shortest paths,
256: it is reasonable to assume that a vertex
257: sends a message to some other vertex and uses any of their shortest paths with the same probability, i.e., $1/\sigma_{st}$, where
258: $\sigma_{st}$ denotes the number of shortest paths between $s$ and $t$.
259: The probability of any vertex $v$ that it has to transport the message is thus
260: given by $\rho_{st}(v):=\frac{\sigma_{st}(v)}{\sigma_{st}}$.
261: The {\it betweenness centrality} $\betw(v)$ is defined as the sum over all $\rho_{st}(v)$:
262: \begin{equation}
263: \betw(v) := \sum_{s \in V}\sum_{t \in V} \rho_{st}(v).
264: \end{equation}
265: See Figure~\ref{fig:cent:b} for the evaluation of betweenness centrality
266: for our example, while Figure~\ref{fig:cent:c} shows the stress centrality.
267: (Again, low values are indicated by dark dots, while high values are represented
268: by light color.)
269: A detailed analysis for restricted stress centrality
270: is given in the following section.
271:
272: %\medskip {\bf Our Results.} We show that distributed location
273: %awareness can be achieved without the help of location hardware. In
274: %particular:
275: %
276: %\begin{itemize}
277: %\item We describe how to recognize the nodes that are near the
278: %boundary of the region. The underlying geometric idea is quite
279: %simple, but it requires some effort on both stochastics and
280: %communication to make it work.
281: %\end{itemize}
282: %%
283: %The rest of this paper is organized as follows. In Section~\ref{sec:prelim}
284: %we give some basic notation and state our underlying model assumptions.
285: %In Section~\ref{sec:tree} we describe how to obtain an auxiliary
286: %tree structure that is used for computing and distributing
287: %global network parameters. Section~\ref{sec:prob} gives a brief
288: %overview of probabilistic aspects that are used in the rest
289: %of the paper to allow topology recognition. Section~\ref{sec:bound}
290: %describes how to perform boundary recognition, while Section~\ref{sec:high}
291: %gives a sketch of how to compute more advanced properties.
292: %Section~\ref{sec:experiments} describes implementation issues
293: %and shows some of our experiments. Finally, Section~\ref{sec:future}
294: %discusses the possibilities for further progress based on our work.
295: %
296:
297: \section{Using Restricted Stress Centrality}
298: \label{stress}
299: In the context of a sensor network, it takes a number of algorithmic
300: steps to evaluate a measure and use the results for extracting
301: global features like boundaries. Some of those details are described
302: in our paper \cite{fkp-nbtrsn-04}, and can be used analogously for
303: other measures: Using an auxiliary tree structure (which is easy
304: to obtain), we can aggregate local results globally in order
305: to determine appropriate threshold values. Once a threshold has been set,
306: it can be distributed to all nodes in the network; after that, each
307: node simply checks whether its centrality index is above or below
308: the threshold, resulting in a classification as ``interior'' or ``boundary''.
309: A good index must have the following properties:
310: \begin{itemize}
311: \item It should require only simple local computations for each node.
312: \item Setting a good threshold value should be relatively easy.
313: In other words: The distributions for interior nodes and for boundary nodes
314: should be well-separated.
315: \end{itemize}
316:
317: \begin{theorem}
318: \label{th:sep}
319: Using the restricted stress centrality $\stress(v,1)$,
320: nodes are classified correctly with high probability
321: for sufficiently large node density.
322: \end{theorem}
323:
324: See Figure~\ref{fig:cent:d} for the result for restricted stress centrality
325: for relatively moderate density:
326: It can be seen that all boundary nodes are correctly classified. The
327: interior contains a number of false positives, which can be eliminated
328: by additional filters.
329:
330: {\bf Discussion of Theorem~1.}
331: Let $v$ be a node in the network, and let $\delta(v)$ be the number
332: of neighbors of $v$. Furthermore, $\stress(v,1)$ is the number
333: of nonadjacent neighbors of $v$. Then the normalized
334: coefficient $\st(v):=\frac{2\stress(v,1))}{\delta(v)(\delta(v)-1)}$
335: describes the fraction of pairs of neighbors that are nonadjacent,
336: i.e., that have a shortest-path connection via $v$, so
337: $\mathbb{E}[\stress(v,1)]=\mathbb{E}[st(v)]\left(\begin{array}{c}{\mathbb{E}[\delta(v)]}\\2\end{array}\right)$.
338: Now consider any neighbor $w$ of $v$. Let $C(v):=\{p\in R\mid d(p,v)\leq r\}$
339: be the portion of $R$ that is within communication range of $v$.
340: See Figure~\ref{fig:circles}; let $N_w:=C(v)\cap C(w)$, and
341: $M_w:=C(v)\setminus C(w)$. For a uniform random distribution,
342: the expected fraction of neighbors of $v$ that are not adjacent
343: to $w$ corresponds to the ratio of areas
344: $\frac{\Ar(M_w)}{\Ar(C(v))}$.
345: Integrating over all possible positions of $w$, we get
346: an overall expected value
347: $\st(v)=\frac{1}{\Ar(C(v))}\int_{w\in C(v)}\left(\frac{\Ar(M_w)}{\Ar(C(v))}\right)dw$.
348:
349: \begin{figure}[h]
350: \centering
351: \includegraphics[width=.45\columnwidth]{lq-circles.eps}
352: \vspace*{-3mm}
353: \caption{For any given neighbor $w$ of $v$, the expected fraction of
354: neighbors of $v$ that are not neighbors of $w$ is given by
355: $\frac{|M|}{|N\cup M|}$.}
356: \label{fig:circles}
357: \vspace*{-3mm}
358: \end{figure}
359:
360: As the size of the areas also depends on the distance $s$ of $v$
361: from the boundary, solving this integral in closed form
362: for all $s$ would require finding a primitive that contains $d$ as an explicit
363: parameter; this appears to be hopeless, even using ideas as described
364: in \cite{geo.prob}. However, for specific values of $s$,
365: an explicit numerical calculation is possible:
366: For $s\geq r=1$ and $d(w,v)=x$ the area of $M_w$ turns out to be
367: $\frac{8\left(\arccos\left(\frac{x}{2}\right)-\frac{1}{2}\sin\left(2\arccos\left(\frac{x}{2}\right)\right)\right)}{3}$.
368: The resulting integral $\sigma=\int_0^1 x\left(1-\frac{2\left(\arccos\left(\frac{x}{2}\right)-\frac{1}{2}\sin\left(2\arccos\left(\frac{x}{2}\right)\right)\right)}{\pi}\right)dx$ can be solved numerically,
369: resulting in a value of $\sigma=0.4134966716$.
370: %Similarly, the resulting value
371: %for $s=0$ is xxx.
372:
373: For determining threshold values for separating interior and
374: boundary values of $\st$, we also need the random distribution
375: of $\st$ for different values of $s$. These distributions
376: can be determined with additional numerical computations; using
377: a Monte-Carlo simulation, we obtained distributions
378: like the ones in Figure~\ref{fig:dist}: Shown are the distributions
379: for 20 expected neighbors (\ref{fig:dist:a})
380: and for 200 expected neighbors (\ref{fig:dist:b}); the left
381: (red) curve shows the distribution of $\st$ for a node $v$ on the
382: boundary, while the right (green/blue) curve shows the distribution
383: completely in the interior of $R$.
384: The probability of error for a specific threshold is given
385: by the normalized area to the right of the threshold below
386: the left curve (false negatives)
387: or by the normalized area to the left of the threshold below
388: the right curve (false positive). Clearly, the error becomes
389: arbitrarily small for large neighborhood size.
390: \QED
391:
392: For intermediate sizes
393: as the one in our example, choosing a relatively large threshold
394: value avoids too many false negatives, at the expense of a limited
395: ratio of false positives.
396: \begin{figure}
397: \begin{center}
398: \centering
399: \subfigure[Distributions for neighborhood size 20.\label{fig:dist:a}]{
400: \epsfig{file=lq-keep_n10000_p0.002_x50000.eps,width=0.8\columnwidth}
401: }
402: \subfigure[Distributions for neighborhood size 200.\label{fig:dist:b}]{
403: \epsfig{file=lq-keep_n100000_p0.002_x50000.eps,width=0.80\columnwidth}
404: }
405: \vspace*{-3mm}
406: \caption{Random distribution of restricted stress centrality for a node on the boundary and in the interior,
407: for different neighborhood sizes.}
408: \label{fig:dist}
409: \end{center}
410: \vspace*{-3mm}
411: \end{figure}
412:
413: \section{Algorithm}
414:
415: In \cite{fkp-nbtrsn-04}, we showed how to estimate
416: $\mathbb{E}[\delta(v)]$ for a node $v$ of boundary distance $s\geq r$,
417: i.e., a node on the inside of the network. The algorithm constructs a
418: tree, collects a node degree histogram and floods the result to all
419: nodes. Both the total runtime of the algorithm and the total size of
420: messages is $\mathcal{O}(|V|\log^2|V|)$. Each node stores a constant
421: threshold value
422: $0 < \theta < \sigma$ that has been chosen in advance. If
423: \[ st(v)\leq \theta\left(\begin{array}{c}\mathbb{E}[\delta(v)]\\2\end{array}\right) \;, \]
424: the node declares itself to be a boundary node. In experiments, we
425: found $\theta=1/3$ to be a particularly good choice.
426:
427:
428: \section{Conclusion}
429:
430: We showed that restricted stress centrality is a useful index
431: for extracting topological boundary information from a geometric
432: sensor network, provided that the distribution of nodes follows
433: a suitable random distribution. As this is a rather strong assumption,
434: it appears desirable to come up with more general methods.
435: Moreover, an approach based on random distributions
436: may still fail in some rare cases
437: (even though the probability of failure is extremely low),
438: so it is particularly interesting to develop
439: deterministic methods for boundary recognition.
440: Such an approach is described in our forthcoming paper
441: \cite{fkfp-dbrlgsn-05}.
442:
443: %---------------------------- Bibliography -------------------------------
444:
445: % Please add the contents of the .bbl file
446:
447: \small
448: \bibliographystyle{abbrv}
449: \bibliography{refs}
450:
451: %\begin{thebibliography}{99}
452: %\end{thebibliography}
453:
454: \end{document}
455: