1: \section{Localization Algorithm and Implementation}
2: \label{sec-algorithm}
3: Our localization algorithm is a simple algorithm using
4: table lookup. The entire procedure has three stages.
5: The first stage is to calculate
6: a table, where each grid position has an entry. The entry for a
7: grid position $p$ is a 4-tuple, consisting of grid distances between
8: $p$ and four anchors. At least three anchors are necessary for
9: localization; we found four anchors to be an improvement over
10: three anchors, and the scheme can easily be generalized to a
11: larger number of anchors. The second stage establishes the
12: one-hop neighborhood for each sensor node, and for each
13: node $q$, calculates a 4-tuple, consisting of grid distances
14: between $q$ and the anchors using the
15: one-hop neighborhood information. The third stage performs the table lookup
16: and then refines the result of the lookup.
17: Our algorithm is shown in Figure ~\ref{top-level}.
18:
19: \begin{figure}[ht]
20: \centering
21: \begin{tabbing}
22: xxx \= xxxxxxxx \= xxx \= xxx \= \kill
23: \>\bf\em{stage 1: }\\
24: \>
25:
26: calculate grid distance table for the grid \\ \>\>positions\\
27: \\
28: \>\bf\em{stage 2:}\\
29: \>
30: each node $p$ sends a group of 30 messages\\
31: \>
32: each node forwards all the RSSI readings to\\
33: \>\> the base station \\
34: \>
35: for each sender $p$, base station calculates \\
36: \>\>a score for all receivers \\
37: \>
38: decide one-hop neighborhood for all nodes \\
39: \\
40:
41: \>\bf\em{stage3:}\\
42: \>calculate distance between each node $q$ \\
43: \>\>and the anchors\\
44: \>for each node $q$,look up in the table for\\
45: \>\> a match \\
46: \>for each unoccupied position, find the \\
47: \>\> most likely node\\
48: \>send out the assignments
49:
50: \end{tabbing}
51: \caption{localization algorithm}
52: \label{top-level}
53: \end{figure}
54:
55: The implementation of this algorithm in one segment can be
56: either centralized or distributed. We will describe the centralized
57: implementation first. Also each stage of this algorithm can be implemented
58: differently. For example, in the second stage, different ranging techniques
59: can be used to help establishing the one-hop neighborhood.
60:
61: \subsection{Table Establishment and Anchors}
62: Using formula (\ref{eqn-1}), defined in Section \ref{prelim},
63: it is simple to calculate the distances from any grid position $p$
64: to the anchors; therefore when the grid is designed or deployed the
65: table can be calculated and disseminated as needed for the
66: subsequent stages.
67:
68: In order for table lookup to be unambiguous, no pair of tuples
69: in the table should be identical (otherwise two different grid points could
70: have the same lookup). This constrains anchor placement.
71: If anchors are placed in a line, then there will be much symmetry in the
72: segment and different positions have identical tuples, which should
73: be avoided. In our experiment, we set up the segment same as in \cite{BKA05},
74: we found that if the 4 anchors
75: form a parallelogram, then no table entries are identical. We have
76: simulated different parallelogram placements in one segment, and
77: found when the anchors are placed close to the segment border, the
78: performance is slightly better. Similar observations can be found in
79: other localization schemes \cite{SGAMS03}.
80:
81:
82: \subsection{One-hop Neighborhood using RSSI}
83:
84: We use Received Signal Strength Indicator(RSSI) for distance estimation
85: in our implementation to obtain, for each node, an estimate of its
86: one-hop neighborhood. Simplifying this task was our design of the
87: grid, which defined the one-hop distance between neighbors
88: to be about nine meters. We assume that the deployment error
89: for each node is within a defined tolerance (see Section~\ref{sec-experiment}).
90:
91: The RSSI ranging works as follows. A sensor node sends
92: out a radio message with a certain signal strength and one field
93: of this message records the signal strength of sending. The receiver of
94: this message can measure the signal strength of the received message.
95: Given a model of how signal strength reduces with distance (in our
96: case obtained empirically), the original signal strength and received
97: signal strength can be compared and the distance between the
98: sender and receiver can be estimated. The advantage of RSSI ranging
99: is that it is very simple and needs no additional
100: hardware and little computing power. The disadvantage of RSSI ranging
101: is that the accuracy is poor\cite{AS04}; radio strength can be affected by the
102: environment, the relative angles of emplacement for a pair of sensors,
103: and manufacturing variances in radio devices.
104:
105: For RSSI ranging in our
106: implementation, we used a group of messages to deal with variation
107: (we did not use advanced filters in our experiments, preferring to
108: use the simplest of techniques). In our experiments, we notice
109: that even with group size of 30, over the same distance,
110: the mean RSSI reading varies considerably, as has been previously
111: observed for similar hardware \cite{WC02}. Put another way,
112: the mean received signal strength readings can be same over different
113: distances. To reduce the inaccuracy of RSSI ranging and get better
114: one-hop neighborhood information, instead of having estimation of
115: the real distance between a single pair
116: of sender and receiver, we forward statistics of
117: the RSSI readings to the base station.
118:
119: Experience of other researchers \cite{WC02,PH03,ZHKS04}
120: suggests that no elementary model may exist for how
121: signal strength behaves as a function of distance.
122: Therefore, instead of trying to find an analytical model for
123: RSSI, we used a machine learning approach in our investigation.
124: Machine learning for RSSI-based localization was previously used
125: in \cite{ELM04}, where Bayesian networks were trained.
126:
127: We use fuzzy membership functions \cite{D90} to
128: calculate a score for each receiver by looking at average RSSI reading,
129: number of received messages, maximum and minimal RSSI readings, and
130: the rank of RSSI readings among all receivers. For each property,
131: we assign one fuzzy membership function. According to the classification in
132: \cite{D90}, we choose the membership functions
133: based on reliability concerns with respect to the particular problem.
134: From our implementation, even the very simple function forms (linear and
135: triangular functions) are sufficient for this part.
136: Finally we decide the one-hop
137: neighborhood for the sender by checking the scores of the receivers.
138: The score is in fact another fuzzy membership function.
139: For this function, we use {\em Distance Approach},
140: one of the {\em semantic approaches} in fuzzy
141: membership function design. This approach concentrates more on
142: practical meaning and interpretation of membership. It is ideal
143: for multi-attribute decision making \cite{LH}. This approach
144: considers requirements on different attributes. It will assign
145: operations for fuzzy logic. For instance, let $f_A$ and $f_B$
146: be fuzzy membership functions for attributes $A$ and $B$; then
147: fuzzy membership function for ``$A$ and $B$'' is $min(f_A,f_B)$,
148: and for ``very $A$'' is $f_A^2$. So we can interpret empirical
149: rules such as ``if very $A$ and $B$, then $C$ is likely'' into
150: functions.
151:
152: After choosing the forms of the fuzzy membership functions, we need
153: to decide the parameters for these functions. We conducted extensive
154: tests for calibration. We performed tests with different spacing
155: between nodes, relative angles, weather and terrain conditions.
156: From the training data we collected from the tests,
157: we determined the parameters
158: to best distinguish 1-hop neighbors from more distant neighbors.
159: In Section~\ref{sec-analysis}, we
160: will show how well these functions work.
161:
162: We give an example here on how to choose parameters for
163: fuzzy membership functions. The fuzzy membership function $f_{avg}$ for
164: an average RSSI reading $x$ to be ``numerically like" a 1-hop reading is a combination
165: of 3 simple linear functions.
166:
167: \[f_{avg}(x) = \left\{\begin{array}{ll}
168: 1 & x<a\\
169: \frac{b-x}{b-a} & a \leq x < b \\
170: 0 & x \geq b
171: \end{array}
172: \right. \]
173:
174: In our tests, generally, if the distance is
175: smaller, the reading is smaller. So a smaller reading
176: implies a more likely 1-hop reading.
177: Thus we chose the function in this form.
178:
179: Here, $a$ and $b$ are parameters of this fuzzy membership function.
180: We need to determine these two parameters from our data. We
181: want to choose them to be thresholds that only a small part of 2-hop
182: readings are smaller than $a$ and most of the 1-hop readings are smaller
183: than $b$. So We chose
184: the $10^{th}$ percentile of average readings over 2-hop distances for
185: $a$, and the $95^{th}$ percentile of average readings over 1-hop distances
186: for $b$.
187:
188:
189:
190: \begin{figure}[ht]
191: \centering
192: \includegraphics {distribution.eps}
193: \caption{distribution for 1-hop and 2-hop readings}
194: \label{distributionChart}
195: \end{figure}
196:
197: Figure \ref{distributionChart} shows the distributions
198: of 1-hop and 2-hop readings. So for our training data,
199: $a = 343$, and $b = 361$.
200:
201: An advantage of using machine learning with fuzzy membership
202: function is that we do not have strong assumption for
203: the underlying distribution of the RSSI reading over a certain distance.
204:
205: An observation from the experiment is that by
206: running a $\chi^2$
207: goodness-of-fit test on these two distributions, we
208: found that the reading distributions for a fixed
209: distance might not always be normal distribution,
210: as assumed in \cite{MLRT04,NN04}. For 2-hop readings,
211: it is not statistically significant to reject the
212: hypothesis that the distribution is normal. But for
213: 1-hop readings, the test yields $\chi^2 = 30.11$, for
214: $p=0.01$, the $\chi^2$ for degree of freedom of 15 is
215: 30.58, so with high significance, we can reject the
216: hypothesis that the distribution of 1-hop readings is
217: normal.
218:
219: Similar to $f_{avg}$, we have fuzzy membership function
220: $f_{rel}$, which describes how much an average RSSI reading $x$
221: is ``relatively like" a 1-hop reading by checking all average RSSI readings
222: for the same sender. Let $s$ denote the sender of the messages, and
223: $R_s$ denote the set of nodes which report
224: an average RSSI reading for $s$. If $|R_s|<2$,
225: $f_{rel}$ is always 1. When $|R_s|$ is at least 2, let $max_s$
226: denote the mean of the 2 strongest average RSSI readings
227: reported by $R_s$.
228:
229: \[f_{rel}(x) = \left\{\begin{array}{ll}
230: 1 & x < 1.05max_s\\
231: 1-\frac{x-1.05 max_s}{0.1 max_s} & 1.05max_s \leq x < 1.15 max_s \\
232: 0 & x \geq 1.15max_s
233: \end{array}
234: \right. \]
235:
236: If a receiver
237: receives most of the messages in the message group
238: from a sender, the average RSSI reading is more reliable
239: than that from just a few messages. Meanwhile, from the training
240: data we collected, we observe that 1-hop neighbors generally receive
241: more messages than the 2-hop neighbors. Also with
242: reduced power of radio signals and a message rate of 10 messages/s,
243: we have not observed obvious message collisions, so it is not necessary
244: to schedule the messages.
245:
246: Function $f_{num}$ is used to determine whether an
247: average RSSI reading $x$ is ``like in volume" to a 1-hop reading by
248: checking the number of messages on which $x$ is based.
249: Let $x_{num}$ denote this number, $s$ denote the sender
250: of the messages, and $most_s$ denote
251: the most number of messages from $s$ that
252: are received by a receiver.
253:
254: \[f_{num}(x) = \left\{\begin{array}{ll}
255: 0 & x_{num} < 0.65most_s\\
256: \frac{x-0.65 most_s}{0.25 most_s} & 0.65most_s \leq x_{num} < 0.9 most_s \\
257: 1 & x_{num} \geq 0.9most_s
258: \end{array}
259: \right. \]
260:
261: After calculating the fuzzy membership functions for
262: different attributes, we use a fuzzy rule to get a
263: score for the sender/receiver pair. From the training data
264: we collected, we chose the rule for judging whether an average
265: reading of a sender/receiver pair is like a 1-hop reading to be,
266: it is ``numerically like", or, ``relatively like" and very ``like
267: in volume". So the score is calculated by the following.
268:
269: \[score(x) = max(f_{avg}(x),min(f_{rel}(x),f_{num}^2(x)))\]
270:
271: \subsection{Table Look-up and Refinement}
272:
273: After establishing the one-hop neighborhood for every sensor node, the
274: algorithm constructs shortest paths for each node to all the anchors
275: in the segment.
276: Thus it has a 4-tuple for each sensor node $q$, consisting of
277: grid distances between $q$ and the anchors. Next, a table lookup
278: is attempted: if it matches a table entry of
279: unoccupied position $p$, then
280: the algorithm assigns sensor node $q$ to position $p$. Due to inaccuracies
281: in the one-hop neighborhood determination, the lookup can fail to
282: match any table entry. To deal with lookup misses, we propose the
283: following refinement: for an unoccupied position $p$,
284: calculate a score for each remaining node. If a node $q$ has highest score
285: among the remaining nodes and the score is above a threshold $T$, assign
286: $q$ to position $p$.
287:
288: Here the algorithm assigns nodes to positions instead of
289: assigning positions to nodes. It does not make much difference in one
290: segment, however in a large scale network,
291: the base station can receive readings of nodes from an adjacent segments.
292: It then becomes likely that the base station has more nodes than positions,
293: so assigning nodes to positions is more likely to yield a better result.
294:
295: Note that it is possible that two or more nodes have tuples that
296: match the same table entry due to inaccurate RSSI readings.
297: Using our algorithm, only one will be assigned
298: to that grid position depending on which one is the first
299: to be looked up in the table. (Of course, RSSI ranging errors lower
300: the quality of our solution, as they would any RSSI-based
301: solution to localization.)
302:
303: It is useful to observe that our decision to match nodes to grid
304: positions, that is, to find a bijection between nodes and grid points,
305: may not be appropriate for some applications. Indeed, when two
306: nodes have the same table lookup results, it can be argued that both
307: should receive the same grid position. The general question of what
308: is a good metric for applications depending on localization quality
309: is outside the scope of our research.
310:
311: \subsection{Distributed Implementation within a Segment}
312: From our description of the algorithm, it is not hard to change the
313: implementation to be fully distributed. In the centralized implementation,
314: the sensor nodes just need to send out a group of messages and forward
315: statistics of the RSSI readings to the base station.
316: There is no message exchange between the sensor nodes.
317: After forwarding the statistics, the sensor nodes will just wait for
318: a grid position assignment.
319:
320: In a fully distributed implementation, each sensor node will send the
321: statistics of the RSSI readings to the sender of the RSSI messages.
322: Prior to deployment, each node's programming includes the localization
323: table as a read-only constant (in current technology, there is far
324: more read-only memory than working RAM).
325: After a sender gets statistics from the receivers, it will use fuzzy
326: membership functions to assign scores to each receiver and determine
327: which receivers are the one-hop neighbors. In stage 3, BFS spanning trees
328: rooted at the anchors can be constructed using a distributed algorithm
329: (which could be similar to routing protocols that construct spanning trees).
330: The depth of a node $q$ in the tree rooted at anchor $a_i$ is the distance
331: between $q$ and $a_i$. Then each node $q$ will look up the 4-tuple
332: in the table. If there is a match, assign the position to $q$; if not,
333: assign a score to all positions and pick the position with highest score.
334: The difference between the centralized and distributed implementations in
335: stage 3 is that distributed implementation assigns positions to nodes. So
336: it is possible for two nodes to think they are at the same position (as
337: noted previously, this is acceptable for some applications).
338:
339: In a large scale grid wireless sensor network, the network
340: can be heterogeneous, enabling faster communication and data processing.
341: Some sensor nodes are more powerful and have larger communication range.
342: These nodes form the back-bone of the network,
343: so routing to the base station needs fewer hops. GPS could be installed
344: at these nodes as well, making such nodes ideal as anchors. If they
345: have much more computing power than other sensor nodes, the
346: algorithm could run on these nodes. On the other hand,
347: if there are no such powerful nodes in the network, the
348: distributed implementation is more desirable. Another advantage
349: of the distributed implementation is that no multi-hop messaging
350: or flooding is needed, which reduces radio traffic.
351: