cs0601111/sec3.tex
1: \section{Localization Algorithm and Implementation} 
2: \label{sec-algorithm}
3: Our localization algorithm is a simple algorithm using 
4: table lookup.  The entire procedure has three stages. 
5: The first stage is to calculate
6: a table, where each grid position has an entry. The entry for a
7: grid position $p$ is a 4-tuple, consisting of grid distances between
8: $p$ and four anchors.  At least three anchors are necessary for 
9: localization;  we found four anchors to be an improvement over 
10: three anchors, and the scheme can easily be generalized to a 
11: larger number of anchors.  The second stage establishes the 
12: one-hop neighborhood for each sensor node, and for each 
13: node $q$, calculates a 4-tuple, consisting of grid distances 
14: between $q$ and the anchors using the
15: one-hop neighborhood information. The third stage performs the table lookup 
16: and then refines the result of the lookup.
17: Our algorithm is shown in Figure ~\ref{top-level}.
18: 
19: \begin{figure}[ht]
20: \centering
21: \begin{tabbing}
22: xxx \= xxxxxxxx \= xxx \= xxx \= \kill
23: \>\bf\em{stage 1: }\\
24: \>
25: 
26: calculate grid distance table for the grid \\ \>\>positions\\
27: \\
28: \>\bf\em{stage 2:}\\
29: \>
30: each node $p$ sends a group of 30 messages\\
31: \>
32: each node forwards all the RSSI readings to\\
33: \>\> the base station \\
34: \>
35: for each sender $p$, base station calculates \\
36: \>\>a score for all receivers \\
37: \>
38: decide one-hop neighborhood for all nodes \\
39: \\
40: 
41: \>\bf\em{stage3:}\\
42: \>calculate distance between each node $q$ \\
43: \>\>and the anchors\\
44: \>for each node $q$,look up in the table for\\
45: \>\> a match  \\
46: \>for each unoccupied position, find the \\
47: \>\> most likely node\\
48: \>send out the assignments
49: 
50: \end{tabbing}
51: \caption{localization algorithm}
52: \label{top-level}
53: \end{figure}
54: 
55: The implementation of this algorithm in one segment can be
56: either centralized or distributed. We will describe the centralized 
57: implementation first. Also each stage of this algorithm can be implemented
58: differently. For example, in the second stage, different ranging techniques
59: can be used to help establishing the one-hop neighborhood.
60: 
61: \subsection{Table Establishment and Anchors}
62: Using formula (\ref{eqn-1}), defined in Section \ref{prelim}, 
63: it is simple to calculate the distances from any grid position $p$ 
64: to the anchors;  therefore when the grid is designed or deployed the 
65: table can be calculated and disseminated as needed for the 
66: subsequent stages.
67: 
68: In order for table lookup to be unambiguous, no pair of tuples
69: in the table should be identical (otherwise two different grid points could
70: have the same lookup). This constrains anchor placement.
71: If anchors are placed in a line, then there will be much symmetry in the 
72: segment and different positions have identical tuples, which should
73: be avoided. In our experiment, we set up the segment same as in \cite{BKA05},
74:  we found that if the 4 anchors
75: form a parallelogram, then no table entries are identical. We have 
76: simulated different parallelogram placements in one segment, and
77: found when the anchors are placed close to the segment border, the 
78: performance is slightly better. Similar observations can be found in 
79: other localization schemes \cite{SGAMS03}.
80: 
81: 
82: \subsection{One-hop Neighborhood using RSSI}
83: 
84: We use Received Signal Strength Indicator(RSSI) for distance estimation
85: in our implementation to obtain, for each node, an estimate of its 
86: one-hop neighborhood.  Simplifying this task was our design of the 
87: grid, which defined the one-hop distance between neighbors 
88: to be about nine meters.  We assume that the deployment error 
89: for each node is within a defined tolerance (see Section~\ref{sec-experiment}).
90: 
91: The RSSI ranging works as follows. A sensor node sends
92: out a radio message with a certain signal strength and one field 
93: of this message records the signal strength of sending.  The receiver of 
94: this message can measure the signal strength of the received message.
95: Given a model of how signal strength reduces with distance (in our
96: case obtained empirically), the original signal strength and received 
97: signal strength can be compared and the distance between the 
98: sender and receiver can be estimated.  The advantage of RSSI ranging 
99: is that it is very simple and needs no additional
100: hardware and little computing power. The disadvantage of RSSI ranging
101: is that the accuracy is poor\cite{AS04};  radio strength can be affected by the
102: environment, the relative angles of emplacement for a pair of sensors, 
103: and manufacturing variances in radio devices.  
104: 
105: For RSSI ranging in our 
106: implementation, we used a group of messages to deal with variation 
107: (we did not use advanced filters in our experiments, preferring to 
108: use the simplest of techniques).  In our experiments, we notice
109: that even with group size of 30, over the same distance, 
110: the mean RSSI reading varies considerably, as has been previously 
111: observed for similar hardware \cite{WC02}.  Put another way,
112: the mean received signal strength readings can be same over different 
113: distances.  To reduce the inaccuracy of RSSI ranging and get better 
114: one-hop neighborhood information, instead of having estimation of 
115: the real distance between a single pair 
116: of sender and receiver, we forward statistics of
117: the RSSI readings to the base station. 
118:  
119: Experience of other researchers \cite{WC02,PH03,ZHKS04} 
120: suggests that no elementary model may exist for how 
121: signal strength behaves as a function of distance.
122: Therefore, instead of trying to find an analytical model for 
123: RSSI, we used a machine learning approach in our investigation.
124: Machine learning for RSSI-based localization was previously used 
125: in \cite{ELM04}, where Bayesian networks were trained.
126: 
127: We use fuzzy membership functions \cite{D90} to 
128: calculate a score for each receiver by looking at average RSSI reading,
129: number of received messages, maximum and minimal RSSI readings, and
130: the rank of RSSI readings among all receivers. For each property,
131: we assign one fuzzy membership function. According to the classification in
132: \cite{D90}, we choose the membership functions
133: based on reliability concerns with respect to the particular problem.
134: From our implementation, even the very simple function forms (linear and 
135: triangular functions) are sufficient for this part.
136: Finally we decide the one-hop 
137: neighborhood for the sender by checking the scores of the receivers.
138: The score is in fact another fuzzy membership function. 
139: For this function, we use {\em Distance Approach}, 
140: one of the {\em semantic approaches} in fuzzy
141: membership function design. This approach concentrates more on 
142: practical meaning and interpretation of membership.  It is ideal 
143: for multi-attribute decision making \cite{LH}. This approach 
144: considers requirements on different attributes. It will assign 
145: operations for fuzzy logic. For instance, let $f_A$ and $f_B$
146: be fuzzy membership functions for attributes $A$ and $B$; then
147: fuzzy membership function for ``$A$ and $B$'' is $min(f_A,f_B)$,
148: and for ``very $A$'' is $f_A^2$. So we can  interpret empirical 
149: rules such as ``if very $A$ and $B$, then $C$ is likely'' into
150: functions.    
151: 
152: After choosing the forms of the fuzzy membership functions, we need
153: to decide the parameters for these functions. We conducted extensive
154: tests for calibration.  We performed tests with different spacing 
155: between nodes, relative angles, weather and terrain conditions. 
156: From the training data we collected from the tests, 
157: we determined the parameters
158: to best distinguish 1-hop neighbors from more distant neighbors. 
159: In Section~\ref{sec-analysis}, we
160: will show how well these functions work. 
161: 
162: We give an example here on how to choose parameters for
163: fuzzy membership functions. The fuzzy membership function $f_{avg}$ for 
164: an average RSSI reading $x$ to be ``numerically like" a 1-hop reading is a combination
165: of 3 simple linear functions. 
166: 
167: \[f_{avg}(x) = \left\{\begin{array}{ll}
168: 1 & x<a\\
169: \frac{b-x}{b-a} & a \leq x < b \\
170: 0 & x \geq b
171: \end{array}
172: \right. \]
173: 
174: In our tests, generally, if the distance is 
175: smaller, the reading is smaller. So a smaller reading
176: implies a more likely 1-hop reading.
177: Thus we chose the function in this form.
178: 
179: Here, $a$ and $b$ are parameters of this fuzzy membership function.
180: We need to determine these two parameters from our data. We
181: want to choose them to be thresholds that only a small part of 2-hop
182: readings are smaller than $a$ and most of the 1-hop readings are smaller
183: than $b$. So We chose 
184: the $10^{th}$ percentile of average readings over 2-hop distances for 
185: $a$, and the $95^{th}$ percentile of average readings over 1-hop distances
186: for $b$. 
187: 
188: 
189: 
190: \begin{figure}[ht]
191: \centering
192: \includegraphics {distribution.eps}
193: \caption{distribution for 1-hop and 2-hop readings}
194: \label{distributionChart}
195: \end{figure}
196: 
197: Figure \ref{distributionChart} shows the distributions
198: of 1-hop and 2-hop readings. So for our training data,
199: $a = 343$, and $b = 361$.
200: 
201: An advantage of using machine learning with fuzzy membership 
202: function is that we do not have strong assumption for
203: the underlying distribution of the RSSI reading over a certain distance.
204: 
205: An observation from the experiment is that by 
206: running a $\chi^2$
207: goodness-of-fit test on these two distributions, we 
208: found that the reading distributions for a fixed
209: distance might not always be normal distribution,
210: as assumed in \cite{MLRT04,NN04}. For 2-hop readings,
211: it is not statistically significant to reject the
212: hypothesis that the distribution is normal. But for
213: 1-hop readings, the test yields $\chi^2 = 30.11$, for
214: $p=0.01$, the $\chi^2$ for degree of freedom of 15 is 
215: 30.58, so with high significance, we can reject the
216: hypothesis that the distribution of 1-hop readings is 
217: normal.
218: 
219: Similar to $f_{avg}$, we have fuzzy membership function
220: $f_{rel}$, which describes how much an average RSSI reading $x$
221: is ``relatively like" a 1-hop reading by checking all average RSSI readings
222: for the same sender. Let $s$ denote the sender of the messages, and
223:  $R_s$ denote the set of nodes which report
224: an average RSSI reading for $s$. If $|R_s|<2$,
225: $f_{rel}$ is always 1. When $|R_s|$ is at least 2, let $max_s$
226: denote the mean of the 2 strongest average RSSI readings 
227: reported by $R_s$. 
228: 
229: \[f_{rel}(x) = \left\{\begin{array}{ll}
230: 1 &  x < 1.05max_s\\
231: 1-\frac{x-1.05 max_s}{0.1 max_s} & 1.05max_s \leq x < 1.15 max_s \\
232: 0 & x \geq 1.15max_s
233: \end{array}
234: \right. \]
235: 
236: If a receiver 
237: receives most of the messages in the message group
238: from a sender, the average RSSI reading is more reliable
239: than that from just a few messages. Meanwhile, from the training
240: data we collected, we observe that 1-hop neighbors generally receive
241: more messages than the 2-hop neighbors. Also with 
242: reduced power of radio signals and a message rate of 10 messages/s, 
243: we have not observed obvious message collisions, so it is not necessary
244: to schedule the messages.
245: 
246: Function $f_{num}$ is used to determine whether an 
247: average RSSI reading $x$ is ``like in volume" to a 1-hop reading by 
248: checking the number of messages on which $x$ is based.
249: Let $x_{num}$ denote this number, $s$ denote the sender 
250: of the messages, and $most_s$ denote
251: the most number of messages from $s$ that 
252: are received by a receiver.  
253: 
254: \[f_{num}(x) = \left\{\begin{array}{ll}
255: 0 &  x_{num} < 0.65most_s\\
256: \frac{x-0.65 most_s}{0.25 most_s} & 0.65most_s \leq x_{num} < 0.9 most_s \\
257: 1 & x_{num} \geq 0.9most_s
258: \end{array}
259: \right. \]
260: 
261: After calculating the fuzzy membership functions for
262: different attributes, we use a fuzzy rule to get a 
263: score for the sender/receiver pair. From the training data
264: we collected, we chose the rule for judging whether an average
265: reading of a sender/receiver pair is like a 1-hop reading to be,
266: it is ``numerically like", or, ``relatively like" and very ``like
267: in volume". So the score is calculated by the following.
268: 
269: \[score(x) = max(f_{avg}(x),min(f_{rel}(x),f_{num}^2(x)))\]
270: 
271: \subsection{Table Look-up and Refinement}
272: 
273: After establishing the one-hop neighborhood for every sensor node, the
274: algorithm constructs shortest paths for each node to all the anchors
275: in the segment. 
276: Thus it has a 4-tuple for each sensor node $q$, consisting of 
277: grid distances between $q$ and the anchors.  Next, a table lookup
278: is attempted:  if it matches a table entry of 
279: unoccupied position $p$, then 
280: the algorithm assigns sensor node $q$ to position $p$.  Due to inaccuracies
281: in the one-hop neighborhood determination, the lookup can fail to 
282: match any table entry.  To deal with lookup misses, we propose the 
283: following refinement: for an unoccupied position $p$, 
284: calculate a score for each remaining node. If a node $q$ has highest score 
285: among the remaining nodes and the score is above a threshold $T$, assign
286: $q$ to position $p$. 
287: 
288: Here the algorithm assigns nodes to positions instead of 
289: assigning positions to nodes.  It does not make much difference in one 
290: segment, however in a large scale network, 
291: the base station can receive readings of nodes from an adjacent segments. 
292: It then becomes likely that the base station has more nodes than positions, 
293: so assigning nodes to positions is more likely to yield a better result.  
294: 
295: Note that it is possible that two or more nodes have tuples that
296: match the same table entry due to inaccurate RSSI readings. 
297: Using our algorithm, only one will be assigned 
298: to that grid position depending on which one is the first
299: to be looked up in the table.  (Of course, RSSI ranging errors lower 
300: the quality of our solution, as they would any RSSI-based 
301: solution to localization.)
302: 
303: It is useful to observe that our decision to match nodes to grid 
304: positions, that is, to find a bijection between nodes and grid points, 
305: may not be appropriate for some applications.  Indeed, when two 
306: nodes have the same table lookup results, it can be argued that both
307: should receive the same grid position.  The general question of what
308: is a good metric for applications depending on localization quality 
309: is outside the scope of our research.
310: 
311: \subsection{Distributed Implementation within a Segment}
312: From our description of the algorithm, it is not hard to change the 
313: implementation to be fully distributed. In the centralized implementation,
314: the sensor nodes just need to send out a group of messages and forward
315: statistics of the RSSI readings to the base station. 
316: There is no message exchange between the sensor nodes.
317: After forwarding the statistics, the sensor nodes will just wait for
318: a grid position assignment.
319: 
320: In a fully distributed implementation, each sensor node will send the 
321: statistics of the RSSI readings to the sender of the RSSI messages. 
322: Prior to deployment, each node's programming includes the localization
323: table as a read-only constant (in current technology, there is far
324: more read-only memory than working RAM).
325: After a sender gets statistics from the receivers, it will use fuzzy 
326: membership functions to assign scores to each receiver and determine
327: which receivers are the one-hop neighbors. In stage 3, BFS spanning trees 
328: rooted at the anchors can be constructed using a distributed algorithm
329: (which could be similar to routing protocols that construct spanning trees).
330: The depth of a node $q$ in the tree rooted at anchor $a_i$ is the distance
331: between $q$ and $a_i$.  Then each node $q$ will look up the 4-tuple
332: in the table. If there is a match, assign the position to $q$; if not, 
333: assign a score to all positions and pick the position with highest score.
334: The difference between the centralized and distributed implementations in 
335: stage 3 is that distributed implementation assigns positions to nodes. So
336: it is possible for two nodes to think they are at the same position (as 
337: noted previously, this is acceptable for some applications).
338: 
339: In a large scale grid wireless sensor network, the network
340: can be heterogeneous, enabling faster communication and data processing. 
341: Some sensor nodes are more powerful and have larger communication range.
342: These nodes form the back-bone of the network,
343: so routing to the base station needs fewer hops.  GPS could be installed
344: at these nodes as well, making such nodes ideal as anchors.  If they 
345: have much more computing power than other sensor nodes, the 
346: algorithm could run on these nodes.  On the other hand, 
347: if there are no such powerful nodes in the network, the 
348: distributed implementation is more desirable.  Another advantage 
349: of the distributed implementation is that no multi-hop messaging 
350: or flooding is needed, which reduces radio traffic.
351: