0601:cs0601111/sec3.tex

1: \section{Localization Algorithm and Implementation}

2: \label{sec-algorithm}

3: Our localization algorithm is a simple algorithm using

4: table lookup.  The entire procedure has three stages.

5: The first stage is to calculate

6: a table, where each grid position has an entry. The entry for a

7: grid position $p$ is a 4-tuple, consisting of grid distances between

8: $p$ and four anchors.  At least three anchors are necessary for

9: localization;  we found four anchors to be an improvement over

10: three anchors, and the scheme can easily be generalized to a

11: larger number of anchors.  The second stage establishes the

12: one-hop neighborhood for each sensor node, and for each

13: node $q$, calculates a 4-tuple, consisting of grid distances

14: between $q$ and the anchors using the

15: one-hop neighborhood information. The third stage performs the table lookup

16: and then refines the result of the lookup.

17: Our algorithm is shown in Figure ~\ref{top-level}.

18:

19: \begin{figure}[ht]

20: \centering

21: \begin{tabbing}

22: xxx \= xxxxxxxx \= xxx \= xxx \= \kill

23: \>\bf\em{stage 1: }\\

24: \>

25:

26: calculate grid distance table for the grid \\ \>\>positions\\

27: \\

28: \>\bf\em{stage 2:}\\

29: \>

30: each node $p$ sends a group of 30 messages\\

31: \>

32: each node forwards all the RSSI readings to\\

33: \>\> the base station \\

34: \>

35: for each sender $p$, base station calculates \\

36: \>\>a score for all receivers \\

37: \>

38: decide one-hop neighborhood for all nodes \\

39: \\

40:

41: \>\bf\em{stage3:}\\

42: \>calculate distance between each node $q$ \\

43: \>\>and the anchors\\

44: \>for each node $q$,look up in the table for\\

45: \>\> a match  \\

46: \>for each unoccupied position, find the \\

47: \>\> most likely node\\

48: \>send out the assignments

49:

50: \end{tabbing}

51: \caption{localization algorithm}

52: \label{top-level}

53: \end{figure}

54:

55: The implementation of this algorithm in one segment can be

56: either centralized or distributed. We will describe the centralized

57: implementation first. Also each stage of this algorithm can be implemented

58: differently. For example, in the second stage, different ranging techniques

59: can be used to help establishing the one-hop neighborhood.

60:

61: \subsection{Table Establishment and Anchors}

62: Using formula (\ref{eqn-1}), defined in Section \ref{prelim},

63: it is simple to calculate the distances from any grid position $p$

64: to the anchors;  therefore when the grid is designed or deployed the

65: table can be calculated and disseminated as needed for the

66: subsequent stages.

67:

68: In order for table lookup to be unambiguous, no pair of tuples

69: in the table should be identical (otherwise two different grid points could

70: have the same lookup). This constrains anchor placement.

71: If anchors are placed in a line, then there will be much symmetry in the

72: segment and different positions have identical tuples, which should

73: be avoided. In our experiment, we set up the segment same as in \cite{BKA05},

74:  we found that if the 4 anchors

75: form a parallelogram, then no table entries are identical. We have

76: simulated different parallelogram placements in one segment, and

77: found when the anchors are placed close to the segment border, the

78: performance is slightly better. Similar observations can be found in

79: other localization schemes \cite{SGAMS03}.

80:

81:

82: \subsection{One-hop Neighborhood using RSSI}

83:

84: We use Received Signal Strength Indicator(RSSI) for distance estimation

85: in our implementation to obtain, for each node, an estimate of its

86: one-hop neighborhood.  Simplifying this task was our design of the

87: grid, which defined the one-hop distance between neighbors

88: to be about nine meters.  We assume that the deployment error

89: for each node is within a defined tolerance (see Section~\ref{sec-experiment}).

90:

91: The RSSI ranging works as follows. A sensor node sends

92: out a radio message with a certain signal strength and one field

93: of this message records the signal strength of sending.  The receiver of

94: this message can measure the signal strength of the received message.

95: Given a model of how signal strength reduces with distance (in our

96: case obtained empirically), the original signal strength and received

97: signal strength can be compared and the distance between the

98: sender and receiver can be estimated.  The advantage of RSSI ranging

99: is that it is very simple and needs no additional

100: hardware and little computing power. The disadvantage of RSSI ranging

101: is that the accuracy is poor\cite{AS04};  radio strength can be affected by the

102: environment, the relative angles of emplacement for a pair of sensors,

103: and manufacturing variances in radio devices.

104:

105: For RSSI ranging in our

106: implementation, we used a group of messages to deal with variation

107: (we did not use advanced filters in our experiments, preferring to

108: use the simplest of techniques).  In our experiments, we notice

109: that even with group size of 30, over the same distance,

110: the mean RSSI reading varies considerably, as has been previously

111: observed for similar hardware \cite{WC02}.  Put another way,

112: the mean received signal strength readings can be same over different

113: distances.  To reduce the inaccuracy of RSSI ranging and get better

114: one-hop neighborhood information, instead of having estimation of

115: the real distance between a single pair

116: of sender and receiver, we forward statistics of

117: the RSSI readings to the base station.

118:

119: Experience of other researchers \cite{WC02,PH03,ZHKS04}

120: suggests that no elementary model may exist for how

121: signal strength behaves as a function of distance.

122: Therefore, instead of trying to find an analytical model for

123: RSSI, we used a machine learning approach in our investigation.

124: Machine learning for RSSI-based localization was previously used

125: in \cite{ELM04}, where Bayesian networks were trained.

126:

127: We use fuzzy membership functions \cite{D90} to

128: calculate a score for each receiver by looking at average RSSI reading,

129: number of received messages, maximum and minimal RSSI readings, and

130: the rank of RSSI readings among all receivers. For each property,

131: we assign one fuzzy membership function. According to the classification in

132: \cite{D90}, we choose the membership functions

133: based on reliability concerns with respect to the particular problem.

134: From our implementation, even the very simple function forms (linear and

135: triangular functions) are sufficient for this part.

136: Finally we decide the one-hop

137: neighborhood for the sender by checking the scores of the receivers.

138: The score is in fact another fuzzy membership function.

139: For this function, we use {\em Distance Approach},

140: one of the {\em semantic approaches} in fuzzy

141: membership function design. This approach concentrates more on

142: practical meaning and interpretation of membership.  It is ideal

143: for multi-attribute decision making \cite{LH}. This approach

144: considers requirements on different attributes. It will assign

145: operations for fuzzy logic. For instance, let $f_A$ and $f_B$

146: be fuzzy membership functions for attributes $A$ and $B$; then

147: fuzzy membership function for ``$A$ and $B$'' is $min(f_A,f_B)$,

148: and for ``very $A$'' is $f_A^2$. So we can  interpret empirical

149: rules such as ``if very $A$ and $B$, then $C$ is likely'' into

150: functions.

151:

152: After choosing the forms of the fuzzy membership functions, we need

153: to decide the parameters for these functions. We conducted extensive

154: tests for calibration.  We performed tests with different spacing

155: between nodes, relative angles, weather and terrain conditions.

156: From the training data we collected from the tests,

157: we determined the parameters

158: to best distinguish 1-hop neighbors from more distant neighbors.

159: In Section~\ref{sec-analysis}, we

160: will show how well these functions work.

161:

162: We give an example here on how to choose parameters for

163: fuzzy membership functions. The fuzzy membership function $f_{avg}$ for

164: an average RSSI reading $x$ to be ``numerically like" a 1-hop reading is a combination

165: of 3 simple linear functions.

166:

167: \[f_{avg}(x) = \left\{\begin{array}{ll}

168: 1 & x<a\\

169: \frac{b-x}{b-a} & a \leq x < b \\

170: 0 & x \geq b

171: \end{array}

172: \right. \]

173:

174: In our tests, generally, if the distance is

175: smaller, the reading is smaller. So a smaller reading

176: implies a more likely 1-hop reading.

177: Thus we chose the function in this form.

178:

179: Here, $a$ and $b$ are parameters of this fuzzy membership function.

180: We need to determine these two parameters from our data. We

181: want to choose them to be thresholds that only a small part of 2-hop

182: readings are smaller than $a$ and most of the 1-hop readings are smaller

183: than $b$. So We chose

184: the $10^{th}$ percentile of average readings over 2-hop distances for

185: $a$, and the $95^{th}$ percentile of average readings over 1-hop distances

186: for $b$.

187:

188:

189:

190: \begin{figure}[ht]

191: \centering

192: \includegraphics {distribution.eps}

193: \caption{distribution for 1-hop and 2-hop readings}

194: \label{distributionChart}

195: \end{figure}

196:

197: Figure \ref{distributionChart} shows the distributions

198: of 1-hop and 2-hop readings. So for our training data,

199: $a = 343$, and $b = 361$.

200:

201: An advantage of using machine learning with fuzzy membership

202: function is that we do not have strong assumption for

203: the underlying distribution of the RSSI reading over a certain distance.

204:

205: An observation from the experiment is that by

206: running a $\chi^2$

207: goodness-of-fit test on these two distributions, we

208: found that the reading distributions for a fixed

209: distance might not always be normal distribution,

210: as assumed in \cite{MLRT04,NN04}. For 2-hop readings,

211: it is not statistically significant to reject the

212: hypothesis that the distribution is normal. But for

213: 1-hop readings, the test yields $\chi^2 = 30.11$, for

214: $p=0.01$, the $\chi^2$ for degree of freedom of 15 is

215: 30.58, so with high significance, we can reject the

216: hypothesis that the distribution of 1-hop readings is

217: normal.

218:

219: Similar to $f_{avg}$, we have fuzzy membership function

220: $f_{rel}$, which describes how much an average RSSI reading $x$

221: is ``relatively like" a 1-hop reading by checking all average RSSI readings

222: for the same sender. Let $s$ denote the sender of the messages, and

223:  $R_s$ denote the set of nodes which report

224: an average RSSI reading for $s$. If $|R_s|<2$,

225: $f_{rel}$ is always 1. When $|R_s|$ is at least 2, let $max_s$

226: denote the mean of the 2 strongest average RSSI readings

227: reported by $R_s$.

228:

229: \[f_{rel}(x) = \left\{\begin{array}{ll}

230: 1 &  x < 1.05max_s\\

231: 1-\frac{x-1.05 max_s}{0.1 max_s} & 1.05max_s \leq x < 1.15 max_s \\

232: 0 & x \geq 1.15max_s

233: \end{array}

234: \right. \]

235:

236: If a receiver

237: receives most of the messages in the message group

238: from a sender, the average RSSI reading is more reliable

239: than that from just a few messages. Meanwhile, from the training

240: data we collected, we observe that 1-hop neighbors generally receive

241: more messages than the 2-hop neighbors. Also with

242: reduced power of radio signals and a message rate of 10 messages/s,

243: we have not observed obvious message collisions, so it is not necessary

244: to schedule the messages.

245:

246: Function $f_{num}$ is used to determine whether an

247: average RSSI reading $x$ is ``like in volume" to a 1-hop reading by

248: checking the number of messages on which $x$ is based.

249: Let $x_{num}$ denote this number, $s$ denote the sender

250: of the messages, and $most_s$ denote

251: the most number of messages from $s$ that

252: are received by a receiver.

253:

254: \[f_{num}(x) = \left\{\begin{array}{ll}

255: 0 &  x_{num} < 0.65most_s\\

256: \frac{x-0.65 most_s}{0.25 most_s} & 0.65most_s \leq x_{num} < 0.9 most_s \\

257: 1 & x_{num} \geq 0.9most_s

258: \end{array}

259: \right. \]

260:

261: After calculating the fuzzy membership functions for

262: different attributes, we use a fuzzy rule to get a

263: score for the sender/receiver pair. From the training data

264: we collected, we chose the rule for judging whether an average

265: reading of a sender/receiver pair is like a 1-hop reading to be,

266: it is ``numerically like", or, ``relatively like" and very ``like

267: in volume". So the score is calculated by the following.

268:

269: \[score(x) = max(f_{avg}(x),min(f_{rel}(x),f_{num}^2(x)))\]

270:

271: \subsection{Table Look-up and Refinement}

272:

273: After establishing the one-hop neighborhood for every sensor node, the

274: algorithm constructs shortest paths for each node to all the anchors

275: in the segment.

276: Thus it has a 4-tuple for each sensor node $q$, consisting of

277: grid distances between $q$ and the anchors.  Next, a table lookup

278: is attempted:  if it matches a table entry of

279: unoccupied position $p$, then

280: the algorithm assigns sensor node $q$ to position $p$.  Due to inaccuracies

281: in the one-hop neighborhood determination, the lookup can fail to

282: match any table entry.  To deal with lookup misses, we propose the

283: following refinement: for an unoccupied position $p$,

284: calculate a score for each remaining node. If a node $q$ has highest score

285: among the remaining nodes and the score is above a threshold $T$, assign

286: $q$ to position $p$.

287:

288: Here the algorithm assigns nodes to positions instead of

289: assigning positions to nodes.  It does not make much difference in one

290: segment, however in a large scale network,

291: the base station can receive readings of nodes from an adjacent segments.

292: It then becomes likely that the base station has more nodes than positions,

293: so assigning nodes to positions is more likely to yield a better result.

294:

295: Note that it is possible that two or more nodes have tuples that

296: match the same table entry due to inaccurate RSSI readings.

297: Using our algorithm, only one will be assigned

298: to that grid position depending on which one is the first

299: to be looked up in the table.  (Of course, RSSI ranging errors lower

300: the quality of our solution, as they would any RSSI-based

301: solution to localization.)

302:

303: It is useful to observe that our decision to match nodes to grid

304: positions, that is, to find a bijection between nodes and grid points,

305: may not be appropriate for some applications.  Indeed, when two

306: nodes have the same table lookup results, it can be argued that both

307: should receive the same grid position.  The general question of what

308: is a good metric for applications depending on localization quality

309: is outside the scope of our research.

310:

311: \subsection{Distributed Implementation within a Segment}

312: From our description of the algorithm, it is not hard to change the

313: implementation to be fully distributed. In the centralized implementation,

314: the sensor nodes just need to send out a group of messages and forward

315: statistics of the RSSI readings to the base station.

316: There is no message exchange between the sensor nodes.

317: After forwarding the statistics, the sensor nodes will just wait for

318: a grid position assignment.

319:

320: In a fully distributed implementation, each sensor node will send the

321: statistics of the RSSI readings to the sender of the RSSI messages.

322: Prior to deployment, each node's programming includes the localization

323: table as a read-only constant (in current technology, there is far

324: more read-only memory than working RAM).

325: After a sender gets statistics from the receivers, it will use fuzzy

326: membership functions to assign scores to each receiver and determine

327: which receivers are the one-hop neighbors. In stage 3, BFS spanning trees

328: rooted at the anchors can be constructed using a distributed algorithm

329: (which could be similar to routing protocols that construct spanning trees).

330: The depth of a node $q$ in the tree rooted at anchor $a_i$ is the distance

331: between $q$ and $a_i$.  Then each node $q$ will look up the 4-tuple

332: in the table. If there is a match, assign the position to $q$; if not,

333: assign a score to all positions and pick the position with highest score.

334: The difference between the centralized and distributed implementations in

335: stage 3 is that distributed implementation assigns positions to nodes. So

336: it is possible for two nodes to think they are at the same position (as

337: noted previously, this is acceptable for some applications).

338:

339: In a large scale grid wireless sensor network, the network

340: can be heterogeneous, enabling faster communication and data processing.

341: Some sensor nodes are more powerful and have larger communication range.

342: These nodes form the back-bone of the network,

343: so routing to the base station needs fewer hops.  GPS could be installed

344: at these nodes as well, making such nodes ideal as anchors.  If they

345: have much more computing power than other sensor nodes, the

346: algorithm could run on these nodes.  On the other hand,

347: if there are no such powerful nodes in the network, the

348: distributed implementation is more desirable.  Another advantage

349: of the distributed implementation is that no multi-hop messaging

350: or flooding is needed, which reduces radio traffic.

351: