0401:q-bio0401001/kwta.tex

1: \documentclass[12pt]{article}

2: %\documentclass[twocolumn]{article}

3: \addtolength{\hoffset}{-0.6in}

4: \addtolength{\textwidth}{1.2in}

5: \addtolength{\voffset}{-0.6in}

6: \addtolength{\textheight}{1.2in}

7:

8: %

9: %--useful packages--

10: %\usepackage{doublespace}

11: \usepackage{latexsym}

12: \usepackage{amsmath}

13: \usepackage{amssymb}

14: \usepackage{amsfonts}

15: %\usepackage{times}

16: \usepackage{graphicx}

17: \usepackage{epsfig}

18: \usepackage{array}

19: %\usepackage{flafter}

20: %\usepackage[all]{xy}

21:

22: %--------------------- Some LaTeX definitions ------------

23:

24: % CUSTOMIZATIONS

25: \newtheorem{definition}{Definition}

26: \newtheorem{theorem}{Theorem}

27:

28: \newcommand{\eq}[1]{equation~\ref{eq#1}}

29: \newcommand{\Eq}[2]{

30: \begin{equation}

31: #1

32: \label{eq#2}

33: \end{equation} }

34:

35: \newcommand{\Eqn}[1]{\[ #1 \]}

36:

37: \newcommand{\Ex}[1]{Example~\ref{ex#1}}

38: \newcommand{\ex}[1]{example~\ref{ex#1}}

39:

40:

41: % ------------------ Defining the \Example environment ---------------------

42: %Syntax:

43: %  \Example{title}{body}{label}

44: %  \Ex{label} ...  will produce Example xxx ..

45: %  ... \ex{label} ..... will produce: ... example xxx ....

46: %  the numbering of the examples (xxx) is done according to the section

47: %  number.

48:

49: \newtheorem{ExampleDef}{Example}[section]

50:

51: \newcommand{\Example}[3]{

52:   \begin{list}{}{

53:       \setlength{\leftmargin}{1em}}     % Indent everything by this amount

54:     \item                               % Group everything in one item

55:     \small                              % Use a smaller font size

56:     \begin{ExampleDef} \rm              % Theorems are italic - select roman

57:       {\bf \hspace{-1ex}: #1}           % The name, use \\[1ex] to break line

58:       #2                                % The actual stuff

59:       \hfill {\large \boldmath $\Box$}  % The box

60:       \label{ex:#3}                      % Label the example

61:     \end{ExampleDef}

62:   \end{list}}

63:

64: %----------------------------------------------------------------------------

65: \setlength{\parskip}{8pt}

66:

67:

68: \begin{document}

69: %\singlespace

70: \begin{center}

71: {\Large {\bf $K$-Winners-Take-All Computation\\

72: with Neural Oscillators\par}}

73: \vspace{1.0em}

74: {\large Wei Wang and Jean-Jacques E. Slotine\footnote{To whom correspondence

75: should be addressed.} \par}

76: {Nonlinear Systems Laboratory \\

77: Massachusetts Institute of Technology \\

78: Cambridge, Massachusetts, 02139, USA

79: \\ wangwei@mit.edu, \ jjs@mit.edu

80: \par}

81: \vspace{2em}

82: \end{center}

83:

84: \begin{abstract}

85: Artificial spike-based computation, inspired by models of computation

86: in the central nervous system, may present significant performance

87: advantages over traditional methods for specific types of large scale

88: problems. This paper describes very simple network architectures for

89: $k$-winners-take-all and soft-winner-take-all computation using neural

90: oscillators. Fast convergence is achieved from arbitrary initial

91: conditions, which makes the networks particularly suitable to track

92: time-varying inputs.

93: \end{abstract}

94:

95: %\doublespace

96:

97: %

98: \section{Introduction} \label{sec:introduction}

99: %

100: The discovery of synchronized oscillations in the visual cortex and

101: other brain regions has triggered significant research in artificial

102: spike-based computation~\cite{hopfield03, gerstner, gray,

103: jin02, llinas03, llinas98, llinas02, maass03, singer, thorpe01,

104: malsburg95, deliang02}. While neurons in the central nervous

105: system are about six orders of magnitude "slower" than silicon-based

106: elements, in both elementary computation time and signal transmission

107: speed, their performance in networks often compares very favorably

108: with their artificial counterparts even when reaction speed is

109: concerned. In a sense, evolution may have been forced to develop

110: extremely efficient computational schemes given available hardware

111: limitations.

112:

113: In a recent paper~\cite{wei03-1}, we proposed new models for two

114: common instances of such neural computation, winner-take-all and

115: coincidence detection, featuring fast convergence and $O(n)$ network

116: complexity.  We saw that both computations could be achieved using a

117: similar architecture, using global feedback inhibition in the first

118: case, and global excitation in the second.  In this paper, we further

119: extend this computational architecture to $k$-winners-take-all and

120: soft-winner-take-all.

121:

122: Fast Winner-take-all (WTA) computation in~\cite{wei03-1} is based on

123: the FitzHugh-Nagumo model, a well-known simplified version of the

124: classical Hodgkin-Huxley model. Compared to previous WTA

125: networks~\cite{arbib, fang, grossberg73, jin02,lazzaro, yuille89}, it

126: has significant computational advantages. The network's initial states

127: can be set arbitrarily, and convergence is guaranteed in at most two

128: spiking periods, with a high computation resolution. The network's

129: complexity is linear in the number of inputs and its size can be

130: adjusted at any time during the computation. As this paper shows, by

131: modifying the starting point of the global inhibitory neuron's

132: charging mode, $k$-Winners-Take-All ($k$-WTA) can be computed instead.

133: By running the charging mode independently, soft-Winner-Take-All

134: (soft-WTA) can be computed. Both extensions inherit the advantages of

135: the original WTA network.

136:

137: After a brief review of the basic WTA network in

138: section~\ref{sec:wta}, $k$-WTA computation and soft-WTA computation

139: are studied in Sections~\ref{sec:kwta} and~\ref{sec:soft}.  Brief

140: concluding remarks are offered in Section~\ref{sec:conclusion}.

141:

142: %

143: \section{Winner-Take-All Network} \label{sec:wta}

144: %

145: The WTA network in~\cite{wei03-1} is based

146: on the FitzHugh-Nagumo (FN) model~\cite{fitzhugh,nagumo,murray}:

147: \begin{equation*} \label{eq:f-n}

148: \begin{cases}

149:   \ \dot{v} = v(\alpha - v)(v-1)-w+I  \\

150:   \ \dot{w} = \beta v - \gamma w

151: \end{cases}

152: \end{equation*}

153: For appropriate parameter choices, there exists

154: a unique equilibrium point for any given value of $I$, which

155: is stable except for a finite range $\ I_l \le I \le I_h\ $

156: where the system tends to a limit cycle.  The steady-state

157: value of $v$ at the stable equilibrium point increases

158: with $I$.

159:

160: \begin{figure}[h]

161: \begin{center}

162: \epsfig{figure=wta_structure.eps,height=40mm,width=60mm}

163: \caption{Diagram of the WTA network. There are $n$ FN neurons

164: receiving external inputs and a global inhibitory neuron

165: monitoring the whole network.}

166: \label{fig:structure}

167: \end{center}

168: \end{figure}

169:

170: The network structure is illustrated in Figure~\ref{fig:structure}

171: where $n$ FN neurons receive external stimulating inputs $I_i$ and

172: a global inhibition $z$. The dynamics of the FN neurons

173: ($i=1, \ldots, n$) are

174: \begin{equation*} \label{eq:fn-in-wta}

175: \begin{cases}

176:   \ \dot{v}_i = v_i (\alpha - v_i) (v_i-1)-w_i + I_i - z  \\

177:   \ \dot{w}_i = \beta v_i - \gamma w_i

178: \end{cases}

179: \end{equation*}

180: The dynamics of the global inhibitory neuron is

181: \begin{equation} \label{eq:inhibition-in-wta}

182: \dot{z} =

183: \begin{cases}

184:   \  - k_c \ (z - z_0)  \ \ \ \ \mathrm{charging}\ \mathrm{mode} \\

185:   \  - k_d \ z   \ \ \ \ \ \ \ \ \ \ \ \ \mathrm{discharging}\ \mathrm{mode}

186: \end{cases}

187: \end{equation}

188: which starts charging if there is any FN neuron spiking in the network

189: (i.e., if $\exists i$, $v_i \ge v_0$ for some given threshold $v_0$)

190: and switches to discharging if the state $z$ is saturated. With a fast

191: charging rate $k_c$ and a slow discharging rate $k_d$, the network

192: computes the largest input (corresponding to the only spiking FN

193: neuron) in at most two periods. Initial conditions can be set

194: arbitrarily and the computation resolution is very high. Detailed

195: analysis and discussions can be found in~\cite{wei03-1}.

196:

197: %

198: \section{$K$-Winners-Take-All Network} \label{sec:kwta}

199: %

200: $K$-WTA, a common variation of WTA computation where the output

201: indicates for each neuron whether its input is among the $k$ largest,

202: has been studied in such fields as competitive learning, pattern

203: recognition and pattern classification~\cite{badel, fukai, urahama,

204: wolfe, juicheng}.  As Maass argued in~\cite{maass00-1}, in principle a

205: $k$-WTA network can replace a two-layer threshold circuit to perform

206: most standard nonlinear computational operations.

207:

208: Most $k$-WTA studies are based on steady-state stability

209: analysis. Many models define the winners as the neurons with the

210: largest initial states~\cite{majani, wolfe} or require initial

211: conditions to be set precisely~\cite{juicheng}, making the networks

212: not well suited to time-varying inputs.  Others adopt particular

213: design methodologies~\cite{perfetti, seiler} but the network size or

214: the number of winners is limited. $K$-WTA is also implemented in

215: analog VLSI circuits~\cite{urahama}, which extend the elegant WTA

216: model in~\cite{lazzaro} but inherit its low resolution limit as well.

217:

218: The neural network described in Section~\ref{sec:wta} can be easily

219: extended to $k$-WTA computation, where an FN neuron spikes if and only

220: if its input is among the $k$ largest. Indeed, as the global

221: inhibition force decreases, the FN neurons enter the oscillation

222: region {\it rank-ordered} by their inputs. Thus, while for WTA

223: computation, the global inhibition neuron is charged after the first

224: arrival, for $k$-WTA computation the charging moment is simply

225: modified to capture the $k^{\rm th}$ arrival instead.

226:

227: To this effect, we augment the dynamics of the FN neuron with an

228: additional state variable $u_i$ (for simplicity, we shall still call

229: FN neuron such a generalized element)

230: \begin{equation*} \label{eq:fn-in-kwta}

231: \begin{cases}

232:   \ \dot{v}_i = v_i (\alpha - v_i) (v_i-1)-w_i + I_i - u_i - z  \\

233:   \ \dot{w}_i = \beta v_i - \gamma w_i \\

234:   \ \dot{u}_i = k_u \ (\zeta_i u_0 - u_i)

235: \end{cases}

236: \end{equation*}

237: where $u_0$ is a constant saturation value and $k_u$ the

238: charging/discharging rate.  The variable $\zeta_i$ takes two values,

239: namely it switches to $0$ whenever $z$ approaches a saturation value

240: $z_0$, else it switches to $1$ if $v_i$ exceeds a given threshold

241: $v_0$. This make the dynamics of $u_i$ a local self-inhibition, which

242: starts charging if the basic FN neuron spikes and discharges whenever

243: the global inhibitory neuron spikes. Note that the specific form of

244: the dynamics of $u_i$ can be more general, as long as the value of

245: $u_i$ varies between $0$ and $u_0$, and the transition periods are

246: very fast (which is satisfied here by choosing a large $k_u$).

247:

248: The dynamics of the global inhibitory neuron is the same

249: as~(\ref{eq:inhibition-in-wta}), except that we start its charging

250: mode if any $k$ FN neurons in the network spike. Such a moment can be

251: captured by determining that $\ \sum_{i=1}^n u_i\ $ approaches $k

252: u_0$. Thus, if any FN neuron spikes, it excites only the corresponding

253: local inhibitory portion but has no effect on the rest of the

254: network. If there are $k$ local inhibitions turned on, the global

255: inhibitory neuron is charged, which then releases all the local

256: inhibitions and starts a new period.

257:

258: Compared to the WTA network in Section~\ref{sec:wta}, the basic

259: principle underlying the $k$-WTA network described above is the same,

260: exploiting the simple properties of the FN model.  Thus, most of the

261: computational advantages of the WTA network~\cite{wei03-1} are

262: inherited by the $k$-WTA extension.  In particular

263: \begin{itemize}

264: \item The initial conditions of the network can be set arbitrarily.

265:

266: \item With appropriate parameters, the computation can be completed at

267: most in two periods, where the first period is affected by the initial

268: conditions but the $k$ spiking neurons during the following periods

269: are guaranteed to be those with the largest inputs. If the initial

270: inhibitions are strong, the computation is completed in one period.

271:

272: \item Since initial conditions are immaterial and the computation

273: speed is very fast, the $k$-WTA network is able to track time-varying

274: inputs. Moreover, since the network complexity is $O(n)$, individual

275: FN neurons can be added or removed at any time during the computation.

276:

277: \item The inputs $I_i$ should be lower-bounded by $I_l$, the lower

278: threshold of the FN oscillation region. It should also be

279: upper-bounded to set inhibition saturations, although the upper bound

280: value is not restricted.

281:

282: \item  The computation resolution also follows that of the WTA network. It can

283: be improved by decreasing the discharging rate $k_d$, as well as the relaxation

284: time of the FN neurons.

285:

286: \item FN neurons receiving equal inputs behave identically, which

287: means that the $k$-WTA computation may generate more than $k$ winners

288: in this particular case.

289:

290: \end{itemize}

291:

292: \Example{}{The result is illustrated in simulation in

293: Figure~\ref{fig:k_normal}, with $n=10$ and $k=3$.  The parameters of

294: the FN neurons are set as $\ \alpha = 5.32, \beta = 3, \gamma = 0.1\

295: $, with spiking threshold $\ v_0 = 5\ $. The parameters of the local

296: inhibition are $\ u_0 = 160, k_u = 100\ $.  The inputs $I_i$ are

297: chosen randomly from $20$ to $125$.  The parameters of the global

298: neuron are $\ z_0 = 240, k_c = 100, k_d = 1/40\ $. All initial

299: conditions are chosen arbitrarily.  The three spiking neurons after

300: the first charging of the global neuron are those with the three

301: largest inputs.

302:

303: Note that the output frequency is determined mainly by the global

304: neuron's dynamics and the value of the $k^{\rm th}$ largest input. It

305: can be increased by increasing the global neuron's discharging rate

306: after the first winner spikes so as to facilitate the other

307: winners' spiking.}{kwta}

308:

309: \begin{figure}[p]

310: \begin{center}

311: \epsfig{figure=normal_1.eps,height=60mm,width=110mm}

312: \epsfig{figure=normal_2.eps,height=60mm,width=110mm}

313: \epsfig{figure=normal_3.eps,height=40mm,width=110mm}

314: \caption{$k$-WTA computation result of Example~\ref{ex:kwta} with $n=10$ and $k=3$.

315: The plots are the time developments of

316: (a) $v_i$ of the neurons with the three largest inputs;

317: (b) $u_i$ of the neurons with the three largest inputs;

318: (c) $v_i$ of the other seven neurons;

319: (d) $u_i$ of the other seven neurons;

320: (e) global inhibition $z$.

321: The computation is completed in less than two periods.}

322: \label{fig:k_normal}

323: \end{center}

324: \end{figure}

325:

326:

327: \Example{}{Figure~\ref{fig:varying} illustrates a simulation result

328: with $n=3$ and $k=2$. The parameters are the same as those in

329: Example~\ref{ex:kwta}.  The inputs keep varying and switch winning

330: positions several times. The spiking neurons always track the two

331: largest inputs.}{varying}

332:

333: \begin{figure}[h]

334: \begin{center}

335: \epsfig{figure=varying.eps,height=60mm,width=110mm}

336: \caption{$k$-WTA computation result of Example~\ref{ex:varying} with $n=3$ and $k=2$.

337: The inputs are not constant. The spiking neurons always track the two largest inputs.

338: The plots are (a).states $v_i$ versus time; (b).inputs $I_i$ versus time. Note that

339: each $v_i$ in plot (a) corresponds to the $I_i$ in plot (b) with the same line type

340: (solid, dashed or dotted).}

341: \label{fig:varying}

342: \end{center}

343: \end{figure}

344:

345:

346: %

347: \section{Soft-Winner-Take-All} \label{sec:soft}

348: %

349: Soft-WTA~\cite{maass00-1} (or softmax) is another variation of WTA

350: computation, where the outputs reflect the rank of all inputs

351: according to their size. Although soft-WTA is a very powerful

352: primitive~\cite{maass00-1, maass00-2} in that it can be used to

353: compute any continuous function, its ``neural'' implementation is

354: complex. Recently, \cite{yuille02} studied soft-WTA as an optimization

355: problem not based on a biologically plausible mechanism;

356: \cite{indiveri} presented a hardware model of selective visual

357: attention which lets the attention switch between the selected inputs,

358: but whose switching order does not completely reflect the input

359: ranks. In this section, we develop a simple neural network which

360: computes soft-WTA very fast and generates spiking outputs rank-ordered

361: by their inputs.

362:

363: Let $k=n\ $ in the $k$-WTA network described in Section~\ref{sec:kwta}.

364: Then we get a pre-ordered spiking sequence in each stable period, since

365: all the FN neurons enter the oscillation region and then spike

366: rank-ordered by their inputs. However, such an $n^{\rm th}$ arrival

367: moment may not be measurable if the number of inputs $n$ is unknown or

368: time-varying. To avoid this problem and make the solution more

369: general, we let the charging mode of the global inhibitory neuron

370: start only if the inhibition $z$ is lower than a given bound

371: $z_{low}$. The spikings of all the FN neurons in the network are

372: guaranteed by the condition that

373: $$

374: z_{low} \ < \ I_l + I_{min}

375: $$

376: where $I_l$ is the lower bound of the oscillation region of the

377: FN model and $I_{min}$ is the minimum input value.

378:

379: \Example{}{Figure~\ref{fig:soft} illustrates the result in simulation

380: with $n=10$.  The parameters are the same as those in

381: Example~\ref{ex:kwta}.  The inputs $I_i$ are distributed uniformly

382: between $80$ and $120$. The inhibition lower bound is $\ z_{low} = 60\

383: $.  Initial conditions are chosen arbitrarily.  The computation is

384: completed in the second period, during and after which the spiking

385: times of the FN neurons are ranked by their inputs.}{soft}

386:

387: \begin{figure}[h]

388: \begin{center}

389: \epsfig{figure=soft.eps,height=60mm,width=110mm}

390: \caption{Soft-WTA computation result of Example~\ref{ex:soft} with

391: $n=10$.  The plots are (a).states $v_i$ versus time; (b).global

392: inhibition $z$ versus time.  The initial conditions are chosen

393: arbitrarily and the computation is completed in the second period.}

394: \label{fig:soft}

395: \end{center}

396: \end{figure}

397:

398: The simple soft-WTA network presented above inherits the main

399: computational advantages of our WTA and $k$-WTA networks. Initial

400: conditions can be arbitrary, the computation is completed in at most

401: two periods, network complexity is linear and neurons can be added or

402: removed at any time. We expect it to be effective in many applications

403: such as selective attention, associative memory and competitive

404: learning, and also to provide an efficient desynchronization mechanism

405: for perceptional binding~\cite{gray, singer, malsburg95, deliang02}.

406:

407: %

408: \section{Concluding Remarks} \label{sec:conclusion}

409: %

410: Basic neural computations such as winner-take-all,

411: $k$-winners-take-all, soft-winner-take-all, and coincidence detection

412: can all be implemented using a common architecture and biological

413: plausible neuron models. Fast and robust convergence is guaranteed,

414: and time-varying inputs can be tracked. Further research will study

415: models of higher level brain functions, such as perception, based on

416: the WTA networks and on general nonlinear synchronization mechanisms

417: derived in~\cite{slotine03, wei03-2}.

418:

419: \vspace{1.5em}

420:

421: \noindent {\large{\bf Acknowledgments:}} This work was supported in

422: part by a grant from the National Institutes of Health. The authors

423: benefited from stimulating discussions with Matthew Tresch.

424:

425: \input{ref.tex}

426:

427:

428: \end{document}

429:

430:

431:

432:

433:

434:

435:

436: