q-bio0401001/kwta.tex
1: \documentclass[12pt]{article}
2: %\documentclass[twocolumn]{article}
3: \addtolength{\hoffset}{-0.6in}
4: \addtolength{\textwidth}{1.2in}
5: \addtolength{\voffset}{-0.6in}
6: \addtolength{\textheight}{1.2in}
7: 
8: %
9: %--useful packages--
10: %\usepackage{doublespace}
11: \usepackage{latexsym}
12: \usepackage{amsmath}
13: \usepackage{amssymb}
14: \usepackage{amsfonts}
15: %\usepackage{times}
16: \usepackage{graphicx}
17: \usepackage{epsfig}
18: \usepackage{array}
19: %\usepackage{flafter}
20: %\usepackage[all]{xy}
21: 
22: %--------------------- Some LaTeX definitions ------------
23: 
24: % CUSTOMIZATIONS
25: \newtheorem{definition}{Definition}
26: \newtheorem{theorem}{Theorem}
27: 
28: \newcommand{\eq}[1]{equation~\ref{eq#1}}
29: \newcommand{\Eq}[2]{
30: \begin{equation}
31: #1
32: \label{eq#2}
33: \end{equation} }
34: 
35: \newcommand{\Eqn}[1]{\[ #1 \]}
36: 
37: \newcommand{\Ex}[1]{Example~\ref{ex#1}}
38: \newcommand{\ex}[1]{example~\ref{ex#1}}
39: 
40: 
41: % ------------------ Defining the \Example environment ---------------------
42: %Syntax:
43: %  \Example{title}{body}{label}
44: %  \Ex{label} ...  will produce Example xxx ..
45: %  ... \ex{label} ..... will produce: ... example xxx ....
46: %  the numbering of the examples (xxx) is done according to the section
47: %  number.
48: 
49: \newtheorem{ExampleDef}{Example}[section]
50: 
51: \newcommand{\Example}[3]{
52:   \begin{list}{}{
53:       \setlength{\leftmargin}{1em}}     % Indent everything by this amount
54:     \item                               % Group everything in one item
55:     \small                              % Use a smaller font size
56:     \begin{ExampleDef} \rm              % Theorems are italic - select roman
57:       {\bf \hspace{-1ex}: #1}           % The name, use \\[1ex] to break line
58:       #2                                % The actual stuff
59:       \hfill {\large \boldmath $\Box$}  % The box
60:       \label{ex:#3}                      % Label the example
61:     \end{ExampleDef}
62:   \end{list}}
63: 
64: %----------------------------------------------------------------------------
65: \setlength{\parskip}{8pt}
66: 
67: 
68: \begin{document}
69: %\singlespace
70: \begin{center}
71: {\Large {\bf $K$-Winners-Take-All Computation\\
72: with Neural Oscillators\par}}
73: \vspace{1.0em}
74: {\large Wei Wang and Jean-Jacques E. Slotine\footnote{To whom correspondence 
75: should be addressed.} \par} 
76: {Nonlinear Systems Laboratory \\
77: Massachusetts Institute of Technology \\
78: Cambridge, Massachusetts, 02139, USA 
79: \\ wangwei@mit.edu, \ jjs@mit.edu 
80: \par}
81: \vspace{2em}
82: \end{center}
83: 
84: \begin{abstract}
85: Artificial spike-based computation, inspired by models of computation
86: in the central nervous system, may present significant performance
87: advantages over traditional methods for specific types of large scale
88: problems. This paper describes very simple network architectures for
89: $k$-winners-take-all and soft-winner-take-all computation using neural
90: oscillators. Fast convergence is achieved from arbitrary initial
91: conditions, which makes the networks particularly suitable to track
92: time-varying inputs.
93: \end{abstract}
94: 
95: %\doublespace
96: 
97: %
98: \section{Introduction} \label{sec:introduction}
99: %
100: The discovery of synchronized oscillations in the visual cortex and
101: other brain regions has triggered significant research in artificial
102: spike-based computation~\cite{hopfield03, gerstner, gray, 
103: jin02, llinas03, llinas98, llinas02, maass03, singer, thorpe01,
104: malsburg95, deliang02}. While neurons in the central nervous
105: system are about six orders of magnitude "slower" than silicon-based
106: elements, in both elementary computation time and signal transmission
107: speed, their performance in networks often compares very favorably
108: with their artificial counterparts even when reaction speed is
109: concerned. In a sense, evolution may have been forced to develop
110: extremely efficient computational schemes given available hardware
111: limitations.
112: 
113: In a recent paper~\cite{wei03-1}, we proposed new models for two
114: common instances of such neural computation, winner-take-all and
115: coincidence detection, featuring fast convergence and $O(n)$ network
116: complexity.  We saw that both computations could be achieved using a
117: similar architecture, using global feedback inhibition in the first
118: case, and global excitation in the second.  In this paper, we further
119: extend this computational architecture to $k$-winners-take-all and
120: soft-winner-take-all.
121: 
122: Fast Winner-take-all (WTA) computation in~\cite{wei03-1} is based on
123: the FitzHugh-Nagumo model, a well-known simplified version of the
124: classical Hodgkin-Huxley model. Compared to previous WTA
125: networks~\cite{arbib, fang, grossberg73, jin02,lazzaro, yuille89}, it
126: has significant computational advantages. The network's initial states
127: can be set arbitrarily, and convergence is guaranteed in at most two
128: spiking periods, with a high computation resolution. The network's
129: complexity is linear in the number of inputs and its size can be
130: adjusted at any time during the computation. As this paper shows, by
131: modifying the starting point of the global inhibitory neuron's
132: charging mode, $k$-Winners-Take-All ($k$-WTA) can be computed instead.
133: By running the charging mode independently, soft-Winner-Take-All
134: (soft-WTA) can be computed. Both extensions inherit the advantages of
135: the original WTA network.
136: 
137: After a brief review of the basic WTA network in
138: section~\ref{sec:wta}, $k$-WTA computation and soft-WTA computation
139: are studied in Sections~\ref{sec:kwta} and~\ref{sec:soft}.  Brief
140: concluding remarks are offered in Section~\ref{sec:conclusion}.
141: 
142: %
143: \section{Winner-Take-All Network} \label{sec:wta}
144: %
145: The WTA network in~\cite{wei03-1} is based
146: on the FitzHugh-Nagumo (FN) model~\cite{fitzhugh,nagumo,murray}:
147: \begin{equation*} \label{eq:f-n}
148: \begin{cases}  
149:   \ \dot{v} = v(\alpha - v)(v-1)-w+I  \\  
150:   \ \dot{w} = \beta v - \gamma w
151: \end{cases}  
152: \end{equation*}
153: For appropriate parameter choices, there exists
154: a unique equilibrium point for any given value of $I$, which
155: is stable except for a finite range $\ I_l \le I \le I_h\ $ 
156: where the system tends to a limit cycle.  The steady-state 
157: value of $v$ at the stable equilibrium point increases
158: with $I$.
159: 
160: \begin{figure}[h]
161: \begin{center}
162: \epsfig{figure=wta_structure.eps,height=40mm,width=60mm}
163: \caption{Diagram of the WTA network. There are $n$ FN neurons
164: receiving external inputs and a global inhibitory neuron 
165: monitoring the whole network.}
166: \label{fig:structure}
167: \end{center}
168: \end{figure}
169: 
170: The network structure is illustrated in Figure~\ref{fig:structure}
171: where $n$ FN neurons receive external stimulating inputs $I_i$ and
172: a global inhibition $z$. The dynamics of the FN neurons 
173: ($i=1, \ldots, n$) are
174: \begin{equation*} \label{eq:fn-in-wta}
175: \begin{cases}  
176:   \ \dot{v}_i = v_i (\alpha - v_i) (v_i-1)-w_i + I_i - z  \\  
177:   \ \dot{w}_i = \beta v_i - \gamma w_i
178: \end{cases}  
179: \end{equation*}
180: The dynamics of the global inhibitory neuron is
181: \begin{equation} \label{eq:inhibition-in-wta}
182: \dot{z} = 
183: \begin{cases}  
184:   \  - k_c \ (z - z_0)  \ \ \ \ \mathrm{charging}\ \mathrm{mode} \\  
185:   \  - k_d \ z   \ \ \ \ \ \ \ \ \ \ \ \ \mathrm{discharging}\ \mathrm{mode}
186: \end{cases}  
187: \end{equation}
188: which starts charging if there is any FN neuron spiking in the network
189: (i.e., if $\exists i$, $v_i \ge v_0$ for some given threshold $v_0$)
190: and switches to discharging if the state $z$ is saturated. With a fast
191: charging rate $k_c$ and a slow discharging rate $k_d$, the network
192: computes the largest input (corresponding to the only spiking FN
193: neuron) in at most two periods. Initial conditions can be set
194: arbitrarily and the computation resolution is very high. Detailed
195: analysis and discussions can be found in~\cite{wei03-1}.
196: 
197: %
198: \section{$K$-Winners-Take-All Network} \label{sec:kwta}
199: %
200: $K$-WTA, a common variation of WTA computation where the output
201: indicates for each neuron whether its input is among the $k$ largest,
202: has been studied in such fields as competitive learning, pattern
203: recognition and pattern classification~\cite{badel, fukai, urahama, 
204: wolfe, juicheng}.  As Maass argued in~\cite{maass00-1}, in principle a
205: $k$-WTA network can replace a two-layer threshold circuit to perform
206: most standard nonlinear computational operations.
207: 
208: Most $k$-WTA studies are based on steady-state stability
209: analysis. Many models define the winners as the neurons with the
210: largest initial states~\cite{majani, wolfe} or require initial
211: conditions to be set precisely~\cite{juicheng}, making the networks
212: not well suited to time-varying inputs.  Others adopt particular
213: design methodologies~\cite{perfetti, seiler} but the network size or
214: the number of winners is limited. $K$-WTA is also implemented in
215: analog VLSI circuits~\cite{urahama}, which extend the elegant WTA
216: model in~\cite{lazzaro} but inherit its low resolution limit as well.
217: 
218: The neural network described in Section~\ref{sec:wta} can be easily
219: extended to $k$-WTA computation, where an FN neuron spikes if and only
220: if its input is among the $k$ largest. Indeed, as the global
221: inhibition force decreases, the FN neurons enter the oscillation
222: region {\it rank-ordered} by their inputs. Thus, while for WTA
223: computation, the global inhibition neuron is charged after the first
224: arrival, for $k$-WTA computation the charging moment is simply
225: modified to capture the $k^{\rm th}$ arrival instead.
226: 
227: To this effect, we augment the dynamics of the FN neuron with an
228: additional state variable $u_i$ (for simplicity, we shall still call
229: FN neuron such a generalized element)
230: \begin{equation*} \label{eq:fn-in-kwta}
231: \begin{cases}  
232:   \ \dot{v}_i = v_i (\alpha - v_i) (v_i-1)-w_i + I_i - u_i - z  \\  
233:   \ \dot{w}_i = \beta v_i - \gamma w_i \\
234:   \ \dot{u}_i = k_u \ (\zeta_i u_0 - u_i) 
235: \end{cases}  
236: \end{equation*}
237: where $u_0$ is a constant saturation value and $k_u$ the
238: charging/discharging rate.  The variable $\zeta_i$ takes two values,
239: namely it switches to $0$ whenever $z$ approaches a saturation value
240: $z_0$, else it switches to $1$ if $v_i$ exceeds a given threshold
241: $v_0$. This make the dynamics of $u_i$ a local self-inhibition, which
242: starts charging if the basic FN neuron spikes and discharges whenever
243: the global inhibitory neuron spikes. Note that the specific form of
244: the dynamics of $u_i$ can be more general, as long as the value of
245: $u_i$ varies between $0$ and $u_0$, and the transition periods are
246: very fast (which is satisfied here by choosing a large $k_u$).
247: 
248: The dynamics of the global inhibitory neuron is the same
249: as~(\ref{eq:inhibition-in-wta}), except that we start its charging
250: mode if any $k$ FN neurons in the network spike. Such a moment can be
251: captured by determining that $\ \sum_{i=1}^n u_i\ $ approaches $k
252: u_0$. Thus, if any FN neuron spikes, it excites only the corresponding
253: local inhibitory portion but has no effect on the rest of the
254: network. If there are $k$ local inhibitions turned on, the global
255: inhibitory neuron is charged, which then releases all the local
256: inhibitions and starts a new period.
257: 
258: Compared to the WTA network in Section~\ref{sec:wta}, the basic
259: principle underlying the $k$-WTA network described above is the same,
260: exploiting the simple properties of the FN model.  Thus, most of the
261: computational advantages of the WTA network~\cite{wei03-1} are
262: inherited by the $k$-WTA extension.  In particular
263: \begin{itemize}
264: \item The initial conditions of the network can be set arbitrarily. 
265: 
266: \item With appropriate parameters, the computation can be completed at
267: most in two periods, where the first period is affected by the initial
268: conditions but the $k$ spiking neurons during the following periods
269: are guaranteed to be those with the largest inputs. If the initial
270: inhibitions are strong, the computation is completed in one period.
271: 
272: \item Since initial conditions are immaterial and the computation
273: speed is very fast, the $k$-WTA network is able to track time-varying
274: inputs. Moreover, since the network complexity is $O(n)$, individual
275: FN neurons can be added or removed at any time during the computation.
276: 
277: \item The inputs $I_i$ should be lower-bounded by $I_l$, the lower
278: threshold of the FN oscillation region. It should also be
279: upper-bounded to set inhibition saturations, although the upper bound
280: value is not restricted.
281: 
282: \item  The computation resolution also follows that of the WTA network. It can 
283: be improved by decreasing the discharging rate $k_d$, as well as the relaxation
284: time of the FN neurons. 
285: 
286: \item FN neurons receiving equal inputs behave identically, which
287: means that the $k$-WTA computation may generate more than $k$ winners
288: in this particular case.
289: 
290: \end{itemize}
291: 
292: \Example{}{The result is illustrated in simulation in
293: Figure~\ref{fig:k_normal}, with $n=10$ and $k=3$.  The parameters of
294: the FN neurons are set as $\ \alpha = 5.32, \beta = 3, \gamma = 0.1\
295: $, with spiking threshold $\ v_0 = 5\ $. The parameters of the local
296: inhibition are $\ u_0 = 160, k_u = 100\ $.  The inputs $I_i$ are
297: chosen randomly from $20$ to $125$.  The parameters of the global
298: neuron are $\ z_0 = 240, k_c = 100, k_d = 1/40\ $. All initial
299: conditions are chosen arbitrarily.  The three spiking neurons after
300: the first charging of the global neuron are those with the three
301: largest inputs.  
302: 
303: Note that the output frequency is determined mainly by the global
304: neuron's dynamics and the value of the $k^{\rm th}$ largest input. It
305: can be increased by increasing the global neuron's discharging rate
306: after the first winner spikes so as to facilitate the other
307: winners' spiking.}{kwta}
308: 
309: \begin{figure}[p]
310: \begin{center}
311: \epsfig{figure=normal_1.eps,height=60mm,width=110mm}
312: \epsfig{figure=normal_2.eps,height=60mm,width=110mm}
313: \epsfig{figure=normal_3.eps,height=40mm,width=110mm}
314: \caption{$k$-WTA computation result of Example~\ref{ex:kwta} with $n=10$ and $k=3$.
315: The plots are the time developments of 
316: (a) $v_i$ of the neurons with the three largest inputs;
317: (b) $u_i$ of the neurons with the three largest inputs; 
318: (c) $v_i$ of the other seven neurons;
319: (d) $u_i$ of the other seven neurons; 
320: (e) global inhibition $z$.
321: The computation is completed in less than two periods.}
322: \label{fig:k_normal}
323: \end{center}
324: \end{figure}
325: 
326: 
327: \Example{}{Figure~\ref{fig:varying} illustrates a simulation result
328: with $n=3$ and $k=2$. The parameters are the same as those in
329: Example~\ref{ex:kwta}.  The inputs keep varying and switch winning
330: positions several times. The spiking neurons always track the two
331: largest inputs.}{varying}
332: 
333: \begin{figure}[h]
334: \begin{center}
335: \epsfig{figure=varying.eps,height=60mm,width=110mm}
336: \caption{$k$-WTA computation result of Example~\ref{ex:varying} with $n=3$ and $k=2$.
337: The inputs are not constant. The spiking neurons always track the two largest inputs.
338: The plots are (a).states $v_i$ versus time; (b).inputs $I_i$ versus time. Note that
339: each $v_i$ in plot (a) corresponds to the $I_i$ in plot (b) with the same line type
340: (solid, dashed or dotted).}
341: \label{fig:varying}
342: \end{center}
343: \end{figure}
344: 
345: 
346: %
347: \section{Soft-Winner-Take-All} \label{sec:soft}
348: %
349: Soft-WTA~\cite{maass00-1} (or softmax) is another variation of WTA
350: computation, where the outputs reflect the rank of all inputs
351: according to their size. Although soft-WTA is a very powerful
352: primitive~\cite{maass00-1, maass00-2} in that it can be used to
353: compute any continuous function, its ``neural'' implementation is
354: complex. Recently, \cite{yuille02} studied soft-WTA as an optimization
355: problem not based on a biologically plausible mechanism;
356: \cite{indiveri} presented a hardware model of selective visual
357: attention which lets the attention switch between the selected inputs,
358: but whose switching order does not completely reflect the input
359: ranks. In this section, we develop a simple neural network which
360: computes soft-WTA very fast and generates spiking outputs rank-ordered
361: by their inputs.
362: 
363: Let $k=n\ $ in the $k$-WTA network described in Section~\ref{sec:kwta}.
364: Then we get a pre-ordered spiking sequence in each stable period, since
365: all the FN neurons enter the oscillation region and then spike
366: rank-ordered by their inputs. However, such an $n^{\rm th}$ arrival
367: moment may not be measurable if the number of inputs $n$ is unknown or
368: time-varying. To avoid this problem and make the solution more
369: general, we let the charging mode of the global inhibitory neuron
370: start only if the inhibition $z$ is lower than a given bound
371: $z_{low}$. The spikings of all the FN neurons in the network are
372: guaranteed by the condition that
373: $$
374: z_{low} \ < \ I_l + I_{min} 
375: $$
376: where $I_l$ is the lower bound of the oscillation region of the 
377: FN model and $I_{min}$ is the minimum input value.
378: 
379: \Example{}{Figure~\ref{fig:soft} illustrates the result in simulation
380: with $n=10$.  The parameters are the same as those in
381: Example~\ref{ex:kwta}.  The inputs $I_i$ are distributed uniformly
382: between $80$ and $120$. The inhibition lower bound is $\ z_{low} = 60\
383: $.  Initial conditions are chosen arbitrarily.  The computation is
384: completed in the second period, during and after which the spiking
385: times of the FN neurons are ranked by their inputs.}{soft}
386: 
387: \begin{figure}[h]
388: \begin{center}
389: \epsfig{figure=soft.eps,height=60mm,width=110mm}
390: \caption{Soft-WTA computation result of Example~\ref{ex:soft} with
391: $n=10$.  The plots are (a).states $v_i$ versus time; (b).global
392: inhibition $z$ versus time.  The initial conditions are chosen
393: arbitrarily and the computation is completed in the second period.}
394: \label{fig:soft}
395: \end{center}
396: \end{figure}
397: 
398: The simple soft-WTA network presented above inherits the main
399: computational advantages of our WTA and $k$-WTA networks. Initial
400: conditions can be arbitrary, the computation is completed in at most
401: two periods, network complexity is linear and neurons can be added or
402: removed at any time. We expect it to be effective in many applications
403: such as selective attention, associative memory and competitive
404: learning, and also to provide an efficient desynchronization mechanism
405: for perceptional binding~\cite{gray, singer, malsburg95, deliang02}.
406: 
407: %
408: \section{Concluding Remarks} \label{sec:conclusion}
409: %
410: Basic neural computations such as winner-take-all,
411: $k$-winners-take-all, soft-winner-take-all, and coincidence detection
412: can all be implemented using a common architecture and biological
413: plausible neuron models. Fast and robust convergence is guaranteed,
414: and time-varying inputs can be tracked. Further research will study
415: models of higher level brain functions, such as perception, based on
416: the WTA networks and on general nonlinear synchronization mechanisms
417: derived in~\cite{slotine03, wei03-2}.
418: 
419: \vspace{1.5em}
420: 
421: \noindent {\large{\bf Acknowledgments:}} This work was supported in
422: part by a grant from the National Institutes of Health. The authors
423: benefited from stimulating discussions with Matthew Tresch.
424: 
425: \input{ref.tex}
426: 
427: 
428: \end{document}
429: 
430: 
431: 
432: 
433: 
434: 
435: 
436: