q-bio0401005/dma.tex
1: %\documentclass{article}
2: \documentclass[twocolumn,showpacs,preprintnumbers,amsmath,amssymb,rmp]{revtex4}
3: 
4: \usepackage{latexsym}
5: \usepackage{graphicx}
6: 
7: \begin{document}
8: \title{How to find decision makers in neural circuits?}
9: \author{Alexei A. Koulakov$^{1}$, Dmitry Rinberg$^{2}$, and Dmitry N. Tsigankov$^{1}$} 
10: 
11: \address{
12: \protect{$^1$}\hspace{-.0in}Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA \\
13: \protect{$^2$}Monell Chemical Senses Center, Philadelphia, PA, USA
14: }
15: 
16: \begin{abstract}
17: Neural circuits often face the problem of classifying stimuli into discrete 
18: groups and making decisions based on such classifications. Neurons of these 
19: circuits can be distinguished according to their correlations with different 
20: features of stimulus or response, which allows defining sensory or motor 
21: neuronal types. In this study we define the third class of neurons, which is 
22: responsible for making decision. We suggest two descriptions for contribution of 
23: units to decision making: first, as a spatial derivative of correlations between 
24: neural activity and the decision; second, as an impact of variability in a given 
25: neuron on the response. These two definitions are shown to be equivalent, when 
26: they can be compared. We also suggest an experimental strategy for determining 
27: contributions to decision making, which uses electric stimulation with time-
28: varying random current. 
29: \end{abstract}
30: 
31: \pacs{n/a}
32: 
33: \maketitle
34: 
35: \section{INTRODUCTION}\label{introduction}
36: 
37: Nervous system is continuously confronted by megabytes of information, 
38: representing light, sound, smell, etc. This information is compiled by the 
39: brain into a set of decisions, representing behaviors of living organisms. 
40: The mechanisms involved in this reduction have been under investigation for 
41: many years (Glimcher, 2003; Romo and Salinas, 
42: 2003). In this study we address a question complimentary to the issue of 
43: decision making (DM) mechanisms. We define neuronal units involved in making 
44: perceptual decisions. For this purpose we determine DM activity in surrogate 
45: networks, defined mathematically, in which a complete control is present 
46: over stimuli, mechanisms, and responses. Such decision making analysis (DMA) 
47: has practical significance, since once units involved in making particular 
48: decision are located, further efforts could be concentrated on uncovering 
49: the underlying mechanisms. 
50: 
51: In this study DM task is defined as evaluation of a function in the 
52: multidimensional stimulus space (Figure 1A). This function has a discrete set 
53: of values, representing the repertoire of responses available to the 
54: organism. The decisions may, of course, be stochastic, to reflect the 
55: uncertainty, pertinent to behavior. This definition is suitable for 
56: experiments where subjects perform poly-alternative forced-choice tasks, 
57: such as saccadic response to the direction of stimulus motion 
58: (Shadlen and Newsome, 2001).
59: 
60: 
61: 
62: 
63: 
64: 
65: \begin{figure}[htbp]
66: \centerline{\includegraphics[width=3.0in]{dma1.eps}}
67: \caption{\textbf{A,} Definition of decision making task. Nervous system 
68: evaluates a function, whose values represent discrete decisions, in the 
69: many-dimensional sensory space. \textbf{B,} Some of the visual areas 
70: involved in motion-discrimination task. The areas on the left are more 
71: sensory (response is correlated with the sensory input), while those of the 
72: right are more motor (correlated with the response).
73: }
74: \label{fig1}
75: \end{figure}
76: 
77: 
78: Let us consider motion-discrimination task in more detail. Figure 1B lists 
79: some visual areas, which are involved in this task. The areas are arranged 
80: along a rough sensory-motor axis, so that the areas on the left are more 
81: ``sensory'', while those on the right are more ``motor''. This implies that 
82: the responses in these areas are more correlated with stimulus or response 
83: respectively. Where on this sensory-motor axis one should position the DM 
84: elements? One could argue that the elements most correlated with the 
85: decision itself are the decision makers, following the analogy with the 
86: definition of sensory and motor elements. It is, however, difficult, if not 
87: impossible, to distinguish such definition from the definition of purely 
88: motor units (Shadlen and Newsome, 2001). The latter 
89: relay the results of decision making process, without involvement in the 
90: formation of the decision. An alternative approach is therefore needed to 
91: define the DM units. 
92: 
93: The DM components may be surmised to be located on the interface between 
94: sensory and motor areas. More precisely, the \textit{first} element in the sensory-motor 
95: chain, which carries significant correlation with the response, may be 
96: identified as the decision maker. In this study we develop this idea into 
97: rigorous mathematical formulation and find a special correlation function, 
98: which determines contributions of units to DM. This 
99: formalism allows us to answer two questions pertaining to the identities of 
100: DM units. First, we consider the case when not one but \textit{several} elements are 
101: involved in the same decision simultaneously. Our approach allows us to 
102: evaluate relative importance of various units in such a distributed DM. 
103: Second, we consider the systems with loops in connectivity. For such systems 
104: the concept of `the first element' becomes more arbitrary and one has to 
105: proceed more carefully in defining contributions to DM. We succeed in doing 
106: so for our surrogate networks and define DM units for recurrent networks in 
107: a way, which is consistent with the linear sensory-motor chains, thus 
108: satisfying the requirement of the correspondence principle. 
109: 
110: This paper is organized as follows. We first analyze simple linear chain models,
111: and networks, such as trees, which have similar properties. We then 
112: use this analysis to define decision makers in networks of arbitrary 
113: connectivity. Finally, we extend our study to the cases, when electric 
114: stimulation can be applied to units, and show that DM components can be 
115: identified in a way consistent with our preceeding analyses. 
116: 
117: \section{LINEAR CHAINS AND THEIR DERIVATIVES}
118: \label{linearchain}
119: 
120: The goal of this section is to formulate quantitative principles by which DM network elements can be identified. 
121: We approach this task by analyzing simple cases, which can be solved exactly without the use of computer, and in which the identities of DM elements are clear. 
122: These cases allow us to emphasize the properties of DM task we are attempting to describe.
123: We proceed therefore to the analysis of the simplest network capable of making decisions.
124: 
125: 
126: \subsection{The `nematode' network}
127: \label{nematode}
128: 
129: In this subsection we consider the network, which we call `nematode', because of its resemblance to simple biological organisms, both in the layout and in the fundamental significance.
130: We first define the model; then show that it can make simple decisions; and, finally, define the positions of decision makers in the network.
131: 
132: 
133: \begin{figure}[htbp]
134: \centerline{\includegraphics[width=3.0in]{dma2.eps}}
135: \caption{\textbf{A,} a simple `nematode' network consists of a linear chain 
136: of units. All units in the chain, but 
137: the last, are linear. The last unit, shown by the square, is non-linear and 
138: returns zero or one depending on the sign of the response of the preceding 
139: unit. \textbf{B,} The average input-output relationship for the `nematode' 
140: is given by the sigmoid function (error function). The spread of the sigmoid 
141: is determined by the net noise in the chain.
142: }
143: \label{fig2}
144: \end{figure}
145: 
146: 
147: Consider a linear chain of units, whose response is characterized by a set 
148: of real numbers $x_i$, where $i=1...N$ is the position of the element in 
149: the chain (Figure 2). Response of each element does not depend on time. This 
150: model is therefore static. This assumption is introduced here to simplify 
151: the analysis and can be relaxed as described below (section~\ref{trees}). Each unit 
152: performs a simple linear transformation between the unit's input and the 
153: output. Thus, for element number $i$
154: \begin{equation}
155: \label{eq1}
156: x_i =x_{i-1} +\eta _i 
157: \end{equation}
158: Here $\eta _i $ is noise associated with the element. In this work we assume 
159: that noise has zero mean, is individual to each unit, and, therefore, is 
160: uncorrelated between units, i.e.
161: \begin{equation}
162: \label{eq2}
163: {\begin{array}{*{20}c}
164:  {\overline {\eta _i } =0,} \hfill & \hfill \\
165: \end{array} }\overline {\eta _i \eta _j } =\left\{ {{\begin{array}{*{20}c}
166:  {\overline {\eta _i^2 } ,} \hfill & {i=j} \hfill \\
167:  {0,} \hfill & {i\ne j} \hfill \\
168: \end{array} }} \right.
169: \end{equation}
170: We further assume that noise has a Gaussian distribution. The chain of 
171: linear elements is thus completely specified by a set of noise variances 
172: $\overline {\eta _i^2 } $. The model described by (\ref{eq1}) and (\ref{eq2}) yields the 
173: following solution for the response of the last element in the chain 
174: \begin{equation}
175: \label{eq3}
176: x_N =x_0 +\eta _1 +\eta _2 +...+\eta _{N-1} +\eta _N .
177: \end{equation}
178: Thus, the response of the last element is just a sum of the input into 
179: network $x_0$ and noise contributions from all units, independently on the 
180: order of unit in the chain. 
181: 
182: The last element in the chain has non-linear response properties. Its 
183: response is defined by
184: \begin{equation}
185: \label{eq4}
186: d=H(x_N ),
187: \end{equation}
188: where $H(x)$ is the Heaviside~step~function, which is equal to one/zero if 
189: the argument is positive/negative. It follows then that our `nematode' 
190: network is capable of making decisions based on the values of input 
191: variable $x_0$. This is if we interpret variable $d$, which is equal either 0 
192: or 1, as the result of DM process, as defined in Figure 1A. The decisions 
193: are made stochastically and are dependent upon the instantiations of random 
194: variables $\eta _i $, which vary from trial to trial. 
195: 
196: \begin{figure}[htbp]
197: \centerline{\includegraphics[width=3.0in]{dma3.eps}}
198: \caption{
199: \textbf{A,} If signal-to-noise ratio is high, responses of all 
200: units are well correlated with the output, as shown on the right by the 
201: mutual information between the response of given unit and the output. 
202: \textbf{B,} For the case of low signal-to-noise ratio, the output is more 
203: correlated with the motor units (right) than with the sensory ones (left).
204: }
205: \label{fig3}
206: \end{figure}
207: 
208: 
209: Our model is completely defined by the set of noise variances, pertinent to 
210: each unit $\overline {\eta _i^2 }$. Although decisions made by this chain 
211: are quite simple, the identities of decision makers are not so easy to find. 
212: The distribution of impact to DM along the chain should depend upon the 
213: distribution of noise variables $\overline {\eta _i^2 } $. Our next goal is 
214: to develop a sensible definition of contributions to DM based on the vector 
215: of variances $\overline {\eta _i^2 } $. Before doing so we describe general 
216: input-output properties of the chain. 
217: 
218: Since decision made by the network varies from trial to trial, one can 
219: define averaged over trials response of the system $\overline {d(x_0 )} $. 
220: As shown in Figure 2B it has a sigmoid shape, smeared by the total amount of 
221: noise in the system. One can, therefore, consider two cases, depending on 
222: whether the signal-to-noise ratio for the chain is large or small. These two 
223: regimes are shown in Figure 3A and B respectively. 
224: 
225: To analyze responses of units in these two cases we define their correlation 
226: with the decision. This correlation is defined for each element in the chain 
227: (Figure 3, right). As a measure of correlation we choose mutual information 
228: (MI) between response of the $i-$th unit, $x_{i}$, and the decision, $d$. MI has an 
229: advantage of being unitless (it is measured in bits) and having clear 
230: intuitive properties, as described below. We will also show below in this 
231: section that MI has limitations as a measure of DM.
232: 
233: MI describes the information transmission from the $i-$th unit to the output of 
234: the system. Since the output can only have values 0 or 1, MI cannot exceed 
235: the value of one bit. We now consider two cases, depending on the network's 
236: signal-to-noise ratio. If network input $\left| {x_0 } \right|$ is large, as 
237: in Figure 3A, response of the system is well correlated with the input. 
238: Hence, activities of all units are well correlated with both input and 
239: output, and $MI(x_i ,d)\approx 1$ for all of the units. In the opposite 
240: limit, when the signal-to-noise ratio is small, $\left| {x_0 } \right|$ is 
241: smaller than noise, and the system's response is weakly correlated with the 
242: input (Figure 3B). In this case MI as a function of unit's position displays a 
243: structure, shown in Figure 3B (right). This structure, as shown below, has a 
244: key to the definition of DM components and is qualitatively discussed here. 
245: The units, which are close to the exit from the network, show strong 
246: correlation with the decision, similarly to the high signal-to-noise ratio case. 
247: Their MI is therefore close to 1 bit. On the 
248: other hand, more `sensory' units, in the beginning of the chain are strongly 
249: correlated with the input. Since input-output correlation is weak in low 
250: signal-to-noise ratio case, the `sensory' units display virtually \textit{no} relation 
251: to the output and $MI(x_i ,d)\approx 0$ for such units (Figure 3B, right). 
252: Thus, MI, as a function of $i$ displays a transition from 0 to 1 in the
253: low signal-to-noise ratio case. 
254: 
255: How could one deduce identities of decision makers from these dependencies 
256: (Figure 3A and B)? One could suggest that the elements perfectly correlated 
257: with the output of the system, such as exit elements from the chain, are the 
258: ones that make the decision. However, such elements may be just the relay or 
259: `motor' units, in which case their contribution to DM is small. Indeed, when 
260: we type, our decisions are perfectly correlated with activities of finger 
261: muscles; but one could hardly blame our fingers for the content of the 
262: typing. Thus, despite their high correlation with the output, exit elements 
263: could not be called decision makers. Input elements, having no correlation 
264: with the decision, are responsible for DM in even lesser degree. We thus 
265: need to analyze the dependence of MI on position in more detail and suggest 
266: another scheme for defining DM units.
267: 
268: Our discarding of motor units as decision makers can be further extended 
269: onto the entire high signal-to-noise ratio case (Figure 3A). We suggest that 
270: the deterministic regime is not descriptive from the point of view of DM 
271: analysis. First, in this regime all units become indistinguishable from 
272: motor. The latter are not decision makers, as suggested above. Second, the 
273: dependence shown in Figure 3A (right) does not reveal the contributions of 
274: individual units to the decision. Since all units have the same correlation, 
275: it is hard, if not impossible, to differentiate them and assign different 
276: contributions. Third, the responses of units in this case are 
277: deterministically related to the input. Hence, units act as relays, 
278: passively transmitting information along the chain. It can be argued that 
279: the external environment, providing the input variable $x_0$, acts as the 
280: decision maker. We conclude that to find decision making activity one has to 
281: concentrate on the low signal-to-noise ratio case. 
282: 
283: We show below that the identities of decision making units can be deduced 
284: from the shape of transition in Figure 3B (right). To this end we analyze a 
285: set of examples of networks with various distributions of noise $\overline 
286: {\eta _i^2 } $. We start from the simplest example of a single noisy unit.
287: 
288: \subsubsection{Example 1: 'Noisy' neuron.} 
289: 
290: Consider a chain in which noise is absent from all units but one, whose 
291: order number in the chain is $n$ (Figure 4). Since, according to our 
292: previous discussion, we need to consider the low signal-to-noise ratio case, 
293: we will assume that
294: \begin{equation}
295: \label{eq5}
296: x_0 =0,
297: \end{equation}
298: i.e. network receives no input. Making the decision in this case is still possible, 
299: based on the values of noise inside the network. Since noise is only present in one neuron, from 
300: (\ref{eq3}) we conclude that 
301: \begin{equation}
302: \label{eq6}
303: x_N =\eta _n .
304: \end{equation}
305: The decision made by the network is
306: \begin{equation}
307: \label{eq7}
308: d=H(\eta _n ).
309: \end{equation}
310: Thus, decision is causally linked to the processes controlling unit number 
311: $n$, which leads us to conclusion that this neuron is the decision maker. 
312: 
313: Paradoxically, the noisiest unit in this simple formulation makes the 
314: largest impact. All noiseless elements, even nonlinear, are deterministic, 
315: and work as simple relays which transmit information from the previous node 
316: to the next one. The output of the circuit is linked to the processes 
317: controlling noise in neuron number $n$, rather that in any other neuron in 
318: the network.
319: 
320: One would be tempted to conclude that the non-linear element is actually the 
321: decision maker in this case. We deduce that the non-linear element does 
322: not have a causal effect on output from the circuit; therefore its role is 
323: just to relay response from neuron $n$ to the output. In this respect the 
324: non-linear element is not different from other noiseless elements. 
325: 
326: To link this example to our previous discussion (Figure 3B) we plot MI as a 
327: function of position in the chain in Figure 4 (top). As we discussed, MI is 
328: high for exit (`motor') units and low for input (`sensory') elements. Figure 
329: 4 also shows the derivative of MI with respect to position in the chain. It 
330: is clear that this derivative represents the decision making element. Thus, 
331: we conclude that not correlation with the decision but the \textit{rate of change} of the latter 
332: along the network is the indicator of DM.
333: 
334: \begin{figure}[htbp]
335: \centerline{\includegraphics[width=3.0in]{dma4.eps}}
336: \caption{
337: The example of 'noisy' neuron (marked by asterisk). Top 
338: panel, mutual information between given unit and the decision. Bottom panel, 
339: derivative of mutual information. The derivative represents the 
340: decision making unit in this case. 
341: }
342: \label{fig4}
343: \end{figure}
344: 
345: 
346: \subsubsection{Example 2: Uniformly distributed noise.} Our next example shows that the conclusion about derivative of MI is 
347: basically correct, but has to be slightly amended to be numerically precise. 
348: Consider the chain in which all elements are noisy and the variance of noise 
349: is the same for each element. In this case 
350: \begin{equation}
351: \label{eq8}
352: x_N =\eta _1 +\eta _2 +...+\eta _{N-1} +\eta _N 
353: \end{equation}
354: i.e. all units contribute to decision \textit{equally}. This is because Eq. (\ref{eq8}) does not 
355: distinguish the order in which contributions from the units are added, and 
356: all contributions are of equal strength on average. Can this conclusion be 
357: confirmed by the derivative of MI?
358: 
359: Figure 5A shows MI as a function of position in the chain for this case. 
360: This dependence is obtained in Appendix A. It increases smoothly from 0 to 1 
361: resulting in a non-zero derivative at all units. This is consistent with 
362: (\ref{eq8}) and the notion that all units participate in the decision. However, 
363: (\ref{eq8}) suggests that all units participate in decision \textit{equally}. The derivative of MI 
364: turns out to be slightly non-uniform, as seen in Figure 5A. This can be 
365: corrected if not MI itself but a non-linear function of MI, denoted $F(MI)$, 
366: is considered. This non-linear function is calculated in Appendix A and is 
367: shown in Figure 5B. The new correlator $F(MI)$ has the same basic properties 
368: as the MI. It rises from 0 to 1 monotonously when passing through the array 
369: (Figure 5C). But, in addition, its derivative turns out to be \textit{uniform}, as shown in 
370: Figure 5C (bottom). This is consistent with equal participation of all units 
371: in DM in the uniformly distributed noise case and Eq.~(\ref{eq8}). Thus, we 
372: conclude that for this case the contributions to DM are given by the rate of 
373: increase of $F(MI)$ when moving through the array
374: \begin{equation}
375: \label{eq9}
376: DM_i =F\left( {MI_i } \right)-F\left( {MI_{i-1} } \right).
377: \end{equation}
378: Here $i$ is the index along the chain. Eq.~(\ref{eq9}) is the main result of this 
379: paper. It represents our definition of contributions to DM for networks of 
380: simple connectivity, such as chains. 
381: 
382: Three points should be made about the definition (\ref{eq9}). First, it reproduces 
383: the result obtained in the previous example of 'noisy' neuron. Indeed, the 
384: mutual information rises from 0 to 1 on the 'noisy' neuron in Figure 4. But 
385: $F(MI)$ coincides with MI at these values, as follows from its plot in 
386: Figure 5B. Thus, the derivative of $F(MI)$ is also given by a single spike 
387: at the position of 'noisy' neuron, as in Figure 4 (bottom). Second, Eq. (\ref{eq9}) 
388: implies that, from point of view of DM, not mutual information, but another 
389: correlator, given by $F(MI)$, is more relevant. Function $F$ deviates from 
390: linear function only slightly (Figure 5B), and for practical purposes the 
391: distinction between the MI and $F(MI)$ could be ignored. However, we retain 
392: it throughout the manuscript to ensure mathematical rigor. Third, when 
393: deducing (\ref{eq9}) we did not postulate that contributions to DM are 
394: proportional to the variance of noise. Instead, we suggested that Eq. (\ref{eq8}) 
395: implies that all units contribute equally, independently on the order in the 
396: chain. This simple qualitative statement is powerful enough to constrain our 
397: quantitative reasoning and lead to a measure of DM in form of function 
398: $F(MI)$ and definition (\ref{eq9}). We do not know yet if the derivative of 
399: $F(MI)$ is proportional to the variance of noise, square root of this 
400: variance, or any other characteristic of noise in each element. All of these 
401: parameters give the same results in the uniform noise case. We need to have 
402: a difference between units to measure relative strength of their 
403: contributions. This is achieved by the next example.
404: 
405: 
406: \begin{figure}[htbp]
407: \centerline{\includegraphics[width=3.0in]{dma5.eps}}
408: \centerline{\includegraphics[width=3.0in]{dma6.eps}}
409: \caption{
410: The example with uniformly distributed noise. 
411: \textbf{A}, mutual information between response of given unit and the 
412: decision. The dependence has a non-uniform increase, suggesting that mutual 
413: information is not a good measure of decision making. \textbf{B}, if one 
414: applies a non-linear function (solid curve) to the mutual information in 
415: \textbf{A}, one obtains a uniformly increasing correlator in \textbf{C}. 
416: This non-linear function, called $F(MI)$, is close to linear, shown by the 
417: dotted line. \textbf{C}, the new correlator $F(MI)$ (top panel) has a 
418: uniform derivative (bottom panel). Thus, derivative of $F(MI)$ is a sensible 
419: measure of decision making in the case of uniform noise. 
420: }
421: \label{fig5}
422: \end{figure}
423: 
424: 
425: \subsubsection{Example 3: `Loud' neuron.} In this example the variances of noise on all neurons are the same, 
426: similarly to the previous case. However, here we amend the network 
427: definition given by (\ref{eq1}). We do so for only one neuron. We assume that the 
428: link between units 5 and 6 is characterized by a very large strength $K>>1$. 
429: Thus, for neuron number 6 (Figure 6) instead of (\ref{eq1}) we have
430: \begin{equation}
431: \label{eq10}
432: x_6 =Kx_5 +\eta _6 
433: \end{equation}
434: Therefore this example is the same as the previous, except that the single 
435: network connection is changed. What are the DM units in this case?
436: 
437: The network's output is given by 
438: \begin{equation}
439: \label{eq11}
440: x_N =K(\eta _1 +\eta _2 +...+\eta _5 )+\eta _6 +...+\eta _{11} 
441: \end{equation}
442: Thus, units 1 through 5 contribute equally to decision. In addition, their 
443: contributions are multiplied by a large factor $K$. Units 6 through 11 also 
444: contribute equally, but their contribution is much smaller than that of the 
445: former group. We conclude that units 1 through 5 are much stronger 
446: decision makers than units 6 through 11. This conclusion is supported by the 
447: derivative of $F(MI)$, as shown in Figure 6 (bottom).
448: 
449: \begin{figure}[htbp]
450: \centerline{\includegraphics[width=3.0in]{dma7.eps}}
451: \caption{
452: The `loud' neuron example. The link between units 5 an 6 
453: is strengthened. Compare to Figure 5A.
454: }
455: \label{fig6}
456: \end{figure}
457: 
458: 
459: Thus, changing one link in the chain produces large effect on the 
460: distribution of DM. The units downstream from the link contribute less to 
461: decisions, while the units upstream contribute a lot. What is the measure of 
462: decision making, which could differentiate these two types of units? 
463: 
464: Calculations in Appendix A show that derivative of $F(MI)$ is proportional 
465: to $K^2$ for units 1 through 5. This is easy to understand qualitatively, 
466: since MI increases along the chain even for negative ($K<0)$ links. This 
467: is \textit{not} possible if contribution from units 1 to 5 are multiplied by $K$ for 
468: example. Thus, an even power of $K$ is required, which is shown in Appendix 
469: A to be $K^2$. 
470: 
471: \subsubsection{Alternative definition of DM.} So far we have used definition (\ref{eq9}), which is quite complex, since it 
472: involves calculation of a nonlinear function $F(MI)$. Is it possible to 
473: reproduce the results derived above in a simpler way? It turns out that the 
474: role of given unit in DM is proportional to its contribution to the 
475: variability of the output $\overline {x_N^2 } $. This leads us to an 
476: alternative to (\ref{eq9}) definition of DM. 
477: 
478: Let us introduce the new definition using the examples, considered above. 
479: From (\ref{eq11}) in the `loud' neuron case we derive 
480: \begin{equation}
481: \label{eq12}
482: \overline {x_N^2 } =K^2(\overline {\eta _1^2 } +...+\overline {\eta _5^2 } 
483: )+\overline {\eta _6^2 } +...+\overline {\eta _{11}^2 } .
484: \end{equation}
485: We could conjecture that the contributions to DM from different units are 
486: weighted proportionally to the corresponding summands in (\ref{eq12}). Indeed, if 
487: we assume
488: \[
489: DM_{1..5} =\overline {\eta _{1..5}^2 } K^2
490: \]
491: \begin{equation}
492: \label{eq13}
493: DM_{6..11} =\overline {\eta _{6..11}^2 } ,
494: \end{equation}
495: by choosing appropriate values of variance of noise and gain, we can 
496: reproduce the results of all three of our previous examples. Thus, in the 
497: case of 'noisy' neuron the variance of noise is only present in one unit, 
498: rendering this unit decision maker, according to (\ref{eq13}). In the case of 
499: uniform noise, when $K=1$ and all $\overline {\eta _{1..11}^2 } $ are the 
500: same, (\ref{eq13}) gives uniform contributions to DM. In the case of `loud' neuron, 
501: (\ref{eq13}) gives the correct factor $K^2$ describing the advantage of upstream 
502: neurons. Thus, the contributions to DM are proportional to the variance of 
503: noise on given element, multiplied by the square of the gain from this 
504: element to the output. We can rewrite (\ref{eq13}) in a more compact form to 
505: emphasize this latter statement
506: \begin{equation}
507: \label{eq14}
508: DM_i =\overline {\eta _i^2 } \frac{d\overline {x_N^2 } }{d\overline {\eta 
509: _i^2 } }.
510: \end{equation}
511: One could verify (\ref{eq14}), by applying it to (\ref{eq12}) and obtaining 
512: relationships (\ref{eq13}). This justifies (\ref{eq14}) in the three examples considered 
513: above.
514: 
515: Eq. (\ref{eq14}) also applies to linear chains in general. In Appendix A we derive 
516: (\ref{eq14}) from previous definition (\ref{eq9}) for arbitrary distribution of 
517: connection strengths and noise. Thus, (\ref{eq14}) can be considered an 
518: alternative definition to (\ref{eq9}). The equivalence between (\ref{eq9}) and (\ref{eq14}) is 
519: demonstrated graphically in Figure 7. 
520: 
521: Why should one consider an alternative definition? This is because (\ref{eq9}) 
522: cannot be applied to networks of arbitrary connectivity, such as circuits 
523: containing loops. Definition (\ref{eq14}) however applies to all topologies, 
524: including the linear chain examples, considered here. 
525: 
526: \begin{figure}[htbp]
527: \centerline{\includegraphics[width=3.0in]{dma8.eps}}
528: \caption{
529: Equivalence of two definitions. The top panel shows 
530: distribution of noise variance (asterisk diameter) and of $F(MI)$ (bars). 
531: The bottom panel displays the derivative of $F(MI)$, defined by (\ref{eq9}). The 
532: derivative is numerically the same as the variance of noise. Both can be 
533: used as measures of decision making.
534: }
535: \label{fig7}
536: \end{figure}
537: 
538: 
539: \subsection{Conclusions from `nematode' study}
540: \label{conclusions}
541: 
542: Let us review our findings. First, we arrived to the definition of DM 
543: activity using the information-theoretical approach (\ref{eq9}). According to this 
544: definition, DM is the rate of change of correlation with decision along the 
545: chain. In other words, the \textit{first} element or elements, which correlate with the 
546: decision, are the decision makers. This approach has its pros and contras. 
547: Indeed, the viewpoint expressed by (\ref{eq9}) has a potential to be transferred 
548: to other systems, which contain non-linear elements. Eq. (\ref{eq9}) has an 
549: information-theoretical origin; hence its applicability may be broader than 
550: our simple system. 
551: %We emphasize, however, that this point is not confirmed 
552: %here, it will be further explored elsewhere. 
553: Another advantage of (\ref{eq9}) is 
554: that it relies on the characteristics measurable in single-electrode 
555: recording experiments, such as response of single unit and its correlation 
556: with behavioral decision. Thus, (\ref{eq9}) could be used experimentally. The 
557: disadvantage of the information-theoretical approach is that it is not clear 
558: how to apply it to the systems with loops, as we have mentioned above. Since 
559: biological networks almost always contain loops this significantly limits 
560: the applicability of information-theoretical formula (\ref{eq9}). 
561: 
562: Our second step was to derive an alternative definition (\ref{eq14}). The latter 
563: is \textit{equivalent} to the former definition (\ref{eq9}) for linear-chain (`nematode') example, as 
564: we have demonstrated on simple examples and have shown more rigorously in 
565: Appendix A. The alternative definition (\ref{eq14}) can be understood on the basis 
566: of the following two observations. First, the example of `noisy' neuron 
567: shows that the variability is the source of decisions. Thus,
568: 
569: \underline {\textbf{Conclusion 1:}} Under fixed other conditions, an 
570: increase in variability and noise in a single unit leads to a larger 
571: contribution to DM from this unit. 
572: \begin{equation}
573: \label{eq15}
574: DM_i \sim \overline {\eta _i^2 } 
575: \end{equation}
576: 
577: Second, the example of `loud' neuron shows that not only variability and 
578: noise are important but also how much of this variability reaches the motor 
579: units. DM is hence a property of network connectivity too. Thus, we arrive 
580: to the next rule
581: 
582: \underline {\textbf{Conclusion 2:}} The stronger is the pathway from given 
583: unit to the motor output, the larger is the contribution of this unit to DM. 
584: \begin{equation}
585: \label{eq16}
586: DM_i \sim \frac{d\overline {x_N^2 } }{d\overline {\eta _i^2 } }
587: \end{equation}
588: These two rules are combined into the definition (\ref{eq14}). Although (\ref{eq14}) and 
589: (\ref{eq16}) assume that the output element is unique, this requirement will be 
590: removed below, when we consider arbitrary topology networks.
591: 
592: What are the features of (\ref{eq14})? It could be used for an arbitrary topology 
593: network, since it does not contain derivative along the chain, as (\ref{eq9}) 
594: does. Definition (\ref{eq14}) can also be used operationally to measure the 
595: contribution of each neuron to the decision experimentally. To do that one 
596: needs to vary noise at the given unit and measure the variability of the 
597: responses. The details are discussed in section ~\ref{stimulation}
598: below.
599: 
600: A special note should be made about normalization in (\ref{eq14}). Throughout this 
601: work we adopt the convention that DM contributions are evaluated for all 
602: units and then normalized proportionally to (\ref{eq14}), so that the total sum of 
603: all contributions is equal to one (or 100{\%}). We will assume this to hold 
604: below without explicitly mentioning. Finally, we give another 
605: definition of DM contributions, which could be useful when noise in the 
606: system is the same for all units. In this case the only difference between 
607: units is due to difference in their position in the network. We therefore 
608: call such quantity \textit{topological} DM. 
609: \begin{equation}
610: \label{eq17}
611: TDM_i =\frac{\partial \sigma ^2(x_N )}{\partial \overline {\eta _i^2 } }
612: \end{equation}
613: As seen from e.g. (\ref{eq12}) it does not depend on the levels of noise, and can 
614: be obtained from (\ref{eq14}) by assuming that $\overline {\eta _i^2 } =1$ for all 
615: units. It therefore describes how strongly each elements of the circuit 
616: affects the output. This quantity is sometimes helpful in describing the 
617: network's topology.
618: 
619: Lastly, we discuss the notion of noise and variability in our approach. Is 
620: this really noise, which leads networks to decisions? Not necessarily. 
621: Imagine that we have studied a chain-like network (Figure 8A) and performed 
622: the DM analysis, described above. We found that the network contains two 
623: decision makers, which are equally important. A more thorough investigation 
624: may suggest that these units are inputs from external network, which in 
625: effect is responsible for DM. For example, these hidden pathways may be 
626: inputs from other sensory modalities or regulatory inputs of other type. 
627: Thus, DMA may help identify entry points from other, less studied, parts of 
628: the network.
629: 
630: \begin{figure}[htbp]
631: \centerline{\includegraphics[width=4.0in]{dma10.eps}}
632: \caption{
633: Hidden pathway. The intensity of red shows 
634: contribution to decision making for each unit. \textbf{A,} analysis for an 
635: incomplete connectivity reveals two decision makers. \textbf{B,} a more 
636: thorough study may show that this results from other inputs to the network.
637: }
638: \label{fig8}
639: \end{figure}
640: 
641: 
642: \subsection{Trees}
643: \label{trees}
644: 
645: Our studies indicate that information-theoretical analysis [definition (\ref{eq9})] can be 
646: further extended to tree-like topologies (Figure 9). To this end we define 
647: column-vector 
648: $\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\rightharpoonup$}}\over 
649: {f}}$ such that $f_i =F(MI_i )$. Then (\ref{eq9}) is equivalent to
650: \begin{equation}
651: \label{eq18}
652: \overrightarrow {DM} =(\hat {I}-\hat {S})\vec {f}.
653: \end{equation}
654: Here $\hat {S}$ is the structure matrix defined as follows. An element 
655: $S_{ij} $ of the structure matrix is equal to 1 if there is a connection 
656: from unit number $i$ to $j$. Matrix $\hat {I}-\hat {S}$ thus implements 
657: evaluating differences between connected elements in (\ref{eq9}). Structure matrix 
658: is related to connectivity matrix, containing network's weights through 
659: $S_{ij} =\left| {sign(C_{ij} )} \right|$. Connectivity matrices for some 
660: networks are shown in Figures 9 and 10.
661: 
662: \begin{figure}[htbp]
663: \centerline{\includegraphics[width=4.0in]{dma11.eps}}
664: \caption{
665: Mutual information approach can be extended to 
666: connectivities other than linear chains (\textbf{A}). Thus, decision makers 
667: on trees (\textbf{B}) can also be found. Arbitrary network can be specified 
668: by connectivity matrices, which are provided for illustration purposes. The 
669: non-zero entries in a connectivity matrix indicate a connection between two 
670: elements numbered on the left. An entry value describes the strength of 
671: connection and does not have to be unitary or positive.
672: }
673: \label{fig9}
674: \end{figure}
675: 
676: 
677: Information-theoretical approach can be even further extended on the cases, when signals 
678: propagate along the network in time, therefore resulting in delays between 
679: signal and response. In this case by 
680: $\mathord{\buildrel{\lower3pt\hbox{$\scriptscriptstyle\rightharpoonup$}}\over 
681: {f}} $ one should understand a sum of correlations over all times preceding 
682: the decision. This compensates for the presence of delays. So far there is 
683: no understanding if Eq.~(\ref{eq18}) [or (\ref{eq9})] can be used for topologies other 
684: than trees. Definition (\ref{eq14}), however, can be used with networks of 
685: arbitrary connectivity. This is the topic of the next section.
686: 
687: \section{DYNAMIC MODELS}
688: \label{dynamic}
689: 
690: All previous examples, except the one mentioned at the end of the last 
691: session, were static, i.e. variables did not depend on time. The deficiency 
692: of this approach is that it is not clear how to treat networks with loops. 
693: To apply our analysis to the cases with loops, and, in general, to 
694: networks with \textit{arbitrary} connectivity (Figure 10), we consider time-dependent models 
695: here. This allows us to observe propagation of noise around the loop 
696: explicitly and to make accurate conclusions about contributions to DM. 
697: 
698: We limit ourselves to linear dynamical systems, where the single nonlinear 
699: element is the last one, transforming an analog system output to a binary 
700: response. As the first step we consider temporal dynamics in the 
701: discrete-time approximation, which contains all essential features of our 
702: approach. Later in the section we extend discrete model to the 
703: continuous-time case and show their equivalence.
704: 
705: \begin{figure}[htbp]
706: \centerline{\includegraphics[width=4.0in]{dma12.eps}}
707: \caption{
708: The network topologies considered by the dynamic models. 
709: Arbitrary connectivities, such as cycles (left) or feedforward networks 
710: (right) can be considered. 
711: }
712: \label{fig10}
713: \end{figure}
714: 
715: 
716: \subsection{Discrete-time model}
717: \label{discrete}
718: 
719: In this section we consider a system of $N$ elements, whose activity at each 
720: instant is described by an $N$-dimensional column-vector $\vec {x}(t)$. Time 
721: has discrete values separated by an interval $\tau $. Therefore this model 
722: is called the discrete-time model. The values of activity at two neighboring 
723: time-slices are related by the connection matrix $\hat {C}$
724: \begin{equation}
725: \label{eq19}
726: \vec {x}(t+\tau )=\hat {C}\vec {x}(t)+\vec {\eta }(t)+\vec {s}(t)
727: \end{equation}
728: Here $\vec {\eta }(t)$ is the vector describing noise added to activity 
729: vector on each time-slice. The variable $\vec {s}(t)$ describes sensory 
730: input into the system. The rules of temporal evolution of activities 
731: described by this equation are general enough to include almost all 
732: interesting phenomena and mimic modeling of real systems on digital 
733: computers. In appendix B we will prove that this model is equivalent to systems with 
734: continuously defined time. 
735: 
736: Noise is specified by the parameter $\vec {\eta }(t)$, which has a zero mean 
737: and is defined by the correlation matrix 
738: \begin{equation}
739: \label{eq20}
740: \overline {\eta _i (t_1 )\eta _j (t_2 )} ={\cal N}_{ij} \delta _{t_1 ,t_2 } 
741: \end{equation}
742: We assume here that neighboring in time values on noise are not correlated, 
743: implying that we consider a system with white noise. This assumption can be 
744: easily relaxed and is used here to simplify the analysis. It becomes 
745: rigorously valid when time-interval $\tau $ is longer than the correlation 
746: time of noise. Further, if noise is specific to each neuron, the same-time 
747: correlation matrix $\hat {{\cal N}}$ is diagonal
748: \begin{equation}
749: \label{eq21}
750: {\cal N}_{ij} =\overline {\eta _i^2 } \delta _{ij} 
751: \end{equation}
752: This takes place i.e. when stochasticity is induced by probabilistic nature 
753: of synaptic vesicle release, in which case every two neurons receive 
754: uncorrelated fluctuating inputs. 
755: 
756: Some time after presentation of the stimulus [$\vec {s}(t)\ne 0$] the system 
757: is forced to make a decision through the following process. First, a scalar 
758: quantity 
759: \begin{equation}
760: \label{eq22}
761: y=\vec {v}^T\cdot \vec {x}(t)
762: \end{equation}
763: is evaluated. Here time corresponds to the instant, when the choice is to be 
764: made. The output metrics vector $\vec {v}$ describes the way in which 
765: system's activity affects motor response. In the simplest case, which was 
766: considered in the previous section, when a single element number $n$ evokes 
767: responses, $v_i =\delta _{in} $. In a more complex situation, when multiple 
768: areas/neurons have direct influence on decision, vector $\vec {v}$ has more 
769: than one non-zero element. On the second step, decision is made based on the 
770: sign of $y$
771: \begin{equation}
772: \label{eq23}
773: d=H(y)
774: \end{equation}
775: Thus, this model describes a two-alternative forced-choice task. 
776: 
777: Our system is completely defined by the following set of parameters: $\hat 
778: {C}$, $\vec {s}(t)$, $\hat {{\cal N}}$, and $\vec {v}$. As we have shown in 
779: the previous section, the presence of the stimulus is not required to define 
780: DM elements [Eq. (\ref{eq5})]. We therefore set $\vec {s}(t)$ to zero and are left 
781: with three parameters $\hat {C}$, $\hat {{\cal N}}$, and $\vec {v}$. We now 
782: are ready to determine DM elements in our simple model.
783: 
784: To find decision makers we will use Eq. (\ref{eq14}). In this case it becomes
785: \begin{equation}
786: \label{eq24}
787: DM_i ={\cal N}_{ii} \frac{\partial \sigma ^2(y)}{\partial {\cal N}_{ii} }
788: \end{equation}
789: Therefore, we need to evaluate the variability on the output from the system 
790: $\sigma ^2(y)$. This is accomplished if we notice that $y=\vec {v}^T\cdot 
791: \vec {x}(t)=\vec {x}^T(t)\cdot \vec {v}$ and 
792: \begin{equation}
793: \label{eq25}
794: \sigma ^2(y)=\vec {v}^T\cdot \overline {\vec {x}(t)\vec {x}^T(t)} \cdot \vec 
795: {v}=\vec {v}^T\hat {X}(t,t)\vec {v}
796: \end{equation}
797: Here we introduced the cross-correlation matrix defined as follows
798: \begin{equation}
799: \label{eq26}
800: \hat {X}(n,k)\equiv \overline {\vec {x}(n)\vec {x}^T(k)} =\overline {\left( 
801: {{\begin{array}{*{20}c}
802:  {x_1 } \hfill \\
803:  \vdots \hfill \\
804:  {x_N } \hfill \\
805: \end{array} }} \right)\left( {{\begin{array}{*{20}c}
806:  {x_1 } \hfill & \cdots \hfill & {x_N } \hfill \\
807: \end{array} }} \right)} 
808: \end{equation}
809: We replace here the time variable by the integers, specifying the time-slice 
810: number. The averaging in (\ref{eq25}) and (\ref{eq26}) is assumed over different 
811: instantiations of noise (trials). 
812: 
813: Due to the properties of noise in our model, this correlator does not depend 
814: on the absolute values of time ($n$ and $k)$, but only on the difference ($n - k)$. As 
815: follows from (\ref{eq25}), of particular interest is the same-time correlator $\hat 
816: {X}_0 \equiv \hat {X}(n,n)$, which determines fluctuations in $y$. We now 
817: derive equation for same-time correlator $\hat {X}_0 $.
818: 
819: Using (\ref{eq19}) we obtain
820: \begin{equation}
821: \label{eq27}
822: \begin{array}{l}
823:  \hat {X}_0 =\overline {\vec {x}(n+1)\vec {x}^T(n+1)} = \\ 
824:  =\overline {\left[ {\hat {C}\vec {x}(n)+\vec {\eta }(n)} \right]\left[ 
825: {\vec {x}^T(n)\hat {C}^T+\vec {\eta }^T(n)} \right]} \\ 
826:  \end{array}
827: \end{equation}
828: We then notice that the correlator $\overline {\vec {x}(n)\vec {\eta }^T(n)} 
829: $ is identically zero, since $\vec {x}(n)$ is a linear combination of values 
830: of noise at times $k<n$ [see Eq. (\ref{eq19})]. We thus deduce from Eq. (\ref{eq27}) that
831: \[
832: \hat {X}_0 =\overline {\hat {C}\vec {x}(n)\vec {x}^T(n)C^T} +\overline {\vec 
833: {\eta }(n)\vec {\eta }^T(n)} ,
834: \]
835: which leads us, finally, to 
836: \begin{equation}
837: \label{eq28}
838: \hat {X}_0 -\hat {C}\hat {X}_0 \hat {C}^T=\hat {{\cal N}}
839: \end{equation}
840: This equation allows us to determine the same-time correlator $\hat {X}_0 $ 
841: from connectivity and noise cross-correlogram, defined in (\ref{eq20}), which is a 
842: diagonal matrix.
843: 
844: We would like to pause here and describe the properties of this equation. 
845: First of all, in the most generic case (\ref{eq28}) allows us to determine $\hat 
846: {X}_0 $ from $\hat {C}$ and $\hat {{\cal N}}$ uniquely. Indeed, (\ref{eq28}) is a 
847: system of $N^2$ linear equations for $N^2$ unknowns $\hat {X}_0 $, arranged 
848: in the matrix form. Hence, this system, in most cases, can be solved 
849: uniquely. On the other hand, with one exception, $\hat {X}_0 $ cannot 
850: be expressed explicitly in terms of matrices $\hat {C}$ and $\hat {{\cal 
851: N}}$. Thus, one has to either appeal to the representation of $\hat {X}_0 $ 
852: in terms of eigenvectors and eigenvalues of $\hat {C}$, or use computer to 
853: arrange elements of matrix $\hat {X}_0 $ in vector form and solve resulting 
854: linear system.
855: 
856: The contribution to DM from a given element can be determined from Eq. (\ref{eq25})
857: \begin{equation}
858: \label{eq29}
859: DM_i ={\cal N}_{ii} \frac{\partial \sigma ^2(y)}{\partial {\cal N}_{ii} 
860: }=\vec {v}^T\frac{\partial \hat {X}_0 }{\partial \ln {\cal N}_{ii} }\vec 
861: {v}
862: \end{equation}
863: The topological DM contributions are
864: \begin{equation}
865: \label{eq30}
866: TDM_i =\vec {v}^T\frac{\partial \hat {X}_0 }{\partial {\cal N}_{ii} }\vec 
867: {v}
868: \end{equation}
869: Using Eqs. (\ref{eq28}) and (\ref{eq29}) one can analyze a variety of network 
870: connectivities. Some new effect emerging for non-tree systems are described 
871: next.
872: 
873: \subsection{Case 1: fan-out hub effect}
874: \label{fanout}
875: 
876: We now consider network shown in Figure 11A, in which all elements have the 
877: same variance of noise and all connections have unitary strength. Figure 11A 
878: shows two pathways from unit \textbf{2} to the exit unit, \textbf{6}. The 
879: resulting network gain from unit \textbf{2} to unit \textbf{6} is thus equal 
880: to two. All other units' gain at the exit is one. The contribution to DM 
881: from unit \textbf{2 }is thus four times larger that from other units. This 
882: is because noise at this unit is multiplied by a factor of two, and the 
883: variance of noise, by a factor of four. We conclude that there may be some 
884: special elements in network, which occupy hub-like positions, gaining large 
885: influence due to abundance of their outputs. It should be noted that fan-in 
886: hubs are not special from the point of view of DM in any way. 
887: 
888: 
889: 
890: \begin{figure}[htbp]
891: \centerline{\includegraphics[width=4.0in]{dma13.eps}}
892: %\centerline{\includegraphics[width=1.6in]{dma14.eps}}
893: \caption{
894: Two cases, in which the identities of decision makers 
895: can be found using discrete-time approach. Variance of noise on all elements 
896: is the same; all network links have unitary strength. The degree of 
897: decision making is shown by the intensity of red. \textbf{A}, The fan-out 
898: effect. \textbf{B}, the temporal integrator.
899: }
900: \label{fig11}
901: \end{figure}
902: 
903: 
904: \subsection{Case 2: temporal integrator}
905: \label{integrator}
906: 
907: Let us now examine the network with a loop. Figure 11B shows such an example 
908: with unitary link strength and uniform noise variance, as in previous case. 
909: The presence of loop affects DM drastically: our discrete-time model marks 
910: units belonging to the loop as decision makers\footnote{ Rigorously 
911: speaking, the set of equations (\ref{eq28}) and (\ref{eq29}) does not have a valid 
912: solution for the loop with all connection equal to unity. One needs to set 
913: one of the connection as a parameter, $\alpha <1$, solve the equations, and 
914: consider the limit $\alpha \to 1 .$ }. This is easy to understand, since 
915: noise, generated by each unit on each time-step, cannot leave the loop and, 
916: therefore, builds up there without limits. Therefore the variance of noise 
917: in the output of element number three grows proportionally to time 
918: $\overline {\left[ {x_3 (t)-\overline {x_3 } (t)} \right]^2} =\overline 
919: {\eta ^2} t\to \infty $. Here averaging is assumed over instantiations of 
920: noise (trials). Thus, loop becomes the crucial decision maker. This case is 
921: somewhat analogous to our previous `noisy' neuron example.
922: 
923: What is the possible role of loops in biological networks? Why would one 
924: introduce such unreliable components? Loops, similar to shown in Figure 11B, 
925: have many useful properties. For instance, they can act as parametric memory 
926: systems. Indeed, imagine that responses of all units in the loop have the 
927: same values, equal to $x$. This could be accomplished by manipulating the 
928: sensory inputs. Assume that no more inputs are received from the outside of 
929: the system. It follows that, in the absence of noise on each element, this 
930: value of response will reverberate around the loop forever. This is because 
931: all links have unitary strength. Loops can thus memorize a graded value, 
932: such as $x$, functioning as parametric memory elements. 
933: 
934: Suppose, in addition, that a non-zero input $s$ is applied to element number 
935: \textbf{1} at all times. Since this element acts as a summator, its response 
936: on the next step is $x_1 (1)=x+s$. The signal $s$ propagates around the 
937: loop, and in four steps it reaches the first element again, at which time 
938: its response is $x_1 (5)=x+2s$. In four more steps $x_1 (9)=x+3s$. Thus, not 
939: only noise, but also signal can build up in the system. Therefore, a loop 
940: can operate as a temporal integrator. The integration is not perfect if one 
941: of the links has a non-unitary strength, in which case integrator becomes 
942: leaky (Robinson, 1989). 
943: 
944: Temporal integrators play special role in DM, since they act as accumulators 
945: of sensory information, which puts them into special position with respect 
946: to other areas (Gold and Shadlen, 2002). As an example, 
947: such is area LIP in primate visual cortex, which is involved in DM in 
948: direction-discrimination task 
949: (Shadlen and Newsome, 2001; Roitman and Shadlen, 2002; Mazurek et al., 2003). 
950: 
951: \subsection{Continuous-time model}
952: \label{continuous}
953: 
954: We finally consider a model, in which time runs continuously. This model has 
955: potential relevance to real-life networks. The responses of units satisfy 
956: the following equation
957: \begin{equation}
958: \label{eq31}
959: \frac{d\vec {x}(t)}{dt}=-\hat {A}\vec {x}(t)+\vec {\eta }(t)+\vec {s}(t).
960: \end{equation}
961: The network connectivity matrix $\hat {A}$ can be related to connection 
962: matrix from the discrete-time model in (\ref{eq19}) through $\hat {C}=e^{-\hat 
963: {A}\tau }$ (see Appendix B). Noise is defined by its cross-correlation
964: \begin{equation}
965: \label{eq32}
966: \overline {\eta _i (t_1 )\eta _j (t_2 )} ={\cal N}_{ij} \delta (t_1 -t_2 ),
967: \end{equation}
968: where 
969: \begin{equation}
970: \label{eq33}
971: \hat {{\cal N}}=\left( {{\begin{array}{*{20}c}
972:  {\overline {\eta _1^2 } } \hfill & \hfill & \hfill & 0 \hfill \\
973:  \hfill & {\overline {\eta _2^2 } } \hfill & \hfill & \hfill \\
974:  \hfill & \hfill & {...} \hfill & \hfill \\
975:  0 \hfill & \hfill & \hfill & {\overline {\eta _N^2 } } \hfill \\
976: \end{array} }} \right)
977: \end{equation}
978: is a diagonal cross-correlogram of noise. Eqs. (\ref{eq31})-(\ref{eq33}) are analogous 
979: to the discrete-time case (\ref{eq19})-(\ref{eq21}). Similarly, we define the output 
980: scalar and the decision variable
981: \[
982: y=\vec {v}^T \cdot \vec {x}(t)
983: \]
984: \begin{equation}
985: \label{eq34}
986: d=H(y).
987: \end{equation}
988: Here $t$ is time when the system makes the decision.
989: 
990: Our model is thus defined by Eqs. (\ref{eq31})-(\ref{eq34}). We will now use definition 
991: (\ref{eq29}) to find decision makers. As in discrete-time case we need to know the 
992: variance of the output variable, $\sigma ^2(y)$, after which (\ref{eq29}) leads to 
993: \begin{equation}
994: \label{eq35}
995: DM_i =\overline {\eta _i^2 } \frac{\partial \sigma ^2(y)}{\partial \overline 
996: {\eta _i^2 } }
997: \end{equation}
998: Important for us is the time-dependent correlator
999: \begin{equation}
1000: \label{eq36}
1001: \hat {X}(t_1 ,t_2 )=\overline {\vec {x}(t_1 )\vec {x}^T(t_2 )} ,
1002: \end{equation}
1003: which we now evaluate. Solution of (\ref{eq31}) is obtained using matrix 
1004: exponentials 
1005: \begin{equation}
1006: \label{eq37}
1007: \vec {x}(t)=\int\limits_{-\infty }^t {dt'e^{\hat {A}(t'-t)}\left[ {\vec 
1008: {\eta }(t)+\vec {s}(t)} \right]} 
1009: \end{equation}
1010: If external stimulus is zero or a constant in time, due to (\ref{eq5}), the 
1011: correlator at $t_1 >t_2 $
1012: \begin{equation}
1013: \label{eq38}
1014: \hat {X}(t_1 ,t_2 )=\int\limits_{-\infty }^{t_2 } {dt'e^{\hat {A}(t'-t_1 
1015: )}\hat {{\cal N}}e^{\hat {A}^T(t'-t_2 )}} 
1016: \end{equation}
1017: We seek $\hat {X}(t_1 ,t_2 )$ in the form
1018: \begin{equation}
1019: \label{eq39}
1020: \hat {X}=e^{\hat {A}(t_2 -t_1 )}\hat {X}_0 ,
1021: \end{equation}
1022: where $\hat {X}_0 $ is equal-time cross-correlation. To find equation for 
1023: $\hat {X}_0 $ we differentiate (\ref{eq38}) as follows
1024: \begin{equation}
1025: \label{eq40}
1026: \begin{array}{l}
1027:  \frac{\partial \hat {X}}{\partial t_2 }=\hat {A}e^{\hat {A}(t_2 -t_1 )}\hat 
1028: {X}_0 = \\ 
1029:  =e^{\hat {A}(t_2 -t_1 )}\hat {{\cal N}}-\hat {X}\hat {A}^T \\ 
1030:  \end{array}
1031: \end{equation}
1032: We arrive thus to the following equation for $\hat {X}_0 $
1033: \begin{equation}
1034: \label{eq41}
1035: \hat {A}\hat {X}_0 +\hat {X}_0 \hat {A}^T=\hat {{\cal N}}
1036: \end{equation}
1037: This equation is the central tool for the continuous-time theory. The 
1038: contributions to DM from each unit are found by differentiating $\sigma 
1039: ^2(y)=\vec {v}^T\hat {X}_0 \vec {v}$ with respect to noise, as in Eq. (\ref{eq29})
1040: \begin{equation}
1041: \label{eq42}
1042: DM_i ={\cal N}_{ii} \vec {v}^T\frac{\partial \hat {X}_0 }{\partial {\cal 
1043: N}_{ii} }\vec {v}
1044: \end{equation}
1045: Once the same-time correlation matrix $\hat {X}_0 $ is found from Eq. 
1046: (\ref{eq41}), cross-correlation for arbitrary time is 
1047: \begin{equation}
1048: \label{eq43}
1049: \hat {X}(t_1 ,t_2 )=\left\{ {{\begin{array}{*{20}c}
1050:  {e^{\hat {A}(t_2 -t_1 )}\hat {X}_0 ,} \hfill & {t_1 \ge t_2 } \hfill \\
1051:  {\hat {X}_0 e^{\hat {A}^T(t_1 -t_2 )},} \hfill & {t_1 <t_2 } \hfill \\
1052: \end{array} }} \right.
1053: \end{equation}
1054: This equation suggests a helpful strategy for determining noise matrix $\hat 
1055: {{\cal N}}$. Indeed, (\ref{eq41}) and (\ref{eq43}) imply that
1056: \begin{equation}
1057: \label{eq44}
1058: \hat {{\cal N}}=\left. {\frac{\partial \hat {X}(t_1 ,t_2 )}{\partial t_1 }} 
1059: \right|_{t_1 =t_2 -\varepsilon } \left. {-\frac{\partial \hat {X}(t_1 ,t_2 
1060: )}{\partial t_1 }} \right|_{t_1 =t_2 +\varepsilon } 
1061: \end{equation}
1062: Here $\varepsilon $ is infinitesimally small positive number. In other 
1063: words, noise matrix is equal to discontinuity in time-derivative of 
1064: cross-correlation at $t_1 =t_2 $. Since noise correlation matrix is 
1065: diagonal, the non-zero elements are 
1066: \begin{equation}
1067: \label{eq45}
1068: \overline {\eta _i^2 } =\left. {\frac{\partial X_{ii} (t_1 ,t_2 )}{\partial 
1069: t_1 }} \right|_{t_1 =t_2 -\varepsilon } \left. {-\frac{\partial X_{ii} (t_1 
1070: ,t_2 )}{\partial t_1 }} \right|_{t_1 =t_2 +\varepsilon } 
1071: \end{equation}
1072: Two comments are in order here. First, noise term $\vec {\eta }(t)$ plays 
1073: the role of input noise in (\ref{eq31}). It cannot be measured directly. Equation 
1074: (\ref{eq44}) provides a way to single it out. Second, (\ref{eq44}) does not apply to the 
1075: discrete-time model. Indeed, in the latter we either have $t_1 =t_2 $, or 
1076: $t_1 =t_2 \pm 1$, etc., i.e. the condition $t_1 =t_2 \pm \varepsilon $ with 
1077: $\varepsilon $ infinitesimally small is hard to enforce. It may happen that 
1078: $\varepsilon \approx 1$ is acceptable due to presence of slow components in 
1079: the circuit, such as temporal integrators. However, in general case (\ref{eq44}) 
1080: cannot be applied to the discrete-time case. For instance, it fails 
1081: dramatically for the case of `nematode' chain considered above.
1082: 
1083: Equations (\ref{eq41}), (\ref{eq42}), and (\ref{eq44}) represent a useful set of tools to find 
1084: DM components for various connectivities. We present here two possible 
1085: cases, in which decision makers can be found. They differ in what is known 
1086: about the system.
1087: 
1088: \underline {\textbf{Scenario 1:}} Assume we know the network connectivity 
1089: $\hat {A}$, output metrics vector $\vec {v}$, and autocorrelation for each 
1090: unit $X_{ii} (t_1 ,t_2 )$. The steps below allow finding the 
1091: decision makers.
1092: 
1093: \begin{enumerate}
1094: \item Since noise matrix is diagonal, as per (\ref{eq33}), it can be found from autocorrelation using (\ref{eq44}). 
1095: \item Solving (\ref{eq41}) allows determining $\partial \hat {X}_0 /\partial {\cal N}{ }_{ii}$, the derivative of equal-time crosscorrelation with respect to noise in each element. 
1096: \item Decision makers are found from (\ref{eq42}).
1097: \item Normalize contributions to DM so that $\sum\limits_i {DM_i =1} $.
1098: \end{enumerate}
1099: 
1100: Scenario 1 does not require simultaneous measurements from all units. It 
1101: requires the knowledge of the network connectivity however. The next 
1102: scenario is complimentary in this respect.
1103: 
1104: \underline {\textbf{Scenario 2:}} Suppose we have measured the full 
1105: crosscorrelation matrix $\hat {X}(t_1 ,t_2 )$ by simultaneous recordings 
1106: from all units. Suppose also that we know how the output of the system is 
1107: evaluated (vector $\vec {v})$. These are the steps to determine DM units.
1108: 
1109: \begin{enumerate}
1110: \item Use (\ref{eq44}) to find noise matrix $\hat {{\cal N}}$.
1111: \item Use (\ref{eq41}) to find the connection matrix $\hat {A}$.
1112: \item Solve (\ref{eq41}) to calculate $\partial \hat {X}_0 /\partial {\cal N}{ }_{ii}$ for each element.
1113: \item Use (\ref{eq42}) to find decision makers. 
1114: \item Normalize contributions to DM so that $\sum\limits_i {DM_i =1} $.
1115: \end{enumerate}
1116: 
1117: Both scenarios use extensive knowledge about the system, which renders them 
1118: useless in experimental conditions. In the next subsection we discuss a way 
1119: to bypass these limitations.
1120: 
1121: Finally, we would like to provide solution to (\ref{eq41}) using eigenbasis of 
1122: matrix $\hat {A}$. Since $\hat {A}$ is not necessarily symmetric, a 
1123: distinction should be made between right and left eigenvectors. The latter 
1124: turn out to be useful for our purposes. They are defined by 
1125: \begin{equation}
1126: \label{eq46}
1127: \vec {\xi }_\alpha ^+ \hat {A}=\lambda _\alpha \vec {\xi }_\alpha ^+ .
1128: \end{equation}
1129: Here and below Greek indexes denote numbers of eigenvalues, while Latin ones 
1130: label spatial components of vectors and matrices. Solution of (\ref{eq41}) is
1131: 
1132: \begin{equation}
1133: \label{eq46_5}
1134: %X_{0ij} =\sum\limits_{{\begin{array}{*{20}c}
1135: % {\alpha \beta \gamma \delta } \hfill \\
1136: % {mn} \hfill \\
1137: %\end{array} }} {\frac{\xi _{i\alpha } \xi _{j\beta }^\ast \xi _{m\gamma 
1138: %}^\ast \xi _{n\delta } }{\lambda _\gamma +\lambda _\delta ^\ast }} \left( 
1139: %{G^{-1}} \right)_{\alpha \gamma } \left( {G^{-1}} \right)_{\beta \delta 
1140: %}^\ast {\cal N}_{mn} ,
1141: X_{0ij} =\sum\limits_{\alpha \beta \gamma \delta \\ \\mn} {\frac{\xi _{i\alpha } \xi _{j\beta }^\ast \xi _{m\gamma 
1142: }^\ast \xi _{n\delta } }{\lambda _\gamma +\lambda _\delta ^\ast }} \left( 
1143: {G^{-1}} \right)_{\alpha \gamma } \left( {G^{-1}} \right)_{\beta \delta 
1144: }^\ast {\cal N}_{mn} ,
1145: \end{equation}
1146: where
1147: \begin{equation}
1148: \label{eq47}
1149: G_{\alpha \beta } =\sum\limits_i {\xi _{i\alpha }^\ast \xi _{i\beta } } 
1150: \end{equation}
1151: is the Gram matrix of eigenvectors. Eq. (\ref{eq46_5}) is valid if the eigenvectors form
1152: a complete basis in the $N$-dimensional space. As follows from (\ref{eq46_5}), eigenvalues of 
1153: $\hat {A}$ with small real part contribute to DM in a large degree. This 
1154: justifies the use of principal component analysis when such eigenvalues are 
1155: present. An example of such principal component is the temporal integrator 
1156: loop in Figure 11B, which has vanishing $\lambda $. 
1157: 
1158: In case if matrix $\hat {A}$ is symmetric, its eigenvalues are real and 
1159: eigenvectors are orthogonal. This leads to a unit Gram matrix. Then, Eq. 
1160: (3.29) becomes more compact
1161: \begin{equation}
1162: \label{eq48}
1163: X_{0ij} =\sum\limits_{
1164:  {\alpha \beta } 
1165:  {mn} } {\frac{\xi _{i\alpha } \xi _{j\beta }^\ast \xi _{m\alpha 
1166: }^\ast \xi _{n\beta } }{\lambda _\alpha +\lambda _\beta }} {\cal N}_{mn} 
1167: \end{equation}
1168: Similar equations, called Kubo formulas, are obtained for various 
1169: correlators in case of diffusion of particles in random media 
1170: (Efetov, 1997). The distinguishing feature of (\ref{eq48}) is that 
1171: a product of four eigenvectors enters the expression. Thus, propagation of 
1172: noise in this case can be accompanied by interference between different 
1173: pathways. An example of destructive interference of this kind is given 
1174: below, in section \ref{stimulation}. 
1175: 
1176: Eq. (\ref{eq48}) can be further simplified. Indeed, our model uses diagonal noise 
1177: matrices, i.e. $n=m$ in (\ref{eq48}). Suppose also that the output from the 
1178: network occurs through one exit element number $i$, which is specified by 
1179: taking $\vec {v}=\hat {e}_i $. In this case the use of Eq. (\ref{eq42}) gives
1180: \begin{equation}
1181: \label{eq49}
1182: DM_i ={\cal N}_{nn} \sum\limits_{\alpha \beta } {\frac{\xi _{i\alpha } \xi 
1183: _{i\beta }^\ast \xi _{n\alpha }^\ast \xi _{n\beta } }{\lambda _\alpha 
1184: +\lambda _\beta }} .
1185: \end{equation}
1186: From this equation we conclude that for element $n$ to contribute to DM, an 
1187: eigenvector should exist, which is non-zero on both unit number $n$ and exit 
1188: unit $i$. Thus, we conclude that eigenvectors of $\hat {A}$ should be 
1189: delocalized for broader impact of elements on the decision. This is not 
1190: surprising in view of the mentioned analogy with the diffusion problem. In 
1191: case if matrix $\hat {A}$ is not symmetric, the Gram matrix may be non-diagonal 
1192: and (\ref{eq49}) cannot be used. However, the off-diagonal elements of $\hat {G}$ 
1193: are usually smaller than diagonal ones, due to uncorrelated sign changes, 
1194: when (\ref{eq47}) is computed with $\alpha \ne \beta $. Therefore, (\ref{eq49}) may 
1195: apply approximately. 
1196: 
1197: \section{Analysis using stimulation}
1198: \label{stimulation}
1199: 
1200: Stimulations with electric current add a new degree of freedom to DMA, thus 
1201: leading to more effective ways of finding decision makers. There are two 
1202: great advantages of the stimulation method. First, it only involves 
1203: stimulation of a single neuron, therefore no simultaneous multiple-electrode 
1204: measurements are required. Second, the knowledge of network connectivity is 
1205: not needed to solve the problem. In this section we study our simple 
1206: networks and find what stimulation strategies are consistent with our 
1207: earlier definitions, such as Eq. (\ref{eq14}). 
1208: 
1209: We will use continuous model for concreteness (section \ref{continuous}). Consider the 
1210: output variable $y$. It is a linear function of the inputs. It is also a 
1211: function, which contains noise components, variable from trial to trial. The 
1212: noise components were acquired from all units in different degree. Since 
1213: noise in each unit is gaussian, the output variable is described by gaussian 
1214: distribution too
1215: \begin{equation}
1216: \label{eq50}
1217: \rho (y)=\frac{1}{\sqrt {2\pi \sigma ^2(y)} }
1218: e^{-(y-\bar {y}(s))^2/2\sigma ^2(y)}
1219: \end{equation}
1220: In each trial a random value of $y$ is obtained, according to distribution 
1221: (\ref{eq50}). The response of the system is equal to 1 if $y$ is positive, and 
1222: 0 otherwise. The probability to obtain response equal to 1 to given 
1223: stimulus $s$ is given by the error function (Abramowitz and 
1224: Stegun, 1972)
1225: \begin{equation}
1226: \label{eq50_5}
1227: p_1 (s)=\int\limits_0^\infty {\rho (y)dy} =\frac{1}{2}\left[ 
1228: {1+\mbox{erf}\left( {\frac{\bar {y}(s)}{\sigma (y)\sqrt 2 }} \right)} 
1229: \right],
1230: \end{equation}
1231: whereas the probability of zero response is 
1232: \begin{equation}
1233: \label{eq51}
1234: p_0 (s)=\int\limits_{-\infty }^0 {\rho (y)dy} =\frac{1}{2}\left[ 
1235: {1-\mbox{erf}\left( {\frac{\bar {y}(s)}{\sigma (y)\sqrt 2 }} \right)} \right].
1236: \end{equation}
1237: Both probabilities depend upon the mean response to stimulus $\bar {y}(s)$ 
1238: and the standard deviation $\sigma (y)$. Therefore the electric stimulation 
1239: strategies may be based on affecting either the former or the latter. We now 
1240: consider both of these strategies and show that affecting the mean response 
1241: may provide misleading results, while changing the variance of response 
1242: allows estimating contributions to DM consistently with our previous 
1243: definitions. Thus, strategies of stimulation based on standard deviation of 
1244: the output variable are \textit{always} correct in our simple model, independently on the 
1245: topology of the network. This may seem a trivial consequence of definition 
1246: (\ref{eq14}), but we will discuss it here for the sake of comparison of two 
1247: strategies and optimizing them.
1248: 
1249: We start with the strategies of stimulation, which affect the mean response 
1250: $\bar {y}(s)$. In our simple model this may be accomplished by injecting a 
1251: tonic input current into a unit number $i$. Mathematically it is 
1252: accomplished by adding extra stimulus $s_i $ to this unit in Eq. (\ref{eq31}). 
1253: Note that in biological systems the stimulating current is alternating with 
1254: constant amplitude (Salzman et al., 1992). The 
1255: mean response is shifted by the stimulation, i.e.
1256: \begin{equation}
1257: \label{eq52}
1258: \Delta \bar {y}=\frac{\partial \bar {y}}{\partial s_i }s_i ,
1259: \end{equation}
1260: where $s_i $ is the magnitude of injected tonic current. This leads to 
1261: observable changes in the probability $p_1 $
1262: \begin{equation}
1263: \label{eq53}
1264: \Delta p_1 (i)=\frac{\partial p}{\partial \bar {y}}\frac{\partial \bar 
1265: {y}}{\partial s_i }s_i .
1266: \end{equation}
1267: Here $\Delta p_1 (i)$ is the change in probability of correct responses 
1268: after unit number $i$ is electrically stimulated. Can $\Delta p_1 (i)$ be a 
1269: measure of DM? 
1270: 
1271: We notice that $\Delta p_1 (i)$ can be either positive or negative. This 
1272: depends on the sign of derivative $\partial \bar {y}/\partial s_i $, which 
1273: is positive for excitatory pathway from unit $i$ to the output and negative 
1274: for inhibitory pathway. Since contribution to DM ought to be positive, we 
1275: cannot assume simply that $DM_i \sim \Delta p_1 (i)$. The correct 
1276: expression, which we provide here without derivation is 
1277: \begin{equation}
1278: \label{eq54}
1279: DM_i \sim \overline {\eta _i^2 } \left[ {\Delta p_1 (i)} \right]^2.
1280: \end{equation}
1281: This equation is understood in proportional sense, since $DM_i $ should be 
1282: normalized to ensure that $\sum\limits_i {DM_i } =1$. Our 
1283: investigations show that this expression is accurate for trees and is 
1284: consistent with both our earlier definitions (\ref{eq9}) or (\ref{eq14}). Remarkably, it 
1285: employs quantities, which can be measured in a single-electrode experiment. 
1286: Indeed, the amplitude of noise $\overline {\eta _i^2 } $ can be found from 
1287: autocorrelation of unit's response, using (\ref{eq44}); and $\Delta p_1 (i)$ is 
1288: determined from behavioral changes in response to single-unit stimulation. 
1289: This equation thus provides an approach potentially useful in practice. Does 
1290: this relationship work for networks of arbitrary connectivity? 
1291: 
1292: Figure 12B shows a counterexample, in which a unit is stimulated, which 
1293: results in \textit{no} change in probability of correct response (we consider trials in which 1 is the correct response throughout this section). 
1294: This is because there are two pathways, leading from this unit to the exit, one positive and one 
1295: negative. They have equal strength, and, therefore, compensate each other. 
1296: On the other hand, unit number one \textit{does} participate is DM, because if a 
1297: non-stationary stimulation/stimulus is applied, its effect on the decision 
1298: is not zero. Thus, (\ref{eq54}) and tonic stimulation method cannot be applied to 
1299: arbitrary circuits, such as shown in Figure 12B, to accurately reveal 
1300: decision makers.
1301: 
1302: 
1303: \begin{figure}[htbp]
1304: \centerline{\includegraphics[width=3.0in]{dma15.eps}}
1305: \centerline{\includegraphics[width=3.0in]{dma16.eps}}
1306: %\centerline{\includegraphics[width=2.67in]{dma17.eps}}
1307: %\centerline{\includegraphics[width=1.76in]{dma18.eps}}
1308: \caption{Finding decision makers using electric stimulation. 
1309: \textbf{A} and \textbf{B}, tonic stimulation; \textbf{C}, random 
1310: stimulation. \textbf{A}, tonic stimulation for trees results in shift in 
1311: probability, which leads to correct estimation of decision making units. 
1312: \textbf{B}, example of a circuit for which tonic stimulation leads to 
1313: incorrect estimation of decision making, since it does not lead to the shift 
1314: in probability. \textbf{C}, stimulation with a random current leads to 
1315: correct estimate of decision making for networks with \textit{any} connectivity. 
1316: \textbf{D}, for optimal performance in random stimulation paradigm, the task 
1317: should be set so that the probability of correct responses is close to 
1318: $p_{optimal} \approx 0.84$.
1319: }
1320: \label{fig12}
1321: \end{figure}
1322: 
1323: 
1324: 
1325: Is there a stimulation method for finding decision making components in 
1326: arbitrary networks? The method follows directly from the definition (\ref{eq14}) 
1327: [or (\ref{eq35}), which is equivalent]. Indeed, when stimulating current is a 
1328: temporal white noise, the output variable $y$ acquires a larger variance 
1329: (Figure 12C). Hence, the derivative of output variance, entering (\ref{eq35}) can 
1330: be calculated operationally, by injecting a distracter current. More 
1331: precisely, if the variance of stimulating current applied to unit $i$ is 
1332: $\overline {s_i^2 } $ the derivative entering definition (\ref{eq35}) is 
1333: \begin{equation}
1334: \label{eq55}
1335: \frac{\partial \sigma ^2(y)} {\partial \overline {\eta _i^2 } } = 
1336: \frac{\Delta \sigma^2 (y)} {\overline {s_i^2 } }.
1337: \end{equation}
1338: In practice one has no access to the variable $y$, so one cannot measure 
1339: directly the change in variance $\Delta \sigma^2 (y)$. Instead, one could 
1340: measure the change in the probability of correct responses under the 
1341: influence of distracting current. Indeed, from (\ref{eq50_5}) we obtain
1342: \begin{equation}
1343: \label{eq56}
1344: \Delta p_1 (i) = \frac{\partial p_1 } {\partial \sigma^2 (y) } \Delta \sigma^2 (y)
1345: \end{equation}
1346: Combining the last two equations we obtain for the important derivative
1347: \begin{equation}
1348: \label{eq57}
1349: \frac{\partial \sigma ^2(y)}{\partial \overline {\eta _i^2 } }=\frac{\Delta 
1350: p_1 (i)}{\overline {s_i^2 } }\left( {\frac{\partial p_1 }{\partial \sigma 
1351: ^2(y)}} \right)^{-1}
1352: \end{equation}
1353: Since the probability of correct responses always decreases under the 
1354: influence of distracters, the derivative $\partial p_1 /\partial \sigma 
1355: ^2(y)$ is a negative constant. It is the same for all units. We arrive 
1356: therefore to the expression for contributions to DM, which follows from 
1357: (\ref{eq35})
1358: \begin{equation}
1359: \label{eq58}
1360: DM_i \sim -\overline {\eta _i^2 } \frac{\Delta p_1 (i)}{\overline {s_i^2 } 
1361: }
1362: \end{equation}
1363: Here $\Delta p_1 (i)$ is the decrease in probability of correct responses 
1364: produced by electric stimulation with variance of the random current equal 
1365: to $\overline {s_i^2 } $. The variance of noise on each unit $\overline {\eta 
1366: _i^2 } $ can be found from autocorrelation using (\ref{eq45}). This procedure 
1367: works for any topology in our simplified model. It should be noted here that 
1368: if noise is not entirely white or cannot be considered white, (\ref{eq45}) cannot 
1369: be used directly and should be replaced by an expression reflecting the 
1370: spectral characteristics of noise appropriate for the system under 
1371: investigation. Thus, if noise is provided by other parts of the network, its 
1372: dynamic features may be more complex. Therefore, (\ref{eq45}) may not apply 
1373: directly to the `hidden pathway' example given in the end of section \ref{conclusions}.
1374: 
1375: The procedure, which we just described, permits further optimization. 
1376: Indeed, imagine that the probability of correct responses is exactly 
1377: $\raise.5ex\hbox{$\scriptstyle 1$}\kern-.1em/ 
1378: \kern-.15em\lower.25ex\hbox{$\scriptstyle 2$} $. Adding distracting 
1379: stimulation current will not change this probability, i.e. $\Delta p_1 
1380: (i)=0$ no matter what unit is stimulated. In the opposite limiting case when 
1381: $p_1 \approx 1$, the effect of distracter on performance is exponentially 
1382: small. Hence, behavioral response to stimulation has an optimum between $p_1 
1383: =1/2$ and $1$. To find the optimum we observe from (\ref{eq56}) that $\Delta p_1 $ 
1384: is maximum for the same variation in $\Delta \sigma ^2(y)$ when $\partial 
1385: p_1 /\partial \sigma ^2(y)$ is maximum. We therefore plot the latter 
1386: derivative as a function of $p_1 $ in Figure 12D. We indeed observe a 
1387: maximum at the value of probability of correct responses close to
1388: \begin{equation}
1389: \label{eq59}
1390: p_{optimal} \approx 0.841
1391: \end{equation}
1392: To summarize, the following scenario describes algorithm for finding 
1393: contributions to DM using random stimulation.
1394: 
1395: \underline {\textbf{Scenario 3:}} Assume that we \textit{do not know} the network connectivity 
1396: $\hat {A}$ and output metrics vector $\vec {v}$; but we know autocorrelation 
1397: for each unit $X_{ii} (t_1 ,t_2 )$. The steps below allow finding the 
1398: decision makers.
1399: 
1400: \begin{enumerate}
1401: \item Prepare stimulus so that the probability of correct responses is close to the value given by (\ref{eq59}).
1402: \item Stimulate one unit with random current, whose variance is $\overline {s_i^2 } $, and measure the decrease in probability of correct responses $\Delta p_1 $.
1403: \item Record autocorrelation and evaluate noise variance $\overline {\eta _i^2 } $ for this unit using (\ref{eq45}). 
1404: \item Find contribution to DM for this unit using equation (\ref{eq58}).
1405: \item Repeat steps 1 through 4 for all units in the system.
1406: \item Normalize contributions to DM so that $\sum\limits_i {DM_i =1} $.
1407: \end{enumerate}
1408: 
1409: \section{Discussion}
1410: \label{discussion}
1411: 
1412: In this work we defined decision makers in networks, which behave in a well-defined fashion.
1413: As with any definition, there is certain degree of arbitrariness in our study, 
1414: since this is the first mathematical study of this sort.
1415: We had to make choices about the features of decision making we 
1416: were attempting to describe as well as about the way they were quantified. 
1417: We demonstrated these features in a set of examples.
1418: Future studies will show if these features can be used as a basis of a more complete 
1419: model-independent theory.  
1420: 
1421: 
1422: In this study we postulated that variability and noise, causally linked to decisions, 
1423: are the chief descriptors of DM. Although this point may seem paradoxical we 
1424: suggest three arguments in its favor. First, variability may reflect 
1425: additional information needed to make a decision in case of uncertainty. 
1426: Such may be inputs from other modalities, memories, or some other relevant 
1427: modulatory inputs, supplying e.g. emotional condition of the subject 
1428: or changing utility values (Figure 8). Second, many behaviors, such as C-start 
1429: escape responses in fish (Eaton and Emberley, 1991) and 
1430: other organisms (Glimcher, 2003), have stochastic character. 
1431: This makes the task of pursuer more difficult.
1432: Such unpredictable behaviors are reproduced in our model if the sensory input is weak 
1433: or in the small signal-to-noise ratio case. Third, 
1434: the goal of DM is to dissipate sensory information, as suggested in the 
1435: introduction (Figure 1A), whereby an analog multi-dimensional stimulus space 
1436: is reduced to a discrete space of several decisions. We argue that 
1437: this transformation is facilitated by noise.
1438: 
1439: We have studied the problem of finding decision making units in 
1440: networks of various connectivities. This path took us from simple linear chains, for which 
1441: the information-theoretical (IT) approach was found to be effective, to trees, and, finally to   
1442: an alternative definition of decision makers, based on propagation of noise in networks. 
1443: This latter definition is valid in networks of arbitrary topology. 
1444: All these approaches are equivalent,
1445: when they can be compared, but include progressively broader classes of networks.
1446: As a practical application for the alternative definition we 
1447: considered the problem of electric stimulation in the surrogate networks and 
1448: showed a way of determining DM contributions for arbitrary networks using 
1449: stimulation with random current. Our findings are summarized in Figures 13 and 14. 
1450: 
1451: 
1452: \begin{figure}[htbp]
1453: \centerline{\includegraphics[width=3.2in]{dma19.eps}}
1454: \caption{
1455: The cluster of problems covered in this study. 
1456: Solid/dashed arrows show the derivations performed here/yet to be confirmed 
1457: or denied. IT stands for information-theoretical.
1458: }
1459: \label{fig13}
1460: \end{figure}
1461: 
1462: Although we studied networks of complex connectivity,
1463: the model describing a single network element was quite simple. 
1464: Not all of the units are linear, of course, since DM is a non-linear task 
1465: (Figure 1A). However, our model is essentially based on linear elements. The 
1466: motivation for this model is that it is easy to analyze. The study of 
1467: simple models is a necessary step before analysis proceeds any further. 
1468: Once the methodological issues are resolved for simpler models, complex 
1469: non-linear systems can be studied in the same paradigm. 
1470: One of important questions resolved here is that a completely linear element 
1471: can be a decision maker, despite the presence of non-linear units in the 
1472: network. Thus, nonlinearity is not a necessary attribute of DM. This question 
1473: would be impossible to answer for more realistic system, since in practice 
1474: all units contain nonlinearities. 
1475: 
1476: %In this study we obtained identities of decision makers using correlations 
1477: %with the response. An alternative approach would be to consider correlations 
1478: %with sensory inputs and to surmise that the points, where such correlations 
1479: %disappear are responsible for making decisions. It is clear, however, that 
1480: %some pathway between the unit and motor response is necessary for the unit 
1481: %to impact DM. If such pathway exists, our analysis can be applied. If the 
1482: %pathway does not exist, the approach, involving sensory correlations 
1483: %produces misleading results. On the other hand, it is possible that a 
1484: %combined sensory-motor approach would be more useful. This question warrants 
1485: %further investigation. 
1486: 
1487: \begin{figure}[htbp]
1488: \centerline{\includegraphics[width=3.2in]{dma20.eps}}
1489: \caption{Comparison between different approaches studied here.}
1490: \label{fig20}
1491: \end{figure}
1492: 
1493: 
1494: 
1495: 
1496: 
1497: Decision making task, as formulated in Figure 1A, is similar to general 
1498: object discrimination task. Representation of motor response in our model is 
1499: not distinguishable mathematically from the representation of abstract 
1500: object/decision category (Horwitz and 
1501: Newsome, 1998; Shadlen and Newsome, 2001). The latter does not necessarily 
1502: lead to a motor command. Thus, our analysis may uncover the identities of 
1503: units responsible for categorization of sensory inputs. In terms of this 
1504: analysis we emphasize the distinction between units representing the object 
1505: category and the units in which this representation is actually formed. The 
1506: former are analogous to motor units in the decision task, while the latter 
1507: are similar to decision makers. As follows from this study, the analysis is 
1508: dependent upon the topology of the network involved. For simple linear 
1509: sensory chains our conclusion is that the \textit{first} unit, spatially or temporally, in 
1510: which the representation of the object is correlated with final outcome of 
1511: the discrimination process, is responsible for casting the stimulus in one 
1512: of the abstract classes. In case of recurrent networks a more detailed 
1513: quantitative analysis is needed to draw conclusions about identities of 
1514: categorizing units. Thus, DMA may find a broader use in identifying units 
1515: representing abstract object's percepts. 
1516: 
1517: A special care should be taken in distinguishing the DM task from the 
1518: sensory discrimination task. It may occur that in the same experiment these 
1519: tasks are performed by different populations of neurons. An example is given 
1520: by (Salinas and Romo, 1998). They discovered a population of M1 
1521: neurons responding differentially to two categories of tactile stimuli. Some 
1522: of these neurons did not respond, when the same behavior was guided by 
1523: visual cues. This observation is consistent with these neurons performing 
1524: sensory discrimination of tactile stimuli, 
1525: while some other population making decisions about the actual motor response.  
1526: Our mathematical analysis is general enough to include both 
1527: of these functions. Thus, if correlations with motor response are 
1528: studied, it will result in the decision makers; while when the correlations 
1529: with percepts are investigated, DMA should provide the identities of 
1530: discriminating elements.
1531: 
1532: We suggest that DMA may be relevant to other biological systems. 
1533: Possible applications may include the analysis of molecular networks, such 
1534: as genetic regulatory or protein binding networks; finding decision makers 
1535: in compartmental models of dentritic trees (Poirazi and Mel, 
1536: 2001); studies of neural networks and structural networks of connectivities 
1537: between different brain areas; and analysis of social networks. 
1538: 
1539: 
1540: 
1541: 
1542: \section{Conclusion}
1543: \label{gen_conclusion}
1544: 
1545: In this study we define network elements responsible for making decisions. 
1546: We obtain two equivalent definitions. According to one, decisions are made 
1547: by elements, in which correlations with the decision are first formed. 
1548: According to the second definition, decision making activity is measured by 
1549: the impact of variability in given unit on the response. We give examples of 
1550: network motifs, especially potent from decision making prospective, such as 
1551: fan-out hubs and recurrent loops. The latter can function as temporal 
1552: integrators of sensory inputs. We also study how electric stimulations can 
1553: reveal decision making components. We conclude that stimulations with 
1554: time-varying random current produce correct results for all network 
1555: topologies. 
1556: 
1557: 
1558: 
1559: 
1560: \appendix
1561: 
1562: \section{The linear chain model.}
1563: \label{appendixa}
1564: 
1565: Here we solve a more general version of linear chain model than considered 
1566: in the text. The responses of neighboring neurons are related linearly 
1567: \begin{equation}
1568: x_i =C_{i-1} x_{i-1} +\eta _i
1569: \label{Eqa1}
1570: \end{equation}
1571: This is a generalization of (\ref{eq1}). The response of the $n$th unit is 
1572: \begin{equation}
1573: x_n =\sum\limits_{i=1}^n {\alpha _{ni}\eta _i } +\alpha _{n0}x_0
1574: \label{Eqa2}
1575: \end{equation}
1576: where coefficients $\alpha _{ni} =C_{n-1} C_{n-2} \ldots C_i $, $\alpha 
1577: _{nn} =1$. The external signal $x_0 $ is assumed to be zero in this 
1578: appendix, due to (\ref{eq5}). For the last element in the chain we have 
1579: \begin{equation}
1580: x_N =\sum\limits_{i=1}^N {\alpha _{Ni}\eta _i } .
1581: \label{Eqa3}
1582: \end{equation}
1583: 
1584: 
1585: Comparing (\ref{Eqa2}) and (\ref{Eqa3}) we conclude that 
1586: \begin{equation}
1587: x_N =\alpha _{Nn} x_n +\xi ,
1588: \label{Eqa4}
1589: \end{equation}
1590: where $\xi $ is a variable, which describes noise in the networks downstream 
1591: from unit $n$. It is, thus, uncorrelated with $x_n $. This is where 
1592: tree-like topology enters our solution, since in case of loops, $x_n $ and 
1593: $\xi $ are correlated. Our goal now is to calculate MI between the decision 
1594: variable $d=H(x_N )$ and $x_n $. We will use the definition for MI 
1595: \begin{equation}
1596: MI(d,x_n )=\sum\limits_{d=0,1} {\int\limits_{-\infty }^\infty {dx_n \rho 
1597: \left( {d,x_n } \right)\log _2 \left[ {\frac{\rho \left( {d,x_n } 
1598: \right)}{\rho (d)\rho (x_n )}} \right]} }
1599: \label{Eqa6}
1600: \end{equation}
1601: Here $\rho \left( d \right)=1/2$, since there is no signal;
1602: \begin{equation}
1603: \rho \left( {x_n } \right)=\exp \left( {-x_n^2 /2\overline {x_n^2 } } 
1604: \right)/\left( {2\pi \overline {x_n^2 } } \right)^{1/2}
1605: \label{Eqa8}
1606: \end{equation}
1607: and
1608: \begin{equation}
1609: \rho (d;x_n )=\frac{\rho (x_n )}{2}\left[ {1\pm \mbox{erf}\left( 
1610: {\frac{\alpha _{Nn} x_n }{\sigma \left( \xi \right)\sqrt 2 }} \right)} 
1611: \right].
1612: \label{Eqa9}
1613: \end{equation}
1614: The upper/lower sign is assumed for $d=0$ or $1$ in (\ref{Eqa9}); $\sigma (\xi )$ 
1615: is the standard deviation of Gaussian variable $\xi $ defined in (\ref{Eqa4}). The 
1616: expression for MI (\ref{Eqa6}) results in 
1617: \begin{equation}
1618: \begin{array}{l}
1619:  MI_n =M(s_n ) \\ 
1620:  M(s_n )=\frac{1}{\sqrt \pi }\int\limits_{-\infty }^\infty {dze^{-z^2}\left[ 
1621: {1+\mbox{erf}\left( {zs_n } \right)} \right]} \log _2 \left[ 
1622: {1+\mbox{erf}\left( {zs_n } \right)} \right] \\ 
1623:  s_n =\sigma (\alpha _{Nn} x_n )/\sigma (\xi ). \\ 
1624:  \end{array}
1625: \label{Eqa11}
1626: \end{equation}
1627: MI is therefore a function of signal-to-noise ratio $s_n$. Inversely, 
1628: \begin{equation}
1629: s_n^2 =\frac{\alpha _{Nn}^2 \overline {x_n^2 } }{\overline {\xi ^2} }=\left[ 
1630: {M^{-1}(MI_n )} \right]^2
1631: \label{Eqa12}
1632: \end{equation}
1633: On the other hand, (\ref{Eqa4}) leads to 
1634: \begin{equation}
1635: \overline {x_N^2 } =\alpha _{Nn}^2 \overline {x_n^2 } +\overline {\xi ^2}
1636: \label{Eqa13}
1637: \end{equation}
1638: Solving (\ref{Eqa12}) and (\ref{Eqa13}) with respect to $\alpha _{Nn}^2 \overline {x_n^2 }$ we have 
1639: \begin{equation}
1640: \frac{\alpha _{Nn}^2 \overline {x_n^2 } }{\overline {x_N^2 } }=\frac{\left[ 
1641: {M^{-1}(MI_n )} \right]^2}{1+\left[ {M^{-1}(MI_n )} \right]^2}\equiv F(MI_n 
1642: )
1643: \label{Eqa14}
1644: \end{equation}
1645: Function $M^{-1}$ here is inverse to $M$ defined in (\ref{Eqa11}). Function $F(MI)$ 
1646: numerically calculated from (\ref{Eqa11}) and (\ref{Eqa14}) is shown in Figure 5. Lastly, 
1647: we recall that variances $\alpha _{Nn}^2 \overline {x_n^2 } $ are related to 
1648: the strength of noise $\overline {\eta _i^2 } $ through (\ref{Eqa2}). We have 
1649: \begin{equation}
1650: \displaystyle
1651: \alpha _{Nn}^2 \overline {x_n^2 } =\sum\limits_{i=1}^n {\alpha _{Ni}^2 
1652: \overline {\eta _i^2 } }
1653: \label{Eqa15}
1654: \end{equation}
1655: Eqs. (\ref{Eqa14}) and (\ref{Eqa15}) are used below to prove a variety of statements about 
1656: function $F(MI)$ used in the main text.
1657: 
1658: \subsection{In the uniform noise example $F(MI)$ is a linear function of position in the chain.} 
1659: 
1660: In this case $C_1 =\ldots =C_{N-1} =1$, 
1661: and, consequently, $\alpha _{N1} =\ldots =\alpha _{NN} =1$. Noise variance 
1662: is the same on every node, i.e. $\overline {\eta _i^2 } \equiv \eta^2 $. 
1663: As follows from (\ref{Eqa15}) $\overline {x_n^2 } =\eta ^2n$, which results in 
1664: \begin{equation}
1665: F(MI_n )=n/N
1666: \label{Eqa16}
1667: \end{equation}
1668: It follows that contributions to DM defined by (\ref{eq9}) are the same for all 
1669: units.
1670: 
1671: \subsection{In the `loud' neuron example the contributions of units upstream 
1672: from the strong link are larger by a factor of $K^2$ than 
1673: contribution from the downstream units.} 
1674: 
1675: 
1676: In this case $\alpha _{1...k} =K$, 
1677: while $\alpha _{k+1...N} =1$, assuming that the link from unit $k$ to $k+1$ 
1678: is strengthened. In the example in the text $k=5$ [cf. (\ref{eq10})]. Eq. (\ref{Eqa15}) 
1679: leads us to the values for variances of responses
1680: \begin{equation}
1681: \displaystyle
1682: \alpha _{Nn}^2 \overline {x_n^2 } =\left\{ {{\begin{array}{*{20}c}\displaystyle
1683:  {\displaystyle \eta ^2K^2n,} \hfill & {n\le k} \hfill \\
1684:  {\eta ^2K^2k+\eta ^2\left(\displaystyle {n-k} \right),} \hfill & {n>k} \hfill \\
1685: \end{array} }} \right.
1686: \label{Eqa17}
1687: \end{equation}
1688: Applying (\ref{Eqa14}) we obtain the expression for $F(MI)$
1689: \begin{equation}
1690: F(MI_n )=\left\{ {{\begin{array}{*{20}c}\displaystyle
1691:  {\displaystyle \frac{K^2n}{N-k+K^2k},} \hfill & {n\le k} \hfill \\
1692:  {\displaystyle \frac{n-k+K^2k}{N-k+K^2k},} \hfill & {n>k} \hfill \\
1693: \end{array} }} \right.,
1694: \label{Eqa18}
1695: \end{equation}
1696: which is a piece-wise linear function of $n$. Eq.~(\ref{eq9}) determines 
1697: contributions to DM as
1698: \begin{equation}
1699: DM_n =\left\{ {{\begin{array}{*{20}c}
1700:  {\displaystyle \frac{K^2}{N-k+K^2k},} \hfill & {n\le k} \hfill \\
1701:  {\displaystyle \frac{1}{N-k+K^2k},} \hfill & {n>k} \hfill \\
1702: \end{array} }} \right.
1703: \label{Eqa19}
1704: \end{equation}
1705: This confirms that the upstream units ($n\le k)$ are $K^2$ times more potent 
1706: than the downstream ones ($n>k)$. 
1707: 
1708: \subsection{Two definitions of contribution to DM using derivative of 
1709: $F(MI)$ (\ref{eq9}) and the impact of noise (\ref{eq14}) are equivalent. }
1710: 
1711: Let 
1712: us start by determining decision makers from definition (\ref{eq14}). According to 
1713: (\ref{Eqa3})
1714: \begin{equation}
1715: \overline {x_N^2 } =\sum\limits_{i=1}^N {\alpha _{Ni}^2 \overline {\eta _i^2 
1716: } } .
1717: \label{Eqa20}
1718: \end{equation}
1719: Definition (\ref{eq14}) gives
1720: \begin{equation}
1721: DM_i \propto \overline {\eta _i^2 } \frac{\partial \overline {x_N^2 } 
1722: }{\partial \overline {\eta _i^2 } }=\alpha _{Ni}^2 \overline {\eta _i^2 } .
1723: \label{Eqa21}
1724: \end{equation}
1725: After normalization we obtain
1726: \begin{equation}
1727: DM_i =\frac{\alpha _{Ni}^2 \overline {\eta _i^2 } }{\overline {x_N^2 } }.
1728: \label{Eqa22}
1729: \end{equation}
1730: Let us derive the same result from (\ref{eq9}). As follows from (\ref{Eqa14}) 
1731: \begin{equation}
1732: \begin{array}{l}
1733: \displaystyle F(MI_n )-F(MI_{n-1} )=\frac{1}{\overline {x_N^2 } }\left( {\alpha _{Nn}^2 
1734: \overline {x_n^2 } -\alpha _{N\mbox{ }n-1}^2 \overline {x_{n-1}^2 } } 
1735: \right) \\ 
1736:  \displaystyle =\frac{\alpha _{Nn}^2 }{\overline {x_N^2 } }\left( {\overline {x_n^2 } 
1737: -C_{n-1}^2 \overline {x_{n-1}^2 } } \right)=\frac{\alpha _{Nn}^2 \overline 
1738: {\eta _n^2 } }{\overline {x_N^2 } } \\ 
1739:  \end{array}
1740: \label{Eqa23}
1741: \end{equation}
1742: This proves the equivalence of (\ref{eq9}) and (\ref{eq14}), since the result is 
1743: identical to (\ref{Eqa22}).
1744: 
1745: \section{Connection between discrete- and continuous-time models.}
1746: \label{appendixb}
1747: 
1748: In this section we show that the discrete-time model can be derived from 
1749: continuous-time model. Starting from equation (\ref{eq37}) for the unit responses 
1750: in the continuous case we obtain the relation for solutions at two different 
1751: time points separated by the time-interval $\tau$, analogous to (\ref{eq19}) in 
1752: the discrete-time description. Then we show that in the limiting case $\tau \to 0$ two descriptions are equivalent.
1753: 
1754: From (\ref{eq37}) we obtain
1755: \begin{equation}
1756: \vec {x}(t+\tau )=e^{-\hat {A}\tau }\vec {x}(t)+\int\limits_t^{t+\tau } 
1757: {e^{-\hat {A}(t+\tau -{t}')}\vec {\eta }({t}')d{t}'}.
1758: \label{Eqb1}
1759: \end{equation}
1760: This equation can be rewritten as $\vec {x}(t+\tau )=\hat {C}\vec {x}(t)+{\vec {\eta }}'(t)$, 
1761: where 
1762: \begin{equation}
1763: \hat {C}=e^{-\hat {A}\tau }\approx \hat {I}-\hat {A}\tau .
1764: \label{Eqb2}
1765: \end{equation}
1766: Thus it has the same form as (\ref{eq19}). Using (\ref{eq32}) we obtain that the new noise cross-correlation matrix
1767: \begin{equation}
1768: {\hat {{\cal N}}}'=\int\limits_0^\tau {e^{-\hat {A}{t}'}\hat {{\cal 
1769: N}}e^{-\hat {A}^T{t}'}d{t}'} .
1770: \label{Eqb3}
1771: \end{equation}
1772: The solution of the continuous-time problem satisfies the equations of the 
1773: discrete-time model for an arbitrarily large time interval $\tau $, but 
1774: the new noise cross-correlation matrix ${\hat {{\cal N}}}'$ is non-diagonal 
1775: in this case. In the limiting case $\tau \to 0$ it becomes 
1776: diagonal. Indeed (\ref{Eqb3}) implies that in this limit
1777: \begin{equation}
1778: {\hat {{\cal N}}}'=\hat {{\cal N}}\tau \quad ,
1779: \label{Eqb4}
1780: \end{equation}
1781: which is diagonal by the definition of the continuous-time model. 
1782: Here we kept only terms linear in $\tau $. Thus, in this limit the matrix ${\hat {{\cal N}}}'$ is diagonal as needed in our 
1783: formulation of discrete-time model. One can also derive (\ref{eq41}) from (\ref{eq28}) 
1784: using (\ref{Eqb2}) and (\ref{Eqb4}) and taking the limit $\tau \to 0$.
1785: 
1786: 
1787: 
1788: \begin{thebibliography}{99}
1789: \small
1790: 
1791: 
1792: \bibitem{1}
1793: Abramowitz M, Stegun IA (1972) Handbook of mathematical functions with 
1794: formulas, graphs, and mathematical tables, 10th printing, 1972, with 
1795: corrections. Edition. Washington: U.S. Govt. Print. Off.
1796: 
1797: \bibitem{2}
1798: Eaton RC, Emberley DS (1991) How stimulus direction determines the 
1799: trajectory of the Mauthner-initiated escape response in a teleost fish. J 
1800: Exp Biol 161:469-487.
1801: 
1802: \bibitem{3}
1803: Efetov K (1997) Supersymmetry in disorder and chaos. Cambridge [England] ; 
1804: New York: Cambridge University Press.
1805: 
1806: \bibitem{4}
1807: Glimcher PW (2003) The neurobiology of visual-saccadic decision making. Annu 
1808: Rev Neurosci 26:133-179.
1809: 
1810: \bibitem{5}
1811: Gold JI, Shadlen MN (2002) Banburismus and the brain: decoding the 
1812: relationship between sensory stimuli, decisions, and reward. Neuron 
1813: 36:299-308.
1814: 
1815: \bibitem{6}
1816: Horwitz GD, Newsome WT (1998) Neurophysiology: sensing and categorizing. 
1817: Curr Biol 8:R376-378.
1818: 
1819: \bibitem{7}
1820: Mazurek ME, Roitman JD, Ditterich J, Shadlen MN (2003) A role for neural 
1821: integrators in perceptual decision making. Cereb Cortex 13:1257-1269.
1822: 
1823: \bibitem{8}
1824: Poirazi P, Mel BW (2001) Impact of active dendrites and structural 
1825: plasticity on the memory capacity of neural tissue. Neuron 29:779-796.
1826: 
1827: \bibitem{9}
1828: Robinson DA (1989) Integrating with neurons. Annu Rev Neurosci 12:33-45.
1829: 
1830: \bibitem{10}
1831: Roitman JD, Shadlen MN (2002) Response of neurons in the lateral 
1832: intraparietal area during a combined visual discrimination reaction time 
1833: task. J Neurosci 22:9475-9489.
1834: 
1835: \bibitem{11}
1836: Romo R, Salinas E (2003) Flutter discrimination: neural codes, perception, 
1837: memory and decision making. Nat Rev Neurosci 4:203-218.
1838: 
1839: \bibitem{12}
1840: Salinas E, Romo R (1998) Conversion of sensory signals into motor commands 
1841: in primary motor cortex. J Neurosci 18:499-511.
1842: 
1843: \bibitem{13}
1844: Salzman CD, Murasugi CM, Britten KH, Newsome WT (1992) Microstimulation in 
1845: visual area MT: effects on direction discrimination performance. J Neurosci 
1846: 12:2331-2355.
1847: 
1848: \bibitem{14}
1849: Shadlen MN, Newsome WT (2001) Neural basis of a perceptual decision in the 
1850: parietal cortex (area LIP) of the rhesus monkey. J Neurophysiol 
1851: 86:1916-1936.
1852: 
1853: \end{thebibliography}
1854: \end{document}
1855: