1: % arXiv:q-bio.NC/0405027
2:
3: \documentclass[twocolumn,showpacs,pre]{revtex4}
4: %\documentclass[showpacs,pre,preprint]{revtex4}
5:
6: \usepackage{graphicx}
7: \usepackage{mathrsfs}
8:
9: \begin{document}
10:
11: \def\bq{{\mathbf q}}
12: \def\br{{\mathbf r}}
13: \def\bu{{\mathbf u}}
14: \def\bv{{\mathbf v}}
15: \def\bw{{\mathbf w}}
16: \def\bx{{\mathbf x}}
17: \def\by{{\mathbf y}}
18: \def\bz{{\mathbf z}}
19: \def\bA{{\mathbf A}}
20: \def\bD{{\mathbf D}}
21: \def\bI{{\mathbf I}}
22: \def\bJ{{\mathbf J}}
23: \def\bQ{{\mathbf Q}}
24: \def\bS{{\mathbf S}}
25: \def\bV{{\mathbf V}}
26: \def\bW{{\mathbf W}}
27: \def\mD{{\mathcal{D}}}
28:
29: \title{General representation of collective neural dynamics \\
30: with columnar modularity}
31: \author{Myoung Won Cho}
32: \email{mwcho@postech.edu}
33: \author{Seunghwan Kim}
34: \email{swan@postech.edu}
35: \affiliation{
36: Asia Pacific Center for Theoretical Physics $\&$ NCSL,
37: Department of Physics, Pohang University of Science and Technology,
38: Pohang, Gyeongbuk, 790-784, Korea}
39: \date{\today}
40: \begin{abstract}
41: We exhibit a mathematical framework to represent the neural process at the
42: cortical level.
43: The description of neural dynamics with columnar and functional modularity,
44: named the fibre bundle map (FBM) representation method, is based on both
45: neuroscience and informatics, whereas it leads to the conventional formulas in
46: statistical physics.
47: The possibility of analogy between the phenomena in brain and physical systems
48: has been proposed~\cite{Cho2004A,Cho2004B}.
49: In spite of the complex circuitry and the nonlinear dynamics in neural systems,
50: the neural behavior at high levels may be described by simple and general
51: rules, which related to the noble theory in statistical physics.
52: The FBM method gives profit in building or analyzing the neural models by
53: representing essential ingredients of neural interactions by general formulas.
54: %We insist that the typical characters of self-organizing map is determined not
55: %by the detailed and complex interaction rules but by the topology of lattice
56: %and feature space.
57: %The collective neural phenomena can be understood and predicted through some
58: %parameters in a general energy form with symmetry transform invariance.
59: %Not only the similarity in formulas, the cortical dynamics can share the
60: %statistical properties with other physical systems, which validated in primary
61: %visual maps~\cite{Cho2004A}.
62: We apply our method to the proposed models of visual map formation and show how
63: they can share statistical properties with vortex dynamics in magnetism in
64: spite of various development mechanisms.
65: \end{abstract}
66: \pacs{87.10.+e,87.19.La,89.75.Fb}
67:
68: \maketitle
69:
70: \section{Introduction}
71: Though the detailed dynamics of a single neuron are revealed,
72: there still remains a challenge at the network level to explain how brains
73: perform higher cognitive functions.
74: Studies of physical models are focused on achieving a biological realism of the
75: neural computation models.
76: %such as the networks of coupled oscillators~\cite{Nishikawa2004,Aonishi1999},
77: However, the success of the basic neural network models, based on the
78: connectional framework between simple cells, in the application of small
79: adaptive systems, they are inherently problematic in the apprehension of
80: collective neural phenomena and higher cognitive behavior in the real brain.
81: And also there are attempts to see through the neural processing at higher
82: levels, the functional modularity of neurons or the symbolic processing
83: architecture.
84: Before the physiological evidence of repetitive cortical blocks, there were
85: proposals of the modularity within neighbor neurons, the called {\em cell
86: assemblies} (CAs), considering the high dimensional attribute and faculty of
87: neurons~\cite{Hebb1949}.
88: It is a tendency of neurons to aggregate together with similar functional
89: specializations and make organizations hierarchically.
90: Though different classifications and names for neural clusters, we adopt the
91: suggestion that {\em neuron} - {\em minicolumn} - ({\em hypercolumn}) -
92: {\em macrocolumn} - {\em cortex area} - {\em hemisphere}, where minicolumn is
93: a candidate for ``the repeating pattern of circuitry'' or ``the iterated
94: modular unit''~\cite{Calvin1998}.
95:
96: In this paper, we exhibit a mathematical framework, noted briefly and named the
97: fibre bundle map (FBM) methods in ref.~\cite{Cho2004A}, and show how to
98: represent the neural process generally at the cortical level.
99: Briefly speaking, the FBM representation is a mapping of the feature components
100: (often the synaptic weights) to topological spaces, the called {\em bundles}.
101: The pattern informations are reordered in locally transformed coordinates and a
102: few of major components are extracted.
103: Obviously there exist another mathematical framework to represent the neural
104: process in reduced space.
105: Kohonen set up the mathematical preliminaries, the called feature maps or
106: {\em feature-based} representation, in vector space and led the successive
107: models in artificial and physiologic neural networks~\cite{Kohonen1984}.
108: Symbolic processing architectures also suggest the description of neural
109: computations at the cognitive and rational bands.
110: The feature vector space or the symbolic sets can belong to a kind of FBM
111: representation.
112: But the FBM method has an interest in the manifold structure of frequent
113: inputs in feature space and its corresponding symmetry group.
114: Indeed, the properties of neural progress are governed not by the detailed
115: neural interaction rules but by the algebraic structure of dominant feature
116: components.
117: The FBM method represents the neural process by the general formulas in
118: statistical physics and helps to comprehend collective neural phenomena
119: intuitively through the knowledge in statistical mechanics and differential
120: geometry.
121:
122: As a class of abstract representations, the formulas in the FBM models have
123: some different character with those in the feature-based (often the called
124: ``low-dimensional'' feature vector) models.
125: In the feature-based representation, the change in the feature vector at
126: position $\br$, $\Phi(\br)$ is described as the difference in the stimuli
127: vector $\Phi(\br')$, such as $\Delta\Phi(\br)\propto(\Phi(\br')-\Phi(\br))$
128: with the energy of the form $|\Phi(\br)-\Phi(\br')|^2$ (or its higher powers).
129: %Whereas, the energy functions in the FBM representation consist of the inner
130: %products such as $\psi(\br)\psi'(\br)$.
131: Whereas, in the FBM representation, the interactions between neurons are
132: notated by inner products rather than their distance, and it is generally
133: assumed that the energy of neural process can be expanded in a power series,
134: i.e.,
135: %and classified according to the number of coupling.
136: %It is assumed that the energy of neural process can be expanded in a power
137: \begin{eqnarray} \label{eq:expansion}
138: E[\psi]&=&E^{(0)}-\sum_iB(\br_i)\psi(\br_i) \nonumber \\
139: &-&\frac{1}{2!}\sum_{i,j}D(\br_i,\br_j)\psi(\br_i)\psi(\br_j) \\
140: &-&\frac{1}{3!}\sum_{i,j,k}F(\br_i,\br_j,\br_k)
141: \psi(\br_i)\psi(\br_j)\psi(\br_k) + \cdots. \nonumber
142: %&-&\frac{1}{4!}\sum_{\bx,\by,\bz,\bw}G(\bx,\by,\bz,\bw)
143: % \psi(\bz)\psi(\by)\psi(\bz)\psi(\bw)+\cdots, \nonumber
144: \end{eqnarray}
145: where the field variables $\psi(\br)$ denote the feature state of neurons at
146: cortical location $\br$.
147: We will show how this formula is derived and the interaction functions are
148: determined in the simple (or the called ``high-dimensional'' feature vector
149: representation) and the complex cell models.
150: %This formula can be derived concretely from the fundamental neural process as well.
151: %Moreover, in the simple cell (or the ``high-dimensional'' representation)
152: %models, the objective function for Hebbian modification obey this general form
153: %as a function the synaptic weights $\bW$ rather than the field variables $\psi$.
154: In a continuum limit, the energy can be approximated to
155: \begin{eqnarray} \label{eq:continuum}
156: E[\psi]=\int d\br\left\{\frac{v}{2}|(\nabla-i\bA)\psi|^2
157: +\frac{m^2}{2}|\psi|^2+\frac{g}{4!}|\psi|^4\right\},
158: \end{eqnarray}
159: where the odd power terms are expected to be vanished generally.
160: This is just the Ginzburg-Landau energy with gauge invariance and explains the
161: statistical properties of the emergent cortical maps in experiments and
162: simulations.
163: The energy in a continuum approximation often can be derived using only minimal
164: mathematical constraint such as the symmetry.
165: %requirement of invariance under the symmetry transformations without the detailed cortical modification rules.
166: The energy form in Eq.(\ref{eq:expansion}) and Eq.(\ref{eq:continuum}) proposes
167: the possibility of the analogy between the physical and the neural systems, and
168: the characteristics of developed visual maps are systematically apprehended
169: through the statistical properties of vortices in
170: magnetism~\cite{Cho2004A,Cho2004B}.
171: %Phase transitions can be predicted when the changes in parameters, whereas the
172: %parameters are obtained from the detailed interaction mechanisms.
173:
174: %The general energy form in cortical dynamics can be build via two different
175: %ways.
176: %One, the energy function and the pattern properties in cortical map formations
177: %can be inferred only using the topologic properties.
178: %the symmetry between the feature states (or called {\em gauge symmetry} in quantum mechanics).
179: %Considering the transform invariant properties, it is generally assumed that
180: %the energy of map formations takes the form at a continuum limit
181: %Another way is to build models through the detailed description of individual
182: %neural interactions.
183:
184: We apply the FBM representation method to the development models in visual
185: cortex.
186: The cortical map formation in orientation and ocular dominance columns is one
187: of the most studied problems in brain.
188: A considerable amount of different models is proposed, and some of which are
189: compared with the experimental findings and in
190: competition~\cite{Erwin1995,Swindale1996}.
191: %The theoretic analysis of pattern formation are reported within a few of models.
192: Miller {\em et al.} formulated {\em correlation-based} models describing how
193: ocular dominance and orientation columns develop in simple cell
194: models~\cite{Miller1989,Miller1992,Miller1994}.
195: Obermayer {\em et al.} presented a statistical-mechanical analysis of pattern
196: formation and compared predictions quantitatively with experimental data using
197: the Kohonen's {\em self-organizing feature map} (SOFM) approaches.
198: Wolf {\em et al.} obtained again the conditions for the emergence of a columnar
199: pattern in the SOFM algorithm~\cite{Wolf2000}.
200: The studies of the {\em elastic-net} model also show the bifurcation and
201: emergence of a columnar pattern~\cite{Durbin1990,Hoffsummer1995,Goodhill2000}.
202: Scherf {\em et al.} investigated the pattern formation in ocular dominance
203: columns with more detailed model, which covers the results of the SOFM
204: algorithm and the elastic-net model~\cite{Scherf1999}.
205: Wolf and Geisel predicted the influence of the interactions between ocular
206: dominance and orientation columns on the pinwheel stability without model
207: dependency and demonstrated it in the simulations of the elastic-net
208: model~\cite{Wolf1998}.
209: The lateral (or neighbor) interaction models are also successful scheme based
210: on physiology~\cite{Swindale1980,Swindale1982,Cowan1991,Cho2004A}.
211:
212: In the proposed visual map formation models, the Hamiltonian models with spin
213: variables belong to the class of FBM representation
214: models~\cite{Cho2004A,Cowan1991,Tanaka1989}.
215: Other development models written in the high- or low-dimensional feature vector
216: representation can be revised again in the FBM representation.
217: The formulas in FBM models represent essential ingredients of neural
218: interactions without paying much attention to particular neural control
219: mechanism.
220: Moreover, the modification of the iterative procedure of a model into the
221: formula in Eq.(\ref{eq:expansion}) or Eq.(\ref{eq:continuum}) becomes the
222: statistical analysis of the model itself.
223: The quadratic interaction function $D(\br_i,\br_j)$ is consequence in the
224: visual map formation as other physical systems.
225: The interaction functions in neural process mean more than the intracortical
226: connections or recurrents in the called lateral activity control.
227: In the competitive Hebbian models, such as the elastic-net model and the SOFM
228: algorithm, the interaction functions comprise the feedforward competition or
229: normalization process.
230: However, in the FBM representation the functional matrix $D(\br_i,\br_j)$ of
231: the visual map formation models have common shape, the called Mexican hat type,
232: that is, positive in short-range and negative in long-range, in spite of
233: different development mechanisms.
234: The bifurcation to a inhomogeneous state and the emergence of a columnar
235: pattern is possible when there are strong negative interactions in
236: $D(\br_i,\br_j)$.
237: The development of a columnar pattern is also concerned with non-vanishing
238: vector $\bA$, the called {\em vector potential} in physics, in
239: Eq.(\ref{eq:continuum}).
240: The FBM representation method will show how the development models with
241: different mechanisms lead to the successful formation of visual maps and
242: share the statistical properties of vortices in the spin Hamiltonian models.
243: %Recently, we predicted the bifurcation of inhomogeneous solutions also in
244: %lateral interaction models, and derive the typical properties in observed
245: %patterns, such as the orthogonality and the correlation function~\cite{Cho2004A}.
246:
247: \section{Representation of neural state with columnar modularity}
248: The structures and connections in cerebral cortex are more complex and modular
249: than those in artificial neural networks.
250: Neurons tend to be vertically arrayed in the cortex, forming cylinders known as
251: cortical columns.
252: Traditionally, six vertical layers have been distinguished and classified into
253: three different functional types.
254: The layer IV neurons ({\em IN} box), first get the long-range input currents,
255: and send them up vertically to layer II and III ({\em INTERNAL} box) that are
256: the called true association cortex.
257: Output signals are sent down to the layer V and VI ({\em OUT} box), and sent
258: further to the thalamus or other deep and distant neural structures.
259: Lateral connections also occur in the superficial (layer II and III) pyramidal
260: neurons.
261: In columnar (or horizontal) clustering, there are minicolumns, which are
262: consisted of about 100 neurons and 30 $um$ in diameter in monkeys, and
263: macrocolumns, which are 0.4$\sim$1.0 $mm$ and contain at most a few hundred
264: minicolumns.
265: On the wider discrimination, there are 52 cortex areas in each human
266: hemisphere; a Brodmann area averages 21 $cm^2$ and 250 million neurons grouped
267: into several million minicolumns~\cite{Calvin1998}.
268:
269: %\begin{figure}[t]
270: %\includegraphics[width=8cm]{3d-colmn}
271: %\caption{ \label{fig:3d-colmn}
272: % The 3-D structure of cortical columns.
273: % (Reprinted by permission from William H. Calvin, 2001,
274: % {\em The Cerebral Code}, The MIT Press, Copyright \copyright 1996 by William
275: % H. Calvin)
276: %}
277: %\end{figure}
278:
279: The columnar modules can be regarded as a kind of multi-layered neural networks
280: and would have complex functional attributes.
281: Most neurons in brain have the attribute of {\em selective response} to a
282: received activity, and the preferred signals become an useful representation
283: of the functional attributes in a small neural region.
284: A traditional representation of neural state is the vector notation $\bv$,
285: where its components correspond to the activity of each neuron in receptor
286: layer.
287: If a columnar module (or complex cell) at position $\br$ respond selectively
288: to a particular input vector $\bv$ and make an output vector $\by$, its
289: functional attribute can be represented compactly as,
290: \begin{eqnarray} \label{eq:associator}
291: w(\br)=\by\circ F\circ\bv^\top,
292: \end{eqnarray}
293: where $F$ is the nonlinear response or activation function of complex cell.
294: % posterior probability function.
295: If the activation function is linear or ignored, this leads to a simple pattern
296: associator, the called {\em linear associator}.
297: The experiments of the response properties to external stimuli through
298: electrode penetration can be understood as the measurement of the product
299: between the associator $w(\br)$ and the input signal $\bv'$ :
300: \begin{eqnarray} \label{eq:inner_product}
301: |w(\br)\circ\bv'|=|\by|\ F(\bv^\top\bv'),
302: \end{eqnarray}
303: where the activity of the output $|\by|$ corresponds to the measurement of the
304: number of action potentials or the frequency of spikes.
305: In the physiological experiments with the complex cells in primary visual
306: cortex~\cite{Hubel1962} or the object perceptions in inferotemporal (IT)
307: cortex~\cite{Tsunoda2001}, the response property of columnar modules used to be
308: the combination of different patterns and then the functional form in
309: Eq.(\ref{eq:associator}) would be expanded into the summation of associators.
310: When the output $\by$ is common with the most favorite input $\bv$ such as
311: Hopfield networks~\cite{Hopfield1982} or the most favorite input is only
312: concerned, a vector notation can play the role of representation of functional
313: attributes in columnar modules.
314:
315: %\begin{figure}[t]
316: %\begin{minipage}[b]{4cm}
317: % \includegraphics[width=4cm]{intrinsic} (a) intrinsic type
318: %\end{minipage}
319: %\ \
320: %\begin{minipage}[b]{4cm}
321: % \includegraphics[width=3.3cm]{extrinsic} \\ \ \\ \ \\ (b) extrinsic type
322: %\end{minipage}
323: %\caption{ \label{fig:coding_type}
324: % For the response properties of neurons, two different encoding types are
325: % possible whether the synaptic connections are (a) between close neurons
326: % within columnar module or (b) with far aparted neurons cross cortex areas.
327: %}
328: %\end{figure}
329:
330: Fig.\ref{fig:network} depicts a neural network with columnar modules.
331: A matrix $\bW$ denotes the feedforward synaptic weights cross cortex areas,
332: such as the connections between LGN and primary visual cortex, and the input
333: vector to a columnar module is given by $\bv_i=\bW_i\bu$ (or
334: $v_i=\sum_\alpha W_{i\alpha}u_\alpha$).
335: In a complex cell model, it is expected that the synaptic connections within a
336: columnar module $w(\br)$ achieve the functional attributes of neuron, whereas
337: in a simple cell model, the connections with the external cells $\bW$ are
338: considered to vest the functional attributes.
339: For example, the ocular dominance in primary visual cortex is determined
340: whether a neuron in V1 is more connected to the left or right eye (or LGN)
341: cells.
342: We call this the {\em extrinsic} information coding type, which is realized by
343: the connectivity of far neurons cross cortex areas, whereas the {\em intrinsic}
344: type is realized by the synaptic plasticity between close neurons within a
345: columnar module.
346: The neural attribute of two coding types are represented by common formula in
347: FBM models, but there exist some different ground when building actual models.
348: The feedforward competition behavior should be related to the intrinsic coding
349: type.
350: Moreover, the extrinsic encoding type causes a problem in modeling huge
351: networks because too massive connections are required when the meaning of
352: activity is characterized only from where the current come.
353: We expect that the intrinsic type, encoding information in spatial or temporal
354: correlations within a signal band, is essential in huge networks and would be
355: a prominent strategy in the real brain.
356:
357: \begin{figure}[t]
358: \includegraphics[width=8cm]{network}
359: \caption{ \label{fig:network}
360: A neural network model with columnar modules with function $w^{(i)}$.
361: Input signal to a columnar module $\bv_i$ is driven by feedforward synapses
362: with weights $\bW$, that is, $\bv_i=\bW_i\bu$, and its output $\by_i$ is
363: interconnected to neighbors by intracortical connections $\bf J$.
364: %Information (or the functional attributes of neurons) are encoded in the
365: %connectivity within columnar modules ${\bf w}^{(I)}$ (intrinsic type) or in
366: %the feedforward synapses $\bf W$ (extrinsic type).
367: }
368: \end{figure}
369:
370: \section{Fibre bundle map representation}
371: %In simple cell models, the cortical models can get extremely complex
372: % the high-dimensional components the amount of receptor cells,
373: %To deal with this, a class of more abstract models has been developed.
374: %In the ``low-dimensional'' feature vector representation each component stands
375: %for a selected response property.
376: %For example, the features of orientation columns are denoted by Cartesian
377: %components
378: %$\Phi(\br)=\left(q(\br)\sin(2\phi(\br)), q(\br)\sin(2\phi(\br))\right)$ for
379: %preferred orientation $\phi(\br)$ and degree of preference for that orientation
380: %$q(\br)$ at each cortical location $\br$~\cite{Swindale1982}.
381: %In the FBM representation, however, they sometimes takes similar forms with
382: %the low-dimensional feature vector representation, the feature components are
383: %approximated with different standpoint.
384: % given pattern vector, we can extract the feature components,
385: %that are the center of the pattern $(x,y)$ and the maximal variance vector $(v_x,v_y)$.
386: %With the ocular dominance $z$, the feature vector with 5 components,
387: %\begin{eqnarray} \label{eq:visual_feature_vector}
388: % \Phi=(x,y,v_x,v_y,z)
389: %\end{eqnarray}
390: %is a usual representation of the orientation and the ocular dominance columns in visual cortex.
391: %so to say, a reduced dimensional
392: %The components in the FBM representation are composed of
393: %representation with the most prominent components on other basis.
394:
395: The FBM representation method bases on a mathematical framework - the called
396: {\em fibre bundle} in manifold theory~\cite{Martin1991,Nash1983}.
397: For a trivial fibre bundle, a total (or bundle) space $E$, which will depicts
398: the neural attributes at a cortical area, is composed of a base space $B$ and a
399: fibre $F$, tat is, $E=B\times F$.
400: In our interests, cortical locations are the elements in base space, where
401: feature (often pattern, code or model) space becomes a fibre.
402: A structure (or symmetry) group $G$ is a homeomorphism of fibre $F$, and the
403: same with the fibre $F$ in a {\em principal fibre bundle}.
404: The principal fibre bundles admit {\em connexions} (or vector potentials in
405: physics), and it is for this reason that they are of basic importance in gauge
406: theories in physics.
407: The features of cortical cells or small cortical regions at each cortical
408: location $\br$ are represented by a set of field variables $\psi_\alpha(\br)$
409: and
410: \begin{eqnarray} \label{eq:representation}
411: \psi(\br)=|\psi(\br)|\exp(-i\phi_a(\br)\tau^a)=\psi_a(\br)\tau^a,
412: \end{eqnarray}
413: where $\phi_a(\br)$ is an arbitrary internal (feature) phase and $\tau^a$ is
414: the basis of a continuous (or Lie) group G.
415: The bases can be taken as the amount of receptor cells, but are usually reduced
416: according to the statistical structure of inputs.
417: The frequent inputs usually occupy small regions in the total feature space and
418: the major variance of feature components occurs within a embedded submanifold
419: with high stimuli density (Fig.~\ref{fig:V})
420: %, the bases are transformed according to the principal directions of external stimuli density at a point.
421: The reduction of feature space is related to the extraction of features from
422: inputs in learning rules as well.
423: Symmetry breaking between transformed feature components is expected in the
424: neural progress of experience and learning, and cortical dynamics can be
425: described with a few of field components in a reduced feature space.
426:
427: \begin{figure}[t]
428: \includegraphics[width=8cm]{V}
429: \caption{ \label{fig:V}
430: Probabilistic external stimuli and a potential function with external source.
431: The transformed basis $\tau^{1'}$ and $\tau^{2'}$ are the principal
432: directions of external stimuli density at a point.
433: }
434: \end{figure}
435:
436: %The interactions in cortical circuitry and the synaptic plasticity are more
437: The differential geometric concepts in the FBM representation furnish an
438: intuitive explanation for emergent cortical maps.
439: The self-organization of feature maps achieved by locally gathering similar
440: interests means there is smooth variance of features with neighbor neurons at
441: each location.
442: In other words, the properties of ``organized'' and ``optimized'' feature maps
443: is related with those of ``continuous'' and ``flat'' variables in manifold.
444: If there is no difference of features with neighbors at small region near
445: position $\br$, they can be denoted by $\nabla\psi(\br)=0$ (or
446: $\nabla\phi(\br)=0$).
447: If there exists small tilting of phase angle at position $\br$ and an arbitrary
448: vector $\bA(\br)$ denote the difference between phase angles, the called
449: {\em covariant derivative} is given by $(\nabla-i\bA(\br))\psi(\br)=0$ (or
450: $\nabla\phi(\br)-A(\br)=0$).
451: If the covariant derivative vanishes (said to be flat or parallel by translated
452: in manifold theory) for all $\br$, the distribution of the field variables
453: $\psi(\br)$ would be a minimum solution of the integral
454: \begin{eqnarray} \label{eq:action}
455: S=\int d\br\ |(\nabla-i\bA)\psi|^2,
456: \end{eqnarray}
457: for the connexion $\bA$.
458: A non-vanishing connexion $\bA$ occurs when there are strong competitive
459: behavior or inhibitory lateral interactions between neurons, and is related to
460: the emergence of a periodic pattern in cortical maps, such as the band patterns
461: in ocular dominance columns and the linear zones in orientation preference
462: columns, with the wavelength $\Lambda=2\pi/|\bA|$.
463: Fig.\ref{fig:macaque} shows the complete pattern of ocular dominance stripes
464: of a macaque monkey.
465: The orthogonality between the contour lines of feature map and the boundary
466: of cortical area is a property of minimal solutions in Eq.(\ref{eq:action}).
467: From the condition $\delta S/\delta\phi\sim 0$ or $\nabla^2\phi\sim 0$ for
468: $\psi=e^{2i\phi}$ with the preferred angle $\phi$, the normal component of
469: $\nabla\phi$ vanishes at the area boundary since the integral along a narrow
470: rectangular loop over the area boundary $\oint_C\nabla\phi\cdot d\hat{n}$
471: vanishes due to the divergence theorem.
472: Such perpendicularity with the area boundary is also manifested in other static
473: field solutions, such as the magnetic field.
474:
475: \begin{figure}[t]
476: \begin{minipage}[b]{5cm}
477: \includegraphics[width=5cm]{evol03c}
478: \end{minipage}
479: \ \
480: \begin{minipage}[b]{0.65cm}
481: \includegraphics[width=0.65cm]{bar2}
482: \end{minipage}
483: \caption{ \label{fig:orientation_map}
484: The simulation result of orientation map formation.
485: The orientation maps have $U(1)$ (or $O(2)$) symmetry and the major
486: characteristics of the developed map can be predicted using only the symmetry
487: properties.
488: }
489: \end{figure}
490: \begin{figure}[t]
491: \includegraphics[width=6cm]{macaque}
492: \caption{\label{fig:macaque}
493: The complete pattern of ocular dominance stripes in the striate cortex of a
494: macaque monkey.
495: There is a strong tendency for the stripes to meet the margin of striate
496: cortex at steep or right angles.
497: (Reprinted by permission from S.LeVay, Copyright \copyright 1985 by the
498: Society for Neuroscience~\cite{LeVay1985}.)
499: }
500: \end{figure}
501:
502: The symmetry property also helps to predict the energy function of the cortical
503: map formation.
504: %The major features in cortical maps are universal and can be understood through
505: %the experience in other physical systems.
506: %Other orientation development models should also satisfy the energy form in spite
507: %of each different interaction rules employed.
508: For example, the features of orientation preference columns in the visual
509: cortex have $U(1)$ (or $O(2)$) symmetry.
510: However we perform a rotation in all the preferred angles through same angle
511: ($\phi\rightarrow\phi+\chi$ - called `global' gauge transform), the energy of
512: orientation map formation should remain invariant.
513: The rotation angle $\chi$ can have a dependency on position $\br$, the called
514: `local' gauge transform, and the energy in a continuum approximation may take
515: the form in Eq.(\ref{eq:continuum}) or Eq.(\ref{eq:action}) with
516: $\bA=\nabla\chi(\br)$.
517:
518: %%The FBM approaches say that the symmetry structure between features determines the typical character of self-organizing feature maps.
519: %The structure of transition group between features can be presumed with
520: %several algebraic descriptions :
521: %(1) In the primary sensory areas, the transition functions can be inferred from
522: %the symmetry in external activities or patterns.
523: %The symmetry group corresponding with the feature space is clear and complete.
524: %(2) If two different features, $\psi_1(\br)$ and $\psi_2(\br)$, are occupied
525: %at common cortex, the imposed restriction for normalization is
526: %$|\psi_1(\br)|^2+|\psi_2(\br)|^2=const$ for all position $\br$.
527: %For example, the symmetry group of the orientation and ocular dominance columns
528: %in primary visual cortex is not $O(2)\times O(1)$ but $O(3)$.
529: %A typical character of two combined feature maps is that the contour lines of
530: %them meet at right angle, because of $\nabla\psi_1\cdot\nabla\psi_2\sim 0$ with
531: %the equilibrium conditions
532: %$\delta E/\delta\psi_\alpha\sim 0$ or $\nabla^2\psi_\alpha\sim 0$ for
533: %$\alpha=1$, $2$.
534: %The orthogonal property between the orientation and ocular dominance maps is
535: %reported also in animal experiments~\cite{Obermayer1993}.
536: %(3) Like the primary auditory cortex, the transition functions corresponding
537: %with the features are not consist complete group but be ordered sequentially.
538: %In this cases the extreme (the maximal or minimal) features tend to exist at
539: %the boundary of feature map.
540: %(4) Some measurements in biologic experiments, such as the correlation between
541: %activity, give information about the difference between codes.
542: %However there are no experimental evidence, we can guess the relative distance
543: %between codes and classify them according to their category, such as human
544: %faces, monkey faces or shapes in inferotemporal cortex.
545: %Such homomorphic representation of group structure is useful for the problems
546: %of functional area differentiations at macroscopic level.
547: %(5) At high cognitive area, it is not easy to infer the transition function
548: %group because the code space is embedded on very large and high dimensional
549: %manifold.
550: %If we cannot guess any more symmetry or relative distance between codes,
551: %symbolic sets will be available, where they are complete groups also.
552:
553: \section{Description of detailed neural interactions}
554: %If we consider the selective response to input signals due to the connections
555: %between neurons within a cluster, the attributes of a block of neurons can be
556: %represented following the basic neural network architectures, what Kohonen
557: %(1977) labeled {\em heteroassociation}.
558: %The cortical modification models at low level (or single-cell models) suggest
559: %more physical features of neural interactions and the biologic foundation of
560: %more abstract models.
561: The description of neural dynamic at a high level also should be based both on
562: neuroscience and informatics.
563: One important principle for neural plasticity is the Hebbian rule : two
564: simultaneously active neurons on either side of a connection increases the
565: weight of that connection is increased~\cite{Hebb1949}.
566: The simple Hebbian plasticity rule in a single neuron consists of inputs $\bu$
567: and weights $\bW$ takes the form
568: \begin{eqnarray} \label{eq:Hebbian}
569: \Delta\bW(t)\propto y(t)\bu(t)
570: \end{eqnarray}
571: for the output $y=f(\bW\bu)$ with the activation function $f$ of simple cell.
572: In intracortical connected networks, the input becomes the summation of the
573: current from input and neighbor cells.
574: The output of neuron at $i$-th site becomes
575: \begin{eqnarray} \label{eq:recursive}
576: y_i=f(v_i+\sum_j J_{ij}y_j)
577: \end{eqnarray}
578: for $v_i=W_{i\alpha}u_\alpha$ and the recurrent weight matrix $\bJ$.
579: In a energy model, synaptic plasticity rule is regarded as the negative
580: gradient of an energy (often objective, error or cost function) defined as a
581: function of $\bW$ :
582: \begin{eqnarray}
583: \Delta\bW\propto-\frac{\partial E[\bW]}{\partial\bW}.
584: \end{eqnarray}
585: Because of the nonlinearity of the activation function and the recursive form
586: in Eq.(\ref{eq:recursive}), the energy used to be approximated depending on
587: models.
588: For example, with the assumption of $y_i=f(v_i+\sum_j J_{ij}v_j)$ and a series
589: expression of activation function $f(v)=\sum_\ell a_{\ell+1}v^\ell$, the energy
590: is obtained by
591: \begin{eqnarray} \label{eq:simple_energy}
592: E[\bW]=-\sum_\ell\frac{a_\ell}{\ell}D_{i_1\cdots i_\ell}^{(\ell)}
593: Q^{(\ell)}_{\alpha_1\cdots\alpha_\ell}
594: W_{i_1\alpha_1}\cdots W_{i_\ell\alpha_\ell},
595: \end{eqnarray}
596: where
597: \begin{eqnarray}
598: D^{(\ell)}_{i_1\cdots i_\ell}&=&(\delta_{i_1i_2}+J_{i_1i_2})\cdots
599: (\delta_{i_{\ell-1}i_\ell}+J_{i_{\ell-1}i_\ell}) \nonumber \\
600: &=&D^{(2)}_{i_1i_2}\cdots D^{(2)}_{i_{\ell-1}i_\ell}
601: \end{eqnarray}
602: is the functional tensor of rank $\ell$ and
603: \begin{eqnarray}
604: Q^{(\ell)}_{\alpha_1\cdots\alpha_\ell}=
605: \langle u_{\alpha_1}\cdots u_{\alpha_\ell}\rangle_\mD
606: \end{eqnarray}
607: is the input correlation tensor of rank $\ell$.
608: $\langle\ \cdot\ \rangle_\mD$ denotes the average over input data set $\mD$.
609: This energy based on the basic Hebbian rule is adjusted again depending on the
610: characteristic of synaptic plasticity rules~\cite{Fregnac1998}.
611: For example, the covariance plasticity rule replaces the input correlation
612: function $\bQ^{(\ell)}$ with rank $\ell$ as the input covariance function
613: \begin{eqnarray}
614: C^{(\ell)}_{\alpha_1\cdots\alpha_\ell}=\langle
615: (u_{\alpha_1}-\langle u_{\alpha_1}\rangle_\mD)\cdots
616: (u_{\alpha_\ell}-\langle u_{\alpha_\ell}\rangle_\mD)\rangle_\mD.
617: \end{eqnarray}
618: In the FBM representation of simple cell model, feedforward synaptic weights
619: $\bW_{i\alpha}$ is replaced as field variables $\psi_\alpha(\br_i)$, then the
620: energy in Eq.(\ref{eq:simple_energy}) satisfies the form of energy in
621: Eq.(\ref{eq:expansion}).
622: For efficient description of dynamics, the energy is decomposed into the
623: functions of transformed field variables.
624: Because of the anisotropy in input correlation $\bQ$ (often in neighbor
625: activity $\bD$), the symmetry between components is broken and the effective
626: dynamics can be described with a few of dominant components.
627: The consequence of the anisotropy in neighbor activity between feature
628: components is explored in the case of the anisotropy between orientation and
629: ocular dominance columns~\cite{Cho2004B}.
630: %For the orientation preference columns, the prominent pattern are the oriented
631: %images with low frequency.
632:
633: In a complex cell model, the features of neurons relate to the synapses within
634: a columnar module.
635: The columnar module is a kind of adaptive neural network systems and the
636: modulation of its functional attributes involves intricate changes in synaptic
637: weights.
638: An effective assumption is that the output of a columnar module is one of
639: the proper states of the functional and will change following afferent
640: signals.
641: For the currents from input and neighbor cells and a linear activation function,
642: the change in the proper state or the output of a columnar module is then
643: \begin{eqnarray}
644: \Delta\by_i\propto\bv_i+\sum_jJ_{ij}\by_j
645: \end{eqnarray}
646: for the input $\bv_i$ to the columnar module at position $\br_i$ and the energy
647: averaged over inputs is obtained by
648: \begin{eqnarray}
649: E[\by]=-\sum_i\langle\bv_i\rangle_\mD\by_i
650: -\frac{1}{2}\sum_{i,j}J_{ij}\by_i\by_j.
651: \end{eqnarray}
652: In the FBM representation, the output with multivariable is replaced by field
653: variables :
654: \begin{eqnarray}
655: E[\psi]&=&-\sum_iB_i\psi_i-\frac{1}{2}\sum_{i,j}J_{ij}\psi_i\psi_j
656: \end{eqnarray}
657: or
658: \begin{eqnarray}
659: \lefteqn{E[\psi]=-\sum_i B(\br_i)\psi^\dagger(\br_i)} \\
660: &&-\frac{1}{4}\sum_{i,j}J(\br_i,\br_j)\left\{\psi(\br_i)^\dagger\psi(\br_j)
661: +\psi(\br_i)\psi(\br_j)^\dagger\right\}, \nonumber
662: \end{eqnarray}
663: where a functional vector $B(\br_i)=\langle \bv_i\rangle_\mD$ is the linear
664: average over inputs.
665: %The term for neighbor interactions in the FBM methods takes the exchange energy form
666: %Indeed the mathematical frameworks and formulas in the FBM methods resemble those
667: %in statistical quantum field theory.
668: If we assume $\psi^\dagger$ and $\psi$ are creation and annihilation operators,
669: the term $\psi(\br_i)J(\br_i,\br_j)\psi^\dagger(\br_j)$ can be regarded as the
670: description of phenomena that a created activity at position $\br_j$ is
671: translated with kernel $J$ and annihilated at position $\br_i$.
672:
673: A series of physiological experiments showed that the synaptic plasticity comes
674: from a redistribution of the available synaptic efficacy, not an increase in
675: the efficacy~\cite{Markram1996,Fregnac1998}.
676: In other words, the neural plasticity at the network level can be understood
677: as the pursuit of increment in the probability of reactivity with bounded total
678: synaptic strength for environmental experience.
679: With the expectation of a automatic normalization of synaptic weights,
680: %to a single neuron for simple cell model (or within a columnar module for complex cell model),
681: the norm of field variables $|\psi|$ used to be constrained to be constant.
682: In this sense, the neural dynamics with functional modularity may be described
683: by the slight shift in the internal phase per activity following afferent
684: signals.
685: Sometimes the normalization constraint is not imposed and involved in the
686: plasticity rule with subtractive normalization~\cite{Oja1982}.
687: For the energy function of the form
688: \begin{eqnarray}
689: E[\psi]=a\psi^2-b\psi^4,
690: \end{eqnarray}
691: the stability of synaptic weight can be achieved due to the relaxation of
692: $|\psi|^2$ to its equilibrium value.
693:
694: Another important mechanism expected in neural computation is the enhancement
695: of neural activity depending on correspondence to input.
696: A possible enhancement modulation is the restriction on the sum over the
697: activity by subtractive normalization.
698: With a simple nonlinear form $x+\eta x^2$, the external source term with
699: enhanced afferent signals becomes that
700: %depending on the conformity is that
701: \begin{eqnarray} \label{eq:enhancement}
702: \langle\bv_i'\rangle_\mD\psi_i&=&\left\langle \bv\frac{\rho_i(1+\eta\bv\psi_i)}
703: {(1/\rho)\sum_j\rho_j(1+\eta\bv\psi_j)}\right\rangle_\mD\psi_i \nonumber \\
704: &\simeq&\langle\bv_i\rangle_\mD\psi_i+\frac{1}{2}\sum_j S_{ij}\psi_i\psi_j
705: \end{eqnarray}
706: for $\bv_i=\rho_i\bv$ and $\rho=\sum_i\rho_i$ with the stimuli strength
707: $\rho_i$ at position $\br_i$.
708: The scattering function with a input data set $\mD$ is defined as
709: \begin{eqnarray} \label{eq:scattering}
710: S_{ij}=2\eta\langle v_iv_j\rangle_\mD(\delta_{ij}-1)
711: \end{eqnarray}
712: for the enhancement (or competition) parameter $\eta$.
713: In the FBM representation, the scattering function describes the feedforward
714: competition process in the competitive Hamiltonian models, such as the
715: elastic-net model and the SOFM algorithm.
716: %For hard competition with large $\eta$, the network accomplishes the
717: %``winner-take-all'' process.
718: %In fact, a priori enhancement of afferent signals is achieved when the
719: %conformity between neural feature and input signal is determined by the
720: %connectivity with the incentive cells (or extrinsic coding type).
721: In fact, for an intrinsic coding type, network cannot tell which neurons match
722: mostly with input signal a priori and the winner has to be determined after
723: lateral inhibitory activity.
724: The competitive Hebbian models require a normalization control of response or a
725: priori decision of winner and depict the feature vectors in visual cortex
726: through the connectivity between visual cortex and retinas (or
727: LGNs)~\cite{Scherf1999}.
728: %Another important role of synaptic normalization in cortical map development is the competition.
729: % in afferent signals is equivalent to those of the lateral inhibitory activity.
730: The lateral activity function $J(\br_i,\br_j)$, the connectivity between
731: neurons (or columnar modules) at position $\br_i$ and $\br_j$ within a cortex
732: area, has two types according to the control mechanisms~\cite{Kohonen1995}.
733: In the case of the lateral feedback control (which Kohonen called the
734: activity-to-activity kernel), the lateral activity function $\bJ$ is regarded
735: to be excitatory for short distance and inhibitory for long distance with the
736: so-called Mexican hat type (Fig.\ref{fig:control}a).
737: Whereas in the case of the lateral control of plasticity (or the
738: activity-to-plasticity kernel), the lateral interaction is nonnegative and may
739: take the Gaussian form (Fig.\ref{fig:control}b).
740: The competitive Hebbian models take the lateral control of plasticity, that
741: means there is no negative value in $\bJ$, and the scattering function $\bS$
742: from afferent signal enhancement has an effect of inhibitory activity.
743:
744: %We can consider also the interactions with higher powers and take the general
745: %energy form as Eq.(\ref{eq:expansion}).
746: %Note that the actual forms of interaction functions depend on interaction
747: %mechanisms and the quadratic interaction term $D(\bx,\by)$ need not always be
748: %the neighborhood function $J(\bx,\by)$.
749:
750: \begin{figure}[t]
751: \begin{minipage}[b]{4cm}
752: \includegraphics[width=4cm]{activity} \\
753: (a) Lateral feedback control of activity
754: %\\ \ \\
755: \end{minipage}
756: \
757: \begin{minipage}[b]{4cm}
758: \includegraphics[width=4cm]{plasticity} \\
759: (b) Lateral control of plasticity
760: \end{minipage}
761: \caption{ \label{fig:control}
762: The two types of neighbor interaction functions and control mechanisms.
763: (a) The lateral interaction models adopt lateral activity control and the
764: activation kernel, usually so-called ``Mexican hat'' function (positive
765: feedback for close distance and negative for longer distance).
766: (b) The plasticity control with nonnegative kernel requires feedforward
767: competition (or feedforward normalization of activity over networks).
768: The elastic-net model assumes the nearest neighbor interactions (or elastic
769: force), whereas the SOFM algorithm takes the neighbor function on Gaussian
770: form with the hard competition (or winner-take-all activity).
771: }
772: \end{figure}
773:
774: Now we employ the concepts of thermodynamic into neural dynamics.
775: In some classes of neural network models, such as Boltzmann machine, the
776: input-output is assumed to be stochastic.
777: Once a stochastic neural network has converged to an equilibrium state, the
778: probability distribution characterizing $\psi$ is expected to obey the
779: Boltzmann distribution
780: \begin{eqnarray}
781: P[\psi]=\frac{\exp(-E[\psi])}{Z}
782: \end{eqnarray}
783: for the partition function
784: \begin{eqnarray}
785: Z=\sum_\psi\exp(-E[\psi]).
786: \end{eqnarray}
787: In neural processing architecture, the notion of entropy or free energy is put
788: into practice ahead for the purpose of informatics.
789: Compared to deterministic firing models, an expected advantage in stochastic
790: neural network models is to escape from poor locally optimal configurations
791: through probabilistic evolution.
792: Moreover, there are several reasons that the stochastic behavior should be
793: indispensable process in neural networks.
794: In view of learning rules, it is natural that neural states are occupied
795: with features corresponding to frequent inputs (the {\em coarse coding}
796: principle).
797: On the other hand, it is efficient for a neural network to avoid the occupation
798: with a few of features, so that an object is coded by a small population that
799: is active for an event (the {\em sparse coding} principle).
800: %Besides the competitive or inhibitory activity, thermodynamic behavior in neural
801: %networks tends to achieve the sparseness.
802: It is usual that the cost function in unsupervised learning algorithm is
803: similar to the Helmholtz free energy that
804: \begin{eqnarray} \label{eq:Helmholtz}
805: F=E-TS,
806: \end{eqnarray}
807: where the parameter $T$ is considered just as a positive constant that
808: determines the importance of the second term relative to the first.
809: %In a Hebbian development model, the energy term $E$ functions neurons to
810: %possess features corresponding frequent inputs in addition to the neighbor
811: %ordering, whereas the entropy term compels neurons to avoid occupying a
812: %common feature state.
813: In learning rules, the energy term is illustrated by a measurement how well the
814: code describes the input data or carry the informations :
815: \begin{eqnarray} \label{eq:E}
816: %E&=&(1/N)\sum_{i}\langle P(\psi_i|\bv)\rangle_\mD \nonumber \\
817: E&=&(1/N)\sum_{i}\sum_{\bv\in\mD}
818: P(\psi_i|\bv)P(\bv|\mD) \nonumber \\
819: &=&(1/N)\sum_{i}P(\psi_i|\mD) \\
820: &=&\sum_{\psi}P(\psi)P(\psi|\mD), \nonumber
821: \end{eqnarray}
822: %In a network with the input-output stochastic relationship,
823: where a distribution $P(\psi|\mD)$ is the average over the probability that
824: input $\bv\in\mD$ generates output $\psi$.
825: In Hebbian development models, this energy term can be considered as an
826: external source term, that is the average over the product between feature
827: state and external signals, $-B\psi$ in a complex cell model (or
828: $-\psi\bQ^{(2)}\psi$ in a simple cell model) as well.
829: %However, if the neural dynamics is described by $E=V(\psi)$, the solutions
830: %indicate the collapse of whole neurons to single feature state with the maximal
831: %probable experience.
832: %We can expect that the observed cortex maps {\em in vivo} are aparted from the
833: %equilibrium state because the relaxation process in neuron systems is very
834: %slow, and they will reach to single state finally.
835: %The {\em minimum description length} (MDL) principle~\cite{Rissanen1989}, for
836: %example, finds a method of coding each input data that minimizes the total cost
837: %of communicating the input data to a receiver.
838: %The energy is described as $P(\psi)$ is the probability of the feature state
839: %$\psi$ in the cortex area or the prior probability of the model $\psi$.
840: In learning rules, the entropy term assesses the sparseness of the code by
841: assigning a cost depending on how the activity is distributed.
842: According to Shannon's coding theorem, the amount of information is defined by
843: \begin{eqnarray} \label{eq:S}
844: S=-K\sum_{\psi} P(\psi)\ln P(\psi)
845: \end{eqnarray}
846: where $K$ is a positive constant and $-\ln P(\psi)$ is the cost of the code,
847: the number of bits required to communicate the code.
848: %The expression for $E$ and $S$ in Eq.(\ref{eq:E}) and Eq.(\ref{eq:S}) remind
849: %the Helmholtz's free energy in density matrix formulation,
850: %\begin{eqnarray} F[\rho]=\mbox{Tr}\ \rho\left\{H+k_BT\ln\rho\right\} \end{eqnarray}
851: %for the density matrix $\rho$ with $K$ being identified as Boltzmann's constant $k_B$.
852: The connections between information theory and statistical mechanics are
853: rigorously investigated~\cite{Jaynes1957A,Jaynes1957B,Grandy1997}.
854: %From the point of the learning algorithm, the probabilistic decision neural
855: %networks such as {\em Boltzmann machines}~\cite{Hinton1983} have been suggested.
856: However, there is some hardship to apply the statistical mechanism to the
857: phenomena in the real brain.
858: Since neural process, comprehends the dynamics at various spatial and temporal
859: levels, is essentially dynamical and non equilibrium phenomenon.
860: For example, the relaxation process in cortical map formation is very slow and
861: the observed maps often do not satisfy the equilibrium criteria.
862: The map formation in visual cortex occurs concentrately for several weeks or
863: months after birth, during a so-called critical period.
864: In observed orientation preference maps, the non-uniforming directions of
865: gradient ($\nabla\phi_\parallel\neq\mbox{const}$, however
866: $|\nabla\phi_\parallel|\simeq\mbox{const}$ for the longitudinal component
867: $\phi_\parallel$) and non-vanishing singular points (or pinwheels) indicate
868: that the system may be frozen during the relaxation process~\cite{Cho2004A}.
869: % the perpendicularity with
870: %area boundary ($\nabla^2\phi_\parallel\sim 0$ for the longitudinal component
871: %$\phi_\parallel$) is achieved, but except
872: %The relaxation process in cortical dynamics is very slow and the observed maps
873: %in adult are expected to be stopped in process.
874: %We think the neural networks should be treated as aparted from but in
875: %relaxation to the equilibrium state.
876: %we can guess how far they apart from the equilibrium state.
877:
878: \section{Application to visual map formation models} \label{sec:application}
879: According to the studies of the statistical structure of natural images, the
880: response properties of visual neurons, the spatially localized and oriented,
881: are considered to be due to the efficient coding of natural
882: images~\cite{Olshausen1996}.
883: Oriented bar or grid patterns are the most probable activity and the feature
884: with $O(2)$ (or $U(1)$) symmetry components is a meaningful representation in
885: orientation columns.
886: With ocular dominance columns, the total feature can be expanded to $O(3)$
887: symmetry components with the restriction of synaptic normalization within
888: each column.
889: Therefore, the conventional spin vector $(S^x,S^y,S^z)$ can serve as a useful
890: representation of the feature states with the preferred orientation
891: $\phi=(1/2)tan^{-1}(S_x/S_y)$.
892: %Among the phenomena at the cortical level, the primary visual map formation is
893: %one of the most investigated problems with various proposed models, most of
894: %which are described in the high- or low-dimensional feature vector representation.
895: The proposed models of visual map formation are based on so various mechanisms.
896: %which factor causes the bifurcation to a inhomogeneous state.
897: We rewrite four widely used visual map development models in term of FBM
898: representation classified by their effective interaction terms that
899: \begin{itemize}
900: \item[(A)] Lateral interaction models : $\bD = \bJ$
901: \item[(B)] Recursive interaction models : $\bD = (\bI-\bJ)^{-1}$
902: \item[(C)] Elastic-net model : $\bD = \bJ + \bS$
903: \item[(D)] SOFM algorithm : $\bD = \bS\bJ$.
904: \end{itemize}
905: The lateral activity function $J(\br_i,\br_j)$ in the elastic-net model and
906: the SOFM algorithm is taken to be all nonnegative, the two-point interaction
907: function $D(\br_i,\br_j)\simeq D(|\br_i-\br_j|)$ takes the Mexican hat type for
908: all cases owing to the scattering funciton $S(\br_i,\br_j)$.
909: %The competitive Hebbian models require the feedforward competition and the
910: %nonnegative neighborhood function (plasticity control kernel) with the
911: %enhancement of activity in Eq.(\ref{eq:enhancement}).
912: Periodic patterns, such as linear zones in orientation preference columns or
913: parallel bands in ocular dominance columns, can develop when there are abundant
914: negative values in $D(\br_i,\br_j)$ so that $\tilde{D}(q)$ in Fourier space has
915: a non-vanishing minimum point $q^\ast$ with the wavelength
916: $\Lambda=2\pi/q^\ast$.
917: %We show that the elastic-net model and the lateral interaction models can be
918: %described by common energy form in Eq.(\ref{eq:linear}) with
919: %$D(r)=h_+(r)-h_-(r)$ for positive functions $h_+$ and $h_-$.
920: %This result means that two models have equivalent effective interactions and
921: %share statistical properties, in spite of their different control mechanisms.
922: %The emergence of columnar patterns are possible when $h_-$ is relatively larger
923: %than $h_+$.
924: %The lateral interaction models consider $h_-$ as the lateral inhibitory
925: %interaction term whereas the elastic-net model suggest it as the correlated
926: %external stimuli term.
927:
928: \subsection{Lateral Interaction Models}
929: A simple cell model of the visual map development uses the high-dimensional
930: feature vector coding for the strength of the connection from each cortical
931: location to each retinal (or LGN) location.
932: The synaptic plasticity depends on the average over the activities of competing
933: inputs, which are left and right eyes for ocular dominance columns or ON-center
934: and OFF-center cells for orientation preference columns.
935: For a linear activation function or $f(v)=v$, the energy in
936: Eq.\ref{eq:simple_energy} becomes that
937: \begin{eqnarray}
938: E[\bW]=-\frac{1}{2}\sum_{i,j}\sum_{\alpha,\beta}
939: (\delta_{ij}+J_{ij})Q^{(2)}_{\alpha\beta}\bW_{i\alpha}\bW_{j\beta}.
940: \end{eqnarray}
941: This energy is decomposed with the (globally) transformed synaptic weights into
942: the sum and difference :
943: %The synaptic weights are represented by their transformation into the sum and difference :
944: \begin{eqnarray}
945: \bW_+=\bW_R+\bW_L & \mbox{and} & \bW_-=\bW_R-\bW_L
946: \end{eqnarray}
947: for ocular dominance columns or similarly $\bW_\pm=\bW_{ON}\pm\bW_{OFF}$ for
948: orientation columns.
949: In a pixel-based representation for orientation columns, oriented patterns with
950: low-frequency compose a dominant feature space and the energy is decomposed
951: with the locally transformed weights as well.
952: Therefore, the energy as the function of field variables $\psi$ in a
953: transformed and reduced feature space is that
954: \begin{eqnarray}
955: E[\psi]=-\frac{1}{2}\sum_{i,j}D(\br_i,\br_j)\psi(\br_i)\psi(\br_j)
956: \end{eqnarray}
957: for $\bD=\bI+\bJ$.
958: The input correlation matrix $\bQ^{(2)}$ is ignored, because the frequency of
959: the dominent features in inputs is regarded to be the same or the two-point
960: activity function $\bD$ comprises this.
961: The term $-\frac{1}{2}\sum_i\psi(\br_i)^2$, related to the self-relaxation
962: term, does not effect influence on the typical spacing of an emergent columnar
963: pattern.
964: In a complex cell model, the simplest is given by the summation of the neighbor
965: interactions and the external stimuli terms as
966: \begin{eqnarray} \label{eq:linear}
967: E[\psi]=-\frac{1}{2}\sum_{i,j}D(\br_i,\br_j)\psi(\br_i)\psi(\br_j)-\sum_iB(\br_i)\psi(\br_i)
968: %E[\psi]=-\frac{1}{2}\sum_{i,j}D_{ij}\psi_i\psi_j-\sum_iB_i\psi_i
969: \end{eqnarray}
970: for $\bD=\bJ$.
971: The external stimuli $B(\br_i)$ is considered to be constant or vanishing.
972: Therefore, the form of the lateral activity function $\bJ$ determines the
973: typical appearance of developed feature map for both cases.
974: In lateral interaction models, $\bJ$ is taken as the {\em activation kernel},
975: or Mexican hat function (positive feedback in the center, negative in the
976: surroundings).
977: For example, a well-known Mexican hat function, the called difference of
978: Gaussians (DOG) filter, is described as
979: \begin{eqnarray}
980: J(\br_i,\br_j)=\varepsilon\left(e^{-|\br_i-\br_j|^2/2\sigma_1^2}
981: -ke^{-|\br_i-\br_j|^2/2\sigma_2^2}\right)
982: \end{eqnarray}
983: where $k$ is the strength of inhibitory activity.
984: Another example of Mexican hat function modified from a wavelet is given by
985: \begin{eqnarray} \label{eq:wavelet}
986: J(\br_i,\br_j)=\varepsilon\left(1-k\frac{|\br_i-\br_j|^2}{\sigma_l^2}\right)
987: e^{-|\br_i-\br_j|^2/2\sigma_l^2}
988: \end{eqnarray}
989: for the lateral cooperation range $\sigma_l$.
990: If the strength of inhibitory activity $k$ is larger than threshold $k_c$
991: ($=1/4)$, $\tilde{D}(q)$ has a non-vanishing maximum point at
992: $q^\ast=(1/\sigma)\sqrt{4-1/k}$~\cite{Cho2004A}.
993:
994: \subsection{Recursive interaction models}
995: For a linear activation function, the output in Eq.(\ref{eq:recursive}) becomes
996: \begin{eqnarray}
997: y_i=v_i+\sum_jJ_{ij}v_j+\sum_{j,k}J_{ij}J_{jk}v_k+\cdots,
998: \end{eqnarray}
999: which is the summation of recursive recurrents.
1000: The energy as the function of synaptic weights is obtained that
1001: \begin{eqnarray} \label{eq:correlation-based}
1002: E[\bW]=-\frac{1}{2}\sum_{i,j}\sum_{\alpha,\beta}D_{ij}Q^{(2)}_{\alpha\beta}
1003: \bW_{i\alpha}\bW_{j\beta},
1004: \end{eqnarray}
1005: where the two-point interaction function is
1006: \begin{eqnarray}
1007: \bD=\bI+\bJ+\bJ^2+\cdots=(\bI-\bJ)^{-1}
1008: \end{eqnarray}
1009: and the real parts of the eigenvalues of $\bJ$ are expected to be less than $1$.
1010: Eq.(\ref{eq:correlation-based}) is a simple modified equation of Miller's
1011: correlation-based learning models~\cite{Dayan2001}.
1012: In the original representation by Miller {\em et al.}, the input stimuli term
1013: is described by an arbor function, expressing the location and the overall size
1014: of the receptive fields~\cite{Miller1989,Miller1994}.
1015: The two-point interaction function $\bD$ takes the Mexican hat type and the
1016: wavelength of visual pattern is determined by the peak of $\tilde{D}(q)$ in
1017: the analysis by Miller as well~\cite{Miller1998}.
1018:
1019: %The ocular dominance development model by Miller {\em et al.} uses a
1020: %high-dimensional feature vector coding for the strength of the connection from
1021: %each cortical location to each retinal (or LGN) location.
1022: %The correlation-based models is based on the synaptic plasticity depending on
1023: %the correlations among the activities of competing inputs, which are left and
1024: %right eyes for ocular dominance columns or ON-center and OFF-center cells for
1025: %orientation preference columns
1026:
1027:
1028: \subsection{Elastic-Net Model}
1029: The elastic-net model is described by an iterative procedure
1030: with the update rule :
1031: \begin{eqnarray} \label{eq:elastic}
1032: \Delta\Phi(\br_i)&=&\alpha\sum_{|\br_i-\br_j|=a}(\Phi(\br_i)-\Phi(\br_j))
1033: \nonumber \\
1034: &+&\beta(\bV-\Phi(\br_i))\frac{e^{-|\bV-\Phi(\br_i)|^2/2\sigma_s^2}}
1035: {\sum_j e^{-|\bV-\Phi(\br_j)|^2/2\sigma_s^2}}, \ \ \ \
1036: \end{eqnarray}
1037: where a feature vector in the low-dimensional representation is
1038: \begin{eqnarray}
1039: \Phi(\br)&=&(r_x,r_y,q\sin(2\phi(\br)),q\cos(2\phi(\br)),z(\br)) \nonumber \\
1040: &=&(\br,\psi(\br)) \nonumber
1041: \end{eqnarray}
1042: for the retinal location $\br=(r_x, r_y)$, the preferred orientation
1043: $\phi(\br)$, the degree of preference for that orientation and the ocular
1044: dominance $z$~\cite{Durbin1990,Erwin1995}.
1045: At each iteration, a stimulus vector $\bV=(\br_v,\bv)$ is chosen at random
1046: according to a given probability distribution.
1047: The first term in Eq.(\ref{eq:elastic}) denotes the elastic force or the
1048: excitatory interactions between the nearest-neighbors, and the second term
1049: implies the normalized stimuli distributed around an activity center.
1050: Functional Taylor expansion of the right hand side after dropping all nonlinear
1051: terms leads to
1052: \begin{eqnarray}
1053: \lefteqn{\Delta\psi(\br_i)=\alpha \sum_{|\br_i-\br_j|=a}
1054: \big\{\psi(\br_i)-\psi(\br_j)\big\}} \\
1055: &&-\beta\psi(\br_i)+\frac{\beta a^4}{4\pi^2\sigma_s^6}
1056: \sum_j\langle\bv_i\bv_j\rangle_\mD\big\{\psi(\br_i)-\psi(\br_j)\big\} \nonumber
1057: \end{eqnarray}
1058: where the stimulus at position $\br_i$,
1059: \begin{eqnarray} \label{eq:scatter}
1060: \bv_i=\bv e^{-|\br_i-\br_v|^2/2\sigma_s^2}
1061: \end{eqnarray}
1062: is distributed in a gaussian form with the activity center $\br_v$ and the
1063: feedforward cooperation range $\sigma_s$.
1064: The correlation between the external stimuli at position $\br_i$ and $\br_j$ is
1065: obtained by
1066: \begin{eqnarray} \label{eq:correlation}
1067: \langle\bv_i\bv_j\rangle_\mD&=&\langle v^2\rangle_\mD
1068: \sum_{\br_v}\ e^{-|\br_i-\br_v|^2/2\sigma_s^2}
1069: \ e^{-|\br_j-\br_v|^2/2\sigma_s^2} \nonumber \\
1070: &\simeq&(\pi\sigma_s^2/a^2)\langle v^2\rangle_\mD
1071: e^{-|\br_i-\br_j|^2/4\sigma_s^2}.
1072: \end{eqnarray}
1073: %as follows the results of Hoffs\"{u}mmer {\em et al.}~\cite{Hoffsummer1995}.
1074: Therefore, the effective energy of the elastic-net model can be represented
1075: through the form in Eq.(\ref{eq:linear}), where the two-point interaction
1076: function is given by
1077: \begin{eqnarray}
1078: \bD=-\beta\bI+\bJ+\bS.
1079: \end{eqnarray}
1080: The lateral activity function becomes
1081: $J(\br_i,\br_j)=\alpha\delta(|\br_i-\br_j|-a)$ or the Laplacian operator in a
1082: continuum limit.
1083: The scattering function coincides with the form in Eq.(\ref{eq:scattering})
1084: for $\eta=\beta a^4/8\pi\sigma_s^6$ or is obtained by
1085: \begin{eqnarray}
1086: %S(\br_i,\br_j)&=&\frac{2\eta}{N}(\delta_{ij}-1)\langle v_iv_j\rangle_\mD \\
1087: % &=&\beta\frac{\langle v^2\rangle_\mD}{2\pi\sigma_s^2}
1088: % \left(2\pi\delta_{ij}-e^{-|\br_i-\br_j|^2/4\sigma_s^2}\right) \nonumber
1089: S(\br_i,\br_j)\simeq\frac{\beta}{\sigma_s^2}\langle v^2\rangle_\mD
1090: \left(\delta_{ij}-\frac{a^2}{4\pi\sigma_s^2}e^{-|\br_i-\br_j|^2/4\sigma_s^2}\right).
1091: \end{eqnarray}
1092: This result means that the scattering function $S(\br_i,\br_j)$ can act as an
1093: kernel with inhibitory activity however the lateral activity function
1094: $J(\br_i,\br_j)$ is nonnegative.
1095: There are also interaction terms of higher power but the two-point interaction
1096: function $D(\br_i,\br_j)$ determines the major characteristics of developed
1097: feature maps.
1098: We transform it to Fourier space and obtain
1099: \begin{eqnarray}
1100: \tilde{D}(\bq)&=&-\beta+\tilde{J}(\bq)+\tilde{S}(\bq) \\
1101: &\simeq&-\beta-\alpha q^2+\frac{\beta}{\sigma_s^2}
1102: \langle v^2\rangle_\mD\left(1-e^{-q^2\sigma_s^2}\right). \nonumber
1103: \end{eqnarray}
1104: It has a maximum at
1105: \begin{eqnarray}
1106: q^\ast=\frac{1}{\sigma_s}\sqrt{\ln
1107: \left(\frac{\beta}{\alpha}\langle v^2\rangle_\mD\right)},
1108: \end{eqnarray}
1109: which corresponds to the analytic results from different
1110: approaches~\cite{Hoffsummer1995,Scherf1999}.
1111: %The maximum is positive for any $\sigma_s<\sigma_s^\ast$ where
1112: %\begin{eqnarray}
1113: % \sigma_s^\ast=\sqrt{\langle v^2\rangle_\mD
1114: % -\alpha-\alpha\ln\left(\frac{\beta}{\alpha}\langle v^2\rangle_\mD\right)}.
1115: %\end{eqnarray}
1116: %The sequence bifurcation model.
1117:
1118: \subsection{Self-Organizing Feature Map Algorithm}
1119: In Eq.(\ref{eq:linear}), the interaction term $\psi J\psi$ denotes the exchange
1120: of spontaneous spikes, created without external activity.
1121: Spontaneous firings can occur in coupled nonlinear oscillators with small
1122: dynamic fluctuations, which have been observed in some experiments~\cite{
1123: Llinas2003,Creutzfeldt1995,Steriade1993,Tsodyks1999,Sanchez2000,Wilson1981}.
1124: However, several experiments suggested that the organization of feature maps is
1125: possible after the exposure to the external activity.
1126: In this case, the probability of spontaneous firing are small ($J\ll v$), so
1127: that the most intracellular interactions would be achieved by indirect currents
1128: of external activities.
1129: With the provoked interactions by external activities, we can take the
1130: effective energy as
1131: \begin{eqnarray} \label{eq:H_SOM}
1132: E[\psi]=-\left(\sum B\psi+\frac{1}{2}\sum\psi S\psi\right)
1133: \left(\frac{1}{2}\sum \psi J\psi\right).
1134: \end{eqnarray}
1135: If $B(\br)\psi(\br)$ is constant for all position $\br$, the first term with
1136: $\psi J\psi$ supports the lateral interaction models again.
1137: In the Kohonen's SOFM algorithm, the lateral currents induced by feedforward
1138: normalized stimuli are focused and the effective interaction term is given by
1139: %ignores this term or assumes that $B(\br_i)=\langle v(\br_i)\rangle_\mD$ vanishes, it focus on
1140: \begin{eqnarray}
1141: D(\br_i,\br_j)&=&\frac{1}{2}\sum_\br S(\br_i,\br)J(\br,\br_j).
1142: \end{eqnarray}
1143: Moreover, the SOFM algorithm requires the hard competition, the called
1144: ``winner take all'' (WTA) case.
1145: As $\sigma_s$ approaches zero (or large $\eta$), the activity is localized only
1146: around the winning neuron and the scattering function in Fourier
1147: space becomes $\tilde{S}(\bq)\simeq\beta \langle v^2\rangle_\mD q^2$, the
1148: Laplacian operator.
1149: The lateral activity function in the SOFM approaches takes on the Gaussian form
1150: $J(\br_i,\br_j)=e^{-|\br_i-\br_j|^2/2\sigma_l^2}$ for the lateral cooperation
1151: range $\sigma_l$ (lateral plasticity control).
1152: Therefore we obtain the two-point interaction function
1153: \begin{eqnarray} \label{eq:FBM_Dq}
1154: \tilde{D}(\bq)=\frac{1}{2}\tilde{S}(\bq)\tilde{J}(\bq)
1155: =\pi \sigma_l^2\beta\langle v^2\rangle_\mD\ q^2e^{-q^2\sigma_l^2/2}
1156: \end{eqnarray}
1157: in Fourier space or
1158: \begin{eqnarray}
1159: D(\br_i,\br_j)=\beta\langle v^2\rangle_\mD \left(1-\frac{|\br_i-\br_j|^2}
1160: {2\sigma_l^2}\right)e^{-|\br_i-\br_j|^2/2\sigma_l^2}
1161: \end{eqnarray}
1162: in real space.
1163: This is the Mexican hat function in Eq.(\ref{eq:wavelet}) with $k=0.5$.
1164: Eq.(\ref{eq:FBM_Dq}) has a minimum at
1165: \begin{eqnarray}
1166: q^\ast=\sqrt{2}/\sigma_l,
1167: \end{eqnarray}
1168: which agrees with previous analytic
1169: results~\cite{Wolf2000,Scherf1999,Obermayer1992} and always positive if
1170: $\sigma_l>0$.
1171: The Kohonen's SOFM algorithm reads to robust learning rules because it always
1172: succeeds in achieving an array of different feature detectors or a columnar
1173: pattern.
1174:
1175: \section{Discussion}
1176: The physical models of neural network based on neuroscience attempt to
1177: interpret both physiologic phenomena and computational architectures.
1178: In order to study the functional of the real brain, we need more adaptable
1179: theories than the basic neural architecture with connectionism.
1180: In this paper, we show that the neural process at the cortical level can be
1181: described by using the conventional expressions in statistical physics.
1182: As we showed in visual map formations~\cite{Cho2004A,Cho2004B}, the collective
1183: neural dynamics can be much alike well-known phenomena in the physical systems.
1184: %More extended computational architectures are also possible in the neural
1185: %models with functional modularity because of the higher dimensional attributes
1186: %of the processing elements in networks.
1187: %The neural dynamic models at higher levels also have to base on neuroscience
1188: %and target to interpret both the physiologic phenomena and the computational
1189: %architectures.
1190: %The representation of neural dynamics at the cortical level is suited to understand
1191: %the statistical and collective phenomena of neurons - multi-functional map
1192: %organization or map differentiations, and cooperative computation, etc.
1193: %More effective description of neural dynamics and computations with functional
1194: %and columnar modularity have been suggested because the connections and
1195: %interactions between neurons are very huge and complex in real brain.
1196:
1197: In the assumption of neural network composed of columnar modules, we classify
1198: the synaptic connection types and anticipate different functional characters in
1199: computational processing.
1200: (1) In the connectivity between close neurons within a columnar module, the
1201: functional attributes of neurons and the associative memory is realized.
1202: (2) By the connectivity between columnar modules within a cortex area, noted by
1203: the lateral activity function or recurrent weight matrix $\bJ$, the networks
1204: control laterally the output activity between neighbors.
1205: (3) Via the connectivity between far apart neurons cross cortex areas, neurons
1206: get driven-activity from external environment or other functional cortex areas.
1207: The columnar modules become elements (or nodes) again with high dimensional
1208: attributes in networks of neural networks.
1209: If the recurrent weights matrix $\bJ$ is specified depending on the positions,
1210: the connectivity between columnar modules also work in information coding.
1211: The connectivity between columnar modules within or beyond cortex areas would
1212: be strengthened also if there are much communications between them according to
1213: the Hebbian rule, and there are some models holding the updating rule in the
1214: recurrent weights matrix $\bJ$, such as the Goodall rule.~\cite{Goodall1960}.
1215: We regard that the enhancement of connectivity between columnar modules proceed
1216: to the efficient communications between neurons rather than information coding.
1217: The consideration of minicolumn as a columnar module and the processing element
1218: in network is optional.
1219: The formation of structure in minicolumn is also due to the functional grouping
1220: between neurons with similar interests, and expected to be certified with more
1221: fundamental process at the cellular or molecular level.
1222:
1223: %Like the linear analysis in other models, the direct interaction term is
1224: %important in the determination of dominance feature of self-organizing map.
1225: %Higher power interaction terms are possible if considering (1) the normalization
1226: %of synaptic strength over networks, (2) indirect interactions between neurons,
1227: %and (3) thermodynamic perturbative coupling by $g\psi^4$ term.
1228: %The statistical property of the cortical map is revealed in the general energy
1229: %formula at continuum limit such as Eq.(\ref{eq:continuum}).
1230: %Using the Landau theory, the prediction of phase transition in neural systems
1231: %would be possible also.
1232:
1233: %Indeed, the interactions in neurons resemble those in the physical particles.
1234: %Neurons (or electrons) receive spikes (or photons) from neighbor neurons (or
1235: %electrons) and send them to neighbors again.
1236: %After collision with spikes (or photons), the preferred state of neurons
1237: %({\em or} the momentum or intrinsic phase of electrons) moves slightly to the
1238: %driven stimuli.
1239: %sensation, perception and memorization
1240:
1241: Extraction of the significant features in the input data is the purpose of an
1242: unsupervised learning rule and also expected to be a principle character of
1243: artificial and physiologic neural networks.
1244: The FBM representation method suggests how neurons find features from afferent
1245: signals and build knowledgement at the cortical level.
1246: An abstract representation of features in the FBM representation and a symmetry
1247: breaking between feature components in progress is related to the learning
1248: process in the neural network.
1249: For example, difference looks of an object form a submanifold in pattern space
1250: and the patterns of the object can be abstracted and decomposed in the
1251: transformed and reduced feature space.
1252: % such as angle or distance from viewpoint.
1253:
1254: In view of dynamics, the essential factors in neural process are (1)
1255: statistical structure of inputs, (2) attractive or repulsive interactions
1256: between neighbor neurons, and (3) stochastic behavior of neurons.
1257: In this paper, we did not fully apply thermodynamic mechanics into neural
1258: process.
1259: There are some models which contain thermodynamic approach.
1260: The basic ingredients of Tanaka's Potts spin models are those of the lateral
1261: interaction models but he took a probabilistic evolution rather than a energy
1262: gradient flow~\cite{Tanaka1989,Tanaka1990A,Tanaka1990B,Tanaka1991A,Tanaka1991B}.
1263: Piepenbrock presented a model which uses the effect of stochastic behavior in
1264: neural network as a competition process~\cite{Rao2002}.
1265: However there is no lateral inhibitory activity or feedforward competition,
1266: thermodynamic effect can make a network to have a columnar structure with a
1267: thermal excitation at low temperature.
1268: We expect the stochastic behavior of neurons can be the connection between the
1269: physical neural dynamic models and the neural network models originated from
1270: learning theory and an essential factor in comprehension of systematic
1271: ordering-disordering or bifurcation problems in the real brain.
1272: Moreover, we expect that the theoretic experience in physics can offer more
1273: intuitive appreciation of the physiologic phenomena at higher level and
1274: sophisticated mechanisms in computational architecture.
1275:
1276: This work was supported by the Ministry of Science and Technology and the
1277: Ministry of Education.
1278:
1279: \bibliography{fbm}
1280:
1281: \end{document}
1282: