0405:q-bio0405027/fbm.tex

1: % arXiv:q-bio.NC/0405027

2:

3: \documentclass[twocolumn,showpacs,pre]{revtex4}

4: %\documentclass[showpacs,pre,preprint]{revtex4}

5:

6: \usepackage{graphicx}

7: \usepackage{mathrsfs}

8:

9: \begin{document}

10:

11: \def\bq{{\mathbf q}}

12: \def\br{{\mathbf r}}

13: \def\bu{{\mathbf u}}

14: \def\bv{{\mathbf v}}

15: \def\bw{{\mathbf w}}

16: \def\bx{{\mathbf x}}

17: \def\by{{\mathbf y}}

18: \def\bz{{\mathbf z}}

19: \def\bA{{\mathbf A}}

20: \def\bD{{\mathbf D}}

21: \def\bI{{\mathbf I}}

22: \def\bJ{{\mathbf J}}

23: \def\bQ{{\mathbf Q}}

24: \def\bS{{\mathbf S}}

25: \def\bV{{\mathbf V}}

26: \def\bW{{\mathbf W}}

27: \def\mD{{\mathcal{D}}}

28:

29: \title{General representation of collective neural dynamics \\

30:   with columnar modularity}

31: \author{Myoung Won Cho}

32: \email{mwcho@postech.edu}

33: \author{Seunghwan Kim}

34: \email{swan@postech.edu}

35: \affiliation{

36:   Asia Pacific Center for Theoretical Physics $\&$ NCSL,

37:   Department of Physics, Pohang University of Science and Technology,

38:   Pohang, Gyeongbuk, 790-784, Korea}

39: \date{\today}

40: \begin{abstract}

41: We exhibit a mathematical framework to represent the neural process at the

42: cortical level.

43: The description of neural dynamics with columnar and functional modularity,

44: named the fibre bundle map (FBM) representation method, is based on both

45: neuroscience and informatics, whereas it leads to the conventional formulas in

46: statistical physics.

47: The possibility of analogy between the phenomena in brain and physical systems

48: has been proposed~\cite{Cho2004A,Cho2004B}.

49: In spite of the complex circuitry and the nonlinear dynamics in neural systems,

50: the neural behavior at high levels may be described by simple and general

51: rules, which related to the noble theory in statistical physics.

52: The FBM method gives profit in building or analyzing the neural models by

53: representing essential ingredients of neural interactions by general formulas.

54: %We insist that the typical characters of self-organizing map is determined not

55: %by the detailed and complex interaction rules but by the topology of lattice

56: %and feature space.

57: %The collective neural phenomena can be understood and predicted through some

58: %parameters in a general energy form with symmetry transform invariance.

59: %Not only the similarity in formulas, the cortical dynamics can share the

60: %statistical properties with other physical systems, which validated in primary

61: %visual maps~\cite{Cho2004A}.

62: We apply our method to the proposed models of visual map formation and show how

63: they can share statistical properties with vortex dynamics in magnetism in

64: spite of various development mechanisms.

65: \end{abstract}

66: \pacs{87.10.+e,87.19.La,89.75.Fb}

67:

68: \maketitle

69:

70: \section{Introduction}

71: Though the detailed dynamics of a single neuron are revealed,

72: there still remains a challenge at the network level to explain how brains

73: perform higher cognitive functions.

74: Studies of physical models are focused on achieving a biological realism of the

75: neural computation models.

76: %such as the networks of coupled oscillators~\cite{Nishikawa2004,Aonishi1999},

77: However, the success of the basic neural network models, based on the

78: connectional framework between simple cells, in the application of small

79: adaptive systems, they are inherently problematic in the apprehension of

80: collective neural phenomena and higher cognitive behavior in the real brain.

81: And also there are attempts to see through the neural processing at higher

82: levels, the functional modularity of neurons or the symbolic processing

83: architecture.

84: Before the physiological evidence of repetitive cortical blocks, there were

85: proposals of the modularity within neighbor neurons, the called {\em cell

86: assemblies} (CAs), considering the high dimensional attribute and faculty of

87: neurons~\cite{Hebb1949}.

88: It is a tendency of neurons to aggregate together with similar functional

89: specializations and make organizations hierarchically.

90: Though different classifications and names for neural clusters, we adopt the

91: suggestion that {\em neuron} - {\em minicolumn} - ({\em hypercolumn}) -

92: {\em macrocolumn} - {\em cortex area} - {\em hemisphere}, where minicolumn is

93: a candidate for ``the repeating pattern of circuitry'' or ``the iterated

94: modular unit''~\cite{Calvin1998}.

95:

96: In this paper, we exhibit a mathematical framework, noted briefly and named the

97: fibre bundle map (FBM) methods in ref.~\cite{Cho2004A}, and show how to

98: represent the neural process generally at the cortical level.

99: Briefly speaking, the FBM representation is a mapping of the feature components

100: (often the synaptic weights) to topological spaces, the called {\em bundles}.

101: The pattern informations are reordered in locally transformed coordinates and a

102: few of major components are extracted.

103: Obviously there exist another mathematical framework to represent the neural

104: process in reduced space.

105: Kohonen set up the mathematical preliminaries, the called feature maps or

106: {\em feature-based} representation, in vector space and led the successive

107: models in artificial and physiologic neural networks~\cite{Kohonen1984}.

108: Symbolic processing architectures also suggest the description of neural

109: computations at the cognitive and rational bands.

110: The feature vector space or the symbolic sets can belong to a kind of FBM

111: representation.

112: But the FBM method has an interest in the manifold structure of frequent

113: inputs in feature space and its corresponding symmetry group.

114: Indeed, the properties of neural progress are governed not by the detailed

115: neural interaction rules but by the algebraic structure of dominant feature

116: components.

117: The FBM method represents the neural process by the general formulas in

118: statistical physics and helps to comprehend collective neural phenomena

119: intuitively through the knowledge in statistical mechanics and differential

120: geometry.

121:

122: As a class of abstract representations, the formulas in the FBM models have

123: some different character with those in the feature-based (often the called

124: ``low-dimensional'' feature vector) models.

125: In the feature-based representation, the change in the feature vector at

126: position $\br$, $\Phi(\br)$ is described as the difference in the stimuli

127: vector $\Phi(\br')$, such as $\Delta\Phi(\br)\propto(\Phi(\br')-\Phi(\br))$

128: with the energy of the form $|\Phi(\br)-\Phi(\br')|^2$ (or its higher powers).

129: %Whereas, the energy functions in the FBM representation consist of the inner

130: %products such as $\psi(\br)\psi'(\br)$.

131: Whereas, in the FBM representation, the interactions between neurons are

132: notated by inner products rather than their distance, and it is generally

133: assumed that the energy of neural process can be expanded in a power series,

134: i.e.,

135: %and classified according to the number of coupling.

136: %It is assumed that the energy of neural process can be expanded in a power

137: \begin{eqnarray} \label{eq:expansion}

138:   E[\psi]&=&E^{(0)}-\sum_iB(\br_i)\psi(\br_i) \nonumber \\

139:   &-&\frac{1}{2!}\sum_{i,j}D(\br_i,\br_j)\psi(\br_i)\psi(\br_j) \\

140:   &-&\frac{1}{3!}\sum_{i,j,k}F(\br_i,\br_j,\br_k)

141:     \psi(\br_i)\psi(\br_j)\psi(\br_k) + \cdots. \nonumber

142:   %&-&\frac{1}{4!}\sum_{\bx,\by,\bz,\bw}G(\bx,\by,\bz,\bw)

143:   %  \psi(\bz)\psi(\by)\psi(\bz)\psi(\bw)+\cdots, \nonumber

144: \end{eqnarray}

145: where the field variables $\psi(\br)$ denote the feature state of neurons at

146: cortical location $\br$.

147: We will show how this formula is derived and the interaction functions are

148: determined in the simple (or the called ``high-dimensional'' feature vector

149: representation) and the complex cell models.

150: %This formula can be derived concretely from the fundamental neural process as well.

151: %Moreover, in the simple cell (or the ``high-dimensional'' representation)

152: %models, the objective function for Hebbian modification obey this general form

153: %as a function the synaptic weights $\bW$ rather than the field variables $\psi$.

154: In a continuum limit, the energy can be approximated to

155: \begin{eqnarray} \label{eq:continuum}

156:   E[\psi]=\int d\br\left\{\frac{v}{2}|(\nabla-i\bA)\psi|^2

157:     +\frac{m^2}{2}|\psi|^2+\frac{g}{4!}|\psi|^4\right\},

158: \end{eqnarray}

159: where the odd power terms are expected to be vanished generally.

160: This is just the Ginzburg-Landau energy with gauge invariance and explains the

161: statistical properties of the emergent cortical maps in experiments and

162: simulations.

163: The energy in a continuum approximation often can be derived using only minimal

164: mathematical constraint such as the symmetry.

165: %requirement of invariance under the symmetry transformations without the detailed cortical modification rules.

166: The energy form in Eq.(\ref{eq:expansion}) and Eq.(\ref{eq:continuum}) proposes

167: the possibility of the analogy between the physical and the neural systems, and

168: the characteristics of developed visual maps are systematically apprehended

169: through the statistical properties of vortices in

170: magnetism~\cite{Cho2004A,Cho2004B}.

171: %Phase transitions can be predicted when the changes in parameters, whereas the

172: %parameters are obtained from the detailed interaction mechanisms.

173:

174: %The general energy form in cortical dynamics can be build via two different

175: %ways.

176: %One, the energy function and the pattern properties in cortical map formations

177: %can be inferred only using the topologic properties.

178: %the symmetry between the feature states (or called {\em gauge symmetry} in quantum mechanics).

179: %Considering the transform invariant properties, it is generally assumed that

180: %the energy of map formations takes the form at a continuum limit

181: %Another way is to build models through the detailed description of individual

182: %neural interactions.

183:

184: We apply the FBM representation method to the development models in visual

185: cortex.

186: The cortical map formation in orientation and ocular dominance columns is one

187: of the most studied problems in brain.

188: A considerable amount of different models is proposed, and some of which are

189: compared with the experimental findings and in

190: competition~\cite{Erwin1995,Swindale1996}.

191: %The theoretic analysis of pattern formation are reported within a few of models.

192: Miller {\em et al.} formulated {\em correlation-based} models describing how

193: ocular dominance and orientation columns develop in simple cell

194: models~\cite{Miller1989,Miller1992,Miller1994}.

195: Obermayer {\em et al.} presented a statistical-mechanical analysis of pattern

196: formation and compared predictions quantitatively with experimental data using

197: the Kohonen's {\em self-organizing feature map} (SOFM) approaches.

198: Wolf {\em et al.} obtained again the conditions for the emergence of a columnar

199: pattern in the SOFM algorithm~\cite{Wolf2000}.

200: The studies of the {\em elastic-net} model also show the bifurcation and

201: emergence of a columnar pattern~\cite{Durbin1990,Hoffsummer1995,Goodhill2000}.

202: Scherf {\em et al.} investigated the pattern formation in ocular dominance

203: columns with more detailed model, which covers the results of the SOFM

204: algorithm and the elastic-net model~\cite{Scherf1999}.

205: Wolf and Geisel predicted the influence of the interactions between ocular

206: dominance and orientation columns on the pinwheel stability without model

207: dependency and demonstrated it in the simulations of the elastic-net

208: model~\cite{Wolf1998}.

209: The lateral (or neighbor) interaction models are also successful scheme based

210: on physiology~\cite{Swindale1980,Swindale1982,Cowan1991,Cho2004A}.

211:

212: In the proposed visual map formation models, the Hamiltonian models with spin

213: variables belong to the class of FBM representation

214: models~\cite{Cho2004A,Cowan1991,Tanaka1989}.

215: Other development models written in the high- or low-dimensional feature vector

216: representation can be revised again in the FBM representation.

217: The formulas in FBM models represent essential ingredients of neural

218: interactions without paying much attention to particular neural control

219: mechanism.

220: Moreover, the modification of the iterative procedure of a model into the

221: formula in Eq.(\ref{eq:expansion}) or Eq.(\ref{eq:continuum}) becomes the

222: statistical analysis of the model itself.

223: The quadratic interaction function $D(\br_i,\br_j)$ is consequence in the

224: visual map formation as other physical systems.

225: The interaction functions in neural process mean more than the intracortical

226: connections or recurrents in the called lateral activity control.

227: In the competitive Hebbian models, such as the elastic-net model and the SOFM

228: algorithm, the interaction functions comprise the feedforward competition or

229: normalization process.

230: However, in the FBM representation the functional matrix $D(\br_i,\br_j)$ of

231: the visual map formation models have common shape, the called Mexican hat type,

232: that is, positive in short-range and negative in long-range, in spite of

233: different development mechanisms.

234: The bifurcation to a inhomogeneous state and the emergence of a columnar

235: pattern is possible when there are strong negative interactions in

236: $D(\br_i,\br_j)$.

237: The development of a columnar pattern is also concerned with non-vanishing

238: vector $\bA$, the called {\em vector potential} in physics, in

239: Eq.(\ref{eq:continuum}).

240: The FBM representation method will show how the development models with

241: different mechanisms lead to the successful formation of visual maps and

242: share the statistical properties of vortices in the spin Hamiltonian models.

243: %Recently, we predicted the bifurcation of inhomogeneous solutions also in

244: %lateral interaction models, and derive the typical properties in observed

245: %patterns, such as the orthogonality and the correlation function~\cite{Cho2004A}.

246:

247: \section{Representation of neural state with columnar modularity}

248: The structures and connections in cerebral cortex are more complex and modular

249: than those in artificial neural networks.

250: Neurons tend to be vertically arrayed in the cortex, forming cylinders known as

251: cortical columns.

252: Traditionally, six vertical layers have been distinguished and classified into

253: three different functional types.

254: The layer IV neurons ({\em IN} box), first get the long-range input currents,

255: and send them up vertically to layer II and III ({\em INTERNAL} box) that are

256: the called true association cortex.

257: Output signals are sent down to the layer V and VI ({\em OUT} box), and sent

258: further to the thalamus or other deep and distant neural structures.

259: Lateral connections also occur in the superficial (layer II and III) pyramidal

260: neurons.

261: In columnar (or horizontal) clustering, there are minicolumns, which are

262: consisted of about 100 neurons and 30 $um$ in diameter in monkeys, and

263: macrocolumns, which are 0.4$\sim$1.0 $mm$ and contain at most a few hundred

264: minicolumns.

265: On the wider discrimination, there are 52 cortex areas in each human

266: hemisphere; a Brodmann area averages 21 $cm^2$ and 250 million neurons grouped

267: into several million minicolumns~\cite{Calvin1998}.

268:

269: %\begin{figure}[t]

270: %\includegraphics[width=8cm]{3d-colmn}

271: %\caption{ \label{fig:3d-colmn}

272: %  The 3-D structure of cortical columns.

273: %  (Reprinted by permission from William H. Calvin, 2001,

274: %  {\em The Cerebral Code}, The MIT Press, Copyright \copyright 1996 by William

275: %   H. Calvin)

276: %}

277: %\end{figure}

278:

279: The columnar modules can be regarded as a kind of multi-layered neural networks

280: and would have complex functional attributes.

281: Most neurons in brain have the attribute of {\em selective response} to a

282: received activity, and the preferred signals become an useful representation

283: of the functional attributes in a small neural region.

284: A traditional representation of neural state is the vector notation $\bv$,

285: where its components correspond to the activity of each neuron in receptor

286: layer.

287: If a columnar module (or complex cell) at position $\br$ respond selectively

288: to a particular input vector $\bv$ and make an output vector $\by$, its

289: functional attribute can be represented compactly as,

290: \begin{eqnarray} \label{eq:associator}

291:   w(\br)=\by\circ F\circ\bv^\top,

292: \end{eqnarray}

293: where $F$ is the nonlinear response or activation function of complex cell.

294: % posterior probability function.

295: If the activation function is linear or ignored, this leads to a simple pattern

296: associator, the called {\em linear associator}.

297: The experiments of the response properties to external stimuli through

298: electrode penetration can be understood as the measurement of the product

299: between the associator $w(\br)$ and the input signal $\bv'$ :

300: \begin{eqnarray} \label{eq:inner_product}

301:   |w(\br)\circ\bv'|=|\by|\ F(\bv^\top\bv'),

302: \end{eqnarray}

303: where the activity of the output $|\by|$ corresponds to the measurement of the

304: number of action potentials or the frequency of spikes.

305: In the physiological experiments with the complex cells in primary visual

306: cortex~\cite{Hubel1962} or the object perceptions in inferotemporal (IT)

307: cortex~\cite{Tsunoda2001}, the response property of columnar modules used to be

308: the combination of different patterns and then the functional form in

309: Eq.(\ref{eq:associator}) would be expanded into the summation of associators.

310: When the output $\by$ is common with the most favorite input $\bv$ such as

311: Hopfield networks~\cite{Hopfield1982} or the most favorite input is only

312: concerned, a vector notation can play the role of representation of functional

313: attributes in columnar modules.

314:

315: %\begin{figure}[t]

316: %\begin{minipage}[b]{4cm}

317: %  \includegraphics[width=4cm]{intrinsic} (a) intrinsic type

318: %\end{minipage}

319: %\ \

320: %\begin{minipage}[b]{4cm}

321: %  \includegraphics[width=3.3cm]{extrinsic} \\ \ \\ \ \\ (b) extrinsic type

322: %\end{minipage}

323: %\caption{ \label{fig:coding_type}

324: %  For the response properties of neurons, two different encoding types are

325: %  possible whether the synaptic connections are (a) between close neurons

326: %  within columnar module or (b) with far aparted neurons cross cortex areas.

327: %}

328: %\end{figure}

329:

330: Fig.\ref{fig:network} depicts a neural network with columnar modules.

331: A matrix $\bW$ denotes the feedforward synaptic weights cross cortex areas,

332: such as the connections between LGN and primary visual cortex, and the input

333: vector to a columnar module is given by $\bv_i=\bW_i\bu$ (or

334: $v_i=\sum_\alpha W_{i\alpha}u_\alpha$).

335: In a complex cell model, it is expected that the synaptic connections within a

336: columnar module $w(\br)$ achieve the functional attributes of neuron, whereas

337: in a simple cell model, the connections with the external cells $\bW$ are

338: considered to vest the functional attributes.

339: For example, the ocular dominance in primary visual cortex is determined

340: whether a neuron in V1 is more connected to the left or right eye (or LGN)

341: cells.

342: We call this the {\em extrinsic} information coding type, which is realized by

343: the connectivity of far neurons cross cortex areas, whereas the {\em intrinsic}

344: type is realized by the synaptic plasticity between close neurons within a

345: columnar module.

346: The neural attribute of two coding types are represented by common formula in

347: FBM models, but there exist some different ground when building actual models.

348: The feedforward competition behavior should be related to the intrinsic coding

349: type.

350: Moreover, the extrinsic encoding type causes a problem in modeling huge

351: networks because too massive connections are required when the meaning of

352: activity is characterized only from where the current come.

353: We expect that the intrinsic type, encoding information in spatial or temporal

354: correlations within a signal band, is essential in huge networks and would be

355: a prominent strategy in the real brain.

356:

357: \begin{figure}[t]

358: \includegraphics[width=8cm]{network}

359: \caption{ \label{fig:network}

360:   A neural network model with columnar modules with function $w^{(i)}$.

361:   Input signal to a columnar module $\bv_i$ is driven by feedforward synapses

362:   with weights $\bW$, that is, $\bv_i=\bW_i\bu$, and its output $\by_i$ is

363:   interconnected to neighbors by intracortical connections $\bf J$.

364:   %Information (or the functional attributes of neurons) are encoded in the

365:   %connectivity within columnar modules ${\bf w}^{(I)}$ (intrinsic type) or in

366:   %the feedforward synapses $\bf W$ (extrinsic type).

367: }

368: \end{figure}

369:

370: \section{Fibre bundle map representation}

371: %In simple cell models, the cortical models can get extremely complex

372: % the high-dimensional components the amount of receptor cells,

373: %To deal with this, a class of more abstract models has been developed.

374: %In the ``low-dimensional'' feature vector representation each component stands

375: %for a selected response property.

376: %For example, the features of orientation columns are denoted by Cartesian

377: %components

378: %$\Phi(\br)=\left(q(\br)\sin(2\phi(\br)), q(\br)\sin(2\phi(\br))\right)$ for

379: %preferred orientation $\phi(\br)$ and degree of preference for that orientation

380: %$q(\br)$ at each cortical location $\br$~\cite{Swindale1982}.

381: %In the FBM representation, however, they sometimes takes similar forms with

382: %the low-dimensional feature vector representation, the feature components are

383: %approximated with different standpoint.

384: % given pattern vector, we can extract the feature components,

385: %that are the center of the pattern $(x,y)$ and the maximal variance vector $(v_x,v_y)$.

386: %With the ocular dominance $z$, the feature vector with 5 components,

387: %\begin{eqnarray} \label{eq:visual_feature_vector}

388: %  \Phi=(x,y,v_x,v_y,z)

389: %\end{eqnarray}

390: %is a usual representation of the orientation and the ocular dominance columns in visual cortex.

391: %so to say, a reduced dimensional

392: %The components in the FBM representation are composed of

393: %representation with the most prominent components on other basis.

394:

395: The FBM representation method bases on a mathematical framework - the called

396: {\em fibre bundle} in manifold theory~\cite{Martin1991,Nash1983}.

397: For a trivial fibre bundle, a total (or bundle) space $E$, which will depicts

398: the neural attributes at a cortical area, is composed of a base space $B$ and a

399: fibre $F$, tat is, $E=B\times F$.

400: In our interests, cortical locations are the elements in base space, where

401: feature (often pattern, code or model) space becomes a fibre.

402: A structure (or symmetry) group $G$ is a homeomorphism of fibre $F$, and the

403: same with the fibre $F$ in a {\em principal fibre bundle}.

404: The principal fibre bundles admit {\em connexions} (or vector potentials in

405: physics), and it is for this reason that they are of basic importance in gauge

406: theories in physics.

407: The features of cortical cells or small cortical regions at each cortical

408: location $\br$ are represented by a set of field variables $\psi_\alpha(\br)$

409: and

410: \begin{eqnarray} \label{eq:representation}

411:   \psi(\br)=|\psi(\br)|\exp(-i\phi_a(\br)\tau^a)=\psi_a(\br)\tau^a,

412: \end{eqnarray}

413: where $\phi_a(\br)$ is an arbitrary internal (feature) phase and $\tau^a$ is

414: the basis of a continuous (or Lie) group G.

415: The bases can be taken as the amount of receptor cells, but are usually reduced

416: according to the statistical structure of inputs.

417: The frequent inputs usually occupy small regions in the total feature space and

418: the major variance of feature components occurs within a embedded submanifold

419: with high stimuli density (Fig.~\ref{fig:V})

420: %, the bases are transformed according to the principal directions of external stimuli density at a point.

421: The reduction of feature space is related to the extraction of features from

422: inputs in learning rules as well.

423: Symmetry breaking between transformed feature components is expected in the

424: neural progress of experience and learning, and cortical dynamics can be

425: described with a few of field components in a reduced feature space.

426:

427: \begin{figure}[t]

428: \includegraphics[width=8cm]{V}

429: \caption{ \label{fig:V}

430:   Probabilistic external stimuli and a potential function with external source.

431:   The transformed basis $\tau^{1'}$ and $\tau^{2'}$ are the principal

432:   directions of external stimuli density at a point.

433: }

434: \end{figure}

435:

436: %The interactions in cortical circuitry and the synaptic plasticity are more

437: The differential geometric concepts in the FBM representation furnish an

438: intuitive explanation for emergent cortical maps.

439: The self-organization of feature maps achieved by locally gathering similar

440: interests means there is smooth variance of features with neighbor neurons at

441: each location.

442: In other words, the properties of ``organized'' and ``optimized'' feature maps

443: is related with those of ``continuous'' and ``flat'' variables in manifold.

444: If there is no difference of features with neighbors at small region near

445: position $\br$, they can be denoted by $\nabla\psi(\br)=0$ (or

446: $\nabla\phi(\br)=0$).

447: If there exists small tilting of phase angle at position $\br$ and an arbitrary

448: vector $\bA(\br)$ denote the difference between phase angles, the called

449: {\em covariant derivative} is given by $(\nabla-i\bA(\br))\psi(\br)=0$ (or

450: $\nabla\phi(\br)-A(\br)=0$).

451: If the covariant derivative vanishes (said to be flat or parallel by translated

452: in manifold theory) for all $\br$, the distribution of the field variables

453: $\psi(\br)$ would be a minimum solution of the integral

454: \begin{eqnarray} \label{eq:action}

455:   S=\int d\br\ |(\nabla-i\bA)\psi|^2,

456: \end{eqnarray}

457: for the connexion $\bA$.

458: A non-vanishing connexion $\bA$ occurs when there are strong competitive

459: behavior or inhibitory lateral interactions between neurons, and is related to

460: the emergence of a periodic pattern in cortical maps, such as the band patterns

461: in ocular dominance columns and the linear zones in orientation preference

462: columns, with the wavelength $\Lambda=2\pi/|\bA|$.

463: Fig.\ref{fig:macaque} shows the complete pattern of ocular dominance stripes

464: of a macaque monkey.

465: The orthogonality between the contour lines of feature map and the boundary

466: of cortical area is a property of minimal solutions in Eq.(\ref{eq:action}).

467: From the condition $\delta S/\delta\phi\sim 0$ or $\nabla^2\phi\sim 0$ for

468: $\psi=e^{2i\phi}$ with the preferred angle $\phi$, the normal component of

469: $\nabla\phi$ vanishes at the area boundary since the integral along a narrow

470: rectangular loop over the area boundary $\oint_C\nabla\phi\cdot d\hat{n}$

471: vanishes due to the divergence theorem.

472: Such perpendicularity with the area boundary is also manifested in other static

473: field solutions, such as the magnetic field.

474:

475: \begin{figure}[t]

476: \begin{minipage}[b]{5cm}

477:   \includegraphics[width=5cm]{evol03c}

478: \end{minipage}

479: \ \

480: \begin{minipage}[b]{0.65cm}

481:   \includegraphics[width=0.65cm]{bar2}

482: \end{minipage}

483: \caption{ \label{fig:orientation_map}

484:   The simulation result of orientation map formation.

485:   The orientation maps have $U(1)$ (or $O(2)$) symmetry and the major

486:   characteristics of the developed map can be predicted using only the symmetry

487:   properties.

488: }

489: \end{figure}

490: \begin{figure}[t]

491: \includegraphics[width=6cm]{macaque}

492: \caption{\label{fig:macaque}

493:   The complete pattern of ocular dominance stripes in the striate cortex of a

494:   macaque monkey.

495:   There is a strong tendency for the stripes to meet the margin of striate

496:   cortex at steep or right angles.

497:   (Reprinted by permission from S.LeVay, Copyright \copyright 1985 by the

498:   Society for Neuroscience~\cite{LeVay1985}.)

499: }

500: \end{figure}

501:

502: The symmetry property also helps to predict the energy function of the cortical

503: map formation.

504: %The major features in cortical maps are universal and can be understood through

505: %the experience in other physical systems.

506: %Other orientation development models should also satisfy the energy form in spite

507: %of each different interaction rules employed.

508: For example, the features of orientation preference columns in the visual

509: cortex have $U(1)$ (or $O(2)$) symmetry.

510: However we perform a rotation in all the preferred angles through same angle

511: ($\phi\rightarrow\phi+\chi$ - called `global' gauge transform), the energy of

512: orientation map formation should remain invariant.

513: The rotation angle $\chi$ can have a dependency on position $\br$, the called

514: `local' gauge transform, and the energy in a continuum approximation may take

515: the form in Eq.(\ref{eq:continuum}) or Eq.(\ref{eq:action}) with

516: $\bA=\nabla\chi(\br)$.

517:

518: %%The FBM approaches say that the symmetry structure between features determines the typical character of self-organizing feature maps.

519: %The structure of transition group between features can be presumed with

520: %several algebraic descriptions :

521: %(1) In the primary sensory areas, the transition functions can be inferred from

522: %the symmetry in external activities or patterns.

523: %The symmetry group corresponding with the feature space is clear and complete.

524: %(2) If two different features, $\psi_1(\br)$ and $\psi_2(\br)$, are occupied

525: %at common cortex, the imposed restriction for normalization is

526: %$|\psi_1(\br)|^2+|\psi_2(\br)|^2=const$ for all position $\br$.

527: %For example, the symmetry group of the orientation and ocular dominance columns

528: %in primary visual cortex is not $O(2)\times O(1)$ but $O(3)$.

529: %A typical character of two combined feature maps is that the contour lines of

530: %them meet at right angle, because of $\nabla\psi_1\cdot\nabla\psi_2\sim 0$ with

531: %the equilibrium conditions

532: %$\delta E/\delta\psi_\alpha\sim 0$ or $\nabla^2\psi_\alpha\sim 0$ for

533: %$\alpha=1$, $2$.

534: %The orthogonal property between the orientation and ocular dominance maps is

535: %reported also in animal experiments~\cite{Obermayer1993}.

536: %(3) Like the primary auditory cortex, the transition functions corresponding

537: %with the features are not consist complete group but be ordered sequentially.

538: %In this cases the extreme (the maximal or minimal) features tend to exist at

539: %the boundary of feature map.

540: %(4) Some measurements in biologic experiments, such as the correlation between

541: %activity, give information about the difference between codes.

542: %However there are no experimental evidence, we can guess the relative distance

543: %between codes and classify them according to their category, such as human

544: %faces, monkey faces or shapes in inferotemporal cortex.

545: %Such homomorphic representation of group structure is useful for the problems

546: %of functional area differentiations at macroscopic level.

547: %(5) At high cognitive area, it is not easy to infer the transition function

548: %group because the code space is embedded on very large and high dimensional

549: %manifold.

550: %If we cannot guess any more symmetry or relative distance between codes,

551: %symbolic sets will be available, where they are complete groups also.

552:

553: \section{Description of detailed neural interactions}

554: %If we consider the selective response to input signals due to the connections

555: %between neurons within a cluster, the attributes of a block of neurons can be

556: %represented following the basic neural network architectures, what Kohonen

557: %(1977) labeled {\em heteroassociation}.

558: %The cortical modification models at low level (or single-cell models) suggest

559: %more physical features of neural interactions and the biologic foundation of

560: %more abstract models.

561: The description of neural dynamic at a high level also should be based both on

562: neuroscience and informatics.

563: One important principle for neural plasticity is the Hebbian rule : two

564: simultaneously active neurons on either side of a connection increases the

565: weight of that connection is increased~\cite{Hebb1949}.

566: The simple Hebbian plasticity rule in a single neuron consists of inputs $\bu$

567: and weights $\bW$ takes the form

568: \begin{eqnarray} \label{eq:Hebbian}

569:   \Delta\bW(t)\propto y(t)\bu(t)

570: \end{eqnarray}

571: for the output $y=f(\bW\bu)$ with the activation function $f$ of simple cell.

572: In intracortical connected networks, the input becomes the summation of the

573: current from input and neighbor cells.

574: The output of neuron at $i$-th site becomes

575: \begin{eqnarray} \label{eq:recursive}

576:   y_i=f(v_i+\sum_j J_{ij}y_j)

577: \end{eqnarray}

578: for $v_i=W_{i\alpha}u_\alpha$ and the recurrent weight matrix $\bJ$.

579: In a energy model, synaptic plasticity rule is regarded as the negative

580: gradient of an energy (often objective, error or cost function) defined as a

581: function of $\bW$ :

582: \begin{eqnarray}

583:   \Delta\bW\propto-\frac{\partial E[\bW]}{\partial\bW}.

584: \end{eqnarray}

585: Because of the nonlinearity of the activation function and the recursive form

586: in Eq.(\ref{eq:recursive}), the energy used to be approximated depending on

587: models.

588: For example, with the assumption of $y_i=f(v_i+\sum_j J_{ij}v_j)$ and a series

589: expression of activation function $f(v)=\sum_\ell a_{\ell+1}v^\ell$, the energy

590: is obtained by

591: \begin{eqnarray} \label{eq:simple_energy}

592:   E[\bW]=-\sum_\ell\frac{a_\ell}{\ell}D_{i_1\cdots i_\ell}^{(\ell)}

593:     Q^{(\ell)}_{\alpha_1\cdots\alpha_\ell}

594:     W_{i_1\alpha_1}\cdots W_{i_\ell\alpha_\ell},

595: \end{eqnarray}

596: where

597: \begin{eqnarray}

598:   D^{(\ell)}_{i_1\cdots i_\ell}&=&(\delta_{i_1i_2}+J_{i_1i_2})\cdots

599:    (\delta_{i_{\ell-1}i_\ell}+J_{i_{\ell-1}i_\ell}) \nonumber \\

600:   &=&D^{(2)}_{i_1i_2}\cdots D^{(2)}_{i_{\ell-1}i_\ell}

601: \end{eqnarray}

602: is the functional tensor of rank $\ell$ and

603: \begin{eqnarray}

604:   Q^{(\ell)}_{\alpha_1\cdots\alpha_\ell}=

605:     \langle u_{\alpha_1}\cdots u_{\alpha_\ell}\rangle_\mD

606: \end{eqnarray}

607: is the input correlation tensor of rank $\ell$.

608: $\langle\ \cdot\ \rangle_\mD$ denotes the average over input data set $\mD$.

609: This energy based on the basic Hebbian rule is adjusted again depending on the

610: characteristic of synaptic plasticity rules~\cite{Fregnac1998}.

611: For example, the covariance plasticity rule replaces the input correlation

612: function $\bQ^{(\ell)}$ with rank $\ell$ as the input covariance function

613: \begin{eqnarray}

614:   C^{(\ell)}_{\alpha_1\cdots\alpha_\ell}=\langle

615:     (u_{\alpha_1}-\langle u_{\alpha_1}\rangle_\mD)\cdots

616:     (u_{\alpha_\ell}-\langle u_{\alpha_\ell}\rangle_\mD)\rangle_\mD.

617: \end{eqnarray}

618: In the FBM representation of simple cell model, feedforward synaptic weights

619: $\bW_{i\alpha}$ is replaced as field variables $\psi_\alpha(\br_i)$, then the

620: energy in Eq.(\ref{eq:simple_energy}) satisfies the form of energy in

621: Eq.(\ref{eq:expansion}).

622: For efficient description of dynamics, the energy is decomposed into the

623: functions of transformed field variables.

624: Because of the anisotropy in input correlation $\bQ$ (often in neighbor

625: activity $\bD$), the symmetry between components is broken and the effective

626: dynamics can be described with a few of dominant components.

627: The consequence of the anisotropy in neighbor activity between feature

628: components is explored in the case of the anisotropy between orientation and

629: ocular dominance columns~\cite{Cho2004B}.

630: %For the orientation preference columns, the prominent pattern are the oriented

631: %images with low frequency.

632:

633: In a complex cell model, the features of neurons relate to the synapses within

634: a columnar module.

635: The columnar module is a kind of adaptive neural network systems and the

636: modulation of its functional attributes involves intricate changes in synaptic

637: weights.

638: An effective assumption is that the output of a columnar module is one of

639: the proper states of the functional and will change following afferent

640: signals.

641: For the currents from input and neighbor cells and a linear activation function,

642: the change in the proper state or the output of a columnar module is then

643: \begin{eqnarray}

644:   \Delta\by_i\propto\bv_i+\sum_jJ_{ij}\by_j

645: \end{eqnarray}

646: for the input $\bv_i$ to the columnar module at position $\br_i$ and the energy

647: averaged over inputs is obtained by

648: \begin{eqnarray}

649:   E[\by]=-\sum_i\langle\bv_i\rangle_\mD\by_i

650:     -\frac{1}{2}\sum_{i,j}J_{ij}\by_i\by_j.

651: \end{eqnarray}

652: In the FBM representation, the output with multivariable is replaced by field

653: variables :

654: \begin{eqnarray}

655:   E[\psi]&=&-\sum_iB_i\psi_i-\frac{1}{2}\sum_{i,j}J_{ij}\psi_i\psi_j

656: \end{eqnarray}

657: or

658: \begin{eqnarray}

659:  \lefteqn{E[\psi]=-\sum_i B(\br_i)\psi^\dagger(\br_i)} \\

660:   &&-\frac{1}{4}\sum_{i,j}J(\br_i,\br_j)\left\{\psi(\br_i)^\dagger\psi(\br_j)

661:     +\psi(\br_i)\psi(\br_j)^\dagger\right\}, \nonumber

662: \end{eqnarray}

663: where a functional vector $B(\br_i)=\langle \bv_i\rangle_\mD$ is the linear

664: average over inputs.

665: %The term for neighbor interactions in the FBM methods takes the exchange energy form

666: %Indeed the mathematical frameworks and formulas in the FBM methods resemble those

667: %in statistical quantum field theory.

668: If we assume $\psi^\dagger$ and $\psi$ are creation and annihilation operators,

669: the term $\psi(\br_i)J(\br_i,\br_j)\psi^\dagger(\br_j)$ can be regarded as the

670: description of phenomena that a created activity at position $\br_j$ is

671: translated with kernel $J$ and annihilated at position $\br_i$.

672:

673: A series of physiological experiments showed that the synaptic plasticity comes

674: from a redistribution of the available synaptic efficacy, not an increase in

675: the efficacy~\cite{Markram1996,Fregnac1998}.

676: In other words, the neural plasticity at the network level can be understood

677: as the pursuit of increment in the probability of reactivity with bounded total

678: synaptic strength for environmental experience.

679: With the expectation of a automatic normalization of synaptic weights,

680: %to a single neuron for simple cell model (or within a columnar module for complex cell model),

681: the norm of field variables $|\psi|$ used to be constrained to be constant.

682: In this sense, the neural dynamics with functional modularity may be described

683: by the slight shift in the internal phase per activity following afferent

684: signals.

685: Sometimes the normalization constraint is not imposed and involved in the

686: plasticity rule with subtractive normalization~\cite{Oja1982}.

687: For the energy function of the form

688: \begin{eqnarray}

689:   E[\psi]=a\psi^2-b\psi^4,

690: \end{eqnarray}

691: the stability of synaptic weight can be achieved due to the relaxation of

692: $|\psi|^2$ to its equilibrium value.

693:

694: Another important mechanism expected in neural computation is the enhancement

695: of neural activity depending on correspondence to input.

696: A possible enhancement modulation is the restriction on the sum over the

697: activity by subtractive normalization.

698: With a simple nonlinear form $x+\eta x^2$, the external source term with

699: enhanced afferent signals becomes that

700: %depending on the conformity is that

701: \begin{eqnarray} \label{eq:enhancement}

702:   \langle\bv_i'\rangle_\mD\psi_i&=&\left\langle \bv\frac{\rho_i(1+\eta\bv\psi_i)}

703:     {(1/\rho)\sum_j\rho_j(1+\eta\bv\psi_j)}\right\rangle_\mD\psi_i \nonumber \\

704:   &\simeq&\langle\bv_i\rangle_\mD\psi_i+\frac{1}{2}\sum_j S_{ij}\psi_i\psi_j

705: \end{eqnarray}

706: for $\bv_i=\rho_i\bv$ and $\rho=\sum_i\rho_i$ with the stimuli strength

707: $\rho_i$ at position $\br_i$.

708: The scattering function with a input data set $\mD$ is defined as

709: \begin{eqnarray} \label{eq:scattering}

710:   S_{ij}=2\eta\langle v_iv_j\rangle_\mD(\delta_{ij}-1)

711: \end{eqnarray}

712: for the enhancement (or competition) parameter $\eta$.

713: In the FBM representation, the scattering function describes the feedforward

714: competition process in the competitive Hamiltonian models, such as the

715: elastic-net model and the SOFM algorithm.

716: %For hard competition with large $\eta$, the network accomplishes the

717: %``winner-take-all'' process.

718: %In fact, a priori enhancement of afferent signals is achieved when the

719: %conformity between neural feature and input signal is determined by the

720: %connectivity with the incentive cells (or extrinsic coding type).

721: In fact, for an intrinsic coding type, network cannot tell which neurons match

722: mostly with input signal a priori and the winner has to be determined after

723: lateral inhibitory activity.

724: The competitive Hebbian models require a normalization control of response or a

725: priori decision of winner and depict the feature vectors in visual cortex

726: through the connectivity between visual cortex and retinas (or

727: LGNs)~\cite{Scherf1999}.

728: %Another important role of synaptic normalization in cortical map development is the competition.

729: % in afferent signals is equivalent to those of the lateral inhibitory activity.

730: The lateral activity function $J(\br_i,\br_j)$, the connectivity between

731: neurons (or columnar modules) at position $\br_i$ and $\br_j$ within a cortex

732: area, has two types according to the control mechanisms~\cite{Kohonen1995}.

733: In the case of the lateral feedback control (which Kohonen called the

734: activity-to-activity kernel), the lateral activity function $\bJ$ is regarded

735: to be excitatory for short distance and inhibitory for long distance with the

736: so-called Mexican hat type (Fig.\ref{fig:control}a).

737: Whereas in the case of the lateral control of plasticity (or the

738: activity-to-plasticity kernel), the lateral interaction is nonnegative and may

739: take the Gaussian form (Fig.\ref{fig:control}b).

740: The competitive Hebbian models take the lateral control of plasticity, that

741: means there is no negative value in $\bJ$, and the scattering function $\bS$

742: from afferent signal enhancement has an effect of inhibitory activity.

743:

744: %We can consider also the interactions with higher powers and take the general

745: %energy form as Eq.(\ref{eq:expansion}).

746: %Note that the actual forms of interaction functions depend on interaction

747: %mechanisms and the quadratic interaction term $D(\bx,\by)$ need not always be

748: %the neighborhood function $J(\bx,\by)$.

749:

750: \begin{figure}[t]

751: \begin{minipage}[b]{4cm}

752:   \includegraphics[width=4cm]{activity} \\

753:  (a) Lateral feedback control of activity

754: %\\ \ \\

755: \end{minipage}

756: \

757: \begin{minipage}[b]{4cm}

758:   \includegraphics[width=4cm]{plasticity} \\

759:   (b) Lateral control of plasticity

760: \end{minipage}

761: \caption{ \label{fig:control}

762:   The two types of neighbor interaction functions and control mechanisms.

763:   (a) The lateral interaction models adopt lateral activity control and the

764:   activation kernel, usually so-called ``Mexican hat'' function (positive

765:   feedback for close distance and negative for longer distance).

766:   (b)  The plasticity control with nonnegative kernel requires feedforward

767:   competition (or feedforward normalization of activity over networks).

768:   The elastic-net model assumes the nearest neighbor interactions (or elastic

769:   force), whereas the SOFM algorithm takes the neighbor function on Gaussian

770:   form with the hard competition (or winner-take-all activity).

771: }

772: \end{figure}

773:

774: Now we employ the concepts of thermodynamic into neural dynamics.

775: In some classes of neural network models, such as Boltzmann machine, the

776: input-output is assumed to be stochastic.

777: Once a stochastic neural network has converged to an equilibrium state, the

778: probability distribution characterizing $\psi$ is expected to obey the

779: Boltzmann distribution

780: \begin{eqnarray}

781:   P[\psi]=\frac{\exp(-E[\psi])}{Z}

782: \end{eqnarray}

783: for the partition function

784: \begin{eqnarray}

785:   Z=\sum_\psi\exp(-E[\psi]).

786: \end{eqnarray}

787: In neural processing architecture, the notion of entropy or free energy is put

788: into practice ahead for the purpose of informatics.

789: Compared to deterministic firing models, an expected advantage in stochastic

790: neural network models is to escape from poor locally optimal configurations

791: through probabilistic evolution.

792: Moreover, there are several reasons that the stochastic behavior should be

793: indispensable process in neural networks.

794: In view of learning rules, it is natural that neural states are occupied

795: with features corresponding to frequent inputs (the {\em coarse coding}

796: principle).

797: On the other hand, it is efficient for a neural network to avoid the occupation

798: with a few of features, so that an object is coded by a small population that

799: is active for an event (the {\em sparse coding} principle).

800: %Besides the competitive or inhibitory activity, thermodynamic behavior in neural

801: %networks tends to achieve the sparseness.

802: It is usual that the cost function in unsupervised learning algorithm is

803: similar to the Helmholtz free energy that

804: \begin{eqnarray} \label{eq:Helmholtz}

805:   F=E-TS,

806: \end{eqnarray}

807: where the parameter $T$ is considered just as a positive constant that

808: determines the importance of the second term relative to the first.

809: %In a Hebbian development model, the energy term $E$ functions neurons to

810: %possess features corresponding frequent inputs in addition to the neighbor

811: %ordering, whereas the entropy term compels neurons to avoid occupying a

812: %common feature state.

813: In learning rules, the energy term is illustrated by a measurement how well the

814: code describes the input  data or carry the informations :

815: \begin{eqnarray} \label{eq:E}

816:   %E&=&(1/N)\sum_{i}\langle P(\psi_i|\bv)\rangle_\mD \nonumber \\

817:   E&=&(1/N)\sum_{i}\sum_{\bv\in\mD}

818:     P(\psi_i|\bv)P(\bv|\mD) \nonumber \\

819:   &=&(1/N)\sum_{i}P(\psi_i|\mD) \\

820:   &=&\sum_{\psi}P(\psi)P(\psi|\mD), \nonumber

821: \end{eqnarray}

822: %In a network with the input-output stochastic relationship,

823: where  a distribution $P(\psi|\mD)$ is the average over the probability that

824: input $\bv\in\mD$ generates output $\psi$.

825: In Hebbian development models, this energy term can be considered as an

826: external source term, that is the average over the product between feature

827: state and external signals, $-B\psi$ in a complex cell model (or

828: $-\psi\bQ^{(2)}\psi$ in a simple cell model) as well.

829: %However, if the neural dynamics is described by $E=V(\psi)$, the solutions

830: %indicate the collapse of whole neurons to single feature state with the maximal

831: %probable experience.

832: %We can expect that the observed cortex maps {\em in vivo} are aparted from the

833: %equilibrium state because the relaxation process in neuron systems is very

834: %slow, and they will reach to single state finally.

835: %The {\em minimum description length} (MDL) principle~\cite{Rissanen1989}, for

836: %example, finds a method of coding each input data that minimizes the total cost

837: %of communicating the input data to a receiver.

838: %The energy is described as $P(\psi)$ is the probability of the feature state

839: %$\psi$ in the cortex area or the prior probability of the model $\psi$.

840: In learning rules, the entropy term assesses the sparseness of the code by

841: assigning a cost depending on how the activity is distributed.

842: According to Shannon's coding theorem, the amount of information is defined by

843: \begin{eqnarray} \label{eq:S}

844:   S=-K\sum_{\psi} P(\psi)\ln P(\psi)

845: \end{eqnarray}

846: where $K$ is a positive constant and $-\ln P(\psi)$ is the cost of the code,

847: the number of bits required to communicate the code.

848: %The expression for $E$ and $S$ in Eq.(\ref{eq:E}) and Eq.(\ref{eq:S}) remind

849: %the Helmholtz's free energy in density matrix formulation,

850: %\begin{eqnarray} F[\rho]=\mbox{Tr}\ \rho\left\{H+k_BT\ln\rho\right\} \end{eqnarray}

851: %for the density matrix $\rho$ with $K$ being identified as Boltzmann's constant $k_B$.

852: The connections between information theory and statistical mechanics are

853: rigorously investigated~\cite{Jaynes1957A,Jaynes1957B,Grandy1997}.

854: %From the point of the learning algorithm, the probabilistic decision neural

855: %networks such as {\em Boltzmann machines}~\cite{Hinton1983} have been suggested.

856: However, there is some hardship to apply the statistical mechanism to the

857: phenomena in the real brain.

858: Since neural process, comprehends the dynamics at various spatial and temporal

859: levels, is essentially dynamical and non equilibrium phenomenon.

860: For example, the relaxation process in cortical map formation is very slow and

861: the observed maps often do not satisfy the equilibrium criteria.

862: The map formation in visual cortex occurs concentrately for several weeks or

863: months after birth, during a so-called critical period.

864: In observed orientation preference maps, the non-uniforming directions of

865: gradient ($\nabla\phi_\parallel\neq\mbox{const}$, however

866: $|\nabla\phi_\parallel|\simeq\mbox{const}$ for the longitudinal component

867: $\phi_\parallel$) and non-vanishing singular points (or pinwheels) indicate

868: that the system may be frozen during the relaxation process~\cite{Cho2004A}.

869: % the perpendicularity with

870: %area boundary ($\nabla^2\phi_\parallel\sim 0$ for the longitudinal component

871: %$\phi_\parallel$) is achieved, but except

872: %The relaxation process in cortical dynamics is very slow and the observed maps

873: %in adult are expected to be stopped in process.

874: %We think the neural networks should be treated as aparted from but in

875: %relaxation to the equilibrium state.

876: %we can guess how far they apart from the equilibrium state.

877:

878: \section{Application to visual map formation models} \label{sec:application}

879: According to the studies of the statistical structure of natural images, the

880: response properties of visual neurons, the spatially localized and oriented,

881: are considered to be due to the efficient coding of natural

882: images~\cite{Olshausen1996}.

883: Oriented bar or grid patterns are the most probable activity and the feature

884: with $O(2)$ (or $U(1)$) symmetry components is a meaningful representation in

885: orientation columns.

886: With ocular dominance columns, the total feature can be expanded to $O(3)$

887: symmetry components with the restriction of synaptic normalization within

888: each column.

889: Therefore, the conventional spin vector $(S^x,S^y,S^z)$ can serve as a useful

890: representation of the feature states with the preferred orientation

891: $\phi=(1/2)tan^{-1}(S_x/S_y)$.

892: %Among the phenomena at the cortical level, the primary visual map formation is

893: %one of the most investigated problems with various proposed models, most of

894: %which are described in the high- or low-dimensional feature vector representation.

895: The proposed models of visual map formation are based on so various mechanisms.

896: %which factor causes the bifurcation to a inhomogeneous state.

897: We rewrite four widely used visual map development models in term of FBM

898: representation classified by their effective interaction terms that

899: \begin{itemize}

900: \item[(A)] Lateral interaction models : $\bD = \bJ$

901: \item[(B)] Recursive interaction models : $\bD = (\bI-\bJ)^{-1}$

902: \item[(C)] Elastic-net model : $\bD = \bJ + \bS$

903: \item[(D)] SOFM algorithm : $\bD = \bS\bJ$.

904: \end{itemize}

905: The lateral activity function $J(\br_i,\br_j)$ in the elastic-net model and

906: the SOFM algorithm is taken to be all nonnegative, the two-point interaction

907: function $D(\br_i,\br_j)\simeq D(|\br_i-\br_j|)$ takes the Mexican hat type for

908: all cases owing to the scattering funciton $S(\br_i,\br_j)$.

909: %The competitive Hebbian models require the feedforward competition and the

910: %nonnegative neighborhood function (plasticity control kernel) with the

911: %enhancement of activity in Eq.(\ref{eq:enhancement}).

912: Periodic patterns, such as linear zones in orientation preference columns or

913: parallel bands in ocular dominance columns, can develop when there are abundant

914: negative values in $D(\br_i,\br_j)$ so that $\tilde{D}(q)$ in Fourier space has

915: a non-vanishing minimum point $q^\ast$ with the wavelength

916: $\Lambda=2\pi/q^\ast$.

917: %We show that the elastic-net model and the lateral interaction models can be

918: %described by common energy form in Eq.(\ref{eq:linear}) with

919: %$D(r)=h_+(r)-h_-(r)$ for positive functions $h_+$ and $h_-$.

920: %This result means that two models have equivalent effective interactions and

921: %share statistical properties, in spite of their different control mechanisms.

922: %The emergence of columnar patterns are possible when $h_-$ is relatively larger

923: %than $h_+$.

924: %The lateral interaction models consider $h_-$ as the lateral inhibitory

925: %interaction term whereas the elastic-net model suggest it as the correlated

926: %external stimuli term.

927:

928: \subsection{Lateral Interaction Models}

929: A simple cell model of the visual map development uses the high-dimensional

930: feature vector coding for the strength of the connection from each cortical

931: location to each retinal (or LGN) location.

932: The synaptic plasticity depends on the average over the activities of competing

933: inputs, which are left and right eyes for ocular dominance columns or ON-center

934: and OFF-center cells for orientation preference columns.

935: For a linear activation function or $f(v)=v$, the energy in

936: Eq.\ref{eq:simple_energy} becomes that

937: \begin{eqnarray}

938:   E[\bW]=-\frac{1}{2}\sum_{i,j}\sum_{\alpha,\beta}

939:     (\delta_{ij}+J_{ij})Q^{(2)}_{\alpha\beta}\bW_{i\alpha}\bW_{j\beta}.

940: \end{eqnarray}

941: This energy is decomposed with the (globally) transformed synaptic weights into

942: the sum and difference :

943: %The synaptic weights are represented by their transformation into the sum and difference :

944: \begin{eqnarray}

945:   \bW_+=\bW_R+\bW_L & \mbox{and} & \bW_-=\bW_R-\bW_L

946: \end{eqnarray}

947: for ocular dominance columns or similarly $\bW_\pm=\bW_{ON}\pm\bW_{OFF}$ for

948: orientation columns.

949: In a pixel-based representation for orientation columns, oriented patterns with

950: low-frequency compose a dominant feature space and the energy is decomposed

951: with the locally transformed weights as well.

952: Therefore, the energy as the function of field variables $\psi$ in a

953: transformed and reduced feature space is that

954: \begin{eqnarray}

955:   E[\psi]=-\frac{1}{2}\sum_{i,j}D(\br_i,\br_j)\psi(\br_i)\psi(\br_j)

956: \end{eqnarray}

957: for $\bD=\bI+\bJ$.

958: The input correlation matrix $\bQ^{(2)}$ is ignored, because the frequency of

959: the dominent features in inputs is regarded to be the same or the two-point

960: activity function $\bD$ comprises this.

961: The term $-\frac{1}{2}\sum_i\psi(\br_i)^2$, related to the self-relaxation

962: term, does not effect influence on the typical spacing of an emergent columnar

963: pattern.

964: In a complex cell model, the simplest is given by the summation of the neighbor

965: interactions and the external stimuli terms as

966: \begin{eqnarray} \label{eq:linear}

967:   E[\psi]=-\frac{1}{2}\sum_{i,j}D(\br_i,\br_j)\psi(\br_i)\psi(\br_j)-\sum_iB(\br_i)\psi(\br_i)

968:   %E[\psi]=-\frac{1}{2}\sum_{i,j}D_{ij}\psi_i\psi_j-\sum_iB_i\psi_i

969: \end{eqnarray}

970: for $\bD=\bJ$.

971: The external stimuli $B(\br_i)$ is considered to be constant or vanishing.

972: Therefore, the form of the lateral activity function $\bJ$ determines the

973: typical appearance of developed feature map for both cases.

974: In lateral interaction models, $\bJ$ is taken as the {\em activation kernel},

975: or Mexican hat function (positive feedback in the center, negative in the

976: surroundings).

977: For example, a well-known Mexican hat function, the called difference of

978: Gaussians (DOG) filter, is described as

979: \begin{eqnarray}

980:   J(\br_i,\br_j)=\varepsilon\left(e^{-|\br_i-\br_j|^2/2\sigma_1^2}

981:     -ke^{-|\br_i-\br_j|^2/2\sigma_2^2}\right)

982: \end{eqnarray}

983: where $k$ is the strength of inhibitory activity.

984: Another example of Mexican hat function modified from a wavelet is given by

985: \begin{eqnarray} \label{eq:wavelet}

986:   J(\br_i,\br_j)=\varepsilon\left(1-k\frac{|\br_i-\br_j|^2}{\sigma_l^2}\right)

987:     e^{-|\br_i-\br_j|^2/2\sigma_l^2}

988: \end{eqnarray}

989: for the lateral cooperation range $\sigma_l$.

990: If the strength of inhibitory activity $k$ is larger than threshold $k_c$

991: ($=1/4)$, $\tilde{D}(q)$ has a non-vanishing maximum point at

992: $q^\ast=(1/\sigma)\sqrt{4-1/k}$~\cite{Cho2004A}.

993:

994: \subsection{Recursive interaction models}

995: For a linear activation function, the output in Eq.(\ref{eq:recursive}) becomes

996: \begin{eqnarray}

997:   y_i=v_i+\sum_jJ_{ij}v_j+\sum_{j,k}J_{ij}J_{jk}v_k+\cdots,

998: \end{eqnarray}

999: which is the summation of recursive recurrents.

1000: The energy as the function of synaptic weights is obtained that

1001: \begin{eqnarray} \label{eq:correlation-based}

1002:   E[\bW]=-\frac{1}{2}\sum_{i,j}\sum_{\alpha,\beta}D_{ij}Q^{(2)}_{\alpha\beta}

1003:     \bW_{i\alpha}\bW_{j\beta},

1004: \end{eqnarray}

1005: where the two-point interaction function is

1006: \begin{eqnarray}

1007:   \bD=\bI+\bJ+\bJ^2+\cdots=(\bI-\bJ)^{-1}

1008: \end{eqnarray}

1009: and the real parts of the eigenvalues of $\bJ$ are expected to be less than $1$.

1010: Eq.(\ref{eq:correlation-based}) is a simple modified equation of Miller's

1011: correlation-based learning models~\cite{Dayan2001}.

1012: In the original representation by Miller {\em et al.}, the input stimuli term

1013: is described by an arbor function, expressing the location and the overall size

1014: of the receptive fields~\cite{Miller1989,Miller1994}.

1015: The two-point interaction function $\bD$ takes the Mexican hat type and the

1016: wavelength of visual pattern is determined by the peak of $\tilde{D}(q)$ in

1017: the analysis by Miller as well~\cite{Miller1998}.

1018:

1019: %The ocular dominance development model by Miller {\em et al.} uses a

1020: %high-dimensional feature vector coding for the strength of the connection from

1021: %each cortical location to each retinal (or LGN) location.

1022: %The correlation-based models is based on the synaptic plasticity depending on

1023: %the correlations among the activities of competing inputs, which are left and

1024: %right eyes for ocular dominance columns or ON-center and OFF-center cells for

1025: %orientation preference columns

1026:

1027:

1028: \subsection{Elastic-Net Model}

1029: The elastic-net model is described by an iterative procedure

1030: with the update rule :

1031: \begin{eqnarray} \label{eq:elastic}

1032:   \Delta\Phi(\br_i)&=&\alpha\sum_{|\br_i-\br_j|=a}(\Phi(\br_i)-\Phi(\br_j))

1033:     \nonumber \\

1034:   &+&\beta(\bV-\Phi(\br_i))\frac{e^{-|\bV-\Phi(\br_i)|^2/2\sigma_s^2}}

1035:     {\sum_j e^{-|\bV-\Phi(\br_j)|^2/2\sigma_s^2}}, \ \ \ \

1036: \end{eqnarray}

1037: where a feature vector in the low-dimensional representation is

1038: \begin{eqnarray}

1039:   \Phi(\br)&=&(r_x,r_y,q\sin(2\phi(\br)),q\cos(2\phi(\br)),z(\br)) \nonumber \\

1040:   &=&(\br,\psi(\br)) \nonumber

1041: \end{eqnarray}

1042: for the retinal location $\br=(r_x, r_y)$, the preferred orientation

1043: $\phi(\br)$, the degree of preference for that orientation and the ocular

1044: dominance $z$~\cite{Durbin1990,Erwin1995}.

1045: At each iteration, a stimulus vector $\bV=(\br_v,\bv)$ is chosen at random

1046: according to a given probability distribution.

1047: The first term in Eq.(\ref{eq:elastic}) denotes the elastic force or the

1048: excitatory interactions between the nearest-neighbors, and the second term

1049: implies the normalized stimuli distributed around an activity center.

1050: Functional Taylor expansion of the right hand side after dropping all nonlinear

1051: terms leads to

1052: \begin{eqnarray}

1053: \lefteqn{\Delta\psi(\br_i)=\alpha \sum_{|\br_i-\br_j|=a}

1054:   \big\{\psi(\br_i)-\psi(\br_j)\big\}} \\

1055: &&-\beta\psi(\br_i)+\frac{\beta a^4}{4\pi^2\sigma_s^6}

1056:   \sum_j\langle\bv_i\bv_j\rangle_\mD\big\{\psi(\br_i)-\psi(\br_j)\big\} \nonumber

1057: \end{eqnarray}

1058: where the stimulus at position $\br_i$,

1059: \begin{eqnarray} \label{eq:scatter}

1060:   \bv_i=\bv e^{-|\br_i-\br_v|^2/2\sigma_s^2}

1061: \end{eqnarray}

1062: is distributed in a gaussian form with the activity center $\br_v$ and the

1063: feedforward cooperation range $\sigma_s$.

1064: The correlation between the external stimuli at position $\br_i$ and $\br_j$ is

1065: obtained by

1066: \begin{eqnarray} \label{eq:correlation}

1067:   \langle\bv_i\bv_j\rangle_\mD&=&\langle v^2\rangle_\mD

1068:     \sum_{\br_v}\ e^{-|\br_i-\br_v|^2/2\sigma_s^2}

1069:     \ e^{-|\br_j-\br_v|^2/2\sigma_s^2} \nonumber \\

1070:   &\simeq&(\pi\sigma_s^2/a^2)\langle v^2\rangle_\mD

1071:     e^{-|\br_i-\br_j|^2/4\sigma_s^2}.

1072: \end{eqnarray}

1073: %as follows the results of Hoffs\"{u}mmer {\em et al.}~\cite{Hoffsummer1995}.

1074: Therefore, the effective energy of the elastic-net model can be represented

1075: through the form in Eq.(\ref{eq:linear}), where the two-point interaction

1076: function is given by

1077: \begin{eqnarray}

1078:   \bD=-\beta\bI+\bJ+\bS.

1079: \end{eqnarray}

1080: The lateral activity function becomes

1081: $J(\br_i,\br_j)=\alpha\delta(|\br_i-\br_j|-a)$ or the Laplacian operator in a

1082: continuum limit.

1083: The scattering function coincides with the form in Eq.(\ref{eq:scattering})

1084: for $\eta=\beta a^4/8\pi\sigma_s^6$ or is obtained by

1085: \begin{eqnarray}

1086:   %S(\br_i,\br_j)&=&\frac{2\eta}{N}(\delta_{ij}-1)\langle v_iv_j\rangle_\mD  \\

1087:   %  &=&\beta\frac{\langle v^2\rangle_\mD}{2\pi\sigma_s^2}

1088:   %  \left(2\pi\delta_{ij}-e^{-|\br_i-\br_j|^2/4\sigma_s^2}\right) \nonumber

1089:   S(\br_i,\br_j)\simeq\frac{\beta}{\sigma_s^2}\langle v^2\rangle_\mD

1090:     \left(\delta_{ij}-\frac{a^2}{4\pi\sigma_s^2}e^{-|\br_i-\br_j|^2/4\sigma_s^2}\right).

1091: \end{eqnarray}

1092: This result means that the scattering function $S(\br_i,\br_j)$ can act as an

1093: kernel with inhibitory activity however the lateral activity function

1094: $J(\br_i,\br_j)$ is nonnegative.

1095: There are also interaction terms of higher power but the two-point interaction

1096: function $D(\br_i,\br_j)$ determines the major characteristics of developed

1097: feature maps.

1098: We transform it to Fourier space and obtain

1099: \begin{eqnarray}

1100:   \tilde{D}(\bq)&=&-\beta+\tilde{J}(\bq)+\tilde{S}(\bq) \\

1101:   &\simeq&-\beta-\alpha q^2+\frac{\beta}{\sigma_s^2}

1102:     \langle v^2\rangle_\mD\left(1-e^{-q^2\sigma_s^2}\right). \nonumber

1103: \end{eqnarray}

1104: It has a maximum at

1105: \begin{eqnarray}

1106:   q^\ast=\frac{1}{\sigma_s}\sqrt{\ln

1107:     \left(\frac{\beta}{\alpha}\langle v^2\rangle_\mD\right)},

1108: \end{eqnarray}

1109: which corresponds to the analytic results from different

1110: approaches~\cite{Hoffsummer1995,Scherf1999}.

1111: %The maximum is positive for any $\sigma_s<\sigma_s^\ast$ where

1112: %\begin{eqnarray}

1113: %  \sigma_s^\ast=\sqrt{\langle v^2\rangle_\mD

1114: %  -\alpha-\alpha\ln\left(\frac{\beta}{\alpha}\langle v^2\rangle_\mD\right)}.

1115: %\end{eqnarray}

1116: %The sequence bifurcation model.

1117:

1118: \subsection{Self-Organizing Feature Map Algorithm}

1119: In Eq.(\ref{eq:linear}), the interaction term $\psi J\psi$ denotes the exchange

1120: of spontaneous spikes, created without external activity.

1121: Spontaneous firings can occur in coupled nonlinear oscillators with small

1122: dynamic fluctuations, which have been observed in some experiments~\cite{

1123: Llinas2003,Creutzfeldt1995,Steriade1993,Tsodyks1999,Sanchez2000,Wilson1981}.

1124: However, several experiments suggested that the organization of feature maps is

1125: possible after the exposure to the external activity.

1126: In this case, the probability of spontaneous firing are small ($J\ll v$), so

1127: that the most intracellular interactions would be achieved by indirect currents

1128: of external activities.

1129: With the provoked interactions by external activities, we can take the

1130: effective energy as

1131: \begin{eqnarray} \label{eq:H_SOM}

1132:   E[\psi]=-\left(\sum B\psi+\frac{1}{2}\sum\psi S\psi\right)

1133:     \left(\frac{1}{2}\sum \psi J\psi\right).

1134: \end{eqnarray}

1135: If $B(\br)\psi(\br)$ is constant for all position $\br$, the first term with

1136: $\psi J\psi$ supports the lateral interaction models again.

1137: In the Kohonen's SOFM algorithm, the lateral currents induced by feedforward

1138: normalized stimuli are focused and the effective interaction term is given by

1139: %ignores this term or assumes that $B(\br_i)=\langle v(\br_i)\rangle_\mD$ vanishes, it focus on

1140: \begin{eqnarray}

1141:   D(\br_i,\br_j)&=&\frac{1}{2}\sum_\br S(\br_i,\br)J(\br,\br_j).

1142: \end{eqnarray}

1143: Moreover, the SOFM algorithm requires the hard competition, the called

1144: ``winner take all'' (WTA) case.

1145: As $\sigma_s$ approaches zero (or large $\eta$), the activity is localized only

1146: around the winning neuron and the scattering function in Fourier

1147: space becomes $\tilde{S}(\bq)\simeq\beta \langle v^2\rangle_\mD q^2$, the

1148: Laplacian operator.

1149: The lateral activity function in the SOFM approaches takes on the Gaussian form

1150: $J(\br_i,\br_j)=e^{-|\br_i-\br_j|^2/2\sigma_l^2}$ for the lateral cooperation

1151: range $\sigma_l$ (lateral plasticity control).

1152: Therefore we obtain the two-point interaction function

1153: \begin{eqnarray} \label{eq:FBM_Dq}

1154:   \tilde{D}(\bq)=\frac{1}{2}\tilde{S}(\bq)\tilde{J}(\bq)

1155:     =\pi \sigma_l^2\beta\langle v^2\rangle_\mD\ q^2e^{-q^2\sigma_l^2/2}

1156: \end{eqnarray}

1157: in Fourier space or

1158: \begin{eqnarray}

1159:   D(\br_i,\br_j)=\beta\langle v^2\rangle_\mD \left(1-\frac{|\br_i-\br_j|^2}

1160:     {2\sigma_l^2}\right)e^{-|\br_i-\br_j|^2/2\sigma_l^2}

1161: \end{eqnarray}

1162: in real space.

1163: This is the Mexican hat function in Eq.(\ref{eq:wavelet}) with $k=0.5$.

1164: Eq.(\ref{eq:FBM_Dq}) has a minimum at

1165: \begin{eqnarray}

1166:   q^\ast=\sqrt{2}/\sigma_l,

1167: \end{eqnarray}

1168: which agrees with previous analytic

1169: results~\cite{Wolf2000,Scherf1999,Obermayer1992} and always positive if

1170: $\sigma_l>0$.

1171: The Kohonen's SOFM algorithm reads to robust learning rules because it always

1172: succeeds in achieving an array of different feature detectors or a columnar

1173: pattern.

1174:

1175: \section{Discussion}

1176: The physical models of neural network based on neuroscience attempt to

1177: interpret both physiologic phenomena and computational architectures.

1178: In order to study the functional of the real brain, we need more adaptable

1179: theories than the basic neural architecture with connectionism.

1180: In this paper, we show that the neural process at the cortical level can be

1181: described by using the conventional expressions in statistical physics.

1182: As we showed in visual map formations~\cite{Cho2004A,Cho2004B}, the collective

1183: neural dynamics can be much alike well-known phenomena in the physical systems.

1184: %More extended computational architectures are also possible in the neural

1185: %models with functional modularity because of the higher dimensional attributes

1186: %of the processing elements in networks.

1187: %The neural dynamic models at higher levels also have to base on neuroscience

1188: %and target to interpret both the physiologic phenomena and the computational

1189: %architectures.

1190: %The representation of neural dynamics at the cortical level is suited to understand

1191: %the statistical and collective phenomena of neurons - multi-functional map

1192: %organization or map differentiations, and cooperative computation, etc.

1193: %More effective description of neural dynamics and computations with functional

1194: %and columnar modularity have been suggested because the connections and

1195: %interactions between neurons are very huge and complex in real brain.

1196:

1197: In the assumption of neural network composed of columnar modules, we classify

1198: the synaptic connection types and anticipate different functional characters in

1199: computational processing.

1200: (1) In the connectivity between close neurons within a columnar module, the

1201: functional attributes of neurons and the associative memory is realized.

1202: (2) By the connectivity between columnar modules within a cortex area, noted by

1203: the lateral activity function or recurrent weight matrix $\bJ$, the networks

1204: control laterally the output activity between neighbors.

1205: (3) Via the connectivity between far apart neurons cross cortex areas, neurons

1206: get driven-activity from external environment or other functional cortex areas.

1207: The columnar modules become elements (or nodes) again with high dimensional

1208: attributes in networks of neural networks.

1209: If the recurrent weights matrix $\bJ$ is specified depending on the positions,

1210: the connectivity between columnar modules also work in information coding.

1211: The connectivity between columnar modules within or beyond cortex areas would

1212: be strengthened also if there are much communications between them according to

1213: the Hebbian rule, and there are some models holding the updating rule in the

1214: recurrent weights matrix $\bJ$, such as the Goodall rule.~\cite{Goodall1960}.

1215: We regard that the enhancement of connectivity between columnar modules proceed

1216: to the efficient communications between neurons rather than information coding.

1217: The consideration of minicolumn as a columnar module and the processing element

1218: in network is optional.

1219: The formation of structure in minicolumn is also due to the functional grouping

1220: between neurons with similar interests, and expected to be certified with more

1221: fundamental process at the cellular or molecular level.

1222:

1223: %Like the linear analysis in other models, the direct interaction term is

1224: %important in the determination of dominance feature of self-organizing map.

1225: %Higher power interaction terms are possible if considering (1) the normalization

1226: %of synaptic strength over networks, (2) indirect interactions between neurons,

1227: %and (3) thermodynamic perturbative coupling by $g\psi^4$ term.

1228: %The statistical property of the cortical map is revealed in the general energy

1229: %formula at continuum limit such as Eq.(\ref{eq:continuum}).

1230: %Using the Landau theory, the prediction of phase transition in neural systems

1231: %would be possible also.

1232:

1233: %Indeed, the interactions in neurons resemble those in the physical particles.

1234: %Neurons (or electrons) receive spikes (or photons) from neighbor neurons (or

1235: %electrons) and send them to neighbors again.

1236: %After collision with spikes (or photons), the preferred state of neurons

1237: %({\em or} the momentum or intrinsic phase of electrons) moves slightly to the

1238: %driven stimuli.

1239: %sensation, perception and memorization

1240:

1241: Extraction of the significant features in the input data is the purpose of an

1242: unsupervised learning rule and also expected to be a principle character of

1243: artificial and physiologic neural networks.

1244: The FBM representation method suggests how neurons find features from afferent

1245: signals and build knowledgement at the cortical level.

1246: An abstract representation of features in the FBM representation and a symmetry

1247: breaking between feature components in progress is related to the learning

1248: process in the neural network.

1249: For example, difference looks of an object form a submanifold in pattern space

1250: and the patterns of the object can be abstracted and decomposed in the

1251: transformed and reduced feature space.

1252: % such as angle or distance from viewpoint.

1253:

1254: In view of dynamics, the essential factors in neural process are (1)

1255: statistical structure of inputs, (2) attractive or repulsive interactions

1256: between neighbor neurons, and (3) stochastic behavior of neurons.

1257: In this paper, we did not fully apply thermodynamic mechanics into neural

1258: process.

1259: There are some models which contain thermodynamic approach.

1260: The basic ingredients of Tanaka's Potts spin models are those of the lateral

1261: interaction models but he took a probabilistic evolution rather than a energy

1262: gradient flow~\cite{Tanaka1989,Tanaka1990A,Tanaka1990B,Tanaka1991A,Tanaka1991B}.

1263: Piepenbrock presented a model which uses the effect of stochastic behavior in

1264: neural network as a competition process~\cite{Rao2002}.

1265: However there is no lateral inhibitory activity or feedforward competition,

1266: thermodynamic effect can make a network to have a columnar structure with a

1267: thermal excitation at low temperature.

1268: We expect the stochastic behavior of neurons can be the connection between the

1269: physical neural dynamic models and the neural network models originated from

1270: learning theory and an essential factor in comprehension of systematic

1271: ordering-disordering or bifurcation problems in the real brain.

1272: Moreover, we expect that the theoretic experience in physics can offer more

1273: intuitive appreciation of the physiologic phenomena at higher level and

1274: sophisticated mechanisms in computational architecture.

1275:

1276: This work was supported by the Ministry of Science and Technology and the

1277: Ministry of Education.

1278:

1279: \bibliography{fbm}

1280:

1281: \end{document}

1282: