81ae36e9f8f8dd40.tex
1: \begin{abstract}
2:   Many biological learning systems 
3:   such as the mushroom body, hippocampus, and cerebellum
4:   are built from sparsely connected networks of neurons.
5:   For a new understanding of such networks, 
6:   we study the function spaces induced by sparse random features and
7:   characterize what functions may and may not be learned.
8:   A network with $d$ inputs per neuron is found to be equivalent
9:   to an additive model of order $d$,
10:   whereas with a degree distribution the network combines
11:   additive terms of different orders.
12:   % We prove uniform rates of convergence 
13:   % for a large class of smooth random features, 
14:   % not only sparse ones,
15:   % to their limiting kernels.
16:   % For particular weight distributions and nonlinearities,
17:   % we derive new induced kernels in the very sparse limit.
18:   We identify three specific advantages of sparsity:
19:   additive function approximation is a powerful 
20:   inductive bias that limits the curse of dimensionality,
21:   sparse networks are stable to outlier noise in the inputs, and
22:   sparse random features are scalable.
23:   Thus, even simple brain architectures
24:   can be powerful function approximators.
25:   Finally, we hope that this work helps
26:   popularize kernel theories of networks among computational neuroscientists.
27: \end{abstract}
28: