1: \begin{abstract}
2: Many biological learning systems
3: such as the mushroom body, hippocampus, and cerebellum
4: are built from sparsely connected networks of neurons.
5: For a new understanding of such networks,
6: we study the function spaces induced by sparse random features and
7: characterize what functions may and may not be learned.
8: A network with $d$ inputs per neuron is found to be equivalent
9: to an additive model of order $d$,
10: whereas with a degree distribution the network combines
11: additive terms of different orders.
12: % We prove uniform rates of convergence
13: % for a large class of smooth random features,
14: % not only sparse ones,
15: % to their limiting kernels.
16: % For particular weight distributions and nonlinearities,
17: % we derive new induced kernels in the very sparse limit.
18: We identify three specific advantages of sparsity:
19: additive function approximation is a powerful
20: inductive bias that limits the curse of dimensionality,
21: sparse networks are stable to outlier noise in the inputs, and
22: sparse random features are scalable.
23: Thus, even simple brain architectures
24: can be powerful function approximators.
25: Finally, we hope that this work helps
26: popularize kernel theories of networks among computational neuroscientists.
27: \end{abstract}
28: