1: \begin{abstract}
2: The history of deep learning has shown that
3: human-designed problem-specific networks can greatly
4: improve the classification performance of general neural models.
5: In most practical cases, however, choosing the optimal
6: architecture for a given task remains a challenging problem.
7: Recent architecture-search methods
8: are able to automatically build neural models with
9: strong performance but fail
10: to fully appreciate
11: the interaction between neural architecture and weights.
12:
13: %
14: This work investigates
15: the problem of disentangling
16: the role of the neural structure and its edge weights,
17: by showing that
18: well-trained architectures may not
19: need any link-specific fine-tuning of the weights.
20: We compare the performance of such weight-free
21: networks (in our case these are binary networks with
22: \{0, 1\}-valued weights) with random,
23: weight-agnostic, pruned and
24: standard fully connected networks.
25: %
26: To find the optimal weight-agnostic network, we use
27: a novel and computationally efficient method that translates
28: the hard architecture-search problem into a feasible
29: optimization problem.
30: %
31: More specifically, we look at the optimal task-specific architectures
32: as the optimal configuration of binary
33: networks with \{0, 1\}-valued
34: weights, which can be found through an approximate gradient
35: descent strategy.
36: %
37: Theoretical convergence guarantees of the proposed algorithm are
38: obtained by bounding the error in the gradient approximation and
39: its practical performance is evaluated
40: %evaluate
41: on two real-world data sets.
42: %
43: For measuring the structural similarities between different
44: architectures, we use a novel spectral
45: approach that allows us to underline the intrinsic differences between real-valued networks and weight-free architectures.
46:
47: \end{abstract}
48: