1e9fe6d00e373e66.tex
1: \begin{abstract}
2:     Structural support vector machines (SSVMs) are amongst the 
3:     best performing models for structured computer vision tasks, 
4:     such as semantic image segmentation or human pose estimation. 
5: 	%
6:     Training SSVMs, however, is computationally costly, because it 
7:     requires repeated calls to a structured prediction subroutine 
8:     (called \emph{max-oracle}), which has to solve an optimization 
9:     problem itself, \eg a graph cut. 
10: 	
11:     In this work, we introduce a new algorithm for SSVM training that
12:     is more efficient than earlier techniques when the max-oracle is 
13:     computationally expensive, as it is frequently the case in computer
14:     vision tasks. The main idea is to (i) combine the recent stochastic 
15:     Block-Coordinate Frank-Wolfe algorithm with efficient hyperplane 
16:     caching, and (ii) use an automatic selection rule for deciding whether 
17:     to call the exact max-oracle or to rely on an approximate one based
18:     on the cached hyperplanes. 
19:     
20:     We show experimentally that this strategy leads to faster convergence 
21:     to the optimum with respect to the number of requires oracle calls, 
22:     and that this translates into faster convergence with respect to the 
23:     total runtime when the max-oracle is slow compared to the other steps 
24:     of the algorithm.
25: 
26:     A publicly available C++ implementation is provided.
27: \end{abstract}
28: