abstract:545b83f162acbb0b.tex

1: \begin{abstract}

2: %We discuss the trade-offs of different optimal transport metrics for their use as learning losses.

3: %The emerging use can be attributed to the impressive breakthroughs in speed achieved by Cuturi's relaxed transport metric named Sinkhorn Distance.

4: %We focus on one dimensional distributions (\eg~Power Spectral Densities), in which the optimal Earth Mover's Distance ($\EMD$) can be efficiently calculated.

5: %We derive a closed-form solution for its gradient that allows a non-iterative calculation.

6: %However, we reveal convergence issues for gradient descent learning, which are confirmed in synthetic tests.

7: %We counter this by suggesting a relaxed form $\EMD^\rho$ with equivalent complexity that converges faster.

8: %We also show how the $\EMD$ gradient affects the entire output space that provides considerable advantages over non-transport metrics (\eg~Mean Squared Error).

9: %For problems with smooth output spaces it provides a significant boost in convergence speed as we demonstrate on a polysomnography data set.

10: %In this case, the model converges within the first quarter of the epoch, demonstrating that generalization is achieved using little data.

11: %\end{abstract}

12: