abstract:67993a9a6e836be1.tex

1: \begin{abstract}

2:    We consider a deep matrix factorization model of covariance matrices trained with the Bures-Wasserstein distance.

3:    While recent works have made advances in the study of the optimization problem for overparametrized low-rank matrix approximation, much emphasis has been placed on discriminative settings and the square loss. In contrast, our model considers another type of loss and connects with the generative setting.

4:    We characterize the critical points and minimizers of the Bures-Wasserstein

5:    distance over the space of rank-bounded matrices. The Hessian of this loss at

6:    low-rank matrices can theoretically blow up, which creates challenges to

7:    analyze convergence of gradient optimization methods.

8:    We establish convergence results for gradient flow using a smooth perturbative version of the loss as well as convergence results for finite step size gradient descent under certain assumptions on the initial weights.

9:    \end{abstract}

10: