1: \begin{abstract}
2: Bilinear pooling achieves great success in fine-grained visual recognition (FGVC).
3: Recent methods have shown that the matrix power normalization can stabilize the second-order information in bilinear features, but some problems,~\emph{e.g.,} redundant information and over-fitting, remain to be resolved.
4: In this paper, we propose an efficient Multi-Objective Matrix Normalization (MOMN) method that can simultaneously normalize a bilinear representation in terms of square-root, low-rank, and sparsity.
5: These three regularizers can not only stabilize the second-order information, but also compact the bilinear features and promote model generalization.
6: In MOMN, a core challenge is how to jointly optimize three non-smooth regularizers of different convex properties.
7: To this end, MOMN first formulates them into an augmented Lagrange formula with approximated regularizer constraints.
8: Then, auxiliary variables are introduced to relax different constraints, which allow each regularizer to be solved alternately.
9: Finally, several updating strategies based on gradient descent are designed to obtain consistent convergence and efficient implementation.
10: Consequently, MOMN is implemented with only matrix multiplication, which is well-compatible with GPU acceleration, and the normalized bilinear features are stabilized and discriminative.
11: Experiments on five public benchmarks for FGVC demonstrate that the proposed MOMN is superior to existing normalization-based methods in terms of both accuracy and efficiency.
12: The code is available: https://github.com/mboboGO/MOMN.
13:
14: \end{abstract}
15: