1: \begin{abstract}
2: Recent works have shown that deep metric learning algorithms can benefit from
3: weak supervision from another input modality. This additional modality can be
4: incorporated directly into the popular triplet-based loss function as distances.
5: Also recently, classification loss and proxy-based metric learning have been
6: observed to lead to faster convergence as well as better retrieval results,
7: all the while without requiring complex and costly sampling strategies.
8: In this paper we propose an extension to the existing adaptive margin for
9: classification-based deep metric learning. Our extension introduces a separate
10: margin for each negative proxy per sample. These margins are computed during
11: training from precomputed distances of the classes in the other modality.
12: Our results set a new state-of-the-art on both on the Amazon fashion retrieval
13: dataset as well as on the public DeepFashion dataset. This was observed with
14: both fastText- and BERT-based embeddings for the additional textual modality.
15: Our results were achieved with faster convergence and lower code complexity than
16: the prior state-of-the-art.
17: %The resulting model has been deployed into Amazon's
18: %production Automated Brand Protection systems, where it is used to find
19: %image near-duplicates for surfacing copyright and design patent infringements
20: %hidden in the Amazon Catalog.
21: \end{abstract}
22: