abstract:705241b20b9ebbe4.tex

1: \begin{abstract}

2: Recently, deep learning-based facial landmark detection has achieved significant improvement.

3: However, the semantic ambiguity problem degrades detection performance.

4: Specifically, the semantic ambiguity causes inconsistent annotation and negatively affects the model's convergence, leading to worse accuracy and instability prediction.

5: To solve this problem, we propose a \textbf{S}elf-adap\textbf{T}ive \textbf{A}mbiguity \textbf{R}eduction (\textbf{STAR}) loss by exploiting the properties of semantic ambiguity.

6: We find that semantic ambiguity results in the anisotropic predicted distribution, which inspires us to use predicted distribution to represent semantic ambiguity.

7: Based on this, we design the STAR loss that measures the anisotropism of the predicted distribution.

8: Compared with the standard regression loss, STAR loss is encouraged to be small when the predicted distribution is anisotropic and thus adaptively mitigates the impact of semantic ambiguity.

9: Moreover, we propose two kinds of eigenvalue restriction methods that could avoid both distribution's abnormal change and the model's premature convergence.

10: Finally, the comprehensive experiments demonstrate that STAR loss outperforms the state-of-the-art methods on three benchmarks, \emph{i.e.,} COFW, 300W, and WFLW, with negligible computation overhead.

11: Code is at \url{https://github.com/ZhenglinZhou/STAR}

12: \end{abstract}

13: