d60c28ce3aacb83a.tex
1: \begin{abstract}
2: Many data distributions in the real world are hardly uniform. Instead, skewed and long-tailed distributions of various kinds
3: are commonly observed. This poses an interesting problem for machine learning, where most
4: algorithms assume or work well with uniformly distributed data. The problem is further exacerbated by current state-of-the-art deep
5: learning models requiring large volumes of training data. As such, learning from imbalanced data remains a
6: challenging research problem and a problem that must be solved as we move towards more real-world applications of deep learning. In the context of class imbalance, state-of-the-art (SOTA) accuracies on standard benchmark datasets for classification typically fall
7: less than 75\%, even for less challenging datasets such as CIFAR100. Nonetheless, there has been progress in this niche
8: area of deep learning. To this end, in this survey, we provide a taxonomy of various methods proposed for addressing the problem of
9: long-tail classification, focusing on works that happened in the last few years under a single mathematical framework.
10: We also discuss standard performance metrics, convergence studies, feature distribution and classifier analysis. We also provide a quantitative comparison of the performance of different SOTA methods and conclude the survey by discussing the remaining challenges and future research direction.
11: \end{abstract}