abstract:df1381e3a6af971d.tex

1: \begin{abstract}

2: The over-parameterized pre-trained models pose a great challenge to fine-tuning with limited computation resources.

3: An intuitive solution is to prune the less informative samples from the fine-tuning dataset.

4: A series of training-based scoring functions are proposed to quantify the informativeness of the data subset but the pruning cost becomes non-negligible due to the heavy parameter updating.

5: For efficient pruning, it is viable to adapt the similarity scoring function of geometric-based methods from training-based to training-free.

6: However, we empirically show that such adaption distorts the original pruning and results in inferior performance on the downstream tasks.

7: In this paper, we propose to treat the learning complexity (LC) as the scoring function for classification and regression tasks.

8: Specifically, the learning complexity is defined as the average predicted confidence of subnets with different capacities, which encapsulates data processing within a converged model.

9: Then we preserve the diverse and easy samples for fine-tuning.

10: Extensive experiments with vision datasets demonstrate the effectiveness and efficiency of the proposed scoring function for classification tasks.

11: % Besides, the learning complexity for regression demonstrates superiority with smooth and stable convergence at the instruction fine-tuning of large language models, outperforming all the pruning methods compared.

12: For the instruction fine-tuning of large language models, our method achieves state-of-the-art performance with stable convergence, outperforming the full training with only 10\% of the instruction dataset.

13: \end{abstract}

14: