Abstract
The great success of deep learning is largely driven by training over-parameterized models on massive datasets. To avoid excessive computation, extracting and training only on the most informative subset is drawing increasing attention. Nevertheless, it is still an open question how to select such a subset on which the model trained generalizes on par with the full data. In this paper, we propose dynamic margin selection (DynaMS). DynaMS leverages the distance from candidate samples to the classification boundary to construct the subset, and the subset is dynamically updated during model training. We show that DynaMS converges with large probability, and for the first time show both in theory and practice that dynamically updating the subset can result in better generalization. To reduce the additional computation incurred by the selection, a light parameter sharing proxy (PSP) is designed. PSP is able to faithfully evaluate instances following the underlying model, which is necessary for dynamic selection. Extensive analysis and experiments demonstrate the superiority of the proposed approach in data selection against many state-of-the-art counterparts on benchmark datasets.
| Original language | English |
|---|---|
| Publication status | Published - 2023 |
| Externally published | Yes |
| Event | 11th International Conference on Learning Representations, ICLR 2023 - Kigali, Rwanda Duration: 1 May 2023 → 5 May 2023 |
Conference
| Conference | 11th International Conference on Learning Representations, ICLR 2023 |
|---|---|
| Country/Territory | Rwanda |
| City | Kigali |
| Period | 1/05/23 → 5/05/23 |
Bibliographical note
Publisher Copyright:© 2023 11th International Conference on Learning Representations, ICLR 2023. All rights reserved.