Abstract
This paper presents a framework of successive functional gradient optimization for training nonconvex models such as neural networks, where training is driven by mirror descent in a function space. We provide a theoretical analysis and empirical study of the training method derived from this framework. It is shown that the method leads to better performance than that of standard training techniques.
| Original language | English |
|---|---|
| Pages | 4921-4930 |
| Publication status | Published - Jul 2020 |
| Event | Proceedings of Machine Learning Research - Duration: 1 Jul 2020 → 1 Jul 2020 |
Conference
| Conference | Proceedings of Machine Learning Research |
|---|---|
| Period | 1/07/20 → 1/07/20 |