Skip to main navigation Skip to search Skip to main content

Mathematical Models of Overparameterized Neural Networks

  • Cong Fang
  • , Hanze Dong
  • , Tong Zhang*
  • *Corresponding author for this work

Research output: Contribution to journalJournal Articlepeer-review

Abstract

Deep learning has received considerable empirical success in recent years. However, while many ad hoc tricks have been discovered by practitioners, until recently, there has been a lack of theoretical understanding for tricks invented in the deep learning literature. Known by practitioners that overparameterized neural networks (NNs) are easy to learn, in the past few years, there have been important theoretical developments in the analysis of overparameterized NNs. In particular, it was shown that such systems behave like convex systems under various restricted settings, such as for two-layer NNs, and when learning is restricted locally in the so-called neural tangent kernel space around specialized initializations. This article discusses some of these recent signs of progress leading to a significantly better understanding of NNs. We will focus on the analysis of two-layer NNs and explain the key mathematical models, with their algorithmic implications. We will then discuss challenges in understanding deep NNs and some current research directions.

Original languageEnglish
Article number9326403
Pages (from-to)683-703
Number of pages21
JournalProceedings of the IEEE
Volume109
Issue number5
DOIs
Publication statusPublished - May 2021

Bibliographical note

Publisher Copyright:
© 1963-2012 IEEE.

Keywords

  • Mean-field (MF) analysis
  • neural networks (NNs)
  • neural tangent kernel (NTK)
  • overparameterization
  • random features

Fingerprint

Dive into the research topics of 'Mathematical Models of Overparameterized Neural Networks'. Together they form a unique fingerprint.

Cite this