Skip to main navigation Skip to search Skip to main content

Overcoming Training Challenges in Neural Network based PDE Solvers: Theory, Algorithms, and Applications

  • Chuqi CHEN

Student thesis: Doctoral thesis

Abstract

Neural network-based methods have emerged as a powerful paradigm for solving partial differential equations (PDEs), especially in contexts involving complex geometries, high-dimensional systems, or integration with empirical data. While their expressive capacity is considerable, training such models often encounters challenges such as slow convergence, instability, and limited accuracy. This thesis aims to systematically investigate the training dynamics, algorithmic enhancements, and theoretical foundations of neural PDE solvers, with a focus on improving their efficiency, stability, and generalization ability.

We begin by investigating the critical role of automatic differentiation (AD) in neural network-based PDE solvers by conducting a systematic comparison with traditional finite difference (FD) schemes. To quantify training dynamics, we propose a novel metric—truncated entropy—which captures both residual decay and convergence behavior during training. Through theoretical analysis and empirical validation on random feature models and two-layer neural networks, we demonstrate that truncated entropy serves as an effective indicator of optimization efficiency. The results consistently show that AD leads to faster convergence and superior training performance compared to FD, providing a principled understanding of the advantages of AD in scientific machine learning applications.

Next, we explore the difficulty of training neural PDE solvers from a spectral perspective. By analyzing the eigenvalue distribution of associated kernel matrices, we introduce the concept of effective rank as a quantitative measure of training complexity. A higher effective rank is found to correlate with faster error convergence. Leveraging this insight, we propose two initialization techniques—Partition of Unity (PoU) and Variance Scaling (VS)—which enhance the effective rank and consistently improve convergence across diverse PDE-solving frameworks, including Physics-Informed Neural Networks (PINNs), Deep Ritz, and DeepONet.

Finally, we address a critical challenge in scientific computing: solving singularly perturbed PDEs. These problems involve small parameters that induce sharp interfaces or near-singular optimization landscapes. To address this, we develop the Homotopy Dynamics framework, which gradually deforms the PDE parameters during training to improve learnability. We establish theoretical convergence guarantees for this method and demonstrate substantial gains in both accuracy and convergence speed on benchmark singular problems, positioning Homotopy Dynamics as a robust strategy for parameter-sensitive systems.

Collectively, these contributions provide a unified understanding of the optimization landscape, practical algorithm design, and theoretical behavior of neural PDE solvers. The findings offer new insights and tools for advancing the robustness, scalability, and scientific applicability of deep learning in computational physics and engineering.

Date of Award2025
Original languageEnglish
Awarding Institution
  • The Hong Kong University of Science and Technology
SupervisorYang XIANG (Supervisor)

Cite this

'