Neural networks are machine learning models for constructing universal function approximation algorithms. The learning of these networks requires to iteratively solve large nonlinear optimization problems, and gradient methods play a central role in this framework, which motivates the investigation of techniques for accelerating and robustifying these methods. On the other hand, for some neural network architectures, the learning process can be interpreted as an optimal control problem and, also in this case, efficient and robust solution techniques are required. In this talk, a two level- and gradient-based learning scheme and a regularized preconditioned gradient method are discussed and applied to the training of neural networks. In the former method, the tuning of the gradient and the coarse-level correction scheme are based on the knowledge of the Hessian of the loss function. In the latter case, a suitable regularization and a preconditioner based on random normal projections are analysed. In the last part of the talk, the training of Runge-Kutta neural networks is formulated as an optimal control problem in the framework of the Pontryagin maximum principle, and a sequential quadratic hamiltonian algorithm is proposed for its fast solution.
Some multilevel, preconditioning, and optimal control techniques in machine learning
Research Group:
Prof.
Speaker:
Alfio Borzi
Institution:
University of Wuerzburg, Germany
Schedule:
Tuesday, March 14, 2023 - 15:00 to 16:00
Location:
A-137
Location:
Lecture Room A-137, SISSA via Bonomea 265
Abstract: