Robust Convergence of Loss Landscapes through Distributional Averaging


Robust Convergence of Loss Landscapes through Distributional Averaging

Kiselev N.S. (MIPT, Dolgoprudny, Moscow Region, Russia)
Meshkov V.S. (MIPT, Dolgoprudny, Moscow Region, Russia)
Grabovoy A.V. (MIPT, Dolgoprudny, Moscow Region, Russia)

Abstract

Understanding how a neural network’s loss landscape evolves with dataset size is essential for identifying sufficient training data. Prior analyses of this problem have typically been local, focusing on second-order expansions around a single optimum and bounding convergence through Hessian properties. While such studies clarify convergence rates, they provide only a pointwise view of stability. In this paper, we extend the framework to a distributional paradigm. Instead of analyzing convergence at one optimum, we evaluate it in expectation over a parameter distribution. This approach captures how entire neighborhoods of the loss landscape stabilize as additional samples are added. We focus on Gaussian distributions centered at local minima and employ Monte Carlo sampling to estimate convergence in practice. Theoretically, we show that distributional convergence exhibits the same asymptotic rate as the local case, while offering a more robust picture of stability. Empirical studies on image classification tasks confirm these predictions and highlight how architectural choices such as normalization, dropout, and network depth influence convergence. Our results broaden local convergence analyses into a distributional setting, providing stronger guarantees and practical tools for characterizing dataset sufficiency.

Keywords

neural networks; loss landscape; convergence; gaussian sampling; Monte Carlo estimation; dataset size.

Edition

Proceedings of the Institute for System Programming, vol. 38, issue 3, part 4, 2026, pp. 71-82

ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).

DOI: 10.15514/ISPRAS-2026-38(3)-47

For citation

Kiselev N.S., Meshkov V.S., Grabovoy A.V. Robust Convergence of Loss Landscapes through Distributional Averaging. Proceedings of the Institute for System Programming, vol. 38, issue 3, part 4, 2026, pp. 71-82 DOI: 10.15514/ISPRAS-2026-38(3)-47.

Full text of the paper in pdf Back to the contents of the volume