https://doi.org/10.1140/epjs/s11734-024-01422-w
Regular Article
Complex landscape of the cost function in a simple machine learning regression task
1
National Research University Higher School of Economics, 25/12 Bol’shaya Pecherskaya street, 603155, Nizhny Novgorod, Russia
2
A.V. Gaponov-Grekhov Institute of Applied Physics of the Russian Academy of Sciences, 46 Ulyanova Street, 603950, Nizhny Novgorod, Russia
Received:
26
September
2024
Accepted:
14
November
2024
Published online:
5
December
2024
We carry out detailed study of the phase space of a small neural network in a paradigmatic machine learning regression problem. In spite of its simplicity, the system phase space turns to be extremely complex with a plenty of local minima of the cost function. These minima differ in depth by several orders of magnitude, with the majority of them being significantly suboptimal. The deepest minima corresponding to optimal solutions are characterized by a peculiar form having both very steep and very flat directions. These features explain the difficulty of learning optimal solutions with standard machine learning methods.
Copyright comment Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
© The Author(s), under exclusive licence to EDP Sciences, Springer-Verlag GmbH Germany, part of Springer Nature 2024
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.