next up previous
Next: INTRODUCTION

SIMPLIFYING NEURAL NETS BY DISCOVERING FLAT MINIMA

Sepp Hochreiter
Jürgen Schmidhuber
TUM

In G. Tesauro, D. S. Touretzky and T. K. Leen, eds., Advances in Neural Information Processing Systems 7, NIPS'7, pages 529-536. MIT Press, Cambridge MA, 1995.

Abstract:

We present a new algorithm for finding low complexity networks with high generalization capability. The algorithm searches for large connected regions of so-called ``flat'' minima of the error function. In the weight-space environment of a ``flat'' minimum, the error remains approximately constant. Using an MDL-based argument, flat minima can be shown to correspond to low expected overfitting. Although our algorithm requires the computation of second order derivatives, it has backprop's order of complexity. Experiments with feedforward and recurrent nets are described. In an application to stock market prediction, the method outperforms conventional backprop, weight decay, and ``optimal brain surgeon''.





Juergen Schmidhuber 2003-02-25


Back to Financial Forecasting page