Next: INTRODUCTION
SIMPLIFYING NEURAL NETS BY
DISCOVERING FLAT MINIMA
Sepp Hochreiter
Jürgen Schmidhuber
TUM
In G. Tesauro, D. S. Touretzky and T. K. Leen, eds., Advances in Neural Information Processing Systems 7, NIPS'7, pages 529-536. MIT Press, Cambridge MA, 1995.
Abstract:
We present a new algorithm for finding
low complexity networks
with high generalization capability.
The algorithm searches for large connected regions
of so-called ``flat'' minima of the
error function. In the weight-space environment
of a ``flat'' minimum, the error remains
approximately constant.
Using an MDL-based argument,
flat minima can be shown to
correspond to low expected overfitting.
Although our algorithm requires the computation
of second order derivatives, it has backprop's order
of complexity.
Experiments with feedforward and recurrent nets
are described.
In an application to stock market prediction,
the method outperforms conventional backprop,
weight decay, and ``optimal brain surgeon''.
Juergen Schmidhuber
2003-02-25
Back to Financial Forecasting page