Next: EXPERIMENTAL RESULTS (see [4]
Up: SIMPLIFYING NEURAL NETS BY
Previous: TASK / ARCHITECTURE /
THE ALGORITHM
The algorithm
is designed to find a defining a box with maximal
box volume . This is
equivalent to finding a box
with minimal
.
Note the relationship to MDL ( is the number of bits required
to describe the weights).
In appendix A.2, we derive the following algorithm.
It minimizes
,
where

(1) 
Here is the activation of the th output unit,
is a constant, and
is a positive variable ensuring
either
,
or ensuring an expected decrease of
during learning
(see [] for adjusting ).
is minimized by gradient descent. To minimize ,
we compute

(2) 
It can be shown (see [4]) that by
using Pearlmutter's and Mller's efficient second order method
[,7],
the gradient of
can be computed in time (see details in [4]).
Therefore, our algorithm
has the same order of complexity as standard backprop.
Next: EXPERIMENTAL RESULTS (see [4]
Up: SIMPLIFYING NEURAL NETS BY
Previous: TASK / ARCHITECTURE /
Juergen Schmidhuber
20030225
Back to Financial Forecasting page