(0) Define a variable interval .

(1) Make equal to the interval constraining possible weight values.

(2) While :

Divide into 2 equally-sized disjunct intervals and .

If then ; write `1'.

If then ; write `0'.

The final set corresponds to a ``bit-box'' within our box. This ``bit-box'' contains 's center and is described by a bitstring of length , where the constant is independent of the box . From ( is the center of the ``bit-box'') and the bitstring describing the ``bit-box'', the receiver can compute as follows: he selects an initialization weight vector within the ``bit-box'' and uses gradient descent to decrease until , where in the bit-box denotes the receiver's current approximation of ( is constantly updated by the receiver). This is like ``FMS without targets'' - recall that the receiver knows the inputs . Since corresponds to the weight vector with the highest degree of local flatness within the ``bit-box'', the receiver will find the correct .

is described by a Gaussian distribution with mean zero.
Hence, the description length of is
(Shannon, 1948).
, the center of the ``bit-box'',
cannot be known before training.
However, we do know the *expected* description length of
the net function, which is
( is a constant independent of ).
Let us approximate :

.

Among those that lead to
equal (the negative logarithm of
the box volume plus ),
we want to find those with minimal description length of
the function induced by .
Using Lagrange multipliers (viewing the as variables),
**it can be shown that is minimal under the
condition
iff flatness condition 2 holds**.
To conclude: with given box volume, we need
flatness condition 2 to minimize the expected description length of
the function induced by .

Back to Financial Forecasting page