Next: THE ALGORITHM Up: SIMPLIFYING NEURAL NETS BY Previous: INTRODUCTION

# TASK / ARCHITECTURE / BOXES

Generalization task. The task is to approximate an unknown relation between a set of inputs and a set of outputs . is taken to be a function. A relation is obtained from by adding noise to the outputs. All training information is given by a finite relation . is called the training set. The th element of is denoted by an input/target pair .

Architecture. For simplicity, we will focus on a standard feedforward net (but in the experiments, we will use recurrent nets as well). The net has input units, output units, weights, and differentiable activation functions. It maps input vectors to output vectors . The weight from unit to is denoted by . The -dimensional weight vector is denoted by .

Training error. Mean squared error is used, where denotes the Euclidian norm, and denotes the cardinality of a set. To define regions in weight space with the property that each weight vector from that region has similar small error'', we introduce the tolerable error , a positive constant. Small'' error is defined as being smaller than . implies underfitting''.

Boxes. Each weight satisfying defines an acceptable minimum''. We are interested in large regions of connected acceptable minima. Such regions are called flat minima. They are associated with low expected generalization error (see [4]). To simplify the algorithm for finding large connected regions (see below), we do not consider maximal connected regions but focus on so-called boxes'' within regions: for each acceptable minimum , its box in weight space is a -dimensional hypercuboid with center . For simplicity, each edge of the box is taken to be parallel to one weight axis. Half the length of the box edge in direction of the axis corresponding to weight is denoted by , which is the maximal (positive) value such that for all , all positive can be added to or subtracted from the corresponding component of simultaneously without violating ( gives the precision of ). 's box volume is defined by .

Next: THE ALGORITHM Up: SIMPLIFYING NEURAL NETS BY Previous: INTRODUCTION
Juergen Schmidhuber 2003-02-25

Back to Financial Forecasting page