where is the th of the components of , and is the th of the components of .

In general, this task requires storage of input events in a short-term memory. Previous solutions to this problem have employed gradient-based dynamic recurrent nets (e.g., [Robinson and Fallside, 1987], [Pearlmutter, 1989], [Williams and Zipser, 1989]). In the next section an alternative gradient-based approach is described. For convenience, we drop the indices which stand for the various episodes.

The gradient of the error over
all episodes is equal to the sum of the gradients for each episode.
Thus we only require a method for
minimizing the error observed during one particular episode:

where . (In the practical

Back to Recurrent Neural Networks page