Title: Recurrent Neural Networks - A Focus on Architectures
Authors: Hans Georg Zimmermann, Ralph Grothmann, Christoph Tietz
Siemens AG, Corporate Technology Dpt. CT IC4,
contact: Hans_Georg.Zimmermann@siemens.com
Time delay recurrent neural networks are an obvious framework for the modeling of dynamical systems. To support the system identification, we propose to integrate additional prior principles about the dynamics into the recurrent neural network. In general this can either be done by improving learning algorthms or by extending architectures. For instance, unfolding in time is an elementary idea of transfering an algorithm into a network architecture.
Most real world dynamical systems are open systems, i.e. they are a superposition of an autonomous and an external driven part. Here one of the major difficulties is the lack of knowledge about the external drivers. As a remedy we propose recurrent error correction neural networks (ECNN). Such an ECNN incorporates the last measured model error as an addtional input. Hence, the learning can interpret the models misfit as the consequence of an external shock or the presence of unkown external influences. The error correction mechanism is designed as an architectural extension to a recurrent network (This is different to Kalman filtering).
A technical problem of finite unfolding in time is the initialization of the first internal recurrent state. Typically this is handled by the assumption that a misspecification of the initial state is vanishing in importance while the system is iterating. We introduce a new technique to desensitize the network form the arbitrariness of the state initialization.
Finally, we deal with the modeling of high-dimensional dynamical systems. One idea is to search for (time-)invariant manifolds containing the dynamics. Now the prediction of the complete system can be simplified by concentrating on the (lower dimensional) (time-)variant sub-structure. The complete dynamics can be reconstructed as a combination of the predicted variants and invariants. We describe a network architecture which combines the variance - invariance separation together with an ECNN.