It should be noted that
in quite different contexts,
previous papers have shown how
one net
may learn to perform appropriate
lasting weight changes for a second net
[4]
[1].
However, these previous approaches could not be called
`self-referential' -- they all
involve at least some weights
that can not be manipulated
other than by conventional gradient descent.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.