Details of complex event sequences are often not predictable, but
their reduced abstract representations are. I study an embedded active
learner that can limit its predictions to almost arbitrary computable
aspects of spatio-temporal events. It constructs probabilistic algorithms
that (1) control interaction with the world, (2) map event sequences to
abstract internal representations (IRs), (3) predict IRs from IRs computed
earlier. Its goal is to create novel algorithms generating IRs useful for
correct IR predictions, without wasting time on those learned before.
This requires an adaptive novelty measure which is implemented by a
co-evolutionary scheme involving two competing modules
collectively
designing (initially random) algorithms representing experiments. Using
special instructions, the modules can bet on the outcome of IR predictions
computed by algorithms they have agreed upon. If their opinions differ
then the system checks who's right, punishes the loser (the surprised
one), and rewards the winner. An evolutionary or reinforcement learning
algorithm forces each module to maximize reward. This motivates both
modules to lure each other into agreeing upon experiments involving
predictions that surprise it. Since each module essentially can veto
experiments it does not consider profitable, the system is motivated to
focus on those computable aspects of the environment where both modules
still have confident but different opinions. Once both share the same
opinion on a particular issue (via the loser's learning process, e.g.,
the winner is simply copied onto the loser), the winner loses a source
of reward -- an incentive to shift the focus of interest onto novel
experiments. My simulations include an example where surprise-generation
of this kind helps to speed up external reward.