OOPS for Reinforcement Learning (skip?)
OOPS predictor tries to find better program for predicting next event, given the past. OOPS actor uses OOPS predictor to search for a program that generates more expected reward than best so far, always executing current action of currently best program.
Back to J. Schmidhuber's OOPS page