LS is not necessarily optimal for ``incremental'' learning problems where
experience with previous problems may help to reduce future search costs.
To make an incremental search method out of non-incremental LS, we
introduce a simple, heuristic, adaptive LS extension (ALS) that uses
experience with previous problems to adaptively modify LS' underlying
probability distribution. ALS essentially works as follows: whenever LS
found a program that computed a solution for the current problem,
the probabilities of 's instructions
are increased (here
denotes 's
-th instruction, and denotes 's length -- if LS did not
find a solution ( is the empty program), then is defined
to be 0). This will increase the probability of the entire program.
The probability adjustment is controlled by a learning rate
( ). ALS is related to the linear
reward-inaction algorithm, e.g., [#!Narendra:74!#,#!Kaelbling:93!#] -- the
main difference is: ALS uses LS to search through *program space*
as opposed to single action space. As in the previous section, the
probability distribution is determined by . Initially, all
. However, given a sequence of problems
, the may undergo changes caused by ALS:

**ALS** (problems
, variable matrix )

for:= 1todo:

:=Levin search(, );Adapt(, ).

where the procedure

**Adapt**(program , variable matrix )

for:= 1to, := 1todo:

if( = )then

else

Back to Optimal Universal Search page

Back to Reinforcement Learning page

Back to Program Evolution page