How Often Can we Expect to Profit from Earlier Tasks?

How likely is it that any learner can indeed profit from earlier solutions?
At first naive glance this seems unlikely, since it has been well-known for
many decades that
most possible pairs of symbol strings (such as problem-solving programs)
do not share any algorithmic information; e.g., [33].
Why not? Most possible combinations
of strings are algorithmically incompressible,
that is, the shortest algorithm computing , given , has
the size of the shortest algorithm computing , given nothing
(typically a bit more than symbols), **which means
that usually does not tell us anything about .**
Papers in evolutionary computation often mention
*no free lunch theorems*
[81] which are variations of this ancient
insight of theoretical computer science.

Such at first glance discouraging theorems, however, have a quite
limited scope: they refer to the very special case of problems sampled
from i.i.d. uniform distributions on finite problem spaces. But of course
there are infinitely many distributions besides the uniform one.
In fact, the uniform one is not only extremely unnatural from
any computational perspective -- although most objects are random,
computing random objects is much harder than computing nonrandom ones
-- but does not even make sense as we increase data set size and let it
go to infinity: *There is no such thing as a uniform distribution
on infinitely many things,* such as the integers.

Typically, successive real world problems are **not**
sampled from uniform distributions.
Instead they tend to be closely related. In particular,
teachers usually provide sequences of more and more
complex tasks with very similar solutions, and
in optimization the next task typically
is just to outstrip the best approximative solution found so far,
given some basic setup that does not change from one task to the next.
Problem sequences that humans consider to be *interesting*
are *atypical* when compared to *arbitrary* sequences
of well-defined problems [53].
Almost the entire field of computer science
is focused on comparatively few atypical
problem sets with exploitable regularities.
For all *interesting* problems
the consideration of previous work is justified,
to the extent that *interestingness* implies relatedness
to what's already known [56].
Obviously, OOPS-like procedures are advantageous only where such
relatedness does exist. In any case, however, they will at least
not do much harm.

Back to OOPS main page