How likely is it that any learner can indeed profit from earlier solutions? At first naive glance this seems unlikely, since it has been well-known for many decades that most possible pairs of symbol strings (such as problem-solving programs) do not share any algorithmic information; e.g., . Why not? Most possible combinations of strings are algorithmically incompressible, that is, the shortest algorithm computing , given , has the size of the shortest algorithm computing , given nothing (typically a bit more than symbols), which means that usually does not tell us anything about . Papers in evolutionary computation often mention no free lunch theorems  which are variations of this ancient insight of theoretical computer science.
Such at first glance discouraging theorems, however, have a quite limited scope: they refer to the very special case of problems sampled from i.i.d. uniform distributions on finite problem spaces. But of course there are infinitely many distributions besides the uniform one. In fact, the uniform one is not only extremely unnatural from any computational perspective -- although most objects are random, computing random objects is much harder than computing nonrandom ones -- but does not even make sense as we increase data set size and let it go to infinity: There is no such thing as a uniform distribution on infinitely many things, such as the integers.
Typically, successive real world problems are not sampled from uniform distributions. Instead they tend to be closely related. In particular, teachers usually provide sequences of more and more complex tasks with very similar solutions, and in optimization the next task typically is just to outstrip the best approximative solution found so far, given some basic setup that does not change from one task to the next. Problem sequences that humans consider to be interesting are atypical when compared to arbitrary sequences of well-defined problems . Almost the entire field of computer science is focused on comparatively few atypical problem sets with exploitable regularities. For all interesting problems the consideration of previous work is justified, to the extent that interestingness implies relatedness to what's already known . Obviously, OOPS-like procedures are advantageous only where such relatedness does exist. In any case, however, they will at least not do much harm.