Another limitation of our approach has been mentioned above: It relies on differentiable (although possibly adaptive) models of the costs associated with known action sequences. The domain knowledge resides in these models - from there it is extracted by the subgoal generation process. There are domains, however, where a differentiable evaluator module might be inappropriate or difficult to obtain.
Even in cases where there is a differentiable model at hand the problem of local minima remains. Local minima did not play a major role with the simple experiments described above - with large scale applications, however, some way of dealing with suboptimal solutions needs to be introduced.