Outline. We first explicitly compute the derivatives of (1). Then we show how to use Pearlmutter and Mller's algorithm to speed up the computation of second order terms (A.3.2).
For simplicity, in what follows we focus on a single input vector . Again, (and occasionally itself) will be notationally suppressed.