next up previous
Next: A.3.2. FAST MULTIPLICATION BY Up: A.3. EFFICIENT IMPLEMENTATION OF Previous: A.3. EFFICIENT IMPLEMENTATION OF

A.3.1 EXPLICIT DERIVATIVE OF EQUATION (1)

The derivative of the right-hand side of (1) is:


  $\textstyle \frac{\partial B(w,x_p)}{\partial w_{uv}} =
\sum_{i,j} \frac{ \sum_{...
...} \partial w_{uv}} }
{ \sum_{m} (\frac{\partial o^{m}}{\partial w_{ij}})^{2}} +$    
  $\textstyle L \frac{\sum_{k} \left( \sum_{i,j} \frac{\vert\frac{\partial o^{k}}{...
...{\sum_{m} (\frac{\partial o^{m}}{\partial w_{ij}})^{2}}} \right)^{2}} \mbox{ .}$   (39)

To compute (2), we need


  $\textstyle \frac{\partial B(w,x_p)}{\partial (\frac{\partial o^{k}}{\partial w_...
...}}{\partial w_{ij}}}
{ \sum_{m} (\frac{\partial o^{m}}{\partial w_{ij}})^{2}} +$    
  $\textstyle L \frac{\sum_{m} \left( \sum_{l,r} \left( \frac{\vert\frac{\partial ...
...r m} (\frac{\partial o^{\bar m}}{\partial w_{lr}})^{2}}} \right)^{2}} \mbox{ ,}$   (40)

where $\bar \delta$ is the Kronecker-Delta. Using the nabla operator and (40), we can compress (39):
\begin{displaymath}
\nabla_{uv} B(w,x_p) =
\sum_{k} H^{k} (\nabla_{\frac{\partial o^{k}}{\partial w_{ij}}} B(w,x_p)) \mbox{ ,}
\end{displaymath} (41)

where $H^{k}$ is the Hessian of the output $o^k$. Since the sums over $l,r$ in (40) need to be computed only once (the results are reusable for all $i,j$), $\nabla_{\frac{\partial o^{k}}{\partial w_{ij}}} B(w,x_p)$ can be computed in $O(L)$ time. The product of the Hessian and a vector can be computed in $O(L)$ time (see next section). With constant number of output units, the computational complexity of our algorithm is $O(L)$.


next up previous
Next: A.3.2. FAST MULTIPLICATION BY Up: A.3. EFFICIENT IMPLEMENTATION OF Previous: A.3. EFFICIENT IMPLEMENTATION OF
Juergen Schmidhuber 2003-02-13


Back to Financial Forecasting page