Next: EXPERIMENTAL RESULTS (see [4]
Up: SIMPLIFYING NEURAL NETS BY
Previous: TASK / ARCHITECTURE /
THE ALGORITHM
The algorithm
is designed to find a
defining a box
with maximal
box volume
. This is
equivalent to finding a box
with minimal
.
Note the relationship to MDL (
is the number of bits required
to describe the weights).
In appendix A.2, we derive the following algorithm.
It minimizes
,
where
 |
(1) |
Here
is the activation of the
th output unit,
is a constant, and
is a positive variable ensuring
either
,
or ensuring an expected decrease of
during learning
(see [] for adjusting
).
is minimized by gradient descent. To minimize
,
we compute
 |
(2) |
It can be shown (see [4]) that by
using Pearlmutter's and M
ller's efficient second order method
[,7],
the gradient of
can be computed in
time (see details in [4]).
Therefore, our algorithm
has the same order of complexity as standard backprop.
Next: EXPERIMENTAL RESULTS (see [4]
Up: SIMPLIFYING NEURAL NETS BY
Previous: TASK / ARCHITECTURE /
Juergen Schmidhuber
2003-02-25
Back to Financial Forecasting page