With the offline variant of the approach,
's training phase
is based on a set
of training files.
Assume that the alphabet contains
possible characters
.
The (local) representation of
is a binary
-dimensional
vector
with exactly one non-zero component (at the
-th position).
has
input units and
output units.
is called the ``time-window size''.
We insert
default characters
at the beginning of each file.
The representation of the
default character,
, is the
-dimensional zero-vector.
The
-th character of file
(starting
from the first default character) is called
.
For all
and all possible
,
receives as an input