Assume that the alphabet contains
possible characters
.
The (local) representation of
is a binary
-dimensional
vector
with exactly one non-zero component (at the
-th position).
has
input units and
output units.
is called the ``time-window'' size.
We insert
default characters
at the beginning of each file.
The representation of the
default character,
, is the
-dimensional zero-vector.
The
-th character of file
(starting
from the first default character) is called
.
For all
and all possible
,
receives as an input
![]() |
(1) |
| (2) |
| (3) |
In practical applications, the
will not always sum up to 1.
To obtain outputs satisfying the properties of
a proper probability distribution,
we normalize by defining
![]() |
(4) |