next up previous
Next: FINAL REMARKS Up: EVALUATING LONG-TERM DEPENDENCY BENCHMARK Previous: PARITY PROBLEM

TOMITA GRAMMARS

Many authors also use Tomita's grammars (1982) to test their algorithms. See, e.g., Bengio and Frasconi (1995), Watrous and Kuhn (1992), Pollack (1991), Miller and Giles (1993), Manolios and Fanelli (1994). Since we already tested parity problems above, we focus here on a few ``parity-free'' Tomita grammars (the grammars #1, #2, and #4). Most previous work facilitated the learning problem by restricting sequence length. E.g., Miller and Giles' maximal test sequence length is 15, and maximal training sequence length is 10. Miller and Giles (1993) report the number of sequences required for convergence (for various first and second order nets with 3 to 9 units): Tomita #1: 23000 - 46000; Tomita #2: 77000 - 200000; Tomita #4: 46000 - 210000. RG, however, performs better in these cases (as always, we use the experimental conditions described in section 2). The average results are: Tomita #1: 182 (with A1, $n=1$) and 288 (with A2), Tomita #2: 1511 (with A1, $n=3$) and 17953 (with A2), Tomita #4: 13833 (with A1, $n=2$) and 35610 (with A2).

It should be mentioned, however, that by using our architectures and very short training sequences (in the style of Miller & Giles) one can achieve reasonable results with gradient descent, too.


next up previous
Next: FINAL REMARKS Up: EVALUATING LONG-TERM DEPENDENCY BENCHMARK Previous: PARITY PROBLEM
Juergen Schmidhuber 2003-02-19