.
Handwriting Recognition - best current results (by Juergen Schmidhuber)
.
.

Handwriting Recognition Through Deep Learning
Jürgen Schmidhuber, 2010

Automatic handwriting recognition is of academic and commercial interest. Current algorithms are already good at learning to recognize handwritten digits. Post offices use them to sort letters; banks use them to read personal checks. Some predict that in the near future billions of handheld devices such as cell phones will have handwriting recognition capabilities.

It is easier to recognize (1) isolated handwritten symbols than (2) unsegmented connected handwriting (with unknown beginnings and ends of individual letters). For both cases, our team achieved the best current performance in various international competitions, using two types of deep artificial neural networks, both with many non-linear processing stages.

(1) For isolated digits we use deep feedforward neural nets trained by an ancient algorithm: backprop. No fashionable unsupervised pre-training is necessary! But graphics cards (mini-supercomputers for video games) are used to accelerate learning by a factor of 50. This is sufficient to clearly outperform numerous previous more complex machine learning methods [6]. One of the reviewers called this a "wake-up call to the machine learning community" :-)

And our network committees yield even better results, e.g., on the MNIST data set, perhaps the most famous benchmark of machine learning: 0.31% error rate [7a] as of March 2011, 0.27% as of June 2011 [8], and finally the first human-competitive result on this iconic benchmark: 0.23% [10], through our special breed [7] of max-pooling convolutional networks (MPCNN), now widely used by research labs and companies all over the world. (As of 2011, the best result by others is still 0.39%.)

(2) For connected handwriting we use our bi-directional or multi-dimensional LSTM recurrent neural networks (graphics in 2nd column) [1-5], which learn to maximize the probabilities of label sequences, given raw training sequences. This method won several handwriting competitions at ICDAR 2009: the Arabic Connected Handwriting Competition, the Handwritten Farsi/Arabic Character Recognition Competition, and the French Connected Handwriting Competition. Compare the more general neural computer vision page.
.

Recognition of Unsegmented Connected Handwriting by Bi-Directional LSTM Recurrent Networks - Jürgen Schmidhuber

Surprisingly, good old on-line backprop for standard neural nets yields a very low 0.35% error rate [6] on the famous MNIST handwritten digits benchmark (below: example digits and plausible labels). All we need to achieve this best result (as of 2010) are many hidden layers, many neurons per layer, many deformed training images, and graphics cards (inset) to greatly speed up learning.
.

In recent decades neural networks have been overshadowed by the very useful but principally less general and less powerful support vector machines as well as other more specialized machine learning methods. Do the new state-of-the-art results herald a rennaissance of neural networks? Neither our fast deep nets nor our recurrent nets (also deep by nature) are limited to handwriting. Both hold great promise for many visual and other pattern recognition tasks.

Selected Publications

[10] D. C. Ciresan, U. Meier, J. Schmidhuber. Multi-column Deep Neural Networks for Image Classification. IEEE Conf. on Computer Vision and Pattern Recognition CVPR 2012. PDF. ArXiv Preprint arXiv:1202.2745v1 [cs.CV], Feb 2012.

[9] U. Meier, D. C. Ciresan, L. M. Gambardella, J. Schmidhuber. Better Digit Recognition with a Committee of Simple Neural Nets. 11th International Conference on Document Analysis and Recognition (ICDAR 2011), Beijing, China, 2011. PDF.

[8] D. C. Ciresan, U. Meier, L. M. Gambardella, J. Schmidhuber. Convolutional Neural Network Committees For Handwritten Character Classification. 11th International Conference on Document Analysis and Recognition (ICDAR 2011), Beijing, China, 2011. PDF.

[7a] D. C. Ciresan, U. Meier, L. M. Gambardella, J. Schmidhuber. Handwritten Digit Recognition with a Committee of Deep Neural Nets on GPUs. ArXiv Preprint arXiv:1103.4487v1 [cs.LG], 23 Mar 2011.

[7] D. C. Ciresan, U. Meier, J. Masci, L. M. Gambardella, J. Schmidhuber. Flexible, High Performance Convolutional Neural Networks for Image Classification. International Joint Conference on Artificial Intelligence (IJCAI-2011, Barcelona), 2011. ArXiv preprint, 1 Feb 2011.

[6] D. C. Ciresan, U. Meier, L. M. Gambardella, J. Schmidhuber. Deep Big Simple Neural Nets For Handwritten Digit Recognition. Neural Computation 22(12): 3207-3220, 2010. ArXiv Preprint.

[5] A. Graves, M. Liwicki, S. Fernandez, R. Bertolami, H. Bunke, J. Schmidhuber. A Novel Connectionist System for Improved Unconstrained Handwriting Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 5, 2009. PDF.

[4] A. Graves, J. Schmidhuber. Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks. Advances in Neural Information Processing Systems 22, NIPS'22, p 545-552, Vancouver, MIT Press, 2009. PDF.

[3] A. Graves, S. Fernandez, M. Liwicki, H. Bunke, J. Schmidhuber. Unconstrained online handwriting recognition with recurrent neural networks. Advances in Neural Information Processing Systems 21, NIPS'21, p 577-584, 2008, MIT Press, Cambridge, MA, 2008. PDF.

[2] M. Liwicki, A. Graves, H. Bunke, J. Schmidhuber. A novel approach to on-line handwriting recognition based on bidirectional Long Short-Term Memory networks. 9th International Conference on Document Analysis and Recognition, 2007. PDF.

[1] S. Hochreiter and J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735-1780, 1997. PDF.
.

Isolated Digit Recognition with Big Deep Neural Nets on Fast Graphics Cards (Juergen Schmidhuber)

Check out Yann LeCun's MNIST page with a long list of broken MNIST records since 1998.

Handwriting team members at IDSIA & TUM: Dan Ciresan, Ueli Meier, Luca Maria Gambardella, Alex Graves. Part of this work was funded through the Swiss CTI project 9688.1 "Intelligent Fill In Form" in collaboration with the company Lifeware.

Update of 17 June 2011: Our team also just won the ICDAR Offline Chinese Handwriting Competition (1st & 2nd place), without speaking a word of Chinese. Additional 1st ranks achieved by our neural computer vision team are listed in the computer vision page.

Chinese Handwriting

Copyright notice (2010): Fibonacci web design by Jürgen Schmidhuber, who will be delighted if you use this web page for educational and non-commercial purposes, including articles for Wikipedia and similar sites, provided you mention the source and provide a link.


Are you an industrial company that wants to solve interesting pattern recognition problems better than your competitors? Don't hesitate to contact JS.

Last update February 2012