Handwriting Recognition - best current results (by Juergen Schmidhuber)

Winning Handwriting Recognition Competitions Through Deep Learning (2009: first really Deep Learners to win official contests). Jürgen Schmidhuber (2009-2013)

It is easier to recognize (1) isolated handwritten symbols than (2) unsegmented connected handwriting (with unknown beginnings and ends of individual letters). For both cases, our Deep Learning team achieved the best current performance in various international competitions, using two types of deep artificial neural networks, both with many non-linear processing stages.

(1) For isolated digits we use deep feedforward neural nets trained by an ancient algorithm: backprop a la Seppo Linnainmaa (1970) and Paul Werbos (1982). No fashionable unsupervised pre-training is necessary! But graphics cards (mini-supercomputers for video games) are used to accelerate learning by a factor of 50. This is sufficient to clearly outperform numerous previous more complex machine learning methods [6]. One of the reviewers called this a "wake-up call to the machine learning community" :-)

Our network committees yield even better results, e.g., on the MNIST data set, perhaps the most famous benchmark of machine learning: 0.31% error rate [7a] as of March 2011, 0.27% as of June 2011 [8], and finally the first human-competitive result on this iconic benchmark: around 0.2% [10], through our special breed [7] of multi-column max-pooling convolutional networks (MCMPCNN), now widely used by research labs and companies all over the world. This represented a dramatic improvement - as of 2011, the best result by others was still 0.39%.

(2) For connected handwriting we use (stacks of) our bi-directional or multi-dimensional LSTM recurrent neural networks (graphics in 2nd column) [1-5], which learn to maximize the probabilities of label sequences, given raw training sequences. Through the efforts of my former PhD student and postdoc Alex Graves, this method won several handwriting competitions at ICDAR 2009: the Arabic Connected Handwriting Competition, the Handwritten Farsi/Arabic Character Recognition Competition, and the French Connected Handwriting Competition. In fact, this was the first RNN system ever to win an official international pattern recognition competition. To our knowledge, it also was the first really Deep Learner ever (recurrent or not) to win such a contest. Compare the more general neural computer vision page.

For information on how we have built on the work of earlier pioneers since the 1960s, please visit www.deeplearning.me.

Recognition of Unsegmented Connected Handwriting by Bi-Directional LSTM Recurrent Networks - Jürgen Schmidhuber

Surprisingly, good old on-line backprop for standard neural nets yields a very low 0.35% error rate [6] on the famous MNIST handwriting benchmark (below: example digits and plausible labels). All we need to achieve this best result (as of 2010) are many hidden layers, many neurons per layer, many deformed training images, and graphics cards (inset) to greatly speed up learning. Our MCMPCNN topped this though.

Automatic handwriting recognition is of academic and commercial interest. Current algorithms already excel at learning to recognize handwritten digits. Post offices use them to sort letters; banks use them to read personal checks. Some predict that in the near future billions of handheld devices such as cell phones will have handwriting recognition capabilities.

In recent decades neural networks have been overshadowed by the very useful but principally less general and less powerful support vector machines as well as other more specialized machine learning methods. Our new state-of-the-art results herald a rennaissance of neural networks. Neither our fast deep nets nor our recurrent nets (also deep by nature) are limited to handwriting. They yield best known results on many visual and other pattern recognition tasks.

Selected Publications

[11] D. C. Ciresan, J. Schmidhuber. Multi-Column Deep Neural Networks for Offline Handwritten Chinese Character Classification. Preprint arXiv:1309.0261, 1 Sep 2013.

[10] D. C. Ciresan, U. Meier, J. Schmidhuber. Multi-column Deep Neural Networks for Image Classification. IEEE Conf. on Computer Vision and Pattern Recognition CVPR 2012. PDF. ArXiv Preprint arXiv:1202.2745v1 [cs.CV], Feb 2012.

[9] U. Meier, D. C. Ciresan, L. M. Gambardella, J. Schmidhuber. Better Digit Recognition with a Committee of Simple Neural Nets. 11th International Conference on Document Analysis and Recognition (ICDAR 2011), Beijing, China, 2011. PDF.

[8] D. C. Ciresan, U. Meier, L. M. Gambardella, J. Schmidhuber. Convolutional Neural Network Committees For Handwritten Character Classification. 11th International Conference on Document Analysis and Recognition (ICDAR 2011), Beijing, China, 2011. PDF.

[7a] D. C. Ciresan, U. Meier, L. M. Gambardella, J. Schmidhuber. Handwritten Digit Recognition with a Committee of Deep Neural Nets on GPUs. ArXiv Preprint arXiv:1103.4487v1 [cs.LG], 23 Mar 2011.

[7] D. C. Ciresan, U. Meier, J. Masci, L. M. Gambardella, J. Schmidhuber. Flexible, High Performance Convolutional Neural Networks for Image Classification. International Joint Conference on Artificial Intelligence (IJCAI-2011, Barcelona), 2011. ArXiv preprint, 1 Feb 2011.

[6] D. C. Ciresan, U. Meier, L. M. Gambardella, J. Schmidhuber. Deep Big Simple Neural Nets For Handwritten Digit Recognition. Neural Computation 22(12): 3207-3220, 2010. ArXiv Preprint.

[5] A. Graves, M. Liwicki, S. Fernandez, R. Bertolami, H. Bunke, J. Schmidhuber. A Novel Connectionist System for Improved Unconstrained Handwriting Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 5, 2009. PDF.

[4] A. Graves, J. Schmidhuber. Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks. Advances in Neural Information Processing Systems 22, NIPS'22, p 545-552, Vancouver, MIT Press, 2009. PDF.

[3] A. Graves, S. Fernandez, M. Liwicki, H. Bunke, J. Schmidhuber. Unconstrained online handwriting recognition with recurrent neural networks. Advances in Neural Information Processing Systems 21, NIPS'21, p 577-584, 2008, MIT Press, Cambridge, MA, 2008. PDF.

[2] M. Liwicki, A. Graves, H. Bunke, J. Schmidhuber. A novel approach to on-line handwriting recognition based on bidirectional Long Short-Term Memory networks. 9th International Conference on Document Analysis and Recognition, 2007. PDF.

[1] S. Hochreiter and J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735-1780, 1997. PDF.

Isolated Digit Recognition with Big Deep Neural Nets on Fast Graphics Cards (Juergen Schmidhuber)

Check out Yann LeCun's MNIST page with a long list of broken MNIST records since 1998.

Handwriting team (ex-)members at IDSIA & TUM: Alex Graves, Dan Ciresan, Ueli Meier. Part of this work was funded through the Swiss CTI project 9688.1 "Intelligent Fill In Form" in collaboration with the company Lifeware.

Update of 17 June 2011: Our team also just won the ICDAR Offline Chinese Handwriting Competition (1st & 2nd place), without speaking a word of Chinese. Additional 1st ranks achieved by our neural computer vision team are listed in the computer vision page.

Update of 1 Sep 2013: our Deep Learning yields the best artificial recognisers of Chinese characters from the ICDAR 2013 competition (3755 classes), approaching human performance [11].

Chinese Handwriting

Copyright notice (2010): Fibonacci web design by Jürgen Schmidhuber, who will be delighted if you use this web page for educational and non-commercial purposes, including articles for Wikipedia and similar sites, provided you mention the source and provide a link.

Apple & Google and many other leading companies are now building on our Deep Learning techniques. Are you an industrial company that wants to solve interesting pattern recognition problems better than your competitors? Don't hesitate to contact JS.

Last update November 2013