![]() |
![]() |
![]() |
||||
![]() |
||||||
![]()
|
![]() |
|
||||
|
![]() COMPETITION DETAILS Links to the original datasets of competitions 1-9 and benchmarks A-D mentioned in the leftmost column, plus more information on the world records set by our team: 9. 22 Sept 2013: our deep and wide MC GPU-MPCNNs [8,17,18] won the MICCAI 2013 Grand Challenge on Mitosis Detection (important for cancer prognosis etc). This was made possible through the efforts of Dan and Alessandro [20]. Do not confuse this with the earlier ICPR 2012 contest below! Comment: When we started our work on Very Deep Learning over two decades ago, limited computing power forced us to focus on tiny toy applications to illustrate the benefits of our methods. How things have changed! It is gratifying to observe that today our techniques may actually help to improve healthcare and save lives. D. As of 1 Sep 2013, our Deep Learning Neural Networks are the best artificial offline recognisers of Chinese characters from the ICDAR 2013 competition (3755 classes), approaching human performance [23]. This is relevant for smartphone producers who want to build phones that can translate photos of foreign texts and signs. As always in such competitions, GPU-based pure supervised gradient descent (40-year-old backprop) was applied to deep and wide multi-column networks with interleaving max-pooling layers and convolutional layers (multi-column GPU-MPCNNs) [8,17]. Many leading IT companies and research labs are now using this technique, too. 8. ICPR 2012 Contest on Mitosis Detection in Breast Cancer Histological Images (MITOS Aperio images). There were 129 registered companies / institutes / universities from 40 countries, and 14 results. Our team (with Alessandro & Dan) clearly won the contest (over 20% fewer errors than the second best team). This was the first Deep Learner to win a contest on object detection in large images, the first to win a medical imaging contest, and the first to win cancer detection contests. See ref [20], as well as the later MICCAI 2013 Grand Challenge above. 7. ISBI 2012 Segmentation of neuronal structures in EM stacks challenge. See the TrakEM2 data sets of INI. Our team won the contest on all three evaluation metrics by a large margin, with superhuman performance in terms of pixel error (March 2012) [15]. (First pure image segmentation competition won by a Deep Learner; ranks 2-6 for researchers at ETHZ, MIT, CMU, Harvard.) This is relevant for the recent huge brain projects in Europe and the US, which try to build 3D models of real brains. 6. IJCNN 2011 on-site Traffic Sign Recognition Competition (1st rank, 2 August 2011, 0.56% error rate, the only method better than humans, who achieved 1.16% on average; 3rd place for 1.69%) [10,18]. The first method ever to achieve superhuman visual pattern recognition on an important benchmark (with secret test set known only to the organisers). This is obviously relevant for self-driving cars. 5. The Chinese Handwriting Recognition Competition at ICDAR 2011 (offline). Our team won 1st and 2nd rank (CR(1): 92.18% correct; CR(10): 99.29% correct) in June 2011. This was the first pure, deep GPU-CNN to win an international pattern recognition contest - very important for all those cell phone makers who want to build smartphones that can read signs and restaurant menus in foreign languages. This attracted a lot of industry attention - it became clear that this was the way forward in computer vision. 4. INI @ Univ. Bochum's online German Traffic Sign Recognition Benchmark (the qualifying), won through late night efforts of Dan & Ueli & Jonathan (1st & 2nd rank; 1.02% error rate, January 2011) [10]. More. C. The MNIST dataset of NY University, 1998. Our team set the new record (0.35% error rate) in 2010 [6] (through plain backprop without convolution or unsupervised pre-training), tied it again in January 2011 [8], broke it again in March 2011 (0.31%) [9], and again (0.27%, ICDAR 2011) [12], and finally achieved the first human-competitive result: 0.23% [17] (mean of many runs; many individual runs yield better results, of course, down to 0.17% [12]). This represented a dramatic improvement, since by then the MNIST record had hovered around 0.4% for almost a decade. B. NORB object recognition dataset for stereo images, NY University, 2004. Our team set the new record on the standard set (2.53% error rate) in January 2011 [8], and achieved 2.7% on the full set [17] (best previous result by others: 5%). A. The CIFAR-10 dataset of Univ. Toronto, 2009. Our team set the new record (19.51% error rate) on these rather challenging data in January 2011 [8], and improved this to 11.2% [17]. Three Connected Handwriting Recognition Competitions at ICDAR 2009 were won by our multi-dimensional LSTM recurrent neural networks [3,3a,4] through the efforts of Alex. This was the first RNN system ever to win an official international pattern recognition competition. To our knowledge, this also was the first Very Deep Learning system ever (recurrent or not) to win such contests: 3. ICDAR 2009 Arabic Connected Handwriting Competition of Univ. Braunschweig 2. ICDAR 2009 Handwritten Farsi/Arabic Character Recognition Competition 1. ICDAR 2009 French Connected Handwriting Competition (PDF) based on data from the RIMES campaign Note that 1-3,C,D are treated in more detail in the page on handwriting recognition. Here a 12 min Google Tech Talk video on fast deep / recurrent nets (only slides and voice) at AGI 2011, summarizing results as of August 2011: People keep asking: What is the secret of your successes? There are actually two secrets: (i) For competitions involving sequential data such as video and speech we use our deep (stacks [2a] of) multi-dimensional [3] Long Short-Term Memory (LSTM) recurrent networks (1997) [2d,2j] trained by Connectionist Temporal Classification (CTC, 2006) [3a]. This is what since 2009 has set records in recognising connected handwriting [4] and speech, e.g., [5]. (ii) For other competitions we use multi-column (MC) committees [10] of GPU-MPCNNs (2011) [8], where we apply (in the style of LeCun et al 1989 & Ranzato et al 2007) efficient backpropagation (Linnainmaa 1970, Werbos 1981) to deep Neocognitron-like weight-replicating convolutional architectures (Fukushima 1979) with max-pooling (MP) layers (Weng 1992, Riesenhuber & Poggio 1999). Over two decades, LeCun's lab has invented many improvements of such CNNs. Our GPU-MPCNNs achieved the first superhuman image recognition results (2011) [18], and were the first Deep Learners to win contests in object detection (2012) and image segmentation (2012), which require fast, non-redundant MPCNN image scans [21,22]. Inaugural tweet: In 2020, we will celebrate that many of the basic ideas behind the Deep Learning Revolution were published three decades ago within fewer than 12 months in our "Annus Mirabilis" 1990-1991: For more information on how we have built on the work of earlier pioneers since the 1960s, see www.deeplearning.me and especially this historical survey). Deep recurrent networks can also reinforcement-learn through Compressed Network Search: See this youtube video of a talk at CogX, London, 2018 (compare old slides from 2014 and a still more popular TEDx talk of 2017): Check out our Brainstorm Open Source Software for Neural Networks: Microsoft dominated the ImageNet 2015 contest through a deep feedforward LSTM without gates, a special case of our Highway Networks ([28], May 2015), the first very deep feedforward networks with hundreds of layers: 2016 IEEE CIS Neural Networks Pioneer Award awarded to JS for "pioneering contributions to deep learning and neural networks:" Our deep learning methods have transformed machine learning and Artificial Intelligence (AI), and are now available to billions of users through the world's five most valuable public companies: Apple (#1 as of March 31, 2017), Google (Alphabet, #2), Microsoft (#3), Facebook (#4), and Amazon (#5). |