MNIST数据库

MNIST数据库（源自“National Institute of Standards and Technology database”^[1] ）是一个通常用于训练各种数码图像处理系统的大型数据库^[2]^[3]。该数据库通过对来自NIST原始数据库的样本进行修改创建，涵盖手写数字的图像，共包含60,000张训练图像和10,000张测试图像，尺寸为28×28像素。该数据库广泛运用于机器学习领域的训练与测试当中^[4]^[5]。MNIST在其发布时使用支持向量机的错误率为0.8%，但一些研究后来通过使用深度学习技术显著改进了这一成绩。

历史

MNIST数据库通过“重混”（re-mixing）的来自NIST原始数据库的样本创建^[6]。创建者认为，由于NIST的训练数据来自美国人口普查局的员工，而测试数据取自美国高中学生，这样的数据集不适合用来进行研究^[7]。此外，NIST的黑白图像被归一化（英语：Normalization (image processing)）处理，以适应28×28像素的边界框，并进行了抗锯齿（英语：Spatial anti-aliasing）处理，从而引入了灰度级别^[7]。

MNIST数据库包含有60,000张训练图像与10,000张测试图像^[8]。训练集的一半和测试集的一半来自NIST的训练数据集，而训练集的另一半和测试集的另一半则来自NIST的测试数据集^[9]。数据库的原始创建者保留了一些在其上测试的算法方法的列表^[7]。在他们的原始论文中，他们使用支持向量机获得了0.8%的错误率^[10]。然而，原始的MNIST数据库含有至少4个错误标签^[11]。

扩展MNIST（EMNIST）是由NIST开发和发布的一个更新的数据集，作为MNIST的（最终）继任者^[12]^[13]。MNIST仅包含手写数字的图像，而EMNIST包括NIST特别数据库19中的所有图像，该数据库包含大量的手写大写和小写字母以及数字的图像^[14]^[15]。

表现

一些研究通过使用人工神经网络在MNIST数据库中获取了“接近人类的表现”^[16]。原始数据库官方网站上列出的最高错误率为12%，这是使用简单线性分类器且没有预处理时的成绩^[10]^[7]。

在2004年，研究人员使用一种名为“LIRA”的基于罗森布拉特感知器原理的三层神经分类器，在数据库上实现了0.42%的最佳错误率^[17]。

一些研究者使用随机失真的MNIST数据库对人工智慧系统进行测试。这些系统通常是人工神经网络系统，所使用的失真方式可能是仿射失真或弹性失真（英语：Elastic deformation）^[7]。在某些情况下，这些系统可以非常成功；其中一个系统在数据库上实现了0.39%的错误率^[18]。

2011年，研究人员报告使用类似的神经网络系统，实现了0.27%的错误率，提升了之前的最佳成绩^[19]。2013年，一种基于DropConnect正则化神经网络的方法声称实现了0.21%的错误率^[20]。2016年，单个卷积神经网络在MNIST上的最佳性能为0.25%的错误率^[21]。截至2018年8月，使用MNIST训练数据、没有数据增强的单个卷积神经网络的最佳性能为0.25%的错误率^[21]^[22]。此外，乌克兰赫梅尔尼茨基的并行计算中心（Parallel Computing Center）使用了仅5个卷积神经网络的集成，在MNIST数据库上表现为0.21%的错误率^[23]^[24]。

参见

机器学习研究数据集列表（英语：List of datasets for machine learning research）
Caltech 101（英语：Caltech 101）
LabelMe（英语：LabelMe）
光学字符识别

参考来源

^ THE MNIST DATABASE of handwritten digits. Yann LeCun, Courant Institute, NYU Corinna Cortes, Google Labs, New York Christopher J.C. Burges, Microsoft Research, Redmond.
^ Support vector machines speed pattern recognition - Vision Systems Design. Vision Systems Design. [2013-08-17].
^ Gangaputra, Sachin. Handwritten digit database. [2013-08-17].
^ Qiao, Yu. THE MNIST DATABASE of handwritten digits. 2007 [2013-08-18]. （原始内容存档于2018年2月11号）.
^ Platt, John C. Using analytic QP and sparseness to speed training of support vector machines (PDF). Advances in Neural Information Processing Systems. 1999: 557–563 [2013-08-18]. （原始内容 (PDF)存档于2016-03-04）.
^ Grother, Patrick J. NIST Special Database 19 - Handprinted Forms and Characters Database (PDF). National Institute of Standards and Technology.
^ ^7.0 ^7.1 ^7.2 ^7.3 ^7.4 LeCun, Yann; Cortez, Corinna; Burges, Christopher C.J. The MNIST Handwritten Digit Database. Yann LeCun's Website yann.lecun.com. [2020-04-30].
^ Kussul, Ernst; Baidyk, Tatiana. Improved method of handwritten digit recognition tested on MNIST database. Image and Vision Computing. 2004, 22 (12): 971–981. doi:10.1016/j.imavis.2004.03.008.
^ Zhang, Bin; Srihari, Sargur N. Fast k-Nearest Neighbor Classification Using Cluster-Based Trees (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. 2004, 26 (4): 525–528 [2020-04-20]. PMID 15382657. doi:10.1109/TPAMI.2004.1265868. （原始内容 (PDF)存档于2021年7月25号）.
^ ^10.0 ^10.1 LeCun, Yann; Léon Bottou; Yoshua Bengio; Patrick Haffner. Gradient-Based Learning Applied to Document Recognition (PDF). Proceedings of the IEEE. 1998, 86 (11): 2278–2324 [2013-08-18]. doi:10.1109/5.726791.
^ Muller, Nicolas M.; Markert, Karla. Identifying Mislabeled Instances in Classification Datasets. 2019 International Joint Conference on Neural Networks (IJCNN). IEEE: 1–8. July 2019. ISBN 978-1-7281-1985-4. arXiv:1912.05283  . doi:10.1109/IJCNN.2019.8851920.
^ NIST. The EMNIST Dataset. NIST. 2017-04-04 [2022-04-11].
^ NIST. NIST Special Database 19. NIST. 2010-08-27 [2022-04-11].
^ Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. EMNIST: an extension of MNIST to handwritten letters.. 2017. arXiv:1702.05373  [cs.CV].
^ Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. EMNIST: an extension of MNIST to handwritten letters.. 2017. arXiv:1702.05373v1  [cs.CV].
^ Cires¸an, Dan; Ueli Meier; Jürgen Schmidhuber. Multi-column deep neural networks for image classification (PDF). 2012 IEEE Conference on Computer Vision and Pattern Recognition. 2012: 3642–3649. CiteSeerX 10.1.1.300.3283  . ISBN 978-1-4673-1228-8. S2CID 2161592. arXiv:1202.2745  . doi:10.1109/CVPR.2012.6248110.
^ Kussul, Ernst; Tatiana Baidyk. Improved method of handwritten digit recognition tested on MNIST database (PDF). Image and Vision Computing. 2004, 22 (12): 971–981 [2013-09-20]. doi:10.1016/j.imavis.2004.03.008. （原始内容 (PDF)存档于2013-09-21）.
^ Ranzato, Marc'Aurelio; Christopher Poultney; Sumit Chopra; Yann LeCun. Efficient Learning of Sparse Representations with an Energy-Based Model (PDF). Advances in Neural Information Processing Systems. 2006, 19: 1137–1144 [2013-09-20].
^ Ciresan, Dan Claudiu; Ueli Meier; Luca Maria Gambardella; Jürgen Schmidhuber. Convolutional neural network committees for handwritten character classification (PDF). 2011 International Conference on Document Analysis and Recognition (ICDAR). 2011: 1135–1139 [2013-09-20]. CiteSeerX 10.1.1.465.2138  . ISBN 978-1-4577-1350-7. S2CID 10122297. doi:10.1109/ICDAR.2011.229. （原始内容 (PDF)存档于2016-02-22）.
^ Wan, Li; Matthew Zeiler; Sixin Zhang; Yann LeCun; Rob Fergus. Regularization of Neural Network using DropConnect. International Conference on Machine Learning(ICML). 2013.
^ ^21.0 ^21.1 SimpleNet. Lets Keep it simple, Using simple architectures to outperform deeper and more complex architectures. 2016 [2020-12-03]. arXiv:1608.06037  .
^ SimpNet. Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet. Github. 2018 [2020-12-03]. arXiv:1802.06205  .
^ Romanuke, Vadim. Parallel Computing Center (Khmelnytskyi, Ukraine) represents an ensemble of 5 convolutional neural networks which performs on MNIST at 0.21 percent error rate.. [2016-11-24].
^ Romanuke, Vadim. Training data expansion and boosting of convolutional neural networks for reducing the MNIST dataset error rate. Research Bulletin of NTUU "Kyiv Polytechnic Institute". 2016, 6 (6): 29–34. doi:10.20535/1810-0546.2016.6.84115  .

延伸阅读

Ciresan, Dan; Meier, Ueli; Schmidhuber, Jürgen. Multi-column deep neural networks for image classification (PDF). 2012 IEEE Conference on Computer Vision and Pattern Recognition. New York, NY: Institute of Electrical and Electronics Engineers. June 2012: 3642–3649 [2013-12-09]. CiteSeerX 10.1.1.300.3283  . ISBN 9781467312264. OCLC 812295155. S2CID 2161592. arXiv:1202.2745  . doi:10.1109/CVPR.2012.6248110.

外部链接

[1] THE MNIST DATABASE of handwritten digits. Yann LeCun, Courant Institute, NYU Corinna Cortes, Google Labs, New York Christopher J.C. Burges, Microsoft Research, Redmond.

[2] Support vector machines speed pattern recognition - Vision Systems Design. Vision Systems Design. [2013-08-17].

[3] Gangaputra, Sachin. Handwritten digit database. [2013-08-17].

[4] Qiao, Yu. THE MNIST DATABASE of handwritten digits. 2007 [2013-08-18]. （原始内容存档于2018年2月11号）.

[5] Platt, John C. Using analytic QP and sparseness to speed training of support vector machines (PDF). Advances in Neural Information Processing Systems. 1999: 557–563 [2013-08-18]. （原始内容 (PDF)存档于2016-03-04）.

[6] Grother, Patrick J. NIST Special Database 19 - Handprinted Forms and Characters Database (PDF). National Institute of Standards and Technology.

[LeCun-7] 7.0 ^7.1 ^7.2 ^7.3 ^7.4 LeCun, Yann; Cortez, Corinna; Burges, Christopher C.J. The MNIST Handwritten Digit Database. Yann LeCun's Website yann.lecun.com. [2020-04-30].

[8] Kussul, Ernst; Baidyk, Tatiana. Improved method of handwritten digit recognition tested on MNIST database. Image and Vision Computing. 2004, 22 (12): 971–981. doi:10.1016/j.imavis.2004.03.008.

[9] Zhang, Bin; Srihari, Sargur N. Fast k-Nearest Neighbor Classification Using Cluster-Based Trees (PDF). IEEE Transactions on Pattern Analysis and Machine Intelligence. 2004, 26 (4): 525–528 [2020-04-20]. PMID 15382657. doi:10.1109/TPAMI.2004.1265868. （原始内容 (PDF)存档于2021年7月25号）.

[Gradient-10] 10.0 ^10.1 LeCun, Yann; Léon Bottou; Yoshua Bengio; Patrick Haffner. Gradient-Based Learning Applied to Document Recognition (PDF). Proceedings of the IEEE. 1998, 86 (11): 2278–2324 [2013-08-18]. doi:10.1109/5.726791.

[11] Muller, Nicolas M.; Markert, Karla. Identifying Mislabeled Instances in Classification Datasets. 2019 International Joint Conference on Neural Networks (IJCNN). IEEE: 1–8. July 2019. ISBN 978-1-7281-1985-4. arXiv:1912.05283  . doi:10.1109/IJCNN.2019.8851920.

[12] NIST. The EMNIST Dataset. NIST. 2017-04-04 [2022-04-11].

[13] NIST. NIST Special Database 19. NIST. 2010-08-27 [2022-04-11].

[14] Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. EMNIST: an extension of MNIST to handwritten letters.. 2017. arXiv:1702.05373  [cs.CV].

[15] Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. EMNIST: an extension of MNIST to handwritten letters.. 2017. arXiv:1702.05373v1  [cs.CV].

[Multideep-16] Cires¸an, Dan; Ueli Meier; Jürgen Schmidhuber. Multi-column deep neural networks for image classification (PDF). 2012 IEEE Conference on Computer Vision and Pattern Recognition. 2012: 3642–3649. CiteSeerX 10.1.1.300.3283  . ISBN 978-1-4673-1228-8. S2CID 2161592. arXiv:1202.2745  . doi:10.1109/CVPR.2012.6248110.

[17] Kussul, Ernst; Tatiana Baidyk. Improved method of handwritten digit recognition tested on MNIST database (PDF). Image and Vision Computing. 2004, 22 (12): 971–981 [2013-09-20]. doi:10.1016/j.imavis.2004.03.008. （原始内容 (PDF)存档于2013-09-21）.

[18] Ranzato, Marc'Aurelio; Christopher Poultney; Sumit Chopra; Yann LeCun. Efficient Learning of Sparse Representations with an Energy-Based Model (PDF). Advances in Neural Information Processing Systems. 2006, 19: 1137–1144 [2013-09-20].

[19] Ciresan, Dan Claudiu; Ueli Meier; Luca Maria Gambardella; Jürgen Schmidhuber. Convolutional neural network committees for handwritten character classification (PDF). 2011 International Conference on Document Analysis and Recognition (ICDAR). 2011: 1135–1139 [2013-09-20]. CiteSeerX 10.1.1.465.2138  . ISBN 978-1-4577-1350-7. S2CID 10122297. doi:10.1109/ICDAR.2011.229. （原始内容 (PDF)存档于2016-02-22）.

[20] Wan, Li; Matthew Zeiler; Sixin Zhang; Yann LeCun; Rob Fergus. Regularization of Neural Network using DropConnect. International Conference on Machine Learning(ICML). 2013.

[:0-21] 21.0 ^21.1 SimpleNet. Lets Keep it simple, Using simple architectures to outperform deeper and more complex architectures. 2016 [2020-12-03]. arXiv:1608.06037  .

[22] SimpNet. Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet. Github. 2018 [2020-12-03]. arXiv:1802.06205  .

[Romanuke3-23] Romanuke, Vadim. Parallel Computing Center (Khmelnytskyi, Ukraine) represents an ensemble of 5 convolutional neural networks which performs on MNIST at 0.21 percent error rate.. [2016-11-24].

[Romanuke4-24] Romanuke, Vadim. Training data expansion and boosting of convolutional neural networks for reducing the MNIST dataset error rate. Research Bulletin of NTUU "Kyiv Polytechnic Institute". 2016, 6 (6): 29–34. doi:10.20535/1810-0546.2016.6.84115  .

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]