A review of MNIST Dataset and its variations
MNIST, short for Modified National Institute of Standards and Technology, is a dataset consisting of images showing handwritten digits from 0 to 9 (both inclusive).
You can find the link to the official dataset here – MNIST
You can find the link to the dataset on Kaggle here – Kaggle MNIST
Type of Data: Image (.png and .jpg are available)
Number of Training Images: 60,000
Number of Testing Images: 30,000
Size of Image: 28 x 28 pixels
Image Color: Grayscale
The above is a sample of the handwritten digits from the MNIST dataset available on Wikipedia.
It is one of the highly renowned datasets in the field of Machine Learning and Image processing. It is a benchmark dataset and has inspired the compilation of various other such image datasets which I’ll discuss in the coming sections.
As the name suggests, this dataset is created by modifying the original dataset by NIST. The need for modification arose because the training and testing data originated from different sources – Training data was taken from American Census Bureau employees and the Testing data was taken from American high school students.
The 60,000 training images in the MNIST data are a combination of 30,000 training images from the NIST training and testing set respectively. Similarly, the 10,000 testing images are a combination of 5,000 training images from the NIST training and testing set respectively.
The MNIST dataset is used to train a model that can input an array of pixel data and output the corresponding digit. We can use any multi-class classification algorithm but since the data is in the form of images, CNNs would be the one to choose. CNN or Convolutional Neural Network is a deep learning architecture that works especially well in processing images. This article of ours is a good read to dive a little deeper into this – CNNs
There are a lot of variations in the MNIST dataset. Many of these variants are made to replace the original dataset. Some of the researchers in the field of machine learning have a few reservations about the MNIST dataset. The simplest of models achieve around 90% accuracy on this data, making it a false indicator of the complexity of data in real life. In attempts to overcome these difficulties or just to contribute to the field of Machine Learning, many such variants have popped up. Here are a few of them.
- MNIST Fashion – As the name indicates this dataset is associated with “fashion”. This dataset consists of images with the same properties as MNIST with one significant difference – The images display fashion wear (10 categories in total). Some examples of these fashion wear are – Trouser, Coat, Sandal, Bag, Ankle Boot and Pullover.
Here’s the link to the dataset on Kaggle – Fashion MNIST.
- Chinese MNIST – This dataset contains images of 15 Chinese digits. Unlike the MNIST data, here there are only 15000 images in total, collected from 100 volunteers where each had to write the 15 characters 10 times. The size of each image is 300 x 300 pixels.
The link to the original dataset can be found here – Chinese MNIST. The link to the Kaggle version of the dataset can be found here – Kaggle Chinese MNIST.
- Sign Language MNIST – This dataset contains 24 categories of images, each representing a hand gesture for one of the 24 letters in the English alphabet, excluding J and Z. There are 27455 training images and 7172 testing images where each image is of size 28 x 28 pixels.
The Kaggle link to the dataset can be found here – Kaggle Sign Language MNIST.