sification using the Fashion MNIST (M = 10) dataset ³. In this dataset, we have N = 60000
training data points and 10000 examples for testing where each one consists of a gray scale image
with F = 28 * 28 = 784 features (or pixels). The dataset is loaded and pre-processed using the
following code.
Fig: 1
Fig: 2
Fig: 3