handwritten Arabic letters. There are two files. The train data and the test data. The
code is available, and we need to optimize the code so under box number 6 when we
do the cross validation of the model, the accuracy of the model should be in high 80s
and low 90s. we should be tuning the hyperparameters and improve the pipeline as
needed. Anything is allowed to be used from the scikit learn but nothing more.
The code as it is, the model accuracy is 79
The goal is to modify the code to be able to get an accuracy of the model in the high
80s and low 90s.
In box 3 of the code, there are the hyperparameters that need to be tuned and the
pipeline that might need to be modifed. Voting model can be used to get high
accuracy.
We need to improve the model accuracy from the existing code.
Info about the dataset: The dataset is composed of 16,800 characters written by 60
participants, the age range is between 19 to 40 years, and 90% of participants are
right-hand. Each participant wrote each character (from 'alef' to 'yeh') ten times on
two forms. The forms were scanned at the resolution of 300 dpi. The dataset is
partitioned into two sets: a training set (13,440 characters to 480 images per class)
and a test set (3,360 characters to 120 images per class). Writers of training set and
test set are exclusive. Ordering of including writers to test set are randomized to make
sure that writers of test set were not from a single institution (to ensure variability of
the test set).
The code: This is a machine learning model in python using scikit learn to classify
the handwritten Arabic letters. There are two files. The train data and the test data.
The code is available, and we need to optimize the code so under box number 6 when
we do the cross validation of the model, the accuracy of the model should be in high
80s and low 90s. we should be tuning the hyperparameters and improve the pipeline
as needed. Anything is allowed to be used from the scikit learn but nothing more.
Voting model can be used to improve accuracy.
Goal: build an image classifier to classify handwritten Arabic language characters
using scikit learn. The model accuracy have to be in high 80s like 89% or low 90s
like 92%
This is all about tuning the hyperparameters and the model pipeline