review categorizer. Your classifier must be able to train to classify
reviews into one of two classes. Positive and negative reviews.
Description can be found in the readme file. Please note that we are
using only the test set as the dataset is huge. This test set contains
400k data points.
a. Data set can be found in the canvas
b. Use the TfidfVectorizer found in Sciekit-learn library in python to
vectorize the dataset
c. Use GaussianNB for the classifier
d. Calculate the accuracy of the model. You need to use the data
partitioning to create train set and test set from the data set
given.
e. Input a sample text and determine the class of the text provided