build a text classifier on the imdb sentiment classification dataset y
Build a text classifier on the IMDB sentiment classification dataset, you can use any classification method, but you must training your model on the first 40000 instances and testing your model on the last 10000 instances. The IMDB dataset will be uploaded on the moodle page for you to download.
Your code should include:
1: Read the file, incorporate the instances into the training set and testing set.
2: Pre-processing the text, you can choose whether you need stemming, removing stop words, removing non-alphabetical words. (Not all classification models need this step, it is OK if you think your model can perform better without this step, and you can give some justification in the report.)
3: Analysing the feature of the training set, report the linguistic features of the training dataset.
4: Build a text classification model, train your model on the training set and test your model on the test set.
5: Summarize the performance of your model (You can gain additional marks if you have some graph visualization).
6: (Optional) You can speculate how you can improve your works based on your proposed model.
*The amount will be in form of wallet points that you can redeem to pay upto 10% of the price for any assignment. **Use of solution provided by us for unfair practice like cheating will result in action from our end which may include permanent termination of the defaulter’s account.Disclaimer:The website contains certain images which are not owned by the company/ website. Such images are used for indicative purposes only and is a third-party content. All credits go to its rightful owner including its copyright owner. It is also clarified that the use of any photograph on the website including the use of any photograph of any educational institute/ university is not intended to suggest any association, relationship, or sponsorship whatsoever between the company and the said educational institute/ university. Any such use is for representative purposes only and all intellectual property rights belong to the respective owners.