Search for question
Question

SUSS SINGAPORE UNIVERSITY OF SOCIAL SCIENCES AIB551 End-of-Course Assessment - January Semester 2024 Natural Language Processing INSTRUCTIONS TO STUDENTS: 1. This End-of-Course Assessment paper comprises 9 pages (including the cover page). 2. You are to include the following particulars in your submission: Course Code, Title of the ECA, SUSS PI No., Your Name, and Submission Date. 3. Late submission will be subjected to the marks deduction scheme. Please refer to the Student Handbook for details. AIB551 Copyright © 2024 Singapore University of Social Sciences (SUSS) ECA - January Semester 2024 Page 1 of 9 ECA Submission Guidelines Please follow the submission instructions stated below: A- What Must Be Submitted You are required to submit the following TWO (2) items for marking and grading: . • A Report A.Zip File that contains the dataset used for the report and code file used for data analysis Please verify your submissions after you have submitted the above TWO (2) items. B-Submission Deadline • • • The TWO (2) items of Report and .Zip File are to be submitted by 12 noon on the submission deadline. You are allowed multiple submissions till the cut-off date for each of the TWO (2) items. Late submission of any of the TWO (2) items will be subjected to mark-deduction scheme by the University. Please refer to Section 5.2 Para 2.4 of the Student Handbook. C-How the (2) Items Should Be Submitted • The Report: submit online to Canvas via TurnItIn (for plagiarism detection) under the ECA submission link The .Zip File that contains the dataset and code file: 。 Zip the dataset and code file ○ Submit the .Zip file online to Canvas via the -ECA Zip File submission link D-Additional guidelines on file formatting are given as follows: 1. Report • Please ensure that your Microsoft Word document is generated by Microsoft Word 2016 or higher. The report must be saved in .docx format. AIB551 Copyright © 2024 Singapore University of Social Sciences (SUSS) ECA - January Semester 2024 Page 2 of 9 2. .Zip File The dataset must be saved in .csv or.json format. The dataset must be included in the .zip file. The code file must be saved in the required format. You are to include the following particulars in your submission: Course Code, Title of the ECA, SUSS PI No., Your Name, and Submission Date. E-Please be Aware of the Following: Submission in hardcopy or any other means not given in the above guidelines will not be accepted. You do not need to submit any other forms or cover sheets (e.g. form ET3) with your ECA. You are reminded that electronic transmission is not immediate. The network traffic may be particularly heavy on the date of submission deadline and connections to the system cannot be guaranteed. Hence, you are advised to submit your work early. Canvas will allow you to submit your work late but your work will be subjected to the mark-deduction scheme. You should therefore not jeopardise your course result by submitting your ECA at the last minute. It is your responsibility to check and ensure that your files are successfully submitted to Canvas. F-Plagiarism and Collusion Plagiarism and collusion are forms of cheating and are not acceptable in any form in a student's work, including this ECA. Plagiarism and collusion are taking work done by others or work done together with others respectively and passing it off as your own. You can avoid plagiarism by giving appropriate references when you use other people's ideas, words or pictures (including diagrams). Refer to the APA Manual if you need reminding about quoting and referencing. You can avoid collusion by ensuring that your submission is based on your own individual effort. The electronic submission of your ECA will be screened by plagiarism detection software. For more information about plagiarism and collusion, you should refer to the Student Handbook (Section 5.2.1.3). You are reminded that SUSS takes a tough stance against plagiarism or collusion. Serious cases will normally result in the student being referred to SUSS's Student Disciplinary Group. For other cases, significant mark penalties or expulsion from the course will be imposed. AIB551 Copyright © 2024 Singapore University of Social Sciences (SUSS) ECA - January Semester 2024 Page 3 of 9 G-Use of Generative AI Tools (Allowed) The use of generative AI tools is allowed for this assignment. • • • You are expected to provide proper attribution if you use generative AI tools while completing the assignment, including appropriate and discipline-specific citation, a table detailing the name of the AI tool used, the approach to using the tool (e.g. what prompts were used), the full output provided by the tool, and which part of the output was adapted for the assignment; To take note of section 3, paragraph 3.2 and section 5.2, paragraph 24.1 (Viva Voce) of the Student Handbook; The University has the right to exercise the viva voce option to determine the authorship of a student's submission should there be reasonable grounds to suspect that the submission may not be fully the student's own work. For more details on academic integrity and guidance on responsible use of generative AI tools in assignments, please refer to the TLC website for more details; The University will continue to review the use of generative AI tools based on feedback and in light of developments in AI and related technologies. AIB551 Copyright © 2024 Singapore University of Social Sciences (SUSS) ECA - January Semester 2024 Page 4 of 9 (Full marks: 100) Section A (100 marks) Answer all questions in this section. Question 1 The assessment is to assess your ability to design NLP solutions based on the following case. Amazon is one of the world's largest e-commerce companies, with a vast product range. It receives hundreds of thousands of product reviews daily from its users worldwide. These reviews contain valuable information about the product and customer sentiments towards it. However, with the volume of incoming data, it's nearly impossible for Amazon to manually analyse all the reviews to extract actionable insights. This is where NLP comes in. Your task is to create an NLP solution to help Amazon structure and analyse the review data. Here is an Amazon review dataset collected in the range of May 1996 - Oct 2018, including reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). You are encouraged to use the smaller per-category dense subsets, which have been reduced to extract the k-core, such that each of the remaining users and items have k reviews each. Dataset link: https://cseweb.ucsd.edu/~jmcauley/datasets/amazon_v2/ Question 1a Based on your task, which is to create an NLP solution to help Amazon structure and analyse the review data, determine a scenario/problem that you are going to analyse and design NLP solution for the identified scenario/problem. (word limit: 500) Question 1b (20 marks) Design and implement relevant NLP analysis flow to address the identified questions, which includes but is not limited to: (i) Dataset preparation; (ii) Vocabulary building; (iii) Sentiment analysis; (iv) Named Entity Recognition; (v) Topic modeling AIB551 Copyright © 2024 Singapore University of Social Sciences (SUSS) ECA - January Semester 2024 Page 5 of 9