Question SUSS
SINGAPORE UNIVERSITY
OF SOCIAL SCIENCES
AIB551
End-of-Course Assessment - January Semester 2024
Natural Language Processing
INSTRUCTIONS TO STUDENTS:
1. This End-of-Course Assessment paper comprises 9 pages (including the cover page).
2. You are to include the following particulars in your submission: Course Code, Title
of the ECA, SUSS PI No., Your Name, and Submission Date.
3. Late submission will be subjected to the marks deduction scheme. Please refer to the
Student Handbook for details.
AIB551 Copyright © 2024 Singapore University of Social Sciences (SUSS)
ECA - January Semester 2024
Page 1 of 9 ECA Submission Guidelines
Please follow the submission instructions stated below:
A- What Must Be Submitted
You are required to submit the following TWO (2) items for marking and grading:
.
•
A Report
A.Zip File that contains the dataset used for the report and code file used for data
analysis
Please verify your submissions after you have submitted the above TWO (2) items.
B-Submission Deadline
•
•
•
The TWO (2) items of Report and .Zip File are to be submitted by 12 noon on the
submission deadline.
You are allowed multiple submissions till the cut-off date for each of the TWO (2)
items.
Late submission of any of the TWO (2) items will be subjected to mark-deduction
scheme by the University. Please refer to Section 5.2 Para 2.4 of the Student
Handbook.
C-How the (2) Items Should Be Submitted
•
The Report: submit online to Canvas via TurnItIn (for plagiarism detection) under
the ECA submission link
The .Zip File that contains the dataset and code file:
。 Zip the dataset and code file
○ Submit the .Zip file online to Canvas via the -ECA Zip File submission link
D-Additional guidelines on file formatting are given as follows:
1. Report
•
Please ensure that your Microsoft Word document is
generated by Microsoft Word 2016 or higher.
The report must be saved in .docx format.
AIB551 Copyright © 2024 Singapore University of Social Sciences (SUSS)
ECA - January Semester 2024
Page 2 of 9 2. .Zip File
The dataset must be saved in .csv or.json format.
The dataset must be included in the .zip file.
The code file must be saved in the required format.
You are to include the following particulars in your
submission: Course Code, Title of the ECA, SUSS PI No.,
Your Name, and Submission Date.
E-Please be Aware of the Following:
Submission in hardcopy or any other means not given in the above guidelines will not
be accepted. You do not need to submit any other forms or cover sheets (e.g. form ET3)
with your ECA.
You are reminded that electronic transmission is not immediate. The network traffic
may be particularly heavy on the date of submission deadline and connections to the
system cannot be guaranteed. Hence, you are advised to submit your work early.
Canvas will allow you to submit your work late but your work will be subjected to the
mark-deduction scheme. You should therefore not jeopardise your course result by
submitting your ECA at the last minute.
It is your responsibility to check and ensure that your files are successfully submitted
to Canvas.
F-Plagiarism and Collusion
Plagiarism and collusion are forms of cheating and are not acceptable in any form in
a student's work, including this ECA. Plagiarism and collusion are taking work done
by others or work done together with others respectively and passing it off as your own.
You can avoid plagiarism by giving appropriate references when you use other people's
ideas, words or pictures (including diagrams). Refer to the APA Manual if you need
reminding about quoting and referencing. You can avoid collusion by ensuring that
your submission is based on your own individual effort.
The electronic submission of your ECA will be screened by plagiarism detection
software. For more information about plagiarism and collusion, you should refer to the
Student Handbook (Section 5.2.1.3). You are reminded that SUSS takes a tough stance
against plagiarism or collusion. Serious cases will normally result in the student being
referred to SUSS's Student Disciplinary Group. For other cases, significant mark
penalties or expulsion from the course will be imposed.
AIB551 Copyright © 2024 Singapore University of Social Sciences (SUSS)
ECA - January Semester 2024
Page 3 of 9 G-Use of Generative AI Tools (Allowed)
The use of generative AI tools is allowed for this assignment.
•
•
•
You are expected to provide proper attribution if you use generative AI tools while
completing the assignment, including appropriate and discipline-specific citation,
a table detailing the name of the AI tool used, the approach to using the tool (e.g.
what prompts were used), the full output provided by the tool, and which part of the
output was adapted for the assignment;
To take note of section 3, paragraph 3.2 and section 5.2, paragraph 24.1 (Viva
Voce) of the Student Handbook;
The University has the right to exercise the viva voce option to determine the
authorship of a student's submission should there be reasonable grounds to suspect
that the submission may not be fully the student's own work.
For more details on academic integrity and guidance on responsible use of
generative AI tools in assignments, please refer to the TLC website for more details;
The University will continue to review the use of generative AI tools based on
feedback and in light of developments in AI and related technologies.
AIB551 Copyright © 2024 Singapore University of Social Sciences (SUSS)
ECA - January Semester 2024
Page 4 of 9 (Full marks: 100)
Section A (100 marks)
Answer all questions in this section.
Question 1
The assessment is to assess your ability to design NLP solutions based on the following
case.
Amazon is one of the world's largest e-commerce companies, with a vast product range.
It receives hundreds of thousands of product reviews daily from its users worldwide.
These reviews contain valuable information about the product and customer sentiments
towards it.
However, with the volume of incoming data, it's nearly impossible for Amazon to
manually analyse all the reviews to extract actionable insights. This is where NLP
comes in. Your task is to create an NLP solution to help Amazon structure and analyse
the review data.
Here is an Amazon review dataset collected in the range of May 1996 - Oct 2018,
including reviews (ratings, text, helpfulness votes), product metadata (descriptions,
category information, price, brand, and image features), and links (also viewed/also
bought graphs).
You are encouraged to use the smaller per-category dense subsets, which have been
reduced to extract the k-core, such that each of the remaining users and items have k
reviews each.
Dataset link: https://cseweb.ucsd.edu/~jmcauley/datasets/amazon_v2/
Question 1a
Based on your task, which is to create an NLP solution to help Amazon structure and
analyse the review data, determine a scenario/problem that you are going to analyse
and design NLP solution for the identified scenario/problem. (word limit: 500)
Question 1b
(20 marks)
Design and implement relevant NLP analysis flow to address the identified questions,
which includes but is not limited to:
(i) Dataset preparation; (ii) Vocabulary building; (iii) Sentiment analysis; (iv) Named
Entity Recognition; (v) Topic modeling
AIB551 Copyright © 2024 Singapore University of Social Sciences (SUSS)
ECA - January Semester 2024
Page 5 of 9