choice, store them in a database, and perform data analytics using Python.
1) First, identify a website as your data source, then identify target data fields your team plans to
collect. You should aim to collect as much data as possible, even if you do not initially expect to use
some data fields for analysis. This is because retroactive collection could be time-consuming if you
find that you are missing some needed data later on.
2) Set up a database to store your data.
a)
You can use any database, including sqlite3, MS SQL Server, MySQL, MongoDB, etc. But note
that Excel or a csv file is not a database.
b)
Based on the data fields you identified from the website, design and create one or more
tables that host your dataset to be collected.
c) After finalizing your database tables, develop the web crawler so that it directly inserts data
into your database (instead of, for example, downloading files as a csv file first, then
importing the csv file into the database).
3) Based on collected data in the database, perform some analyses to obtain insights. The types of
analyses can include at least two of the following (but not limited to):
a) Descriptive analysis
b) Visualization
c) Regression
d) Sentiment analysis
e) Other text mining analysis
4) Present your work in a video presentation.