Search for question
Question

0.4 Question 3 [4.5 marks]

Q3 (i) [ 0.5 marks] Add two columns in the df_transactions dataframe: 1. A column named

TX_DATE should have the date of the transaction without any time information 2. A column name

TX_WEEK should have the week number in which the transaction occurred (e.g., first week of January

is week 1, the second week of January is week 2 and so forth). A week is defined as starting on a

Monday and ending on a Sunday.

[ ]:"""Add the columns discussed above with appropriate values here"""

# BEGIN - YOUR CODE GOES HERE

pass

# END - YOUR CODE GOES HERE

[]:"""Do not remove this cell. """

Q3 (ii) [ 1.5 marks] This question asks you to create columns that will store the Frequency of

transactions for customers. In particular, we are interested in the number of transactions a customer

did on the previous day and in the previous week (where week is defined as starting on a Monday

and ending on a Sunday). The columns should be added to the df_transactions dataframe as

per: 1. CUSTOMER_TOTAL_1D: The number of transactions for this customer on previous day 2.

CUSTOMER_TOTAL_1W: The number of transactions for this customer on previous week/nNote The df_transactions dataframe should not have any columns that are not required as per

this assignment.

[ ]:"""Populate the variables shown above with appropriate values here"""

#BEGIN - YOUR CODE GOES HERE

pass

# END - YOUR CODE GOES HERE

[]:"""Do not remove this cell. """

Q3 (iii) [ 1.5 marks] This question asks you to create columns that will store the expected Mon-

etary value of transactions for customers. In particular, we are interested in the median value

of transactions a customer did on the previous day and in the previous week (where week is de-

fined as starting on a Monday and ending on a Sunday). The columns should be added to the

df_transactions dataframe as per: 1. SPENT_1D: The median dollar value of transactions for this

customer on previous day 2. SPENT_1W: The median dollar value of transactions for this customer

on previous week

Note The df_transactions dataframe should not have any columns that are not required as per

this assignment.

[ ]:"""Populate the variables shown above with appropriate values here"""

# BEGIN - YOUR CODE GOES HERE

pass

#END - YOUR CODE GOES HERE

[]:"""Do not remove this cell. """

Q3 (iv) [0.5 marks] Generate a scatter plot with amount (in dollars) on y-axis, and customer id

on the x-axis. The scatter plot should have two markers, one for the median amount a customer

spent on the previous day, and second for the value of fraudulent transactions for that customer.

You should label the plot appropriately.

[]:"""Populate the variables shown above with appropriate values here"""

# BEGIN - YOUR CODE GOES HERE

pass

# END - YOUR CODE GOES HERE

Q3 (v) [ 0.5 marks ] Will including the amount a customer spent in the previous day help in

improving the logistic regression model from the previous question? You do not need to run the

logistic regression model. You should answer the question in the context of the scatter plot from

previous question. [Word limit < 150 words]

Note Write your justification in the Markdown cell below

0.4.1 WRITE YOUR ANSWER(S) HERE IN THIS CELL

You can use Markdown syntax here.

7

Fig: 1

Fig: 2