Search for question
Question

1. Download the data file csv_result-churn.csv. Consider the costumer churn example described in the file MegaTelCo.pdf with the following data structure: Data instances: 20000 Features: id, COLLEGE, INCOME, OVERAGE, LEFTOVER, HOUSE,

HAND- SET PRICE, OVER 15MINS_CALLS PER MONTH, AVERAGE_CALL DURATION, REPORTED SATISFACTION, REPORTED_USAGE_LEVEL, CONSIDERING CHANGE OF PLAN (total: 12 features) Target: LEAVE With this data configuration do the following: 1.a 3 points Use the software to compute a decision tree model for classification with the following specifications: Training sample using 75 % of the data randomly selected as training/estimation sample Parameters: at least two instances in leaves, do not split subsets smaller than 5, maximum depth 6; Splitting: Stop splitting when majority reaches 95%. Use the remaining 25 % of the sample to compute predictions and appraise those with a Confusion Matrix./n1.b 3 points Your employer would like to target the actively looking into it (leaving) customers, which according to the prediction sample decision tree model, will leave with probability one? Note that since random sample selection occurred the answer will differ depending on the random draws Note also that we are only selecting a small set of customers here for illustrative purpose. In practice such efforts would involve larger selections Hint: Save the output of the decision tree predictions into an Excel file to facilitate the analysis of your results. Then cut/paste your findings into your answers document (PDF file you submit). 1.c 3 points For comparison, using the same specifications compute the Confusion Matrix for the full sample. 1.d 3 points Compute the decision tree for both the aforementioned training and the full samples with the following tuning parameters: at least two instances in leaves, do not split subsets smaller than 5, maximum depth 50; Splitting: Stop splitting when majority reaches 99%. Appraise the results again with a Confusion Matrix. 1.e 3 points Explain why your answers in 1.a and 1.c yielded comparable confusion matrices while in 1.d there is a dramatic improvement in the full sample and deterioration in the prediction sample.

Fig: 1

Fig: 2