Search for question
Question

Student Name: IS Department IS 463 Introduction to Data-Mining Homework 1 - Second Semester 1445H This homework covers the following outcomes ID#: Outcome Student Outcome Description Weight (b) An ability to analyze a problem, and identify and define the computing requirements appropriate to its solution 10 pts Total Weight for this assignment 5pts Course Learning Outcomes to be covered in this homework Course Outcome Description 1 Understand Association analysis IS463 B X SO D Questions Weight | Ex1, Ex2 5pts 1 Exercise 1: (5 pts) Consider below transactional dataset collected from a supermarket: Transaction Items T101 Bread Milk Eggs Cheese Apple Yogurt T102 Bread Diapers Apple Yogurt Cereal T103 Milk Diapers Soda Eggs Cheese Yogurt T104 Bread Milk Diapers Soda Cheese Yogurt T105 Bread Milk Diapers Eggs Apple Yogurt T106 Bread Milk Cheese Yogurt Cereal T107 Milk Diapers Soda Cheese Yogurt Cereal T108 Milk Diapers Eggs Yogurt Cereal Juice T109 Bread Milk Yogurt Cereal Juice T110 Bread Eggs Cheese Yogurt Cereal Juice T111 Milk Diapers Cheese Yogurt Cereal Juice T112 Bread Milk Yogurt Cereal Juice T113 Bread Milk Diapers Yogurt Cereal Juice T114 Bread Diapers Cheese Yogurt Cereal Juice T115 Milk Diapers Yogurt Cereal Juice T116 Bread Diapers Yogurt Cereal Juice Apple T117 Bread Milk Yogurt Cereal Juice T118 Bread Milk Diapers Soda Yogurt Cereal T119 Bread Milk Diapers Yogurt Cereal Juice T120 Bread Milk Apple Yogurt Cereal Juice T121 Bread Cereal Milk Diapers Cheese Yogurt T122 Milk Juice Cheese Diapers Yogurt Cereal T123 Milk Cereal Diapers Soda Cheese Yogurt T124 Milk Juice Bread Diapers Yogurt Cereal 1) Trace (by hand) the Apriori algorithm to find out all the frequent itemsets in the above dataset using a minimum support threshold equal to 25%. For each step k, show: a. The generated candidates itemsets Ck+1 b. Eventually the pruned candidates itemsets Ck+1 C. The remaining Ck+1 surviving the pruning with their support counts. d. The frequent itemsetsLk+1 2) Generate all the possible association rules from the above generated frequent itemsets which satisfy a confidence threshold equal to 100% IS463 2 Exercise 2: (5 pts) Consider below transactional dataset: Transaction Items T1 C F B G T2 B G A L T3 B L K E T4 F B G A E T5 G A D K T6 T7 T8 618 F B G A C G A D F J E T9 B G D | T10 F G A E T11 B G K J E T12 F B A E T13 F B G D T14 C F B G T15 B K J H T16 C A D K T17 K E | T18 F G K E T19 G A E T20 F B G A K 3) Trace (by hand) the FP-tree algorithm to find out all the frequent itemsets in the above dataset using a minimum support threshold equal to 20%. Show your steps: a. The ordered list b. Ordered frequent items C. Draw the tree step by step d. Conditional Pattern Base e. Conditional FP-tree 4) Generate all the possible association rules from the above generated frequent itemsets which satisfy a confidence threshold equal to 75% IS463 3