(a) In the context of binary classification, define the optimal separating hyper-plane (also known as maximal margin hyperplane) and maximal margin clas-sifier.[4 marks] (b) Describe Linear Discriminant Analysis (LDA) giving an explicit formula and making sure to cover the case of one attribute (usually expressed as p = 1)and more than one attribute (usually expressed as p > 1). There is no need to give formulas for parameter estimates.[6 marks] (c) Why are LDA prediction boundaries linear? When answering this question,you may assume that the number of attributes is greater than 1 (p > 1).12 morkel (d) You are given the following training set with two attributes and binary label: points (-1,0) and (0, -1) belong to class 1, point (0,0) belongs to class 2. Do the following tasks: i. Draw the optimal separating hyperplane (line, in this context) for thistraining set and write an equation for it.[4 marks] ii. Use this optimal separating hyperplane to classify point (-0.5, 0.5). (e) Consider the following training set with one attribute and binary label: points -2 and -1 belong to class 1, points 1 and 3 belong to class 2. Answer the following questions about this data set showing all details of your calculations (if any). i. What is the optimal separating hyperplane (point in this context) for the maximal margin classifier?[2 marks] ii. What is the prediction made by the maximal margin classifier for point 0.1?[2 marks] iii. What is the LDA prediction for point 0.1? (f) What is the key difference between the assumptions of linear discriminant analysis and those of quadratic discriminant analysis in the case of morethan one attribute?[4 marks]