Question

Question 6 (This relies on the file iris.csv that can be found in LumiNUS. Use that file to avoid problems associated with different versions.) p-norms are used to measure the distance

between multi-dimensional data points and the origin. For a n-dimensional data point x = (X₁, X2, ..... Xn), the p-norm is given by: 11/p n ΣX₂² k=1 See: https://en.wikipedia.org/wiki/Lp_space#The_p-norm_in_finite_dimensions Here we will, for each type of flower (setosa, versicolor, and virginica), measure the distance between each data point in the 1-norm, 2-norm, and 3-norm from the mean of each of the factors: Sepal Length, Sepal Width, Petal Length, and Petal Width. So each data point is in 4-dimensions, and the distance from each data point from the mean from is in 4-dimensions The number of data points where the (component-wise) difference of the data point from the mean for its flower type has a p-norm less than or equal to 1.5 is: Type of Flower \p setosa FLAG QUESTION versicolor virginica 2 FYI: Depending on which items were selected for this assignment in this semester, there may or may not be another question that tells you do do the exact same thing for with a different threshold. So create your visual accordingly. Note: The 1-norm is the Manhattan distance, which is quite relevant in transportation operations in cities. The 2-norm is the usual straight line distance. In analytics work, it is common to generalise well known metrics.

Fig: 1