measurements performed on breast biopsy samples. The large number of variables and the strong
correlation between some of them make it difficult to use these values for diagnosis. The purpose of
this assignment is to define three most significant, mutually independent variables from these
values, then determine if different pathological classes segregate to different regions in the three-
dimensional space.
1. Generate a matrix P where each column represents a principal component.
2. Generate a plot of cumulative percentage variance carried by the principal components.
Label x-axis as "Number of Principal Components" and y-axis as "Percentage Total Variance".
3. Write code to determine the number of principal components needed to account for 85% of
the total variances. Print a line of "Need at least XX principal components to account for
85% of the total variance" where XX is generated by the code.
4. Generate a 3D plot with its axes formed by the first three principal components. Label the
axes as P(1), P(2), and P(3) respectively. Mark each patient with a filled circle, using different
colors to represent different pathological classes (car, fad, mas,...). Use red circles for car.
Add a legend indicating which symbol is for which class.
ons
2
Number of Compet
Notes: To generate a standalone graph outside the Live Editor, add "set(gcf, 'Visible','on')" after
the plotting command. To allow rotation of the 3D plot under mouse control, add "rotate3d on"
after the plotting command.
Fig: 1