Task 1: Dimensionality reduction using PCA
Introduction
Give a general description about the importance of PCA and how they help to understand
the information hidden in the large dataset or how they can be used a modelling tool to
predict an experimental outcome. Any other information that you learned or understood
about these algorithms or methods and list their applicability in the field of chemical
engineering (you can list up to 6 different applications).
Working principle: Write the working principle or the mathematics behind the PCA.
Note: Explain the working principles of PCA. Clearly explain the mathematics involved.
Present the results obtained from MATLAB. Clearly label the components involved, add
suitable legends, or clearly mention 'which symbol correspond to which class of crystal',/nclearly label the x-axis and y-axis. You can show a 2D/scatter plot that contains two
components or 3D/scatter chart that may contain more than two components depending
on the outcome. Provide a suitable explanation on what you observe in the PCA plot
(how many classes of crystals can be identified in the PCA space? - think and answer).
Provide a suitable explanation highlighting the dimensions of the eigen vectors and the
corresponding eigen values. Do you think, the dimensions are something that you expect
- if yes - explain why based on the rules of matrices? Based on the results obtained,
explain how many numbers of components are required to represent >95% of the data).
Finally, create a pair plot using the MATLAB's inbuilt function 'gplotmatrix' or any other
plot (e.g., histogram - Note: you can create up to four histograms as an alternate to pair
plot) to analyse the data given to you. Based on such plots, can you extract any useful
information from the data and compare the ones with the trend that you observe from
PCA plot (if yes, give your interpretation about the daa).
Quantify your results where possible, and provide critical analysis / discussion of the
results
Fig: 1
Fig: 2