Last time we saw the practical difficulty of analysing data from four or more measured variables. The demands a means of reducing the numbers to two so that it appears on a 2-D plot but gives the message we want – that similar candidates cluster together.
In other words, one must perform necessary mathematical manipulations to convert the parameters to a different set of variables (principal components), select the top two or the principal components, and plot them. All these happen without losing much of the information embedded inside it.
PCA is the technique of compressing data from a large set of measurements into a smaller number of independent (i.e., uncorrelated) variables that captures the core of the original data. Note that the principal component themselves are linear combinations of the original variables.
The first principal component, which becomes the X-axis, defines the direction of the maximum variation of data.