This formula-free summary provides a short overview about how PCA (principal component analysis) works for dimension reduction, that is, to select k features (also called variables) among a larger set of n features, with k much smaller than n. This smaller set of k features built with PCA is the best subset of k features, in the sense that it minimizes the variance of the residual noise when fitting data to a linear model. Note that PCA transforms the initial features into new ones, that are linear combinations of the original features.
Steps for PCA
The PCA algorithm proceeds as follows:
The proportion of the variance that each eigenvector represents can be calculated by dividing the eigenvalue corresponding to that eigenvector by the sum of all eigenvalues.
Caveats
If the original features are highly correlated, the solution will be very unstable. Also the new features are linear combinations of the original features, and thus, may lack interpretation. The data does not need to be multinormal, except if you use this technique for predictive modeling using normal models to compute confidence intervals.
Source for picture: click here
Click here (Wikipedia) to read the implementation details. This is a very long article, but you can focus on the section entitled Computing PCA using the covariance method.
DSC Resources
Popular Articles
Comment
PCA has many applications in finance industry; I have plan to write a paper in this area.This paper has used PCA to detect and mitigate the risk of highly correlated feature variables (which is common in finance): https://ssrn.com/abstract=2967184
Well, PCA is not "based on" rotation matrices (it's from 1900 after all) but "an example of" a rotation matrix.
PCA became clear to me when I realized that it is based on the computer graphics tool called a rotation matrix.
https://en.wikipedia.org/wiki/Rotation_matrix
A rotation matrix rotates a shape in 2-space or 3-space, keeping the same area (volume). Take a 100x3 matrix, consider the rows as points in 3-space, and the columns as dimensions x, y, and z. PCA creates a rotation matrix which gives the largest distance in (x,y) between points in (x,y,z). This gives the most dramatic visualization when you plot in 2D using (x,y) and dropping (z).
© 2020 Data Science Central ® Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
DSC Podcast
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
DSC Podcast
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central