How to quantify variance explained from PCA?

13 views (last 30 days)
SeanC
SeanC on 4 Oct 2017
Answered: Ayush Aniket on 23 Jan 2025 at 4:41
Hi,
I want to quantify the amount of variance explained by PCA. However, I want to define the PCs using one half of my data, and test it using the other half as follows:
[COEFF,SCORE,latent,tsquare] = princomp(InputMatrix_TrainingData); %InputMatrix is an 8 x 78 matrix. RecomputedScores = InputMatrix_TestData*COEFF; %Output
This works fine for recomputing the scores based upon alternate data, but how do I recompute the amount of variance explained etc. ?
Thanks

Answers (1)

Ayush Aniket
Ayush Aniket on 23 Jan 2025 at 4:41
To quantify the variance explained by PCA and apply the principal components derived from your training data to your test data, you can follow these steps:
1. Calculate the principal components using your training dataset. This will give you the coefficients and the explained variance for the training data.
% Perform PCA on the training data
[COEFF, SCORE, latent, tsquare, explained] = pca(InputMatrix_TrainingData);
2. Use the coefficients obtained from the training data to project your test data onto the principal component space.
% Recompute scores for the test data using the coefficients from the training data
RecomputedScores = InputMatrix_TestData * COEFF;
3. To assess how much variance is explained in the test data using the principal components from the training data, you can compare the variance of the projected test data with the original test data.
% Calculate the variance explained in the test data
% 1. Compute the variance of the original test data
totalVarianceTestData = sum(var(InputMatrix_TestData));
% 2. Compute the variance of the projected test data (RecomputedScores)
explainedVarianceTestData = sum(var(RecomputedScores));
% 3. Calculate the percentage of variance explained
percentageExplainedTestData = (explainedVarianceTestData / totalVarianceTestData) * 100;
Note: I have used the pca MATLAB function since the princomp function has been deprecated. You can compute the explained variance (explained) as follows (if using princomp function):
% Calculate the total variance
totalVariance = sum(latent);
% Compute the percentage of variance explained by each principal component
explained = (latent / totalVariance) * 100;

Categories

Find more on Dimensionality Reduction and Feature Extraction in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!