Matlab主成分分析（特征值阶）

发布于 2024-10-17 02:13:05 字数 357 浏览 8 评论 0原文

我想使用 Matlab 的“princomp”函数，但该函数给出排序数组中的特征值。这样我就无法找出哪一列对应哪个特征值。对于 Matlab，

m = [1,2,3;4,5,6;7,8,9];
[pc,score,latent] = princomp(m);

与相同即

m = [2,1,3;5,4,6;8,7,9];
[pc,score,latent] = princomp(m);

交换前两列不会改变任何内容。潜在的结果（特征值）将为：(27,0,0) 信息（哪个特征值对应于哪个原始（输入）列）丢失了。有没有办法告诉matlab不要对特征值进行排序？

原文

I want to use the "princomp" function of Matlab but this function gives the eigenvalues in a sorted array. This way I can't find out to which column corresponds which eigenvalue.
For Matlab,

m = [1,2,3;4,5,6;7,8,9];
[pc,score,latent] = princomp(m);

is the same as

m = [2,1,3;5,4,6;8,7,9];
[pc,score,latent] = princomp(m);

That is, swapping the first two columns does not change anything. The result (eigenvalues) in latent will be: (27,0,0)
The information (which eigenvalue corresponds to which original (input) column) is lost.
Is there a way to tell matlab to not to sort the eigenvalues?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

枫林﹌晚霞¤ 2024-10-24 02:13:05

使用 PCA，返回的每个主成分将是原始列/维度的线性组合。也许一个例子可以消除您的任何误解。

让我们考虑由 150 个实例和 4 个维度组成的 Fisher-Iris 数据集，并对数据应用 PCA。为了让事情更容易理解，我在调用 PCA 函数之前首先将数据归零：

load fisheriris
X = bsxfun(@minus, meas, mean(meas));    %# so that mean(X) is the zero vector

[PC score latent] = princomp(X);

让我们看看第一个返回的主成分（PC 矩阵的第一列）：

这表示为线性组合因此，

PC1 =  0.36139*dim1 + -0.084523*dim2 + 0.85667*dim3 + 0.35829*dim4

为了在主成分形成的新坐标系中表达相同的数据，新的第一维度应该是原始维度根据上式的线性组合。

我们可以简单地将其计算为 X*PC ，这正是 PRINCOMP (score) 的第二个输出中返回的内容，以确认此尝试：

>> all(all( abs(X*PC - score) < 1e-10 ))
    1

最后，每个的重要性主成分可以通过它解释的数据的方差多少来确定。这是由 PRINCOMP (latent) 的第三个输出返回的。

我们可以自己计算数据的 PCA，而无需使用 PRINCOMP：

[V E] = eig( cov(X) );
[E order] = sort(diag(E), 'descend');
V = V(:,order);

协方差矩阵 V 的特征向量是主成分（与上面的 PC 相同，尽管符号可以反转）），相应的特征值 E 表示解释的方差量（与 latent 相同）。请注意，习惯上按主成分的特征值对其进行排序。和以前一样，为了表达新坐标中的数据，我们只需计算 X*V （如果确保匹配符号，则应与上面的 score 相同））

With PCA, each principle component returned will be a linear combination of the original columns/dimensions. Perhaps an example might clear up any misunderstanding you have.

Lets consider the Fisher-Iris dataset comprising of 150 instances and 4 dimensions, and apply PCA on the data. To make things easier to understand, I am first zero-centering the data before calling PCA function:

load fisheriris
X = bsxfun(@minus, meas, mean(meas));    %# so that mean(X) is the zero vector

[PC score latent] = princomp(X);

Lets look at the first returned principal component (1st column of PC matrix):

This is expressed as a linear combination of the original dimensions, i.e.:

PC1 =  0.36139*dim1 + -0.084523*dim2 + 0.85667*dim3 + 0.35829*dim4

Therefore to express the same data in the new coordinates system formed by the principal components, the new first dimension should be a linear combination of the original ones according to the above formula.

We can compute this simply as X*PC which is the exactly what is returned in the second output of PRINCOMP (score), to confirm this try:

>> all(all( abs(X*PC - score) < 1e-10 ))
    1

Finally the importance of each principal component can be determined by how much variance of the data it explains. This is returned by the third output of PRINCOMP (latent).

We can compute the PCA of the data ourselves without using PRINCOMP:

[V E] = eig( cov(X) );
[E order] = sort(diag(E), 'descend');
V = V(:,order);

the eigenvectors of the covariance matrix V are the principal components (same as PC above, although the sign can be inverted), and the corresponding eigenvalues E represent the amount of variance explained (same as latent). Note that it is customary to sort the principal component by their eigenvalues. And as before, to express the data in the new coordinates, we simply compute X*V (should be the same as score above, if you make sure to match the signs)

回复收藏 0 原文