R-“原理计算”只能与比变量更多的单位一起使用

发布于 2024-11-01 09:09:39 字数 320 浏览 10 评论 0原文

我正在使用 R 软件(R Commander)对我的数据进行聚类。我的数据有一个较小的子集,包含 200 行和大约 800 列。尝试 kmeans 聚类并在图表上绘制时出现以下错误。 “‘princomp’只能与比变量更多的单位一起使用”

然后我创建了一个 10 行和 10 列的测试文档,它绘制得很好,但是当我添加额外的列时,我再次收到错误。 这是为什么呢?我需要能够绘制我的集群。当我在对数据集执行 kmeans 后查看数据集时,我可以看到额外的结果列,其中显示它们属于哪些集群。

我做错了什么吗,我可以摆脱这个错误并绘制更大的样本吗??? 请帮忙,已经让我头疼了一个星期了。 谢谢你们。

I am using R software (R commander) to cluster my data. I have a smaller subset of my data containing 200 rows and about 800 columns. I am getting the following error when trying kmeans cluster and plot on a graph.
"'princomp' can only be used with more units than variables"

I then created a test doc of 10 row and 10 columns whch plots fine but when I add an extra column I get te error again.
Why is this? I need to be able to plot my cluster. When I view my data set after performing kmeans on it I can see the extra results column which shows which clusters they belong to.

IS there anything I am doing wrong, can I ger rid of this error and plot my larger sample???
Please help, been wrecking my head for a week now.
Thanks guys.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

宫墨修音 2024-11-08 09:09:39

问题在于变量多于样本点,并且正在进行的主成分分析失败。

princomp 的帮助文件中,它解释了(阅读 ?princomp):

 ‘princomp’ only handles so-called R-mode PCA, that is feature
 extraction of variables.  If a data matrix is supplied (possibly
 via a formula) it is required that there are at least as many
 units as variables.  For Q-mode PCA use ‘prcomp’.

The problem is that you have more variables than sample points and the principal component analysis that is being done is failing.

In the help file for princomp it explains (read ?princomp):

 ‘princomp’ only handles so-called R-mode PCA, that is feature
 extraction of variables.  If a data matrix is supplied (possibly
 via a formula) it is required that there are at least as many
 units as variables.  For Q-mode PCA use ‘prcomp’.

如果样本少于数据点,则主成分分析未指定
每个数据点都将是它自己的主要组成部分。为了使 PCA 发挥作用,实例的数量应明显大于维度的数量。

简单来说,你可以这样看待问题:
如果您有 n 维度,则可以使用全部为 0 或最多具有一个 的向量对最多 n+1 个实例进行编码>1。而且这是最优的,所以 PCA 就会这样做!但这并没有多大帮助。

Principal component analysis is underspecified if you have fewer samples than data point.
Every data point will be it's own principal component. For PCA to work, the number of instances should be significantly larger than the number of dimensions.

Simply speaking you can look at the problems like this:
If you have n dimensions, you can encode up to n+1 instances using vectors that are all 0 or that have at most one 1. And this is optimal, so PCA will do this! But it is not very helpful.

甜心小果奶 2024-11-08 09:09:39

您可以使用 prcomp 代替 princomp

you can use prcomp instead of princomp

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文