r 中约 10 个因素的簇共存饼图
我有一个两列数据集,其中包含大约 30000 个聚类和 10 个因子,如下所示:
cluster-1 Factor1
cluster-1 Factor2
...
cluster-2 Factor2
cluster-2 Factor3
...
我想表示聚类集中因子的共现。类似于“1234 个簇中的因子 1+因子 3+因子 5”,等等不同的组合。我以为我可以做一些像饼图这样的东西,但是有 10 个因素,我认为可能有太多的组合。
表示这一点的好方法是什么?
I've got a two-column dataset with about 30000 clusters and 10 factors like this:
cluster-1 Factor1
cluster-1 Factor2
...
cluster-2 Factor2
cluster-2 Factor3
...
And I would like to represent the co-occurrence of factors in the clusterset. Something like "Factor1+Factor3+Factor5 in 1234 clusters", and so on for the different combinations. I thought I could so something like a pie chart, but with 10 factors, I take there can be too many combinations.
What would be a good way of representing this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这里有一个很好的编程问题需要解决:
如何计算不同簇中因子同时出现的数量?
首先模拟一些数据:
然后下面的代码可用于将每个因素组合在集群中出现的次数制成表格:
这可以表示为简单的饼图,例如
但是像这样的简单计数通常最有效地显示为排序表。有关详细信息,请查看 Edward Tufte。
There is one good programming question in here that should be addressed:
How do I count the number of co-occurrences of factors in the different clusters?
First simulate some data:
Then here is the code that could be used to tabulate the number of times each combination of factors occurs in the clusters:
This can be represented as a simple pie chart, for example,
but simple counts like this are often most efficiently displayed as a sorted table. For more on this, check out Edward Tufte.