使用 ids 计算数据帧中的共现次数

发布于 2025-01-14 20:04:32 字数 653 浏览 4 评论 0原文

我意识到有很多类似的问题，但它们都解决了略有不同的问题，我已经被困了一段时间。

我有一个包含 2 个变量的所有唯一组合的 dataframe ，如下所示：

df = data.frame(id = c('c1','c2','c3','c2','c3','c1','c3'),
                groupid = c('g1','g1','g1','g2','g2','g3','g3'))

我需要以下输出：

   c1 c2 c3
c1  3  1  2
c2  1  3  2
c3  2  2  3

换句话说，我需要计算每对客户 ID 在同一组中出现的频率。

似乎是一个基本问题，但我无法弄清楚。我尝试：

进行交叉连接以查找 (cid1,groupid,cid2) 的所有可能组合，
循环遍历所有组合，并检索与 cid1 匹配的唯一组以及与匹配 cid2
获取交集的长度

..但这将永远运行，所以我正在寻找一种高效且最好是干净的解决方案（使用tidyr/dplyr）。

原文

I realize there are a lot of similar questions but they all tackle a slightly different problem and I have been stuck for a while.

I have a dataframe of all unique combinations of 2 variables as follows:

df = data.frame(id = c('c1','c2','c3','c2','c3','c1','c3'),
                groupid = c('g1','g1','g1','g2','g2','g3','g3'))

And I need the following output:

   c1 c2 c3
c1  3  1  2
c2  1  3  2
c3  2  2  3

In other words I need to count how often each pair of customer ids occur in the same group.

Seems like a basic question, but I can't figure it out. I tried:

making a cross join to find all possible combinations of (cid1,groupid,cid2)
looping through all of them and retrieving unique groups that match cid1 and unique groups that match cid2
taking the length of the intersection

..but this would take forever to run, so I am looking for an efficient and preferably clean solution (using tidyr/dplyr).

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

新雨望断虹 2025-01-21 20:04:32

在通过table获取两列的频率计数后，我们可以使用crossprod

crossprod(table(df[2:1]))

We may use crossprod after getting the frequency count with table on the two columns

crossprod(table(df[2:1]))

回复收藏 0 原文

~没有更多了~

关于作者

兮子

暂无简介

文章

24 人气

关注发私信

十二

文章 0 评论 0

关注

飞烟轻若梦

文章 0 评论 0

关注

OPleyuhuo

文章 0 评论 0

关注

wxb0109

文章 0 评论 0

关注

旧城空念

文章 0 评论 0

关注

-小熊_

文章 0 评论 0

友情链接

文江博客

使用 ids 计算数据帧中的共现次数

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

使用 ids 计算数据帧中的共现次数

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。