r计算重叠类别的唯一计数
我非常沮丧地计算了在3个单个类别(应用程序,桌面,Web)方案和其余重叠类别(app& web,app& desktop,web&桌面,App& Web& amp;这是我正在研究的示例数据集。
我可以在R中找到具有汇总和group_by函数的单个类别计数,但是,我真的无法弄清楚如何在重叠类别上工作。
真的非常感谢,如果有人可以帮助我!!!谢谢!!!
df <- data.frame(list(ClientID = c("1", "1", "1", "2", "2", "3", "3" , "3" , "3" , "4" ),
device = c("App", "Web", "App", "Web", "Web", "App", "Desktop", "App", "App", "Web"),
conversion = c("0", "0", "0", "0", "1", "1", "0", "1", "0", "1")) )
以下是预期的结果:
Scenario With Conversion Without Conversion
App
Web
Desktop
App & Web
App & Desktop
Web & Desktop
App & Desktop & Web
I have been so frustrated to count the number of clients who made conversion or not in the 3 single categories (app, desktop, web) scenarios and the rest of the overlapping categories (app & Web, app & desktop, web & desktop, app & web & desktop) scenarios. Here is the sample dataset I am working on.
I could figure out the single category count with the aggregate and group_by function in r, however, I can't really figure out how to work on the overlap categories.
Really really thanks so much if someone could help me on this!!! Thanks!!!
df <- data.frame(list(ClientID = c("1", "1", "1", "2", "2", "3", "3" , "3" , "3" , "4" ),
device = c("App", "Web", "App", "Web", "Web", "App", "Desktop", "App", "App", "Web"),
conversion = c("0", "0", "0", "0", "1", "1", "0", "1", "0", "1")) )
Below is the desired outcome:
Scenario With Conversion Without Conversion
App
Web
Desktop
App & Web
App & Desktop
Web & Desktop
App & Desktop & Web
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以准备设备组合列表(即所有7个可能性),然后在辅助功能中使用
setequal()
and unique> unique(),例如combn()
f
(获取设备列表,并指示唯一的设备列表是否符合每个设备组合一个仅通过使用unique()
和setequal()
转换
和client> clientId
,UNNEST应用功能,通过转换计数
和方案
,并透射到您所需的广泛格式输出:
You can prepare a list of device combinations (i.e. all 7 possibilities), and then use
setequal()
andunique()
in a helper function, like thiscombn()
f
(takes a list of devices, and indicates whether the unique list of devices meets each of the device combinations (it can meet at max one only, by usingunique()
andsetequal()
conversion
andClientID
, unnest, count byconversion
andScenario
and pivot to your desired wide formatOutput:
我想除了已经在这里的答案外,我还会添加答案。
根据您的评论进行更新,
这看起来并不是所有的整洁,但确实可以做您期望的。尽管另一个答案看起来好多了。
我首先通过设备转换来收集独特客户的计数。
然后我想要组合。这里只有三个,所以我可以比编码更快地编写组合。但是,我提供了一种更具动态的方法。
现在,我将使用数据框架
df
来计算dbls
中确定的组合的计数。最后但并非最不重要的一点是,我结合了两个帧。
我不知道哪个转换(0或1)与或没有,所以我只是留下标记。
I thought I would add my answer in addition to the one that's already here.
Updated based on your comment
This doesn't look all that neat, but it does do what you're expecting. Although the other answer looks far better.
I started by collecting the count of unique customers by conversion by device.
Then I wanted the combinations. There are only three here, so I could probably write the combinations faster than coding them. However, I've provided a more dynamic approach.
Now I'll use data frame
df
to calculate the counts for the combinations determined indbls
.Last, but not least, I combined the two frames.
I don't know which conversion (0 or 1) is with or without, so I just left the markers.