计算切换值的观察百分比

发布于 2025-02-13 15:22:03 字数 529 浏览 0 评论 0原文

我有一个具有两个列的数据集。一列指示该组,每个组只有两个行。第二列代表类别。现在,我想计算每个组没有相同类别的百分比。因此,在第1和第2行中,在第3和第4行中的类别不一样。在提供的数据中,我将获得66.66%的百分比,因为类别的变化四倍,而两组则保持不变。

这是我的数据:

structure(list(Group = c("A", "A", "B", "B", "C", "C", "D", "D", 
"E", "E", "F", "F"), Category = c(1L, 2L, 3L, 3L, 5L, 6L, 7L, 
7L, 7L, 6L, 5L, 4L)), class = "data.frame", row.names = c(NA, 
-12L))

到目前为止,我已经尝试了以下内容:

Data <- Data %>%
  group_by(Group) %>%
  count(n())

但是现在我不在最后一行中编写代码以获取所需百分比。有人可以在这里帮我吗?

I have a dataset that has two columns. One column indicates the group and each group has only two rows. The second column represents the category. Now I would like to count the percentage of each group not having the same category. So in row 1 and 2, the Category is not the same while in row 3 and 4 it is the same. In the provided data, I would get a percentage of 66.66% as four times the Category changes while it stays the same for two groups.

This is my data:

structure(list(Group = c("A", "A", "B", "B", "C", "C", "D", "D", 
"E", "E", "F", "F"), Category = c(1L, 2L, 3L, 3L, 5L, 6L, 7L, 
7L, 7L, 6L, 5L, 4L)), class = "data.frame", row.names = c(NA, 
-12L))

I have tried the following so far:

Data <- Data %>%
  group_by(Group) %>%
  count(n())

But I don't now how to write the code in the last line to get my desired percentage. Could someone help me here?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

中性美 2025-02-20 15:22:03

base带有tapply()的解决方案:

mean(with(df, tapply(Category, Group, \(x) length(unique(x)))) > 1)

# [1] 0.6666667

使用dplyr,您可以使用n_distinct()来计数独特的值。

library(dplyr)

df %>%
  group_by(Group) %>%
  summarise(N = n_distinct(Category)) %>%
  summarise(Percent = mean(N > 1))

# # A tibble: 1 × 1
#   Percent
#     <dbl>
# 1   0.667

A base solution with tapply():

mean(with(df, tapply(Category, Group, \(x) length(unique(x)))) > 1)

# [1] 0.6666667

With dplyr, you could use n_distinct() to count the number of unique values.

library(dplyr)

df %>%
  group_by(Group) %>%
  summarise(N = n_distinct(Category)) %>%
  summarise(Percent = mean(N > 1))

# # A tibble: 1 × 1
#   Percent
#     <dbl>
# 1   0.667
清风无影 2025-02-20 15:22:03

要显示这两个类别,您可以使用以下代码:

library(dplyr)
Data %>%
  group_by(Group) %>%
  mutate(unique = as.numeric(n_distinct(Category) == 1)) %>%
  ungroup() %>%
  summarise(Percent = prop.table(table(unique)))

输出:

# A tibble: 2 × 1
  Percent  
  <table>  
1 0.6666667
2 0.3333333

To show it for both classes, you can use the following code:

library(dplyr)
Data %>%
  group_by(Group) %>%
  mutate(unique = as.numeric(n_distinct(Category) == 1)) %>%
  ungroup() %>%
  summarise(Percent = prop.table(table(unique)))

Output:

# A tibble: 2 × 1
  Percent  
  <table>  
1 0.6666667
2 0.3333333
憧憬巴黎街头的黎明 2025-02-20 15:22:03

使用基本r

counts <- table(df)
prop.table(table(rowSums(counts != 0)))

-output

        1         2 
0.3333333 0.6666667 

Using base R

counts <- table(df)
prop.table(table(rowSums(counts != 0)))

-output

        1         2 
0.3333333 0.6666667 
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文