R:计算列中观测值的百分比,该观测值低于面板数据的一定值

发布于 2025-02-01 13:08:37 字数 1036 浏览 2 评论 0原文

我有面板数据,我想在列(大小)以下的列(大小)中获得比例的观测值。

我的数据如下:

structure(list(Product = c("A", "A", "A", "A", "A", "A", "B", 
"B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "C"), Date = c("02.05.2018", 
"04.05.2018", "05.05.2018", "06.05.2018", "07.05.2018", "08.05.2018", 
"02.05.2018", "04.05.2018", "05.05.2018", "06.05.2018", "07.05.2018", 
"08.05.2018", "02.05.2018", "04.05.2018", "05.05.2018", "06.05.2018", 
"07.05.2018", "08.05.2018"), Size = c(100023423, 1920, 2434324342, 
2342353566, 345345345, 432, 1.35135e+11, 312332, 23434, 4622436246, 
3252243, 234525, 57457457, 56848648, 36363546, 36535636, 2345, 
2.52646e+11)), class = "data.frame", row.names = c(NA, -18L))

因此,例如,对于产品A,将为33.33%,因为6个观察中有2个低于100万。

但是,我在R中尝试了以下内容

df <- df %>%
  group_by(Product) %>%
  dplyr:: summarise(CountDate = n(), SmallSize = count(Size<1000000))

,但是我会发现一个错误,说“没有适用的方法”应用于类“逻辑”对象的“逻辑”“事件”,尽管列大小具有double的格式。

在上面的代码之后,我将计算小型/计数以获取百分比。

我需要调整什么才能获取错误消息?

I have panel data and I would like to get the percentage of observations in a column (Size) that are below 1 million.

My data is the following:

structure(list(Product = c("A", "A", "A", "A", "A", "A", "B", 
"B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "C"), Date = c("02.05.2018", 
"04.05.2018", "05.05.2018", "06.05.2018", "07.05.2018", "08.05.2018", 
"02.05.2018", "04.05.2018", "05.05.2018", "06.05.2018", "07.05.2018", 
"08.05.2018", "02.05.2018", "04.05.2018", "05.05.2018", "06.05.2018", 
"07.05.2018", "08.05.2018"), Size = c(100023423, 1920, 2434324342, 
2342353566, 345345345, 432, 1.35135e+11, 312332, 23434, 4622436246, 
3252243, 234525, 57457457, 56848648, 36363546, 36535636, 2345, 
2.52646e+11)), class = "data.frame", row.names = c(NA, -18L))

So for instance, for Product A it would be 33.33% since two out of 6 observations are below one million.

I have tried the following in R

df <- df %>%
  group_by(Product) %>%
  dplyr:: summarise(CountDate = n(), SmallSize = count(Size<1000000))

However, I get an error saying that "no applicable method for 'count' applied to an object of class "logical"" eventhough the column Size has the format double.

After the code above I would then calculate SmallSize/CountDate to get the percentage.

What do I need to adjust to not get the error message?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

檐上三寸雪 2025-02-08 13:08:37

而不是count,它需要一个data.frame/tibble,在逻辑向量上使用sum以获取计数 - true值将计数AS 1和false也是0

library(dplyr)
df %>%
  group_by(Product) %>%
  dplyr:: summarise(CountDate = n(),
     SmallSize = sum(Size<1000000, na.rm = TRUE), .groups = "drop") %>%
  dplyr::mutate(Percent = SmallSize/CountDate)
# A tibble: 3 × 4
  Product CountDate SmallSize Percent
  <chr>       <int>     <int>   <dbl>
1 A               6         2   0.333
2 B               6         3   0.5  
3 C               6         1   0.167

,我们也不需要创建两个列。它可以用平均值直接计算

df %>%
    group_by(Product) %>%
    dplyr::summarise(Percent = mean(Size < 1000000, na.rm = TRUE))
# A tibble: 3 × 2
  Product Percent
  <chr>     <dbl>
1 A         0.333
2 B         0.5  
3 C         0.167

Instead of count, which requires a data.frame/tibble, use sum on a logical vector to get the count - TRUE values will be counted as 1 and FALSE as 0

library(dplyr)
df %>%
  group_by(Product) %>%
  dplyr:: summarise(CountDate = n(),
     SmallSize = sum(Size<1000000, na.rm = TRUE), .groups = "drop") %>%
  dplyr::mutate(Percent = SmallSize/CountDate)
# A tibble: 3 × 4
  Product CountDate SmallSize Percent
  <chr>       <int>     <int>   <dbl>
1 A               6         2   0.333
2 B               6         3   0.5  
3 C               6         1   0.167

Also, we don't need to create both the columns. It can be directly calculated with mean

df %>%
    group_by(Product) %>%
    dplyr::summarise(Percent = mean(Size < 1000000, na.rm = TRUE))
# A tibble: 3 × 2
  Product Percent
  <chr>     <dbl>
1 A         0.333
2 B         0.5  
3 C         0.167
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文