R:计算列中观测值的百分比,该观测值低于面板数据的一定值
我有面板数据,我想在列(大小)以下的列(大小)中获得比例的观测值。
我的数据如下:
structure(list(Product = c("A", "A", "A", "A", "A", "A", "B",
"B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "C"), Date = c("02.05.2018",
"04.05.2018", "05.05.2018", "06.05.2018", "07.05.2018", "08.05.2018",
"02.05.2018", "04.05.2018", "05.05.2018", "06.05.2018", "07.05.2018",
"08.05.2018", "02.05.2018", "04.05.2018", "05.05.2018", "06.05.2018",
"07.05.2018", "08.05.2018"), Size = c(100023423, 1920, 2434324342,
2342353566, 345345345, 432, 1.35135e+11, 312332, 23434, 4622436246,
3252243, 234525, 57457457, 56848648, 36363546, 36535636, 2345,
2.52646e+11)), class = "data.frame", row.names = c(NA, -18L))
因此,例如,对于产品A,将为33.33%,因为6个观察中有2个低于100万。
但是,我在R中尝试了以下内容
df <- df %>%
group_by(Product) %>%
dplyr:: summarise(CountDate = n(), SmallSize = count(Size<1000000))
,但是我会发现一个错误,说“没有适用的方法”应用于类“逻辑”对象的“逻辑”“事件”,尽管列大小具有double的格式。
在上面的代码之后,我将计算小型/计数以获取百分比。
我需要调整什么才能获取错误消息?
I have panel data and I would like to get the percentage of observations in a column (Size) that are below 1 million.
My data is the following:
structure(list(Product = c("A", "A", "A", "A", "A", "A", "B",
"B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "C"), Date = c("02.05.2018",
"04.05.2018", "05.05.2018", "06.05.2018", "07.05.2018", "08.05.2018",
"02.05.2018", "04.05.2018", "05.05.2018", "06.05.2018", "07.05.2018",
"08.05.2018", "02.05.2018", "04.05.2018", "05.05.2018", "06.05.2018",
"07.05.2018", "08.05.2018"), Size = c(100023423, 1920, 2434324342,
2342353566, 345345345, 432, 1.35135e+11, 312332, 23434, 4622436246,
3252243, 234525, 57457457, 56848648, 36363546, 36535636, 2345,
2.52646e+11)), class = "data.frame", row.names = c(NA, -18L))
So for instance, for Product A it would be 33.33% since two out of 6 observations are below one million.
I have tried the following in R
df <- df %>%
group_by(Product) %>%
dplyr:: summarise(CountDate = n(), SmallSize = count(Size<1000000))
However, I get an error saying that "no applicable method for 'count' applied to an object of class "logical"" eventhough the column Size has the format double.
After the code above I would then calculate SmallSize/CountDate to get the percentage.
What do I need to adjust to not get the error message?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
而不是
count
,它需要一个data.frame/tibble,在逻辑向量上使用sum
以获取计数 -true
值将计数AS 1和false
也是0,我们也不需要创建两个列。它可以用
平均值
直接计算Instead of
count
, which requires a data.frame/tibble, usesum
on a logical vector to get the count -TRUE
values will be counted as 1 andFALSE
as 0Also, we don't need to create both the columns. It can be directly calculated with
mean