寻找一个函数以有条件地罚款顶部n个值(不是行!)的平均值,并返回数字,而不是dataframe
我有一个大数据框架: 我想在其中计算特定ID的前5个计数的均值
# A tibble: 4,437 x 3
# Groups: DATETIME [87]
DATETIME ID COUNT
<dttm> <chr> <int>
1 2020-06-07 00:00:00 Bagheera NA
2 2020-06-07 00:00:00 Bagheera2 0
3 2020-06-07 00:00:00 Baloo img 0
4 2020-06-07 00:00:00 Banna NA
5 2020-06-07 00:00:00 Blair 158
6 2020-06-07 00:00:00 Carol NA
,然后在for循环中表示每个计数值作为数量,该数量是该ID计算的平均值ID。 为此,我真的宁愿获得一个平均值,而不是作为所有个人的dataFRME,而是作为所需ID的单个数字,然后将其用作for循环内部的变量。
我实际上是在尝试重建一个适用于每个ID的分离列的相同数据工作的循环,但是在将数据融合到一个ID colum之后,它需要探索:
max_activity <- readline(prompt="enter a number: ")
for(i in 2:length(percentage_activity)) {
percentage_activity[[i]] <-
as.numeric(percentage_activity[[i]]*100/mean(sort(percentage_activity[[i]] ,T)
[1:max_activity]))
}
我也尝试了此方法:我不确定如何从这里进行:
for (i in unique(percentage_activity$ID)){
individual <- percentage_activity$ID == i
mean(percentage_activity[individual,"COUNT"], na.rm=TRUE)
}
I have a large data frame:
percentage_activity
# A tibble: 4,437 x 3
# Groups: DATETIME [87]
DATETIME ID COUNT
<dttm> <chr> <int>
1 2020-06-07 00:00:00 Bagheera NA
2 2020-06-07 00:00:00 Bagheera2 0
3 2020-06-07 00:00:00 Baloo img 0
4 2020-06-07 00:00:00 Banna NA
5 2020-06-07 00:00:00 Blair 158
6 2020-06-07 00:00:00 Carol NA
in which I would like to calculate the mean of the top 5 COUNTs for a specific ID, and then, in a for loop, represent every COUNT value as a quantity with the mean value calculated for this ID as the 100% of this specific ID.
To do that, I would really rather get a mean value not as a datafrme for all individuals but as a single number for the desired ID, and then use it as a variable inside the for loop.
I'm actually trying to reconstruct a loop that workd for the same data orgenized with seperated columns for each ID, but after melting the data to one ID colum It needs adjusments:
max_activity <- readline(prompt="enter a number: ")
for(i in 2:length(percentage_activity)) {
percentage_activity[[i]] <-
as.numeric(percentage_activity[[i]]*100/mean(sort(percentage_activity[[i]] ,T)
[1:max_activity]))
}
I also tried this, but I'm not sure how to proceed from here:
for (i in unique(percentage_activity$ID)){
individual <- percentage_activity$ID == i
mean(percentage_activity[individual,"COUNT"], na.rm=TRUE)
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
也许这可能会有所帮助:
Maybe this may help: