每天总计计数R中的不同ID

发布于 2025-01-24 21:22:38 字数 1288 浏览 2 评论 0 原文

我想每天获得包括零在内的案件计数。这是我的数据框架示例：

set.seed(1453); ID = sample(1:4, 10, TRUE)
date = c('2016-01-01', '2016-01-05', '2016-01-07',  '2016-01-12',  '2016-01-16',  '2016-01-20',
         '2016-01-20',  '2016-01-25',  '2016-01-26',  '2016-01-31')
df = data.frame(ID, date = as.Date(date))

因此，我知道 ID 1 on 2016-01-01 有一个情况 ID <代码> 1 on 2016-01-20 。因此，我想从 2016-01-01 到 2016-01-31 使用 1 1 在这两天和 0 否则。我会喜欢每个 id 相同的。因此，此示例显示了一个事件 ID ，但是我的实际数据框架中每天最多每天 id 。

我已经使用过：

M <- function(timeStamps) {
  Dates <- as.Date(strftime(df$date, "%Y-%m-%d"))
  allDates <- seq(from = min(Dates), to = max(Dates), by = "day")
  Admission <- sapply(allDates, FUN = function(X) sum(Dates == X))
  data.frame(day = allDates, Admission = Admission)
}
MM<-M(df$date)

但是 mm 只有为每个 ID 创建一个数据框，才会给我我想要的结果。

我使用 0 每个 ID 的事件，我将能够每天汇总类似的数据框架。

原文

I want to get the count of cases per day, including zeros. This is my data frame example:

set.seed(1453); ID = sample(1:4, 10, TRUE)
date = c('2016-01-01', '2016-01-05', '2016-01-07',  '2016-01-12',  '2016-01-16',  '2016-01-20',
         '2016-01-20',  '2016-01-25',  '2016-01-26',  '2016-01-31')
df = data.frame(ID, date = as.Date(date))

So I know that there was one case for ID 1 on 2016-01-01, then one case for ID 1 on 2016-01-20. So I want to get a data frame from 2016-01-01 to 2016-01-31 with 1 on those two days and 0 otherwise. I will like the same for each ID. So this example shows one event per ID, but I have up to 15 cases per day per ID in my actual data frame.

I have used:

M <- function(timeStamps) {
  Dates <- as.Date(strftime(df$date, "%Y-%m-%d"))
  allDates <- seq(from = min(Dates), to = max(Dates), by = "day")
  Admission <- sapply(allDates, FUN = function(X) sum(Dates == X))
  data.frame(day = allDates, Admission = Admission)
}
MM<-M(df$date)

But MM will only give me the result I want if I create a data frame for each ID.

I have done the same exercise using this example, but I get monthly aggregate results here. Ideally, I would be able to aggregate a similar data frame per day, considering 0 events per ID.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

傾旎 2025-01-31 21:22:38

使用当前函数，我们可以 split “日期” by“ id”列，应用该功能， rbind list> list 输出输出到一个单个data.frame，带有 id 作为另一列

lst1 <- lapply(split(df$date, df$ID), M)
out <- do.call(rbind, Map(cbind, ID = names(lst1), lst1))
row.names(out) <- NULL

-Output

> str(out)
'data.frame':   124 obs. of  3 variables:
 $ ID       : chr  "1" "1" "1" "1" ...
 $ day      : Date, format: "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" ...
 $ Admission: int  1 0 0 0 1 0 1 0 0 0 ...
> head(out)
  ID        day Admission
1  1 2016-01-01         1
2  1 2016-01-02         0
3  1 2016-01-03         0
4  1 2016-01-04         0
5  1 2016-01-05         1
6  1 2016-01-06         0

进行组

library(dplyr)
library(tidyr)
df %>%
  group_by(ID) %>% 
  summarise(out = M(date), .groups = 'drop') %>%
  unpack(out)

或使用 tidyverse ，按操作- output

# A tibble: 124 × 3
      ID day        Admission
   <int> <date>         <int>
 1     1 2016-01-01         1
 2     1 2016-01-02         0
 3     1 2016-01-03         0
 4     1 2016-01-04         0
 5     1 2016-01-05         1
 6     1 2016-01-06         0
 7     1 2016-01-07         1
 8     1 2016-01-08         0
 9     1 2016-01-09         0
10     1 2016-01-10         0
# … with 114 more rows

With the current function, we can split the 'date' by 'ID' column, apply the function, and rbind the list output to a single data.frame with ID as another column

lst1 <- lapply(split(df$date, df$ID), M)
out <- do.call(rbind, Map(cbind, ID = names(lst1), lst1))
row.names(out) <- NULL

-output

> str(out)
'data.frame':   124 obs. of  3 variables:
 $ ID       : chr  "1" "1" "1" "1" ...
 $ day      : Date, format: "2016-01-01" "2016-01-02" "2016-01-03" "2016-01-04" ...
 $ Admission: int  1 0 0 0 1 0 1 0 0 0 ...
> head(out)
  ID        day Admission
1  1 2016-01-01         1
2  1 2016-01-02         0
3  1 2016-01-03         0
4  1 2016-01-04         0
5  1 2016-01-05         1
6  1 2016-01-06         0

Or using tidyverse, do a group by operation

library(dplyr)
library(tidyr)
df %>%
  group_by(ID) %>% 
  summarise(out = M(date), .groups = 'drop') %>%
  unpack(out)

-output

# A tibble: 124 × 3
      ID day        Admission
   <int> <date>         <int>
 1     1 2016-01-01         1
 2     1 2016-01-02         0
 3     1 2016-01-03         0
 4     1 2016-01-04         0
 5     1 2016-01-05         1
 6     1 2016-01-06         0
 7     1 2016-01-07         1
 8     1 2016-01-08         0
 9     1 2016-01-09         0
10     1 2016-01-10         0
# … with 114 more rows

回复收藏 0 原文

~没有更多了~