总结函数与group_by（）一起使用时不会按组按组分组数据。

发布于 2025-01-20 22:43:41 字数 985 浏览 3 评论 0原文

我有一个大型数据集，其中包括COVID-19案例，每个日期的每个案例数。我正在尝试通过一个包含所有地区ID和日期变量（MeldEdatum）的变量来概括这些数据，出于某种原因，新数据框架中的输出仅为1行，整个期间总案例，并且不按其分组ID和日期变量。我点知道为什么那是。我正在添加数据集的屏幕截图以显示其外观。有人可以帮忙吗？

数据样本。对于44个地区，总共有100,000多个观察结果，我只是包括2个不同地区和日期的样本。

dat<-data.frame(Landkreis=c("Sk Stuttgart", "Sk Stuttgart", "Lk Freiburg","Lk Freiburg"),
              Anzahlfall=c(1,1,1,1),AnzahlTodesfall=c(0,1,2,1),
          Meldedatum=c("09-03-2020","18-03-2020","09-03-2020","20-03-2020"),IdLandkreis=c(8111, 8111,8116,8116))

datAggMelde <- dat %>% group_by(IdLandkreis,  Meldedatum) %>%
  summarize(sumCount = sum(AnzahlFall, na.rm = TRUE),
            sumDeath = sum(AnzahlTodesfall, na.rm = TRUE),
            Landkreis = first(Landkreis) )

原文

I have a large dataset with COVID-19 cases, with number of cases per for each date.This data is in the dat dataframe. I am trying to summarize these data by a variable which contains ID of all districts and the date variable (Meldedatum), for some reason the output in new data frame is just 1 row with total cases for the entire period and it is not grouped by ID and date variable. I dot know why that is. I am adding screen shot of the dataset to show what it looks like. Can someone help?

sample of data. There are more than 100,000 observations in total for 44 districts, I am just including sample with 2 different districts and dates.

dat<-data.frame(Landkreis=c("Sk Stuttgart", "Sk Stuttgart", "Lk Freiburg","Lk Freiburg"),
              Anzahlfall=c(1,1,1,1),AnzahlTodesfall=c(0,1,2,1),
          Meldedatum=c("09-03-2020","18-03-2020","09-03-2020","20-03-2020"),IdLandkreis=c(8111, 8111,8116,8116))

datAggMelde <- dat %>% group_by(IdLandkreis,  Meldedatum) %>%
  summarize(sumCount = sum(AnzahlFall, na.rm = TRUE),
            sumDeath = sum(AnzahlTodesfall, na.rm = TRUE),
            Landkreis = first(Landkreis) )

分享到QQ

分享到微博