计算观察和考虑条件

发布于 2025-02-03 14:51:10 字数 1753 浏览 3 评论 0原文

我有一个这样的数据库:

id <- c(rep(1,3), rep(2, 3), rep(3, 3))
condition <- c(0, 0, 1, 0, 0, 1, 1, 1, 0)
time_point1 <- c(1, 1, NA)
time_point2 <- c(NA, 1, NA)
time_point3 <- c(NA, NA, NA)
time_point4 <- c(1, NA, NA, 1, NA, NA, NA, NA, 1)

data <- data.frame(id, condition, time_point1, time_point2, time_point3, time_point4)
data

  id condition time_point1 time_point2 time_point3 time_point4
1  1         0           1          NA          NA           1
2  1         0           1           1          NA          NA
3  1         1          NA          NA          NA          NA
4  2         0           1          NA          NA           1
5  2         0           1           1          NA          NA
6  2         1          NA          NA          NA          NA
7  3         1           1          NA          NA          NA
8  3         1           1           1          NA          NA
9  3         0          NA          NA          NA           1

我想制作一个表格,其中有多少个条件== 1(n_x),每个时间点(n_t)中有多少个。如果我也没有一个0。我尝试了这一点:

data %>% 
  pivot_longer(cols = contains("time_point")) %>% 
  filter (!is.na(value)) %>% 
  group_by(name) %>% 
  mutate(n_t = n_distinct(id)) %>% 
  ungroup() %>% 
  filter(condition == 1) %>%
  group_by(name) %>%
  summarise(n_x = n_distinct(id), n_t = first(n_t))

获得此问题:

  name          n_x   n_t
  <chr>       <int> <int>
1 time_point1     1     3
2 time_point2     1     3

所需的结果:我想要这种类型的表格,以情况和没有状态来考虑案件:

         name n_x n_t
1 time_point1   2   6
2 time_point2   1   3
3 time_point3   0   0
4 time_point4   0   3

谢谢!

I have a database like this:

id <- c(rep(1,3), rep(2, 3), rep(3, 3))
condition <- c(0, 0, 1, 0, 0, 1, 1, 1, 0)
time_point1 <- c(1, 1, NA)
time_point2 <- c(NA, 1, NA)
time_point3 <- c(NA, NA, NA)
time_point4 <- c(1, NA, NA, 1, NA, NA, NA, NA, 1)

data <- data.frame(id, condition, time_point1, time_point2, time_point3, time_point4)
data

  id condition time_point1 time_point2 time_point3 time_point4
1  1         0           1          NA          NA           1
2  1         0           1           1          NA          NA
3  1         1          NA          NA          NA          NA
4  2         0           1          NA          NA           1
5  2         0           1           1          NA          NA
6  2         1          NA          NA          NA          NA
7  3         1           1          NA          NA          NA
8  3         1           1           1          NA          NA
9  3         0          NA          NA          NA           1

I want to make a table with how many have the condition == 1 (n_x) and also how many are in each time point (n_t). In case there is none also I want a 0. I tried this:

data %>% 
  pivot_longer(cols = contains("time_point")) %>% 
  filter (!is.na(value)) %>% 
  group_by(name) %>% 
  mutate(n_t = n_distinct(id)) %>% 
  ungroup() %>% 
  filter(condition == 1) %>%
  group_by(name) %>%
  summarise(n_x = n_distinct(id), n_t = first(n_t))

Obtaining this:

  name          n_x   n_t
  <chr>       <int> <int>
1 time_point1     1     3
2 time_point2     1     3

Desired Outcome: I want this type of table that considers the cases with condition and without it:

         name n_x n_t
1 time_point1   2   6
2 time_point2   1   3
3 time_point3   0   0
4 time_point4   0   3

Thank you!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

灯角 2025-02-10 14:51:10

您可以pivot_longer()能够group_by() time_points,然后总结一下添加值。对于条件,仅列values!= Na的总和值。

data %>% 
  pivot_longer(cols=c(3:6),names_to = 'point', values_to='values') %>%
  group_by(point) %>% 
  summarise(n_x = sum(condition[!is.na(values)]), n_t = sum(values, na.rm = TRUE))

输出:

# A tibble: 4 x 3
  point         n_x   n_t
  <chr>       <dbl> <dbl>
1 time_point1     2     6
2 time_point2     1     3
3 time_point3     0     0
4 time_point4     0     3

You can pivot_longer() to be able to group_by() time_points and then summarise just adding up the values. For conditions only sum values where the column values != NA.

data %>% 
  pivot_longer(cols=c(3:6),names_to = 'point', values_to='values') %>%
  group_by(point) %>% 
  summarise(n_x = sum(condition[!is.na(values)]), n_t = sum(values, na.rm = TRUE))

Output:

# A tibble: 4 x 3
  point         n_x   n_t
  <chr>       <dbl> <dbl>
1 time_point1     2     6
2 time_point2     1     3
3 time_point3     0     0
4 time_point4     0     3
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文