计算观察和考虑条件

发布于 2025-02-03 14:51:10 字数 1753 浏览 3 评论 0原文

我有一个这样的数据库：

id <- c(rep(1,3), rep(2, 3), rep(3, 3))
condition <- c(0, 0, 1, 0, 0, 1, 1, 1, 0)
time_point1 <- c(1, 1, NA)
time_point2 <- c(NA, 1, NA)
time_point3 <- c(NA, NA, NA)
time_point4 <- c(1, NA, NA, 1, NA, NA, NA, NA, 1)

data <- data.frame(id, condition, time_point1, time_point2, time_point3, time_point4)
data

  id condition time_point1 time_point2 time_point3 time_point4
1  1         0           1          NA          NA           1
2  1         0           1           1          NA          NA
3  1         1          NA          NA          NA          NA
4  2         0           1          NA          NA           1
5  2         0           1           1          NA          NA
6  2         1          NA          NA          NA          NA
7  3         1           1          NA          NA          NA
8  3         1           1           1          NA          NA
9  3         0          NA          NA          NA           1

我想制作一个表格，其中有多少个条件== 1（n_x），每个时间点（n_t）中有多少个。如果我也没有一个0。我尝试了这一点：

data %>% 
  pivot_longer(cols = contains("time_point")) %>% 
  filter (!is.na(value)) %>% 
  group_by(name) %>% 
  mutate(n_t = n_distinct(id)) %>% 
  ungroup() %>% 
  filter(condition == 1) %>%
  group_by(name) %>%
  summarise(n_x = n_distinct(id), n_t = first(n_t))

获得此问题：

  name          n_x   n_t
  <chr>       <int> <int>
1 time_point1     1     3
2 time_point2     1     3

所需的结果：我想要这种类型的表格，以情况和没有状态来考虑案件：

         name n_x n_t
1 time_point1   2   6
2 time_point2   1   3
3 time_point3   0   0
4 time_point4   0   3

谢谢！

原文

I have a database like this:

id <- c(rep(1,3), rep(2, 3), rep(3, 3))
condition <- c(0, 0, 1, 0, 0, 1, 1, 1, 0)
time_point1 <- c(1, 1, NA)
time_point2 <- c(NA, 1, NA)
time_point3 <- c(NA, NA, NA)
time_point4 <- c(1, NA, NA, 1, NA, NA, NA, NA, 1)

data <- data.frame(id, condition, time_point1, time_point2, time_point3, time_point4)
data

  id condition time_point1 time_point2 time_point3 time_point4
1  1         0           1          NA          NA           1
2  1         0           1           1          NA          NA
3  1         1          NA          NA          NA          NA
4  2         0           1          NA          NA           1
5  2         0           1           1          NA          NA
6  2         1          NA          NA          NA          NA
7  3         1           1          NA          NA          NA
8  3         1           1           1          NA          NA
9  3         0          NA          NA          NA           1

I want to make a table with how many have the condition == 1 (n_x) and also how many are in each time point (n_t). In case there is none also I want a 0. I tried this:

data %>% 
  pivot_longer(cols = contains("time_point")) %>% 
  filter (!is.na(value)) %>% 
  group_by(name) %>% 
  mutate(n_t = n_distinct(id)) %>% 
  ungroup() %>% 
  filter(condition == 1) %>%
  group_by(name) %>%
  summarise(n_x = n_distinct(id), n_t = first(n_t))

Obtaining this:

  name          n_x   n_t
  <chr>       <int> <int>
1 time_point1     1     3
2 time_point2     1     3

Desired Outcome: I want this type of table that considers the cases with condition and without it:

         name n_x n_t
1 time_point1   2   6
2 time_point2   1   3
3 time_point3   0   0
4 time_point4   0   3

Thank you!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

灯角 2025-02-10 14:51:10

您可以pivot_longer（）能够group_by（） time_points，然后总结一下添加值。对于条件，仅列values！= Na的总和值。

data %>% 
  pivot_longer(cols=c(3:6),names_to = 'point', values_to='values') %>%
  group_by(point) %>% 
  summarise(n_x = sum(condition[!is.na(values)]), n_t = sum(values, na.rm = TRUE))

输出：

# A tibble: 4 x 3
  point         n_x   n_t
  <chr>       <dbl> <dbl>
1 time_point1     2     6
2 time_point2     1     3
3 time_point3     0     0
4 time_point4     0     3

You can pivot_longer() to be able to group_by() time_points and then summarise just adding up the values. For conditions only sum values where the column values != NA.

data %>% 
  pivot_longer(cols=c(3:6),names_to = 'point', values_to='values') %>%
  group_by(point) %>% 
  summarise(n_x = sum(condition[!is.na(values)]), n_t = sum(values, na.rm = TRUE))

Output:

# A tibble: 4 x 3
  point         n_x   n_t
  <chr>       <dbl> <dbl>
1 time_point1     2     6
2 time_point2     1     3
3 time_point3     0     0
4 time_point4     0     3

回复收藏 0 原文

~没有更多了~