如何使用 R 将列中的 na 替换为第一个非缺失值,而不删除仅包含缺失值的情况?

发布于 2025-01-16 07:59:59 字数 728 浏览 4 评论 0原文

我有一个包含许多 NA 的长数据框,但我想压缩它,以便所有 NA 在按变量分组时都填充第一个非缺失值 - 但如果观察有 NA ,它保留它。在我更新 R 之前,我有一个有效的代码(如下所示),但现在如果其中一列全部为 NA,它会删除行。

这是一个示例数据集:

library(dplyr)

test <- tibble(name = c("J", "C", "J", "C"),
               test_1 = c(1:2, NA, NA),
               test_2 = c(NA, NA, 3:4),
               make_up_test = c(NA, 1, NA, NA))

这是曾经有效的方法 - 但现在删除了一列中只有 NA 的观察结果(参见 J 被删除,因为他只有 NA 用于测试观察)


test %>%
  group_by(name) %>%
  summarise_all(~first(na.omit(.)))

这就是我希望得到的:

solution <- tibble(name = c("J", "C"),
                test_1 = c(1:2),
                test_2 = c(3:4),
                make_up_test = c(NA, 1))

I have a long data frame that has many NAs, but I want to condenses it so all NAs are filled with the first non-missing value when grouped by a variable--but if the observation only has NAs, it keeps it. Until I updated R, I had a code that worked (shown below), but now it deletes rows if one of their columns is all NAs.

Here's a sample dataset:

library(dplyr)

test <- tibble(name = c("J", "C", "J", "C"),
               test_1 = c(1:2, NA, NA),
               test_2 = c(NA, NA, 3:4),
               make_up_test = c(NA, 1, NA, NA))

And here's what used to work--but now deletes observations that only have NAs in one column (see J getting dropped because he only has NAs for test observation)


test %>%
  group_by(name) %>%
  summarise_all(~first(na.omit(.)))

This is what I'm hoping to get:

solution <- tibble(name = c("J", "C"),
                test_1 = c(1:2),
                test_2 = c(3:4),
                make_up_test = c(NA, 1))

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

但可醉心 2025-01-23 07:59:59

我们使用 na.omit 删除 NA 并获取 first 元素 - 使用 [1] 强制为 < code>NA 如果不存在非 NA 元素

library(dplyr)
test %>% 
  group_by(name) %>% 
  summarise(across(everything(), ~ first(na.omit(.x))[1]))

-输出

# A tibble: 2 × 4
  name  test_1 test_2 make_up_test
  <chr>  <int>  <int>        <dbl>
1 C          2      4            1
2 J          1      3           NA

We remove the NA with na.omit and get the first element - use [1] to coerce to NA if there are no non-NA elements present

library(dplyr)
test %>% 
  group_by(name) %>% 
  summarise(across(everything(), ~ first(na.omit(.x))[1]))

-output

# A tibble: 2 × 4
  name  test_1 test_2 make_up_test
  <chr>  <int>  <int>        <dbl>
1 C          2      4            1
2 J          1      3           NA
树深时见影 2025-01-23 07:59:59

这是一种旋转方法:

library(tidyr)
library(dplyr)

test %>% 
  pivot_longer(-name, names_to = "names") %>%  
  drop_na() %>% 
  pivot_wider(names_from = names, values_from = value) %>% 
  relocate(test_2, .after = test_1)
  name  test_1 test_2 make_up_test
  <chr>  <dbl>  <dbl>        <dbl>
1 J          1      3           NA
2 C          2      4            1

Here is an approach with pivoting:

library(tidyr)
library(dplyr)

test %>% 
  pivot_longer(-name, names_to = "names") %>%  
  drop_na() %>% 
  pivot_wider(names_from = names, values_from = value) %>% 
  relocate(test_2, .after = test_1)
  name  test_1 test_2 make_up_test
  <chr>  <dbl>  <dbl>        <dbl>
1 J          1      3           NA
2 C          2      4            1
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文