如何用特定字符串字符替换列中的特定 NA

发布于 2025-01-12 08:00:26 字数 1070 浏览 2 评论 0原文

这可能很简单，但我无法弄清楚

df<-structure(list(Besti = c("Friend", "myfriend", "yourbest", "allbest"
), Friend = c("Friend", NA, "Friend", "Toofriend"), Val1 = c(0L, 
0L, 0L, 0L), Val2 = c(0L, 0L, 0L, 0L), Val3 = c(0L, 1L, 0L, 0L
), Val4 = c(0L, 0L, 0L, 0L), Val5 = c(0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-4L))

我的数据是这样的，我想知道如果一个较高的字符串和一个较低的字符串相同，如何将 NA 替换为字符串

所以我可以发现有一个 NA

sum(is.na(df$Friend))

如果它是一个较高的朋友，一个较低的朋友，我想将其替换为朋友，

所以输出看起来像这样

df_out<-structure(list(Besti = c("Friend", "myfriend", "yourbest", "allbest"
), Friend = c("Friend", "Friend", "Friend", "Toofriend"), Val1 = c(0L, 
0L, 0L, 0L), Val2 = c(0L, 0L, 0L, 0L), Val3 = c(0L, 1L, 0L, 0L
), Val4 = c(0L, 0L, 0L, 0L), Val5 = c(0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-4L))

，想象我有 100 个或多个 HA，并且没有顺序，也许前面的一个是 NA，或者后面的一个是 NA，但是这两个之后是 Friend 或任何字符串

如果我想要将 NA 替换为 Friend，我可以这样做

df$Friend <- df$Friend %>% replace_na('Friend')

原文

This could be very simple to but I could not figure

df<-structure(list(Besti = c("Friend", "myfriend", "yourbest", "allbest"
), Friend = c("Friend", NA, "Friend", "Toofriend"), Val1 = c(0L, 
0L, 0L, 0L), Val2 = c(0L, 0L, 0L, 0L), Val3 = c(0L, 1L, 0L, 0L
), Val4 = c(0L, 0L, 0L, 0L), Val5 = c(0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-4L))

My data is like this, I want to know how to replace the NA to a string if one higher and one lower string is the same

So I can find that there is an NA

sum(is.na(df$Friend))

If it is one higher friend and one lower is friend, I want to replace it to friend

so the output look like this

df_out<-structure(list(Besti = c("Friend", "myfriend", "yourbest", "allbest"
), Friend = c("Friend", "Friend", "Friend", "Toofriend"), Val1 = c(0L, 
0L, 0L, 0L), Val2 = c(0L, 0L, 0L, 0L), Val3 = c(0L, 1L, 0L, 0L
), Val4 = c(0L, 0L, 0L, 0L), Val5 = c(0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-4L))

so imagine I have 100 HAs or many and there is no order, maybe one before is NA or one after is NA but the two after is Friend or whatever string

If I want to replace the NA to Friend, I can do this

df$Friend <- df$Friend %>% replace_na('Friend')

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

白云不回头 2025-01-19 08:00:26

library(dplyr)
df |>
  mutate(
    upper = lag(Friend),
    lower = lead(Friend),
    replacement = ifelse(upper == lower, upper, NA),
    Friend = coalesce(Friend, replacement)
  )
#>      Besti    Friend Val1 Val2 Val3 Val4 Val5  upper     lower replacement
#> 1   Friend    Friend    0    0    0    0    0   <NA>      <NA>        <NA>
#> 2 myfriend    Friend    0    0    1    0    0 Friend    Friend      Friend
#> 3 yourbest    Friend    0    0    0    0    0   <NA> Toofriend        <NA>
#> 4  allbest Toofriend    0    0    0    0    0 Friend      <NA>        <NA>

dplyr::lag() 和 dplyr::lead() 向下/向上移动矢量 Friend。
然后我们可以测试它们是否具有相同的值，如果是，我们就使用它
值作为重置值。 dplyr::coalesce() 替换 NA
Friend 与 replacement 值处于同一位置。
这可以简化为：

df |>
  mutate(
    replacement = ifelse(lag(Friend) == tail(Friend), lag(Friend), NA),
    Friend = coalesce(Friend, replacement)
  )
#>      Besti    Friend Val1 Val2 Val3 Val4 Val5 replacement
#> 1   Friend    Friend    0    0    0    0    0          NA
#> 2 myfriend      <NA>    0    0    1    0    0          NA
#> 3 yourbest    Friend    0    0    0    0    0          NA
#> 4  allbest Toofriend    0    0    0    0    0          NA

library(dplyr)
df |>
  mutate(
    upper = lag(Friend),
    lower = lead(Friend),
    replacement = ifelse(upper == lower, upper, NA),
    Friend = coalesce(Friend, replacement)
  )
#>      Besti    Friend Val1 Val2 Val3 Val4 Val5  upper     lower replacement
#> 1   Friend    Friend    0    0    0    0    0   <NA>      <NA>        <NA>
#> 2 myfriend    Friend    0    0    1    0    0 Friend    Friend      Friend
#> 3 yourbest    Friend    0    0    0    0    0   <NA> Toofriend        <NA>
#> 4  allbest Toofriend    0    0    0    0    0 Friend      <NA>        <NA>

dplyr::lag() and dplyr::lead() shift the vector Friend down/up.
We can then test if they have the same value and if they do we use this
value as the replacement value. dplyr::coalesce() replaces the NAs in
Friend with the replacement value in the same postion.
This can be simplified to:

df |>
  mutate(
    replacement = ifelse(lag(Friend) == tail(Friend), lag(Friend), NA),
    Friend = coalesce(Friend, replacement)
  )
#>      Besti    Friend Val1 Val2 Val3 Val4 Val5 replacement
#> 1   Friend    Friend    0    0    0    0    0          NA
#> 2 myfriend      <NA>    0    0    1    0    0          NA
#> 3 yourbest    Friend    0    0    0    0    0          NA
#> 4  allbest Toofriend    0    0    0    0    0          NA

回复收藏 0 原文

ゃ懵逼小萝莉 2025-01-19 08:00:26

这是另一种方法。在数据框中，我添加了每次观察之前和之后出现的 Friend 值：

library(dplyr)

df$after <- lead(df$Friend)
df$before <- lag(df$Friend)

df

输出：

     Besti    Friend Val1 Val2 Val3 Val4 Val5     after before
1   Friend    Friend    0    0    0    0    0      <NA>   <NA>
2 myfriend      <NA>    0    0    1    0    0    Friend Friend
3 yourbest    Friend    0    0    0    0    0 Toofriend   <NA>
4  allbest Toofriend    0    0    0    0    0      <NA> Friend

现在我们可以导出 Friend 的新版本> 带有 ifelse() 的变量：

df$Friend <- ifelse(
  is.na(df$Friend) & 
  df$after == "Friend" & 
  df$before == "Friend", "Friend", df$Friend
)

df[, -c(8,9)]

输出：

     Besti    Friend Val1 Val2 Val3 Val4 Val5
1   Friend    Friend    0    0    0    0    0
2 myfriend    Friend    0    0    1    0    0
3 yourbest    Friend    0    0    0    0    0
4  allbest Toofriend    0    0    0    0    0

Here's another approach. To the data frame I added the values of Friend that come before and after each observation:

library(dplyr)

df$after <- lead(df$Friend)
df$before <- lag(df$Friend)

df

Output:

     Besti    Friend Val1 Val2 Val3 Val4 Val5     after before
1   Friend    Friend    0    0    0    0    0      <NA>   <NA>
2 myfriend      <NA>    0    0    1    0    0    Friend Friend
3 yourbest    Friend    0    0    0    0    0 Toofriend   <NA>
4  allbest Toofriend    0    0    0    0    0      <NA> Friend

Now we can derive a new version of the Friend variable with ifelse():

df$Friend <- ifelse(
  is.na(df$Friend) & 
  df$after == "Friend" & 
  df$before == "Friend", "Friend", df$Friend
)

df[, -c(8,9)]

Output:

     Besti    Friend Val1 Val2 Val3 Val4 Val5
1   Friend    Friend    0    0    0    0    0
2 myfriend    Friend    0    0    1    0    0
3 yourbest    Friend    0    0    0    0    0
4  allbest Toofriend    0    0    0    0    0

回复收藏 0 原文

~没有更多了~