用不同的特定值替换多列中缺少的数据

发布于 2025-01-23 01:07:53 字数 2303 浏览 0 评论 0原文

我有一个巨大的数据框架,我需要替换几个缺少值,如下所示:

循环A周期B周期C.....
nana nana
nana nana
na nanana na
-1na0
-1 -1-20
na-2Na
nanana
nana1
0-11
0-1na na
nanana na
nanana
na na02
102
1nana na
nanana na

na na na na n na n a i需要用出现的下一个数字替换na有类似的东西:

A周期B周期C.....
-1-20
-1 -1-20
-1-20
-1 -1-20
-1-20
0 00 -2-2 1
0-11
0 0-11
0-11
0-12
102 1 0
2 102
102
102
113
213

任何想法如何做? 谢谢。

I have a huge data frame with several missing value that I need to replace as follow:

Cycle ACycle BCycle C.....
nanana
nanana
nanana
-1na0
-1-20
na-2na
nanana
nana1
0-11
0-1na
nanana
nanana
na02
102
1nana
nanana

For each column I need to replace the NA's by the next number that appears, to have something like that:

Cycle ACycle BCycle C.....
-1-20
-1-20
-1-20
-1-20
-1-20
0-21
0-11
0-11
0-11
0-12
102
102
102
102
113
213

Any idea how to do that?
Thank you.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

固执像三岁 2025-01-30 01:07:53

假设您要在所有start_with周期的所有列中执行替换。

第一个填充函数将na替换为下一个行值。 突变函数在最后一行中替换Na是最后一个非NA值 + 1。

library(tidyverse)

df %>% 
  fill(starts_with("Cycle"), .direction = "up") %>% 
  mutate(across(starts_with("Cycle"), ~ replace_na(.x, last(.x[!is.na(.x)]) + 1)))

   CycleA CycleB CycleC
1      -1     -2      0
2      -1     -2      0
3      -1     -2      0
4      -1     -2      0
5      -1     -2      0
6       0     -2      1
7       0     -1      1
8       0     -1      1
9       0     -1      1
10      0     -1      2
11      1      0      2
12      1      0      2
13      1      0      2
14      1      0      2
15      1      1      3
16      2      1      3

Assume you want to perform replacement in all columns that starts_with Cycle.

The first fill function replaces NA with the next row values. The mutate function replaces NA in the last row to be last non-NA value + 1.

library(tidyverse)

df %>% 
  fill(starts_with("Cycle"), .direction = "up") %>% 
  mutate(across(starts_with("Cycle"), ~ replace_na(.x, last(.x[!is.na(.x)]) + 1)))

   CycleA CycleB CycleC
1      -1     -2      0
2      -1     -2      0
3      -1     -2      0
4      -1     -2      0
5      -1     -2      0
6       0     -2      1
7       0     -1      1
8       0     -1      1
9       0     -1      1
10      0     -1      2
11      1      0      2
12      1      0      2
13      1      0      2
14      1      0      2
15      1      1      3
16      2      1      3
知足的幸福 2025-01-30 01:07:53

首先,将转换为natype.covert的数字值。

dat <- replace(dat, dat== 'na', NA) |> type.convert(as.is=TRUE)

接下来,我可能错了,您是否正在寻找这种基本结构?

f <- \(x) {
  rp <- cumsum(c(0, diff(!is.na(x))) > 0) + min(x, na.rm=TRUE)
  nas <- is.na(x)
  x[nas] <- rp[nas]
  x
}

cols <- c("CycleA", "CycleB", "CycleC")  ## select columns

dat[cols] <- lapply(dat[cols], f)
dat
#    CycleA CycleB CycleC
# 1      -1     -2      0
# 2      -1     -2      0
# 3      -1     -2      0
# 4      -1     -2      0
# 5      -1     -2      0
# 6       0     -2      1
# 7       0     -1      1
# 8       0     -1      1
# 9       0     -1      1
# 10      0     -1      2
# 11      1      0      2
# 12      1      0      2
# 13      1      0      2
# 14      1      0      2
# 15      1      1      3
# 16      2      1      3

数据:

dat <- structure(list(CycleA = c("na", "na", "na", "-1", "-1", "na", 
"na", "na", "0", "0", "na", "na", "na", "1", "1", "na"), CycleB = c("na", 
"na", "na", "na", "-2", "-2", "na", "na", "-1", "-1", "na", "na", 
"0", "0", "na", "na"), CycleC = c("na", "na", "na", "0", "0", 
"na", "na", "1", "1", "na", "na", "na", "2", "2", "na", "na")), class = "data.frame", row.names = c(NA, 
-16L))

First, convert "na" to NA and type.covert for numeric values.

dat <- replace(dat, dat== 'na', NA) |> type.convert(as.is=TRUE)

Next, I might be wrong, are you looking for this underlying structure?

f <- \(x) {
  rp <- cumsum(c(0, diff(!is.na(x))) > 0) + min(x, na.rm=TRUE)
  nas <- is.na(x)
  x[nas] <- rp[nas]
  x
}

cols <- c("CycleA", "CycleB", "CycleC")  ## select columns

dat[cols] <- lapply(dat[cols], f)
dat
#    CycleA CycleB CycleC
# 1      -1     -2      0
# 2      -1     -2      0
# 3      -1     -2      0
# 4      -1     -2      0
# 5      -1     -2      0
# 6       0     -2      1
# 7       0     -1      1
# 8       0     -1      1
# 9       0     -1      1
# 10      0     -1      2
# 11      1      0      2
# 12      1      0      2
# 13      1      0      2
# 14      1      0      2
# 15      1      1      3
# 16      2      1      3

Data:

dat <- structure(list(CycleA = c("na", "na", "na", "-1", "-1", "na", 
"na", "na", "0", "0", "na", "na", "na", "1", "1", "na"), CycleB = c("na", 
"na", "na", "na", "-2", "-2", "na", "na", "-1", "-1", "na", "na", 
"0", "0", "na", "na"), CycleC = c("na", "na", "na", "0", "0", 
"na", "na", "1", "1", "na", "na", "na", "2", "2", "na", "na")), class = "data.frame", row.names = c(NA, 
-16L))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文