在 R 中调整组内的先前值

发布于 2025-01-11 18:48:45 字数 1787 浏览 0 评论 0原文

我正在尝试编写一段代码,允许我根据组 name 的值创建 TRUEFALSE 变量以下 data.frame 弹出的列的最早记录:

 library(tidyverse)   
  name<-c("AAA","AAA","AAA","AAA","AAA","AAA","AAA")
  poped<-c(NA,1,NA,NA,1,NA,NA)
  order<-c(1:7)
  tag<-c("X","Y","X","X","Y","X","X")

>   df
  name order tag poped
1  AAA     1   X    NA
2  AAA     2   Y     1
3  AAA     3   X    NA
4  AAA     4   X    NA
5  AAA     5   Y     1
6  AAA     6   X    NA
7  AAA     7   X    NA

我想改变两个名为 CHECKPOS

CHECK 的新变量 将采用这些值

    1= If the closest (above) value where the tag column is Y and poped is 1
    0= If the closest (above) value where the tag column is Y and poped is 0
    2 = If the current row has tag = Y
    NA = Otherwise

POS 将采用最接近(上方)行号的值,其中标记列为 Y 并且 poped 为 1,否则采用 NA

我想要的输出是:

>   df
  name order tag poped CHECK POS                                                            why
1  AAA     1   X    NA    NA  NA                                      There is no previous data
2  AAA     2   Y     1    NA  NA                                                current tag = Y
3  AAA     3   X    NA     1   2 the closest value above where tag=Y is in row 2 and poped is 1
4  AAA     4   X    NA     1   2 the closest value above where tag=Y is in row 2 and poped is 1
5  AAA     5   Y     1    NA  NA                                                current tag = Y
6  AAA     6   X    NA     1   5 the closest value above where tag=Y is in row 5 and poped is 1
7  AAA     7   X    NA     1   5 the closest value above where tag=Y is in row 5 and poped is 1

如何创建一个解决方案,最好使用 Tidyverse?

I'm trying to write a code that will allow me to create a TRUE or FALSE variable within the groups name depending on the value of the earliest record of the column poped of the following data.frame:

 library(tidyverse)   
  name<-c("AAA","AAA","AAA","AAA","AAA","AAA","AAA")
  poped<-c(NA,1,NA,NA,1,NA,NA)
  order<-c(1:7)
  tag<-c("X","Y","X","X","Y","X","X")

>   df
  name order tag poped
1  AAA     1   X    NA
2  AAA     2   Y     1
3  AAA     3   X    NA
4  AAA     4   X    NA
5  AAA     5   Y     1
6  AAA     6   X    NA
7  AAA     7   X    NA

I want to mutate a two new variable named CHECK and POS

CHECK will take on the values

    1= If the closest (above) value where the tag column is Y and poped is 1
    0= If the closest (above) value where the tag column is Y and poped is 0
    2 = If the current row has tag = Y
    NA = Otherwise

POS will take on the value of the closest (above) row number where the tag column is Y and poped is 1, and NA otherwise.

My desired output will be:

>   df
  name order tag poped CHECK POS                                                            why
1  AAA     1   X    NA    NA  NA                                      There is no previous data
2  AAA     2   Y     1    NA  NA                                                current tag = Y
3  AAA     3   X    NA     1   2 the closest value above where tag=Y is in row 2 and poped is 1
4  AAA     4   X    NA     1   2 the closest value above where tag=Y is in row 2 and poped is 1
5  AAA     5   Y     1    NA  NA                                                current tag = Y
6  AAA     6   X    NA     1   5 the closest value above where tag=Y is in row 5 and poped is 1
7  AAA     7   X    NA     1   5 the closest value above where tag=Y is in row 5 and poped is 1

How can I create a solution, ideally using Tidyverse?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

贱人配狗天长地久 2025-01-18 18:48:46
df %>%
  mutate(ctag=if_else(tag=="Y",tag,as.character(NA)),
         cpop=if_else(tag=="Y",poped,as.double(NA)),
         maxr=if_else(tag=="Y" & poped==1,order,as.integer(NA))) %>%
  fill(ctag,cpop,maxr) %>% 
  mutate(
    CHECK = case_when(
      tag == "Y"~2,
      lag(ctag) == "Y" & lag(cpop)==1 ~1,
      lag(ctag) == "Y" & lag(cpop)==0 ~0,
      TRUE~as.double(NA)),
    POS = if_else(tag=="Y", as.integer(NA), maxr)
  ) %>% 
  select(!ctag:maxr)

输出:

  name  order tag   poped CHECK   POS
  <chr> <int> <chr> <dbl> <dbl> <int>
1 AAA       1 X        NA    NA    NA
2 AAA       2 Y         1     2    NA
3 AAA       3 X        NA     1     2
4 AAA       4 X        NA     1     2
5 AAA       5 Y         1     2    NA
6 AAA       6 X        NA     1     5
7 AAA       7 X        NA     1     5
df %>%
  mutate(ctag=if_else(tag=="Y",tag,as.character(NA)),
         cpop=if_else(tag=="Y",poped,as.double(NA)),
         maxr=if_else(tag=="Y" & poped==1,order,as.integer(NA))) %>%
  fill(ctag,cpop,maxr) %>% 
  mutate(
    CHECK = case_when(
      tag == "Y"~2,
      lag(ctag) == "Y" & lag(cpop)==1 ~1,
      lag(ctag) == "Y" & lag(cpop)==0 ~0,
      TRUE~as.double(NA)),
    POS = if_else(tag=="Y", as.integer(NA), maxr)
  ) %>% 
  select(!ctag:maxr)

Output:

  name  order tag   poped CHECK   POS
  <chr> <int> <chr> <dbl> <dbl> <int>
1 AAA       1 X        NA    NA    NA
2 AAA       2 Y         1     2    NA
3 AAA       3 X        NA     1     2
4 AAA       4 X        NA     1     2
5 AAA       5 Y         1     2    NA
6 AAA       6 X        NA     1     5
7 AAA       7 X        NA     1     5
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文