如何在r中使用类型列表()的列上的条件过滤数据表行,r

发布于 2025-02-12 20:04:12 字数 3539 浏览 2 评论 0原文

我有一个data.table,看起来像 (仅显示许多列中的几列) -

ID期间产品
10007973662018-q1UG10000000000000000-WISD
10007973662018-Q1NX11100100 ,UG10000-WISD,UG12210
10007973662018-Q1UG10000-WISD,UG12210
10007973662018-Q1UG10000-WISD,UG12210
10007973662018-Q1UG12210
10007973662018-Q1NX11100
10007973662018-Q1 2018-Q1NX11100 NX11100

在此处“产品”列的“产品”,这样,以便以后的用法。

但是我在根据产品列上的条件过滤行时面临问题。

我想要的是过滤所有行,而其他文件的值可以是该向量C(“ UG12210”,“ UG10000-WISD”)的所有行 例如C中的周期(“ 2018-Q1”)。

因此,我的输出应该看起来像这样

-ID产品
10007973662018-Q1UG10000-WISD
10007973662018-Q1NX11100,UG10000-WISD,UG122210
10007973662018 -Q1Wis121111211012111118
UG1000000000000000000000000000000000000000000-21000000个-Q111118 -Q120186666110 YUG12110姐妹 -
10007973662018-Q1UG12210

,但这并没有以某种方式发生,我尝试了以下条件,但没有奏效。

data_test[Period %in% c("2018-Q1") & is.element("UG12210",Product),]

data_test[Period %in% c("2018-Q1") & Product %in% c("UG12210"),]

关于如何实现它的任何潜在客户都会有很大的帮助。谢谢!

下面是数据使用 dput()的数据

structure(
  list(
    Id = c("1000797366", "1000797366", "1000797366", "1000797366", "1000797366", "1000797366", "1000797366"),
    Period = c("2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1"),
    Product = list("UG10000-WISD", c("NX11100", "UG10000-WISD", "UG12210"), c("UG10000-WISD", "UG12210"),
      c("UG10000-WISD", "UG12210"), "UG12210", "NX11100", "NX11100")
  ),
  row.names = c(NA,-7L),
  class = c("data.table", "data.frame"),
  .internal.selfref = < pointer:0x562f66275020 >
)

I have a data.table which looks something like this (showing just few columns out of many) -

IdPeriodProduct
10007973662018-Q1UG10000-WISD
10007973662018-Q1NX11100, UG10000-WISD, UG12210
10007973662018-Q1UG10000-WISD, UG12210
10007973662018-Q1UG10000-WISD, UG12210
10007973662018-Q1UG12210
10007973662018-Q1NX11100
10007973662018-Q1NX11100

Here the column "Product" is of type list() as I've to keep it this way for some later usage.

But I am facing a problem while filtering the rows based on a condition on Product column.

What I want is to filter all rows where value of product can be any of this vector c("UG12210","UG10000-WISD") along-with other filer
such as Period in c("2018-Q1").

So my output should look something like this -

IdPeriodProduct
10007973662018-Q1UG10000-WISD
10007973662018-Q1NX11100, UG10000-WISD, UG12210
10007973662018-Q1UG10000-WISD, UG12210
10007973662018-Q1UG10000-WISD, UG12210
10007973662018-Q1UG12210

But somehow this is not happening, I tried following conditions but none worked.

data_test[Period %in% c("2018-Q1") & is.element("UG12210",Product),]

data_test[Period %in% c("2018-Q1") & Product %in% c("UG12210"),]

Any leads on how it can be achieved will be of great help. Thanks!

Below is the data using dput() for the datatable

structure(
  list(
    Id = c("1000797366", "1000797366", "1000797366", "1000797366", "1000797366", "1000797366", "1000797366"),
    Period = c("2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1"),
    Product = list("UG10000-WISD", c("NX11100", "UG10000-WISD", "UG12210"), c("UG10000-WISD", "UG12210"),
      c("UG10000-WISD", "UG12210"), "UG12210", "NX11100", "NX11100")
  ),
  row.names = c(NA,-7L),
  class = c("data.table", "data.frame"),
  .internal.selfref = < pointer:0x562f66275020 >
)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

回心转意 2025-02-19 20:04:12

您可以使用sapply函数来检查vals中的任何值是否在product中:

vals = c("UG12210","UG10000-WISD")

dt[Period %chin% "2018-Q1" & sapply(Product, function(v) any(vals %chin% v))]

#            Id  Period                      Product
# 1: 1000797366 2018-Q1                 UG10000-WISD
# 2: 1000797366 2018-Q1 NX11100,UG10000-WISD,UG12210
# 3: 1000797366 2018-Q1         UG10000-WISD,UG12210
# 4: 1000797366 2018-Q1         UG10000-WISD,UG12210
# 5: 1000797366 2018-Q1                      UG12210

You can use sapply function to check if any of the values in vals is in Product for each row:

vals = c("UG12210","UG10000-WISD")

dt[Period %chin% "2018-Q1" & sapply(Product, function(v) any(vals %chin% v))]

#            Id  Period                      Product
# 1: 1000797366 2018-Q1                 UG10000-WISD
# 2: 1000797366 2018-Q1 NX11100,UG10000-WISD,UG12210
# 3: 1000797366 2018-Q1         UG10000-WISD,UG12210
# 4: 1000797366 2018-Q1         UG10000-WISD,UG12210
# 5: 1000797366 2018-Q1                      UG12210
又爬满兰若 2025-02-19 20:04:12

我们可以使用 lapply 在list上循环,检查是否有是否有vector> vector('v1'的任何值) )在每个列表元素和子集

library(data.table)
v1 <- c("UG12210","UG10000-WISD")
dt1[Period %chin% c("2018-Q1") & 
      unlist(lapply(Product, function(x) any(v1 %chin% x)))]

输出中

          Id  Period                      Product
       <char>  <char>                       <list>
1: 1000797366 2018-Q1                 UG10000-WISD
2: 1000797366 2018-Q1 NX11100,UG10000-WISD,UG12210
3: 1000797366 2018-Q1         UG10000-WISD,UG12210
4: 1000797366 2018-Q1         UG10000-WISD,UG12210
5: 1000797366 2018-Q1                      UG12210

We can loop over the list with lapply, check if there are any values from the vector ('v1') in each of the list elements and subset

library(data.table)
v1 <- c("UG12210","UG10000-WISD")
dt1[Period %chin% c("2018-Q1") & 
      unlist(lapply(Product, function(x) any(v1 %chin% x)))]

-output

          Id  Period                      Product
       <char>  <char>                       <list>
1: 1000797366 2018-Q1                 UG10000-WISD
2: 1000797366 2018-Q1 NX11100,UG10000-WISD,UG12210
3: 1000797366 2018-Q1         UG10000-WISD,UG12210
4: 1000797366 2018-Q1         UG10000-WISD,UG12210
5: 1000797366 2018-Q1                      UG12210
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文