如何在r中使用类型列表（）的列上的条件过滤数据表行，r

发布于 2025-02-12 20:04:12 字数 3539 浏览 2 评论 0原文

我有一个data.table，看起来像 （仅显示许多列中的几列） -

ID	期间	产品
1000797366	2018-q1	UG10000000000000000-WISD
1000797366	2018-Q1	NX11100100 ，UG10000-WISD，UG12210
1000797366	2018-Q1	UG10000-WISD，UG12210
1000797366	2018-Q1	UG10000-WISD，UG12210
1000797366	2018-Q1	UG12210
1000797366	2018-Q1	NX11100
1000797366	2018-Q1 2018-Q1	NX11100 NX11100

在此处“产品”列的“产品”，这样，以便以后的用法。

但是我在根据产品列上的条件过滤行时面临问题。

我想要的是过滤所有行，而其他文件的值可以是该向量C（“ UG12210”，“ UG10000-WISD”）的所有行例如C中的周期（“ 2018-Q1”）。

因此，我的输出应该看起来像这样

-ID	期	产品
1000797366	2018-Q1	UG10000-WISD
1000797366	2018-Q1	NX11100，UG10000-WISD，UG122210
1000797366	2018 -Q1	Wis121111211012111118
UG1000000000000000000000000000000000000000000-21000000个	-Q111118 -Q1	20186666110 YUG12110姐妹 -
1000797366	2018-Q1	UG12210

，但这并没有以某种方式发生，我尝试了以下条件，但没有奏效。

data_test[Period %in% c("2018-Q1") & is.element("UG12210",Product),]

data_test[Period %in% c("2018-Q1") & Product %in% c("UG12210"),]

关于如何实现它的任何潜在客户都会有很大的帮助。谢谢！

下面是数据使用 dput（）的数据

structure(
  list(
    Id = c("1000797366", "1000797366", "1000797366", "1000797366", "1000797366", "1000797366", "1000797366"),
    Period = c("2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1"),
    Product = list("UG10000-WISD", c("NX11100", "UG10000-WISD", "UG12210"), c("UG10000-WISD", "UG12210"),
      c("UG10000-WISD", "UG12210"), "UG12210", "NX11100", "NX11100")
  ),
  row.names = c(NA,-7L),
  class = c("data.table", "data.frame"),
  .internal.selfref = < pointer:0x562f66275020 >
)

原文

I have a data.table which looks something like this (showing just few columns out of many) -

Id	Period	Product
1000797366	2018-Q1	UG10000-WISD
1000797366	2018-Q1	NX11100, UG10000-WISD, UG12210
1000797366	2018-Q1	UG10000-WISD, UG12210
1000797366	2018-Q1	UG10000-WISD, UG12210
1000797366	2018-Q1	UG12210
1000797366	2018-Q1	NX11100
1000797366	2018-Q1	NX11100

Here the column "Product" is of type list() as I've to keep it this way for some later usage.

But I am facing a problem while filtering the rows based on a condition on Product column.

What I want is to filter all rows where value of product can be any of this vector c("UG12210","UG10000-WISD") along-with other filer
such as Period in c("2018-Q1").

So my output should look something like this -

Id	Period	Product
1000797366	2018-Q1	UG10000-WISD
1000797366	2018-Q1	NX11100, UG10000-WISD, UG12210
1000797366	2018-Q1	UG10000-WISD, UG12210
1000797366	2018-Q1	UG10000-WISD, UG12210
1000797366	2018-Q1	UG12210

But somehow this is not happening, I tried following conditions but none worked.

data_test[Period %in% c("2018-Q1") & is.element("UG12210",Product),]

data_test[Period %in% c("2018-Q1") & Product %in% c("UG12210"),]

Any leads on how it can be achieved will be of great help. Thanks!

Below is the data using dput() for the datatable

structure(
  list(
    Id = c("1000797366", "1000797366", "1000797366", "1000797366", "1000797366", "1000797366", "1000797366"),
    Period = c("2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1", "2018-Q1"),
    Product = list("UG10000-WISD", c("NX11100", "UG10000-WISD", "UG12210"), c("UG10000-WISD", "UG12210"),
      c("UG10000-WISD", "UG12210"), "UG12210", "NX11100", "NX11100")
  ),
  row.names = c(NA,-7L),
  class = c("data.table", "data.frame"),
  .internal.selfref = < pointer:0x562f66275020 >
)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

回心转意 2025-02-19 20:04:12

您可以使用sapply函数来检查vals中的任何值是否在product中：

vals = c("UG12210","UG10000-WISD")

dt[Period %chin% "2018-Q1" & sapply(Product, function(v) any(vals %chin% v))]

#            Id  Period                      Product
# 1: 1000797366 2018-Q1                 UG10000-WISD
# 2: 1000797366 2018-Q1 NX11100,UG10000-WISD,UG12210
# 3: 1000797366 2018-Q1         UG10000-WISD,UG12210
# 4: 1000797366 2018-Q1         UG10000-WISD,UG12210
# 5: 1000797366 2018-Q1                      UG12210

You can use sapply function to check if any of the values in vals is in Product for each row:

vals = c("UG12210","UG10000-WISD")

dt[Period %chin% "2018-Q1" & sapply(Product, function(v) any(vals %chin% v))]

#            Id  Period                      Product
# 1: 1000797366 2018-Q1                 UG10000-WISD
# 2: 1000797366 2018-Q1 NX11100,UG10000-WISD,UG12210
# 3: 1000797366 2018-Q1         UG10000-WISD,UG12210
# 4: 1000797366 2018-Q1         UG10000-WISD,UG12210
# 5: 1000797366 2018-Q1                      UG12210

回复收藏 0 原文

又爬满兰若 2025-02-19 20:04:12

我们可以使用 lapply 在list上循环，检查是否有是否有vector> vector（'v1'的任何值））在每个列表元素和子集

library(data.table)
v1 <- c("UG12210","UG10000-WISD")
dt1[Period %chin% c("2018-Q1") & 
      unlist(lapply(Product, function(x) any(v1 %chin% x)))]

输出中

          Id  Period                      Product
       <char>  <char>                       <list>
1: 1000797366 2018-Q1                 UG10000-WISD
2: 1000797366 2018-Q1 NX11100,UG10000-WISD,UG12210
3: 1000797366 2018-Q1         UG10000-WISD,UG12210
4: 1000797366 2018-Q1         UG10000-WISD,UG12210
5: 1000797366 2018-Q1                      UG12210

We can loop over the list with lapply, check if there are any values from the vector ('v1') in each of the list elements and subset

library(data.table)
v1 <- c("UG12210","UG10000-WISD")
dt1[Period %chin% c("2018-Q1") & 
      unlist(lapply(Product, function(x) any(v1 %chin% x)))]

-output

          Id  Period                      Product
       <char>  <char>                       <list>
1: 1000797366 2018-Q1                 UG10000-WISD
2: 1000797366 2018-Q1 NX11100,UG10000-WISD,UG12210
3: 1000797366 2018-Q1         UG10000-WISD,UG12210
4: 1000797366 2018-Q1         UG10000-WISD,UG12210
5: 1000797366 2018-Q1                      UG12210

回复收藏 0 原文

~没有更多了~