我如何在r中获得特定条件的行
说我有一个df
。
我想同时获得两个条件的id
:
id 's code
应该竞争一个资本我
,无论跟随它的数字如何。例如i11
,i31
...id
'scode
应该竞争特定代码
:e12
。
在下面的示例中,已过滤的ID
应为id = 1
和id = 2
。因为它们都包含i
和e12
。
相同id
在示例中表示同一组中的意思。
structure(list(id = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3,
3, 3, 3, 4, 4, 4, 4, 4, 4), diag = c("main", "other", "main",
"other", "main", "other", "main", "other", "main", "other", "main",
"other", "main", "other", "main", "other", "main", "other", "main",
"other", "main", "other"), code = c("I11", "E12", "I11", "Q34",
"I31", "C33", "E12", "I34", "E12", "I45", "E12", "Z11", "E13",
"Z12", "E14", "Z13", "I25", "E1", "I25", "E2", "I25", "E3")), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -22L), groups = structure(list(
id = c(1, 2, 3, 4), .rows = structure(list(1:6, 7:10, 11:16,
17:22), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L), .drop = TRUE))
> df
# A tibble: 22 × 3
# Groups: id [4]
id diag code
<dbl> <chr> <chr>
1 1 main I11
2 1 other E12
3 1 main I11
4 1 other Q34
5 1 main I31
6 1 other C33
7 2 main E12
8 2 other I34
9 2 main E12
10 2 other I45
# … with 12 more rows
Say that I have a df
.
I want to get the id
with two conditions at the same time:
the
id
'scode
should contaions a capitalI
, regardless of the number that follows it. For exampleI11
,I31
...the
id
'scode
should contaions specificcode
:E12
.
On the example below, the filtered id
should be id = 1
and id = 2
. Because they all contain I
and E12
.
Same id
in the example means in the same group.
structure(list(id = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3,
3, 3, 3, 4, 4, 4, 4, 4, 4), diag = c("main", "other", "main",
"other", "main", "other", "main", "other", "main", "other", "main",
"other", "main", "other", "main", "other", "main", "other", "main",
"other", "main", "other"), code = c("I11", "E12", "I11", "Q34",
"I31", "C33", "E12", "I34", "E12", "I45", "E12", "Z11", "E13",
"Z12", "E14", "Z13", "I25", "E1", "I25", "E2", "I25", "E3")), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -22L), groups = structure(list(
id = c(1, 2, 3, 4), .rows = structure(list(1:6, 7:10, 11:16,
17:22), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L), .drop = TRUE))
> df
# A tibble: 22 × 3
# Groups: id [4]
id diag code
<dbl> <chr> <chr>
1 1 main I11
2 1 other E12
3 1 main I11
4 1 other Q34
5 1 main I31
6 1 other C33
7 2 main E12
8 2 other I34
9 2 main E12
10 2 other I45
# … with 12 more rows
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您的问题建议 [1] 您对满足您两个条件的
id
值最感兴趣;在这种情况下,您实际上不需要在数据框架上工作,而仅在向量上工作:创建两个字符串向量的
df $ id
s,一个针对每个条件,然后提取他们的共享术语(这也可以删除重复项,因此输出为1 2
)。这只需要基础r,我怀疑比任何基于桌子的方法都要高得多(尤其是在涉及分组或枢轴的情况下)。
[1]如果不是,则仍然可以使用上面的行来索引表行,例如,
这将带有
df
的所有行,并带有id
1或2(包括那些具有匹配ID
但不同代码
的行,例如“ Q34”)。Your question suggests[1] that you're mostly interested in the
id
values that fulfil both of your conditions; in which case, you don't really need to work on a data frame, but only on vectors:You create two string vectors of
df$id
s, one for each condition, then extract their shared terms (this also removes duplicates, so the output is1 2
).This requires nothing more than base R and I suspect is much more efficient than any table-based approach (especially if grouping or pivots are involved).
[1] and if you're not, you can still use the line above to index a table row-wise, e.g.
This will return all the rows of
df
with anid
of 1 or 2 (including, however, those rows that have a matchingid
but a differentcode
, e.g. "Q34").为了澄清,您想要所有ixx或“ e12”的记录?您的“同时”使我有点扔掉。
如果这是您的意思,那应该得到您的结果。
使用库(tidyverse):
含义过滤记录,其中列代码包含i或记录代码等于e12。
To clarify, you want all records that are Ixx OR "E12"? Your 'at the same time' threw me off a little.
If this is what you mean this should get your results.
Using library(tidyverse):
meaning filter records where column code contains I OR records where code equals E12.
您可以做:
输出:
或者,如果您只想添加
dintife(ID)
之后过滤器(...)
:You could do:
Output:
Or if you just want the groups add
distinct(id)
after thefilter(...)
: