在Dbplyr管道中使用弦匹配像GREPL
Dbplyr非常方便,因为它将dplyr代码转换为SQL。这确实很好,除非没有。例如,我试图通过将字符串与列中的值部分匹配来子集。除了Postgres外,DBPlyr似乎尚未实现。我是否缺少一些可以完成以下操作的{stringr}函数:
library(dplyr, warn.conflicts = FALSE)
library(DBI)
data("flights", package = "nycflights13")
con <- dbConnect(RSQLite::SQLite(), ":memory:")
dbWriteTable(con, "flights", flights)
## works in dplyr
flights %>%
filter(grepl("N", tailnum))
#> # A tibble: 334,264 × 19
#> year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time
#> <int> <int> <int> <int> <int> <dbl> <int> <int>
#> 1 2013 1 1 517 515 2 830 819
#> 2 2013 1 1 533 529 4 850 830
#> 3 2013 1 1 542 540 2 923 850
#> 4 2013 1 1 544 545 -1 1004 1022
#> 5 2013 1 1 554 600 -6 812 837
#> 6 2013 1 1 554 558 -4 740 728
#> 7 2013 1 1 555 600 -5 913 854
#> 8 2013 1 1 557 600 -3 709 723
#> 9 2013 1 1 557 600 -3 838 846
#> 10 2013 1 1 558 600 -2 753 745
#> # … with 334,254 more rows, and 11 more variables: arr_delay <dbl>,
#> # carrier <chr>, flight <int>, tailnum <chr>, origin <chr>, dest <chr>,
#> # air_time <dbl>, distance <dbl>, hour <dbl>, minute <dbl>, time_hour <dttm>
## no function translation to grepl
tbl(con, "flights") %>%
filter(grepl("N", tailnum)) %>%
collect()
#> Error: no such function: grepl
## Also not implemented for stringr
library(stringr)
tbl(con, "flights") %>%
filter(str_detect(tailnum, "N")) %>%
collect()
#> Error: str_detect() is not available in this SQL variant
dbDisconnect(con)
dbplyr is very handy as it convert dplyr code into SQL. This works really well except when it doesn't. For example i am trying to subset rows by partially matching a string against values in a column. With exception of postgres, it appears as though this isn't yet implemented in dbplyr. Am I missing some {stringr} function that would accomplish the below:
library(dplyr, warn.conflicts = FALSE)
library(DBI)
data("flights", package = "nycflights13")
con <- dbConnect(RSQLite::SQLite(), ":memory:")
dbWriteTable(con, "flights", flights)
## works in dplyr
flights %>%
filter(grepl("N", tailnum))
#> # A tibble: 334,264 × 19
#> year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time
#> <int> <int> <int> <int> <int> <dbl> <int> <int>
#> 1 2013 1 1 517 515 2 830 819
#> 2 2013 1 1 533 529 4 850 830
#> 3 2013 1 1 542 540 2 923 850
#> 4 2013 1 1 544 545 -1 1004 1022
#> 5 2013 1 1 554 600 -6 812 837
#> 6 2013 1 1 554 558 -4 740 728
#> 7 2013 1 1 555 600 -5 913 854
#> 8 2013 1 1 557 600 -3 709 723
#> 9 2013 1 1 557 600 -3 838 846
#> 10 2013 1 1 558 600 -2 753 745
#> # … with 334,254 more rows, and 11 more variables: arr_delay <dbl>,
#> # carrier <chr>, flight <int>, tailnum <chr>, origin <chr>, dest <chr>,
#> # air_time <dbl>, distance <dbl>, hour <dbl>, minute <dbl>, time_hour <dttm>
## no function translation to grepl
tbl(con, "flights") %>%
filter(grepl("N", tailnum)) %>%
collect()
#> Error: no such function: grepl
## Also not implemented for stringr
library(stringr)
tbl(con, "flights") %>%
filter(str_detect(tailnum, "N")) %>%
collect()
#> Error: str_detect() is not available in this SQL variant
dbDisconnect(con)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我们可能会使用
%喜欢%
- 输出
We may use
%like%
-output