我很难理解这三个软件包的行为: rsqlite
, dbi
和 dbplyr
。
我正在使用 mtcars
dataframe,并且正在使用 tibble
软件包加载它,将CAR名称添加为行之一。
library(tibble)
mtcars <- tibble::as_tibble(mtcars, rownames = 'car')
我还创建了自己的表格,其中包含有关所有这些汽车的建造位置的信息:
car <- c("Mazda RX4", "Mazda RX4 Wag", "Datsun 710", "Hornet 4 Drive", "Hornet Sportabout", "Valiant", "Duster 360", "Merc 240D", "Merc 230", "Merc 280", "Merc 280C", "Merc 450SE", "Merc 450SL", "Merc 450SLC", "Cadillac Fleetwood", "Lincoln Continental", "Chrysler Imperial", "Fiat 128", "Honda Civic", "Toyota Corolla", "Toyota Corona", "Dodge Challenger", "AMC Javelin", "Camaro Z28", "Pontiac Firebird", "Fiat X1-9", "Porsche 914-2", "Lotus Europa", "Ford Pantera L", "Ferrari Dino", "Maserati Bora", "Volvo 142E")
origin <- c("Japan", "Japan", "Japan", "United States", "United States", "United States", "United States", "Germany", "Germany", "Germany", "Germany", "Germany", "Germany", "Germany", "United States", "United States", "United States", "Italy", "Japan", "Japan", "Japan", "United States", "United States", "United States", "United States", "Italy", "Germany", "British", "United States", "Italy", "Italy", "Sweden")
cars_origin_tbl <- tibble(car, origin)
通过创建这两个表,我可以使用R练习加入R。这是两个不同方法的示例,以编写 semi_join
:
mtcars %>% dplyr::semi_join(cars_origin_tbl, by = 'car') -> mtcars_semi_join
# another method using value matching
mtcars_semi_join <- mtcars %>%
filter(car %in% cars_origin_tbl$car)
但是我真的很想使用SQL语法更愿意加入加入。我有一项即将到来的工作,将涉及更多SQL,并且我真的想在开始之前学习。对我来说,最好的学习方法是做这些事情。在网上观看示例而没有自己做,这并不为我留下。
因此,在这里,我试图使用这些R软件包创建SQL版本的内容。
library(dbplyr)
library(RSQLite)
library(DBI)
来自安德鲁·库奇(Andrew Couch)的YouTube视频非常有帮助。 这是链接
sql_mtcars <- mtcars
con <- RSQLite::dbConnect(SQLite(), ":memory:")
dplyr::copy_to(con, sql_mtcars)
sql_mtcars_db <- tbl(con, "sql_mtcars")
sql_mtcars_db %>%
dplyr::select(car, mpg, wt) %>%
dplyr::show_query()
。 > show_query 事情很有趣。它显示 dplyr
代码的SQL版本。
另一个帖子在此回答,并向我展示了如何使用 dblisttables列出的表格
。
dbListTables(con)
此 dbgetquery
将运行代码并获取预期的输出。
DBI::dbGetQuery(con, '
SELECT sql_mtcars.mpg, sql_mtcars.wt
FROM sql_mtcars
')
这是我困惑的地方。我正在尝试将 CARS_ORIGIN_TBL
放入 rsqlite
中,然后练习加入两个不同的表。
cars_origin_tbl <- cars_origin_tbl
con <- dbConnect(SQLite(), ":memory:")
copy_to(con, cars_origin_tbl)
cars_origin_tbl_db <- tbl(con, "cars_origin_tbl")
dbListTables(con)
在我运行 dblisttables()
之后,就像我的第一个 sql_mtcars
被遗忘并消失了。
现在此 dbgetquery
正常工作,因为我只是在调用较新的表。
DBI::dbGetQuery(con, '
SELECT cars_origin_tbl.car
FROM cars_origin_tbl
')
但是,此其他 dbgetquery
不起作用,因为我指的是两个表,我认为这两个表都将在 rsqlite
中。我收到一条消息,说没有 sql_mtcars
。
DBI::dbGetQuery(con, '
SELECT sql_mtcars.mpg, sql_mtcars.wt
FROM sql_mtcars
LEFT JOIN cars_origin_tbl
ON sql_mtcars.car = cars_origin_tbl.car;
')
Error: no such table: sql_mtcars
我基本上是在尝试将两个桌子纳入连接,以便我可以练习加入它们,但是我不确定如何将两个桌子放在那里而不会消失。
con <- RSQLite::dbConnect(SQLite(), ":memory:")
con2 <- RSQLite::dbConnect(SQLite(), ":memory:")
dplyr::copy_to(c(con, con2), cars_origin_tbl)
Error in UseMethod("copy_to") :
no applicable method for 'copy_to' applied to an object of class "list"
I'm having a hard time understanding the behavior of these three packages: RSQLite
, DBI
, and dbplyr
.
I'm using the mtcars
dataframe and I'm loading it with the tibble
package to add the car name as one of the rows.
library(tibble)
mtcars <- tibble::as_tibble(mtcars, rownames = 'car')
I also created my own table that has information about where all these cars were built:
car <- c("Mazda RX4", "Mazda RX4 Wag", "Datsun 710", "Hornet 4 Drive", "Hornet Sportabout", "Valiant", "Duster 360", "Merc 240D", "Merc 230", "Merc 280", "Merc 280C", "Merc 450SE", "Merc 450SL", "Merc 450SLC", "Cadillac Fleetwood", "Lincoln Continental", "Chrysler Imperial", "Fiat 128", "Honda Civic", "Toyota Corolla", "Toyota Corona", "Dodge Challenger", "AMC Javelin", "Camaro Z28", "Pontiac Firebird", "Fiat X1-9", "Porsche 914-2", "Lotus Europa", "Ford Pantera L", "Ferrari Dino", "Maserati Bora", "Volvo 142E")
origin <- c("Japan", "Japan", "Japan", "United States", "United States", "United States", "United States", "Germany", "Germany", "Germany", "Germany", "Germany", "Germany", "Germany", "United States", "United States", "United States", "Italy", "Japan", "Japan", "Japan", "United States", "United States", "United States", "United States", "Italy", "Germany", "British", "United States", "Italy", "Italy", "Sweden")
cars_origin_tbl <- tibble(car, origin)
By creating these two tables I can practice joins using R. Here is an example of two different ways to write a semi_join
:
mtcars %>% dplyr::semi_join(cars_origin_tbl, by = 'car') -> mtcars_semi_join
# another method using value matching
mtcars_semi_join <- mtcars %>%
filter(car %in% cars_origin_tbl$car)
But I really want to feel more comfortable making joins using SQL syntax. I have an upcoming job that will involve more SQL and I really want to study before I start. For me, the best way to study is by doing this stuff. Watching example online without doing it myself just doesn't stick for me.
So here I am trying to create a SQL version of stuff using these R packages.
library(dbplyr)
library(RSQLite)
library(DBI)
This YouTube video from Andrew Couch has been very helpful. Here is the link.
sql_mtcars <- mtcars
con <- RSQLite::dbConnect(SQLite(), ":memory:")
dplyr::copy_to(con, sql_mtcars)
sql_mtcars_db <- tbl(con, "sql_mtcars")
sql_mtcars_db %>%
dplyr::select(car, mpg, wt) %>%
dplyr::show_query()
This show_query
thing is interesting. It shows the SQL version of the dplyr
code.
data:image/s3,"s3://crabby-images/134ac/134acd0d3293f8b717b7504dbf281747ff0ff60e" alt="enter image description here"
Another SO post here was answered and showed me how to find the tables listed using dbListTables
.
dbListTables(con)
data:image/s3,"s3://crabby-images/e46f0/e46f07155552042a476df035885d3ffdde4c6962" alt="enter image description here"
This dbGetQuery
will run the code and get the expected output.
DBI::dbGetQuery(con, '
SELECT sql_mtcars.mpg, sql_mtcars.wt
FROM sql_mtcars
')
data:image/s3,"s3://crabby-images/cca95/cca959764bbc6cc8dd7c50444507dfde076fae01" alt="enter image description here"
Here is where I'm confused. I'm trying to put the cars_origin_tbl
into RSQLite
and then practice joining two different tables.
cars_origin_tbl <- cars_origin_tbl
con <- dbConnect(SQLite(), ":memory:")
copy_to(con, cars_origin_tbl)
cars_origin_tbl_db <- tbl(con, "cars_origin_tbl")
dbListTables(con)
After I run dbListTables()
it's like my first sql_mtcars
was forgotten and disappeared.
data:image/s3,"s3://crabby-images/9633a/9633a64fc4f85fd2d8393dc680c4f0c5093ecb49" alt="enter image description here"
Now this dbGetQuery
works fine because I'm just calling the newer table.
DBI::dbGetQuery(con, '
SELECT cars_origin_tbl.car
FROM cars_origin_tbl
')
But this other dbGetQuery
doesn't work because I'm referring to both tables, which I thought would both be in RSQLite
at this point. I get a message saying there is no sql_mtcars
.
DBI::dbGetQuery(con, '
SELECT sql_mtcars.mpg, sql_mtcars.wt
FROM sql_mtcars
LEFT JOIN cars_origin_tbl
ON sql_mtcars.car = cars_origin_tbl.car;
')
Error: no such table: sql_mtcars
I'm basically trying to get two tables into the connection so I can practice joining them, but I'm not sure how to get both tables in there without one disappearing.
con <- RSQLite::dbConnect(SQLite(), ":memory:")
con2 <- RSQLite::dbConnect(SQLite(), ":memory:")
dplyr::copy_to(c(con, con2), cars_origin_tbl)
Error in UseMethod("copy_to") :
no applicable method for 'copy_to' applied to an object of class "list"
发布评论
评论(1)
解决方案是为两个表都使用相同的连接。以下工作:
您现在可以半加入这两个表:
生成以下SQL:
作为附带说明,
sql_mtcars
和sql_mtcars_db
是不同的r对象。第一个是本地R内存中的一个表。第二个是存储在数据库中的远程表的本地访问点。The solution is to use the same connection for both tables. The following works:
You can now semi-join these two tables:
Producing the following SQL:
As a side note,
sql_mtcars
andsql_mtcars_db
are different R objects. The first is a table in local R memory. The second is a local access point to a remote table stored in a database.