将动态值添加到 RMySQL getQuery 中
是否可以将 RMySQL 包中的值传递到 dbGetQuery 中的查询中。
例如,如果我在字符向量中有一组值:
df <- c('a','b','c')
并且我想循环遍历这些值以从数据库中为每个值提取特定值。
library(RMySQL)
res <- dbGetQuery(con, "SELECT max(ID) FROM table WHERE columna='df[2]'")
当我尝试添加对值的引用时,出现错误。想知道是否可以在查询中添加 R 对象的值。
Is it possible to pass a value into the query in dbGetQuery
from the RMySQL package.
For example, if I have a set of values in a character vector:
df <- c('a','b','c')
And I want to loop through the values to pull out a specific value from a database for each.
library(RMySQL)
res <- dbGetQuery(con, "SELECT max(ID) FROM table WHERE columna='df[2]'")
When I try to add the reference to the value I get an error. Wondering if it is possible to add a value from an R object in the query.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一种选择是在循环内操作 SQL 字符串。目前您有一个字符串文字,
'df[2]'
不会被 R 解释为字符以外的任何内容。我的答案中会有一些含糊之处,因为 Q 中的 df 显然不是数据框(它是字符向量!)。像这样的东西会做你想做的事。将输出存储在数字向量中:
现在我们可以循环 df 的元素来执行查询三次。我们可以通过两种方式设置循环:i) 将
i
作为数字,用于引用df
和out
的元素,或者ii) 将i
依次作为df
的每个元素(即a
,然后b
,...) 。我将在下面展示这两个版本。或者:
您使用哪种取决于个人品味。第二 (ii) 版本要求您在输出向量
out
上设置与out
内的数据相同的名称。话虽如此,假设您的实际 SQL 查询与您发布的查询类似,您不能在单个 SQL 语句中使用 GROUP BY 子句在计算之前对数据进行分组吗?代码>最大(ID)?像这样在数据库中做简单的事情可能会快得多。不幸的是,我没有可用的 MySQL 实例,而且我的 SQL-fu 目前很弱,所以我无法给出这样的例子。
One option is to manipulate the SQL string within the loop. At the moment you have a string literal, the
'df[2]'
is not interpreted by R as anything other than characters. There are going to be some ambiguities in my answer, becausedf
in your Q is patently not a data frame (it is a character vector!). Something like this will do what you want.Store the output in a numeric vector:
Now we can loop over the elements of
df
to execute your query three times. We can set the loop up two ways: i) withi
as a number which we use to reference the elements ofdf
andout
, or ii) withi
as each element ofdf
in turn (i.e.a
, thenb
, ...). I will show both versions below.OR:
Which you use will depend on personal taste. The second (ii) version requires you to set names on the output vector
out
that are the same as the data insideout
.Having said all that, assuming your actual SQL Query is similar to the one you post, can't you do this in a single SQL statement, using the
GROUP BY
clause, to group the data before computingmax(ID)
? Doing simple things in the data base like this will likely be much quicker. Unfortunately, I don't have a MySQL instance around to play with and my SQL-fu is weak currently, so I can't given an example of this.您还可以使用 sprintf 命令来解决问题(这是我在构建 Shiny 应用程序时使用的命令)。
df <- c('a','b','c')
res
<- dbGetQuery(con, sprintf("从表中选择 max(ID) WHERE columna=' %s'"),df())
按照这些思路应该可以工作。
You could also use the
sprintf
command to solve the issue (it's what I use when building Shiny Apps).df <- c('a','b','c')
res <- dbGetQuery(con, sprintf("SELECT max(ID) FROM table WHERE columna='%s'"),df())
Something along those lines should work.