Rails3:使用 .where() 等哈希替换执行 SQL

发布于 2024-11-06 21:57:01 字数 797 浏览 2 评论 0原文

使用这样的简单模型,

class Model < ActiveRecord::Base
   # ...
end

我们可以执行这样的查询,

Model.where(["name = :name and updated_at >= :D", \
  { :D => (Date.today - 1.day).to_datetime, :name => "O'Connor" }])

其中哈希中的值将被替换到最终的 SQL 语句中,并根据底层数据库引擎进行适当的转义。

我想知道 SQL 执行的类似功能,例如:(

ActiveRecord::Base.connection.execute( \
    ["update models set name = :name, hired_at = :D where id = :id;"], \
    { :id => 73465, :D => DateTime.now, :name => "O'My God" }] \
  ) # THIS CODE IS A FANTASY. NOT WORKING.

请不要通过加载模型对象、修改然后保存来解决该示例!该示例只是我想拥有/了解的功能的说明。集中精力主题!)

最初的问题是我想将大量(数千行)数据插入数据库。我想使用 ActiveRecord 框架的 SQL 抽象的一些功能,但我不想使用基于 ActiveRecord::Base 的模型对象,因为它们太慢了! (对于我当前的问题,每秒 8 个查询。)

With a simple model like that

class Model < ActiveRecord::Base
   # ...
end

we can do queries like that

Model.where(["name = :name and updated_at >= :D", \
  { :D => (Date.today - 1.day).to_datetime, :name => "O'Connor" }])

Where the values in the hash will be substituted into the final SQL statement with proper escaping depending on the underlying database engine.

I would like to know a similar feature for SQL execution like:

ActiveRecord::Base.connection.execute( \
    ["update models set name = :name, hired_at = :D where id = :id;"], \
    { :id => 73465, :D => DateTime.now, :name => "O'My God" }] \
  ) # THIS CODE IS A FANTASY. NOT WORKING.

(Please do not solve the example with loading a Model object, modifying and then saving! The example is only an illustration for the feature I would like to have / know. Concentrate on the subject!)

The original problem is that I want to insert large amount (many thousand lines) of data into the database. I want to use some features of the SQL abstraction of the ActiveRecord framework but I don't want to use model objects based on ActiveRecord::Base because they are damn slow! (8 queries per second for my current problem.)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

柒夜笙歌凉 2024-11-13 21:57:01
query = ActiveRecord::Base.connection.raw_connection.prepare("INSERT INTO users (name) VALUES(:name)")
query.execute(:name => 'test_name')
query.close
query = ActiveRecord::Base.connection.raw_connection.prepare("INSERT INTO users (name) VALUES(:name)")
query.execute(:name => 'test_name')
query.close
倾城月光淡如水﹏ 2024-11-13 21:57:01

使用批量插入的具体代码示例扩展@peufeu解决方案:

users_places = []
users_values = []
timestamp = Time.now.strftime('%Y-%m-%d %H:%M:%S')
params[:users].each do |user|
    users_places << "(?,?,?,?)"
    users_values << user[:name] << user[:punch_line] << timestamp << timestamp
end

bulk_insert_users_sql_arr = ["INSERT INTO users (name, punch_line, created_at, updated_at) VALUES #{users_places.join(", ")}"] + users_values
begin
    sql = ActiveRecord::Base.send(:sanitize_sql_array, bulk_insert_users_sql_arr)
    ActiveRecord::Base.connection.execute(sql)
rescue
    "something went wrong with the bulk insert sql query"
end

这是 参考ActiveRecord::Base中的sanitize_sql_array方法,它通过转义字符串中的单引号来生成正确的查询字符串。例如,punch_line“不要让他们让你失望”将变成“不要让他们让你失望”。

Extending the @peufeu solution with concrete code example for bulk insert:

users_places = []
users_values = []
timestamp = Time.now.strftime('%Y-%m-%d %H:%M:%S')
params[:users].each do |user|
    users_places << "(?,?,?,?)"
    users_values << user[:name] << user[:punch_line] << timestamp << timestamp
end

bulk_insert_users_sql_arr = ["INSERT INTO users (name, punch_line, created_at, updated_at) VALUES #{users_places.join(", ")}"] + users_values
begin
    sql = ActiveRecord::Base.send(:sanitize_sql_array, bulk_insert_users_sql_arr)
    ActiveRecord::Base.connection.execute(sql)
rescue
    "something went wrong with the bulk insert sql query"
end

Here is the reference to sanitize_sql_array method in ActiveRecord::Base, it generates the proper query string by escaping the single quotes in the strings. For example the punch_line "Don't let them get you down" will become "Don\'t let them get you down".

暗藏城府 2024-11-13 21:57:01

Yes you could do raw SQL, but checkout the ar-extensions gem that helps with batch inserts:

https://github.com/zdennis/ar-extensions

Here's a post on it, and various other techniques:

http://www.coffeepowered.net/2009/01/23/mass-inserting-data-in-rails-without-killing-your-performance/

千笙结 2024-11-13 21:57:01

对于 INSERT,使用长 VALUES 子句(如 Simon 的链接所示)对它们进行批处理是最快的方法(除非您想生成文本文件并使用 MySQL 的 LOAD DATA INFILE 将其加载到数据库中)。但是您必须非常小心转义文本值(示例中没有这样做)。

我问“你使用什么数据库”,因为它对于大规模更新确实很重要。

例如,您可以在 postgres 上执行此操作(我相信 SQL Server 将“columnX”更改为“colX”):

UPDATE foo 
JOIN (VALUES (1,2),(3,4),... long list) v ON (foo.id=v.column1)
SET foo.bar = v.column2

并且您可以使用单个语句非常快地更新行负载。

如果您不需要 Ruby 对数据执行一些特定于 Ruby 的魔法,将数据从一个数据库传输到另一个数据库的最快方法是将其导出为文本文件(CSV 或制表符分隔),然后将其加载到另一个数据库上DB(MySQL 上的 LOAD DATA INFILE),可能在临时表中,并使用 SQL 进行批量处理。

编辑:这是我在Python中执行此操作的方法:

sql = [ "INSERT INTO foo (column list) VALUES " ]
values = []

for tuple in tuple_list:
     append "(?,?,?,?)" to sql
     extend values list with tuple

然后将sql连接到字符串中,您将得到“INSERT INTO foo(列列表)VALUES(?,?,?,?),(?,?,?,?),(? ,?,?,?)" 和 "(?,?,?,?)" 重复次数与要插入的行数相同。

然后“values”包含 (a1,b1,c1,d1,a2,b2,c2,d2,a3,b3,c3,d3) 的列表,其中 an,bn,cn,dn 是您要为行插入的元组名词每一个都对应于 sql 字符串中的一个占位符。

然后将其传递给通常的“使用参数执行查询”函数,该函数将照常处理引用和转义。

For INSERTs, batching them using a long VALUES clause (as shown by Simon's link) is the fastest way (unless you want to generate a text file and load it in your database with MySQL's LOAD DATA INFILE). But you have to be very careful about escaping your text values (which is not done in the example).

I was asking "what database are you using" because it does matter for mass UPDATEs.

For instance, you can do this on postgres (and I believe SQL Server changing "columnX" to "colX" ):

UPDATE foo 
JOIN (VALUES (1,2),(3,4),... long list) v ON (foo.id=v.column1)
SET foo.bar = v.column2

And you can update a load of rows using a single statement, very fast.

If you don't need Ruby to perform some Ruby-specific magic on your data, the fastest way to transfer data from one DB to a different one is to export as a text file (CSV or tab separated), load it on the other DB (LOAD DATA INFILE on MySQL), perhaps in a temporary table, and bulk process using SQL.

EDIT : Here's how I do this in Python :

sql = [ "INSERT INTO foo (column list) VALUES " ]
values = []

for tuple in tuple_list:
     append "(?,?,?,?)" to sql
     extend values list with tuple

Then join sql into a string, you get "INSERT INTO foo (column list) VALUES (?,?,?,?),(?,?,?,?),(?,?,?,?)" with the "(?,?,?,?)" repeated as many times as you have lines to insert.

Then "values" contains a list of (a1,b1,c1,d1,a2,b2,c2,d2,a3,b3,c3,d3) with an,bn,cn,dn being the tuples you want to insert for line n. Each one corresponds to a placeholder in the sql string.

Then pass this to the usual "execute query with parameters" function which will handle quoting and escaping as usual.

白衬杉格子梦 2024-11-13 21:57:01

最近,我在使用 mysql2 gem 将 100K+ 条记录插入到 Rails 4 应用程序的 MySQL 数据库中时遇到了类似的问题。数据包含在插入之前必须进行清理的字符。

我最终采用的解决方案是 https://www.coffeepowered.net/2009/01/23/mass-inserting-data-in-rails-without-killing-your-performance/

以下是上述链接中的相关代码块:

TIMES = 10000
inserts = []
TIMES.times do
  inserts.push "(3.0, '2009-01-23 20:21:13', 2, 1)"
end
sql = "INSERT INTO user_node_scores (`score`, `updated_at`, `node_id`, `user_id`) VALUES #{inserts.join(", ")}"

我所做的修改是对需要它的值使用公共方法 ActiveRecord::Base.sanitize() 。

inserts = []
created = Time.now.strftime "%Y-%m-%d %H:%M:%S"
params[:audits].each do |audit|
  inserts.push "(#{audit.user_id), #{created}," + ActiveRecord::Base.sanitize(audit.comment) + ", #{audit.status})"
end
sql = "INSERT INTO user_audits (`user_id`, `created_at`, `comment`, `status`) VALUES #{inserts.join(", ")}"

I encountered a similar issue recently when tying to insert 100K+ records into a MySQL database for a Rails 4 app using mysql2 gem. The data included characters that had to be sanitized prior to insert.

The solution I ended going with was a slightly modified version of Option 3 described at https://www.coffeepowered.net/2009/01/23/mass-inserting-data-in-rails-without-killing-your-performance/

Here's the relevant code block from the above link:

TIMES = 10000
inserts = []
TIMES.times do
  inserts.push "(3.0, '2009-01-23 20:21:13', 2, 1)"
end
sql = "INSERT INTO user_node_scores (`score`, `updated_at`, `node_id`, `user_id`) VALUES #{inserts.join(", ")}"

The modification I made was using the public method ActiveRecord::Base.sanitize() on values that required it.

inserts = []
created = Time.now.strftime "%Y-%m-%d %H:%M:%S"
params[:audits].each do |audit|
  inserts.push "(#{audit.user_id), #{created}," + ActiveRecord::Base.sanitize(audit.comment) + ", #{audit.status})"
end
sql = "INSERT INTO user_audits (`user_id`, `created_at`, `comment`, `status`) VALUES #{inserts.join(", ")}"
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文