PostgreSQL 中使用游标更新的性能
我想知道使用以下方法更新表的性能差异:
UPDATE films SET kind = 'Dramatic' WHERE CURRENT OF c_films;
或者像这样:
UPDATE films SET kind = 'Dramatic' WHERE unique_indexed_int_column = 3000;
是否有人测试过这个或知道使用游标的更新如何工作,以便他们可以对此发表评论?
编辑:我现在对此进行了基准测试,发现最新示例的速度实际上快了三分之一左右。我将每个查询运行了 100000 次并计算了差异。我使用 psycopg2 使用服务器端游标与 Postgres 进行通信。我将进一步调查,看看是否能发现情况并非总是如此。
I would like to know the performance difference in updating a table using the following method:
UPDATE films SET kind = 'Dramatic' WHERE CURRENT OF c_films;
or like this:
UPDATE films SET kind = 'Dramatic' WHERE unique_indexed_int_column = 3000;
Has anyone tested this or know how updates using cursors work so they can comment on this?
EDIT: I have now benchmarked this and found that it is infact around a third faster to do the latest example. I ran each of the queries 100000 times and timed the difference. I used psycopg2 using server side cursors to communicate with Postgres. I will investigate further to see whether I can find that this is not always the case.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我对 PostgreSQL 不熟悉,所以只能给你一个笼统的答案。
首先,如果
indexed_int_column
不唯一,第二条语句将更新多行,而第一个语句只会更新当前光标c_films
下的行。所以这些说法并不完全相同。假设唯一性并且游标 c_films 位于
indexed_int_column = 3000
的一行,那么一旦游标位于某一行下方,更新应该会非常快,因为游标保存了直接访问物理数据的信息。该行的位置。然而,第二条语句必须首先获取索引,查找其中的值 3000,然后才知道要更新的行的物理位置。也就是说,必须在某一点对游标执行此查找操作(如果我们没有迭代整个表)。因此,一般来说,只有当您必须先读取数据,然后想要更新刚刚读取的同一行时,使用游标才是值得的。I'm not familiar with PostgreSQL, so I can only give you a generic answer.
First off, if
indexed_int_column
is not unique, the second statement will update multiple rows, whereas the first one will only update the row currently under the cursorc_films
. So the statements are not identical.Assuming uniqueness and the cursor c_films being at the one row where
indexed_int_column = 3000
, then updating should be very quick once the cursor is positioned under a certain row, as the cursor holds the information to directly access the physical location of this row. The second statement however, will have to fetch the index first, lookup the value 3000 in it and only then it will know the physical location of the row to update. That said, this lookup operation had to be done for the cursor at one point to (if we did not iterate over the whole table that is). So in general it only pays off to use the cursor, when you have to read the data first anyway, and then want to update the same row you just read.