update语句中的冗余数据
Hibernate 生成 UPDATE
语句,其中包括所有列,无论我是否更改该列中的值,例如:
tx.begin();
Item i = em.find(Item.class, 12345);
i.setA("a-value");
tx.commit();
发出此 UPDATE
语句:
update Item set A = $1, B = $2, C = $3, D = $4 where id = $5
所以列 B
、C
、D
已更新,而我没有更改它们。
假设,项目经常更新,并且所有列都已建立索引。 将 Hibernate 部分优化为这样有意义吗?
tx.begin();
em.createQuery("update Item i set i.a = :a where i.id = :id")
.setParameter("a", "a-value")
.setParameter("id", 12345)
.executeUpdate();
tx.commit();
最让我困惑的是“未优化”和“优化”查询版本的 EXPLAIN
计划是相同的!
Hibernate generates UPDATE
statements, which include all columns, regardless of whether I'm changing the value in that columns, eg:
tx.begin();
Item i = em.find(Item.class, 12345);
i.setA("a-value");
tx.commit();
issues this UPDATE
statement:
update Item set A = $1, B = $2, C = $3, D = $4 where id = $5
So columns B
, C
, D
are updated, while I didn't change them.
Say, Items are updated frequently and all columns are indexed.
Does it make sense to optimize the Hibernate part to something like this?
tx.begin();
em.createQuery("update Item i set i.a = :a where i.id = :id")
.setParameter("a", "a-value")
.setParameter("id", 12345)
.executeUpdate();
tx.commit();
What confuses me most is that the EXPLAIN
plans of the 'unoptimized' and the 'optimized' query version are identical!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
由于PostgreSQL MVCC,
UPDATE
实际上是就像DELETE
加上INSERT
。除了烤值的显着例外 - 请参阅:(仅堆元组的细微差别 -
DELETE
+ INSERT 启动一个新的 HOT 链 - 但这与当前的情况无关。)准确地说,“已删除”行对于提交删除后开始的任何事务都是不可见的,并且稍后吸尘。因此,在数据库方面,包括索引操作,这两个语句实际上没有区别。 (例外情况适用,请继续阅读。)它会增加一点网络流量(取决于您的数据)并且需要一些解析。
在 @araqnid 的输入之后,我进一步研究了 HOT 更新并进行了一些测试。就热更新而言,实际上不更改值的列更新没有任何区别。我的答案成立。请参阅下面的详细信息。
这也适用于 toasted 属性,因为除非值实际更改,否则这些属性也不会被触及。
但是,如果您使用每列触发器(9.0 版中引入),这可能会产生不良副作用!
我引用触发器手册:
大胆强调我的。
抽象层是为了方便。它们对于不懂 SQL 的开发人员或应用程序需要在不同 RDBMS 之间移植时非常有用。不利的一面是,它们可能会降低性能并引入额外的故障点。我尽可能避开他们。
热门(仅堆元组)更新
仅堆元组是通过 引入的Postgres 8.3,在 8.3.4 和 8.4 .9.
Postgres 8.3 发行说明:
强调我的。 “无更改”包括列被更新为其已保存的相同值的情况。我实际测试过,因为我不确定。
最终,广泛的 源代码中的README.HOT证实了这一点。
烘烤的列也不会妨碍热门更新。热更新的元组仅链接到关系的 toast fork 中相同的、未更改的元组。热更新甚至可以使用目标列表中的烘烤值(实际更改或未更改)。显然,如果 toast 值发生更改,则需要写入 toast 关系分支。我也测试了所有这些。
别相信我的话,你自己看看。 Postgres 提供了一些检查统计信息的函数。在包含和不包含所有列的情况下运行
UPDATE
并检查是否有任何差异。或者使用 pgAdmin。选择您的表格并检查主窗口中的“统计”选项卡。
请注意,只有当主关系分支的同一页面上有新元组版本的空间时,才可能进行热更新。强制该条件的一种简单方法是使用仅包含几行的小表进行测试。页面大小通常为 8k,因此页面上必须有可用空间。
Due to PostgreSQL MVCC, an
UPDATE
is effectively much like aDELETE
plus anINSERT
. With the notable exception of toasted values - see:(And minor differences for heap only tuples -
DELETE
+INSERT
starts a new HOT chain - but that has no bearing on the case at hand.)To be precise, the "deleted" row is just invisible to any transaction starting after the delete has been committed, and vacuumed later. Therefore, on the database side, including index manipulation, there is in effect no difference between the two statements. (Exceptions apply, keep reading.) It increases network traffic a bit (depending on your data) and needs a bit of parsing.
I studied HOT updates some more after @araqnid's input and ran some tests. Updates on columns that don't actually change the value make no difference whatsoever as far as HOT updates are concerned. My answer holds. See details below.
This also applies to toasted attributes, since those are also not touched unless the values actually change.
However, if you use per-column triggers (introduced with pg 9.0), this may have undesired side effects!
I quote the manual on triggers:
Bold emphasis mine.
Abstraction layers are for convenience. They are useful for SQL-illiterate developers or if the application needs to be portable between different RDBMS. On the downside, they can butcher performance and introduce additional points of failure. I avoid them wherever possible.
HOT (Heap-only tuple) updates
Heap-Only Tuples were introduced with Postgres 8.3, with important improvements in 8.3.4 and 8.4.9.
The release notes for Postgres 8.3:
Emphasis mine. And "no changes" includes cases where columns are updated with the same value as they already hold. I actually tested, as I wasn't sure.
Ultimately, the extensive README.HOT in the source code confirms it.
Toasted columns also don't stand in the way of HOT updates. The HOT-updated tuple just links to the same, unchanged tuple(s) in the toast fork of the relation. HOT updates even work with toasted values in the target list (actually changed or not). If toasted values are changed, it entails writes to the toast relation fork, obviously. I tested all of that, too.
Don't take my word for it, see for yourself. Postgres provides a couple of functions to check statistics. Run your
UPDATE
with and without all columns and check if it makes any difference.Or use pgAdmin. Select your table and inspect the "Statistics" tab in the main window.
Be aware that HOT updates are only possible when there is room for the new tuple version on the same page of the main relation fork. One simple way to force that condition is to test with a small table that holds only a few rows. Page size is typically 8k, so there must be free space on the page.
您可以使用 hibernate 注释@Entity:
这将仅更新更改的字段。
You can use hibernate annotation @Entity:
This will update only the changed fields.