update语句中的冗余数据

发布于 2024-12-10 19:06:29 字数 703 浏览 2 评论 0原文

Hibernate 生成 UPDATE 语句,其中包括所有列,无论我是否更改该列中的值,例如:

tx.begin();
Item i = em.find(Item.class, 12345);
i.setA("a-value");
tx.commit();

发出此 UPDATE 语句:

update Item set A = $1, B = $2, C = $3, D = $4 where id = $5

所以列 BCD已更新,而我没有更改它们。

假设,项目经常更新,并且所有列都已建立索引。 将 Hibernate 部分优化为这样有意义吗?

tx.begin();
em.createQuery("update Item i set i.a = :a where i.id = :id")
    .setParameter("a", "a-value")
    .setParameter("id", 12345)
    .executeUpdate();
tx.commit();

最让我困惑的是“未优化”和“优化”查询版本的 EXPLAIN 计划是相同的!

Hibernate generates UPDATE statements, which include all columns, regardless of whether I'm changing the value in that columns, eg:

tx.begin();
Item i = em.find(Item.class, 12345);
i.setA("a-value");
tx.commit();

issues this UPDATE statement:

update Item set A = $1, B = $2, C = $3, D = $4 where id = $5

So columns B, C, D are updated, while I didn't change them.

Say, Items are updated frequently and all columns are indexed.
Does it make sense to optimize the Hibernate part to something like this?

tx.begin();
em.createQuery("update Item i set i.a = :a where i.id = :id")
    .setParameter("a", "a-value")
    .setParameter("id", 12345)
    .executeUpdate();
tx.commit();

What confuses me most is that the EXPLAIN plans of the 'unoptimized' and the 'optimized' query version are identical!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

北城孤痞 2024-12-17 19:06:29

由于PostgreSQL MVCCUPDATE实际上是就像DELETE加上INSERT。除了烤值的显着例外 - 请参阅:

(仅堆元组的细微差别 - DELETE + INSERT 启动一个新的 HOT 链 - 但这与当前的情况无关。)

准确地说,“已删除”行对于提交删除后开始的任何事务都是不可见的,并且稍后吸尘。因此,在数据库方面,包括索引操作,这两个语句实际上没有区别。 (例外情况适用,请继续阅读。)它会增加一点网络流量(取决于您的数据)并且需要一些解析。

在 @araqnid 的输入之后,我进一步研究了 HOT 更新并进行了一些测试。就热更新而言,实际上不更改值的列更新没有任何区别。我的答案成立。请参阅下面的详细信息。

这也适用于 toasted 属性,因为除非值实际更改,否则这些属性也不会被触及。

但是,如果您使用每列触发器(9.0 版中引入),这可能会产生不良副作用!

我引用触发器手册

...诸如UPDATE ... SET x = x ...之类的命令将触发触发器
x即使列的值没有更改

大胆强调我的。

抽象层是为了方便。它们对于不懂 SQL 的开发人员或应用程序需要在不同 RDBMS 之间移植时非常有用。不利的一面是,它们可能会降低性能并引入额外的故障点。我尽可能避开他们。

热门(仅堆元组)更新

仅堆元组是通过 引入的Postgres 8.3,在 8.3.48.4 .9.
Postgres 8.3 发行说明:

UPDATEDELETE 会留下死元组,就像失败的 INSERT 一样。
以前只有 VACUUM 可以回收死元组占用的空间。和
HOT死元组空间可以在发生时自动回收
INSERTUPDATE 如果未对索引列进行任何更改。这
可以实现更一致的性能。此外,HOT 避免添加
重复的索引条目。

强调我的。 “无更改”包括列被更新为其已保存的相同值的情况。我实际测试过,因为我不确定。

最终,广泛的 源代码中的README.HOT证实了这一点。

烘烤的列也不会妨碍热门更新。热更新的元组仅链接到关系的 toast fork 中相同的、未更改的元组。热更新甚至可以使用目标列表中的烘烤值(实际更改或未更改)。显然,如果 toast 值发生更改,则需要写入 toast 关系分支。我也测试了所有这些。

别相信我的话,你自己看看。 Postgres 提供了一些检查统计信息的函数。在包含和不包含所有列的情况下运行 UPDATE 并检查是否有任何差异。

-- Number of rows HOT-updated in table:
SELECT pg_stat_get_tuples_hot_updated('table_name'::regclass::oid)

-- Number of rows HOT-updated in table, in the current transaction:
SELECT pg_stat_get_xact_tuples_hot_updated('table_name'::regclass::oid)

或者使用 pgAdmin。选择您的表格并检查主窗口中的“统计”选项卡。

请注意,只有当主关系分支的同一页面上有新元组版本的空间时,才可能进行热更新。强制该条件的一种简单方法是使用仅包含几行的小表进行测试。页面大小通常为 8k,因此页面上必须有可用空间。

Due to PostgreSQL MVCC, an UPDATE is effectively much like a DELETE plus an INSERT. With the notable exception of toasted values - see:

(And minor differences for heap only tuples - DELETE + INSERT starts a new HOT chain - but that has no bearing on the case at hand.)

To be precise, the "deleted" row is just invisible to any transaction starting after the delete has been committed, and vacuumed later. Therefore, on the database side, including index manipulation, there is in effect no difference between the two statements. (Exceptions apply, keep reading.) It increases network traffic a bit (depending on your data) and needs a bit of parsing.

I studied HOT updates some more after @araqnid's input and ran some tests. Updates on columns that don't actually change the value make no difference whatsoever as far as HOT updates are concerned. My answer holds. See details below.

This also applies to toasted attributes, since those are also not touched unless the values actually change.

However, if you use per-column triggers (introduced with pg 9.0), this may have undesired side effects!

I quote the manual on triggers:

... a command such as UPDATE ... SET x = x ... will fire a trigger on
column x, even though the column's value did not change.

Bold emphasis mine.

Abstraction layers are for convenience. They are useful for SQL-illiterate developers or if the application needs to be portable between different RDBMS. On the downside, they can butcher performance and introduce additional points of failure. I avoid them wherever possible.

HOT (Heap-only tuple) updates

Heap-Only Tuples were introduced with Postgres 8.3, with important improvements in 8.3.4 and 8.4.9.
The release notes for Postgres 8.3:

UPDATEs and DELETEs leave dead tuples behind, as do failed INSERTs.
Previously only VACUUM could reclaim space taken by dead tuples. With
HOT dead tuple space can be automatically reclaimed at the time of
INSERT or UPDATE if no changes are made to indexed columns. This
allows for more consistent performance. Also, HOT avoids adding
duplicate index entries.

Emphasis mine. And "no changes" includes cases where columns are updated with the same value as they already hold. I actually tested, as I wasn't sure.

Ultimately, the extensive README.HOT in the source code confirms it.

Toasted columns also don't stand in the way of HOT updates. The HOT-updated tuple just links to the same, unchanged tuple(s) in the toast fork of the relation. HOT updates even work with toasted values in the target list (actually changed or not). If toasted values are changed, it entails writes to the toast relation fork, obviously. I tested all of that, too.

Don't take my word for it, see for yourself. Postgres provides a couple of functions to check statistics. Run your UPDATE with and without all columns and check if it makes any difference.

-- Number of rows HOT-updated in table:
SELECT pg_stat_get_tuples_hot_updated('table_name'::regclass::oid)

-- Number of rows HOT-updated in table, in the current transaction:
SELECT pg_stat_get_xact_tuples_hot_updated('table_name'::regclass::oid)

Or use pgAdmin. Select your table and inspect the "Statistics" tab in the main window.

Be aware that HOT updates are only possible when there is room for the new tuple version on the same page of the main relation fork. One simple way to force that condition is to test with a small table that holds only a few rows. Page size is typically 8k, so there must be free space on the page.

淡淡绿茶香 2024-12-17 19:06:29

您可以使用 hibernate 注释@Entity:

@org.hibernate.annotations.Entity(dynamicUpdate = true)
public class Item

这将仅更新更改的字段。

You can use hibernate annotation @Entity:

@org.hibernate.annotations.Entity(dynamicUpdate = true)
public class Item

This will update only the changed fields.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文