如何避免频繁更新造成的数据库存储碎片?

发布于 2024-10-21 14:21:52 字数 570 浏览 1 评论 0原文

当我有下表时:

CREATE TABLE test
(
  "id" integer NOT NULL,
  "myval" text NOT NULL,
  CONSTRAINT "test-id-pkey" PRIMARY KEY ("id")
)

当执行大量如下查询时:

UPDATE "test" set "myval" = "myval" || 'foobar' where "id" = 12345

那么行 myval 将随着时间的推移变得越来越大。 postgresql 会做什么?它将从哪里获得空间?

我可以避免 postgresql 需要多次查找来读取特定的 myval 列吗?

postgresql 会自动执行此操作吗?

我知道通常我应该尝试更多地标准化数据。但我需要一次读取该值。每次更新(添加数据)时,Myval 都会增大约 20 个字节。有些专栏会有1-2个更新,有些则有1000个更新。 通常我只会使用一个新行而不是更新。但随后选择变得越来越慢。 所以我想到了非规范化的想法。

When I have the following table:

CREATE TABLE test
(
  "id" integer NOT NULL,
  "myval" text NOT NULL,
  CONSTRAINT "test-id-pkey" PRIMARY KEY ("id")
)

When doing a lot of queries like the following:

UPDATE "test" set "myval" = "myval" || 'foobar' where "id" = 12345

Then the row myval will get larger and larger over time.
What will postgresql do? Where will it get the space from?

Can I avoid that postgresql needs more than one seek to read a particular myval-column?

Will postgresql do this automatically?

I know that normally I should try to normalize the data much more. But I need to read the value with one seek. Myval will enlarge by about 20 bytes with each update (that adds data). Some colums will have 1-2 updates, some 1000 updates.
Normally I would just use one new row instead of an update. But then selecting is getting slow.
So I came to the idea of denormalizing.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

戴着白色围巾的女孩 2024-10-28 14:21:53

这与有关 TEXT in PostgreSQL 的问题相关,或者至少答案是相似的。 PostgreSQL 将大列存储在远离主表存储的位置:

非常长的值也存储在后台表中,这样它们就不会干扰对较短列值的快速访问。

因此,您可以预期 TEXT (或 BYTEA 或大型 VARCHAR)列始终存储在远离主表的位置,例如 SELECT id, myval FROM test WHERE id = 12345 将进行两次搜索以将两列从磁盘上拉出(以及更多的搜索来解析它们的位置)。

如果您的 UPDATE 确实导致您的 SELECT 变慢,那么也许您需要检查您的

This is related to this question about TEXT in PostgreSQL, or at least the answer is similar. PostgreSQL stores large columns away from the main table storage:

Very long values are also stored in background tables so that they do not interfere with rapid access to shorter column values.

So you can expect a TEXT (or BYTEA or large VARCHAR) column to always be stored away from the main table and something like SELECT id, myval FROM test WHERE id = 12345 will take two seeks to pull both columns off the disk (and more seeks to resolve their locations).

If your UPDATEs really are causing your SELECTs to slow down then perhaps you need to review your vacuuming strategy.

变身佩奇 2024-10-28 14:21:52

更改表的 FILLFACTOR 以为将来的更新创建空间。这也可以是 HOT 更新,因为文本字段没有索引,以使更新更快,并且 autovacuum 开销更低,因为 HOT 更新使用 microvacuum。 CREATE TABLE 语句包含有关 FILLFACTOR 的一些信息。

ALTER TABLE test SET (fillfactor = 70);
-- do a table rebuild to blow some space in your current table:
VACUUM FULL ANALYZE test;
-- start testing

值 70 并不是完美的设置,这取决于您的独特情况。也许您对 90 感到满意,也可能是 40 或其他。

Change the FILLFACTOR of the table to create space for future updates. This can also be HOT updates because the text field doesn't have an index, to make the update faster and autovacuum overhead lower because HOT updates use a microvacuum. The CREATE TABLE statement has some information about the FILLFACTOR.

ALTER TABLE test SET (fillfactor = 70);
-- do a table rebuild to blow some space in your current table:
VACUUM FULL ANALYZE test;
-- start testing

The value 70 is not the perfect setting, it depends on your unique situation. Maybe you're fine with 90, it could also be 40 or something else.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文