如何减少PostgreSQL中生成的WAL文件的数量

发布于 2025-02-06 02:04:00 字数 5701 浏览 5 评论 0原文

大量的WAL文件是在主机复制中生成的。 Walfiles在一个备用节点之一上存档，每2小时，我们使用tar来压缩备用节点中的存档WALS。尽管如此，它还是很大的存储尺寸。当要进行30，90天的备份时，它将成为一个巨大的存储问题。此外，最终需要更多时间在修复过程中下载和重播WAL的。

我已经使用了以下选项。

wal_level=replica
wal_compression=on
archive_mode = always

下面对参数进行评论/未使用。

archive_timeout
checkpoint_timeout

还有其他方法吗，我们可以减少WAL生成的数量或管理它们的更简单方法？ pg_waldump显示了大约70-90％的数据是全页图像。

另外，我可以通过更改待机节点来制作上述参数吗？待机是否会归档大师发送的同一WAL？还是基于备用配置来再生？

- 更新：修改为以下值

        name        | setting | unit
--------------------+---------+------
 archive_timeout    | 0       | s
 checkpoint_timeout | 3600    | s
 checkpoint_warning | 3600    | s
 max_wal_size       | 4000    | MB
 min_wal_size       | 2000    | MB
 shared_buffers     | 458752  | 8kB
 wal_buffers        | 4096    | 8kB
 wal_compression    | on      |
 wal_level          | replica |

仍然看到每分钟生成的3-4个WAL文件。我在热备用节点（从备份中获取）上进行了这些更改。我应该在主人中更改吗？主设置对Standby的Wal Generation有影响吗？

示例pg_waldump显示fpi size = 87％

pg_waldump --stats 0000000100000498000000B2
Type                                           N      (%)          Record size      (%)             FPI size      (%)        Combined size      (%)
----                                           -      ---          -----------      ---             --------      ---        -------------      ---
XLOG                                           1 (  0.00)                  114 (  0.01)                    0 (  0.00)                  114 (  0.00)
Transaction                                 3070 ( 10.35)               104380 (  4.86)                    0 (  0.00)               104380 (  0.63)
Storage                                        0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
CLOG                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Database                                       0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Tablespace                                     0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
MultiXact                                      0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
RelMap                                         0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Standby                                        2 (  0.01)                  100 (  0.00)                    0 (  0.00)                  100 (  0.00)
Heap2                                        590 (  1.99)                33863 (  1.58)                46192 (  0.32)                80055 (  0.48)
Heap                                        6679 ( 22.51)               578232 ( 26.92)              4482508 ( 30.92)              5060740 ( 30.41)
Btree                                      19330 ( 65.14)              1430918 ( 66.62)              9967524 ( 68.76)             11398442 ( 68.48)
Hash                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Gin                                            0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Gist                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Sequence                                       0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
SPGist                                         0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
BRIN                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
CommitTs                                       4 (  0.01)                  120 (  0.01)                    0 (  0.00)                  120 (  0.00)
ReplicationOrigin                              0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Generic                                        0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
LogicalMessage                                 0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
                                        --------                      --------                      --------                      --------
Total                                      29676                       2147727 [12.90%]             14496224 [87.10%]             16643951 [100%]

，使用log_checkpoints = on

2022-06-15 07:08:57 UTC [11] LOG:  checkpoint starting: time
2022-06-15 07:29:57 UTC [11] LOG:  checkpoint complete: wrote 67010 buffers (14.6%); 0 WAL file(s) added, 12 removed, 56 recycled; write=1259.767 s, sync=0.010 s, total=1259.961 s; sync files=253, longest=0.003 s, average=0.001 s; distance=1125728 kB, estimate=2176006 kB
2022-06-15 07:38:57 UTC [11] LOG:  checkpoint starting: time
2022-06-15 07:59:57 UTC [11] LOG:  checkpoint complete: wrote 61886 buffers (13.5%); 0 WAL file(s) added, 20 removed, 10 recycled; write=1259.740 s, sync=0.005 s, total=1259.878 s; sync files=185, longest=0.002 s, average=0.001 s; distance=491822 kB, estimate=2007588 kB

原文

Huge pile of WAL files are generated in Master-Standby replication.
walfiles are archived at one of the standby node and every 2 hour, we are using tar to compress the archived WALs in standby node. Still, it becomes a huge size to store.
When it comes to 30, 90 days' backup it becomes a huge storage issue. Also, ends up taking more time to download and replay the WAL's during restoration.

I have used the below options.

wal_level=replica
wal_compression=on
archive_mode = always

And below parameters are commented/not used.

archive_timeout
checkpoint_timeout

Is there any other way, we can reduce the number of WAL's generated or an easier way to manage them? pg_waldump is showing around 70-90% of the data is full page images.

Also, Can I make above parameters in effect by changing in standby node?
Is standby archiving the same WAL's sent by the master? OR is it regenerating based on standby's configuration?

-- Update: Modified to below values

        name        | setting | unit
--------------------+---------+------
 archive_timeout    | 0       | s
 checkpoint_timeout | 3600    | s
 checkpoint_warning | 3600    | s
 max_wal_size       | 4000    | MB
 min_wal_size       | 2000    | MB
 shared_buffers     | 458752  | 8kB
 wal_buffers        | 4096    | 8kB
 wal_compression    | on      |
 wal_level          | replica |

still seeing 3-4 WAL files generated every minute.
I am making these changes on hot standby node(From where backup is taken). Should I change this in Master? Does master settings have affect on Standby's WAL generation?

Example pg_waldump showing FPI size=87%

pg_waldump --stats 0000000100000498000000B2
Type                                           N      (%)          Record size      (%)             FPI size      (%)        Combined size      (%)
----                                           -      ---          -----------      ---             --------      ---        -------------      ---
XLOG                                           1 (  0.00)                  114 (  0.01)                    0 (  0.00)                  114 (  0.00)
Transaction                                 3070 ( 10.35)               104380 (  4.86)                    0 (  0.00)               104380 (  0.63)
Storage                                        0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
CLOG                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Database                                       0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Tablespace                                     0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
MultiXact                                      0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
RelMap                                         0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Standby                                        2 (  0.01)                  100 (  0.00)                    0 (  0.00)                  100 (  0.00)
Heap2                                        590 (  1.99)                33863 (  1.58)                46192 (  0.32)                80055 (  0.48)
Heap                                        6679 ( 22.51)               578232 ( 26.92)              4482508 ( 30.92)              5060740 ( 30.41)
Btree                                      19330 ( 65.14)              1430918 ( 66.62)              9967524 ( 68.76)             11398442 ( 68.48)
Hash                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Gin                                            0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Gist                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Sequence                                       0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
SPGist                                         0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
BRIN                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
CommitTs                                       4 (  0.01)                  120 (  0.01)                    0 (  0.00)                  120 (  0.00)
ReplicationOrigin                              0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Generic                                        0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
LogicalMessage                                 0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
                                        --------                      --------                      --------                      --------
Total                                      29676                       2147727 [12.90%]             14496224 [87.10%]             16643951 [100%]

After using log_checkpoints=on

2022-06-15 07:08:57 UTC [11] LOG:  checkpoint starting: time
2022-06-15 07:29:57 UTC [11] LOG:  checkpoint complete: wrote 67010 buffers (14.6%); 0 WAL file(s) added, 12 removed, 56 recycled; write=1259.767 s, sync=0.010 s, total=1259.961 s; sync files=253, longest=0.003 s, average=0.001 s; distance=1125728 kB, estimate=2176006 kB
2022-06-15 07:38:57 UTC [11] LOG:  checkpoint starting: time
2022-06-15 07:59:57 UTC [11] LOG:  checkpoint complete: wrote 61886 buffers (13.5%); 0 WAL file(s) added, 20 removed, 10 recycled; write=1259.740 s, sync=0.005 s, total=1259.878 s; sync files=185, longest=0.002 s, average=0.001 s; distance=491822 kB, estimate=2007588 kB

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

维持三分热 2025-02-13 02:04:00

wal_compression = on

这可能是适得其反的。这种类型的压缩需要隔离地压缩每个WAL记录，而无需更大的上下文。因此，这不是很有效。但是，当您将整个WAL文件脱机重新压缩时，它们确实可以访问较大的上下文时，第一轮尝试的压缩会干扰更好的固定式压缩尝试。

例如，如果我从1,000,000个PGBench交易中拿走WAL，则它们占据了889192448无wal_compression的原始字节，而637534208则占据。

但是后来，在他们通过“ XZ”（非常缓慢但非常彻底的压缩机）之后，第一组需要129393020字节，但第二组需要155769400。因此，打开压缩太早了，我花了我20％的空间。

您可以在某些WAL文件上使用PG_WALDUMP -Stat ...来查看其中实际上是什么。如果主要是FPI，则可以尝试将检查点进一步分开以降低FPI频率。但是，如果您没有太多的FPI开始，那将是无效的。如果您可以隔离造成太多WAL的原因，也许您可以为此做些事情。例如，如果您进行了很多堕落的更新，其中将列设置为已有相同值的更新，则添加一个抑制这些情况的位置可能会使您有很多WAL生成。

回复收藏 0 原文