如何减少PostgreSQL中生成的WAL文件的数量
大量的WAL文件是在主机复制中生成的。 Walfiles在一个备用节点之一上存档,每2小时,我们使用tar
来压缩备用节点中的存档WALS。尽管如此,它还是很大的存储尺寸。 当要进行30,90天的备份时,它将成为一个巨大的存储问题。此外,最终需要更多时间在修复过程中下载和重播WAL的。
我已经使用了以下选项。
wal_level=replica
wal_compression=on
archive_mode = always
下面对参数进行评论/未使用。
archive_timeout
checkpoint_timeout
还有其他方法吗,我们可以减少WAL生成的数量或管理它们的更简单方法? pg_waldump
显示了大约70-90%的数据是全页图像。
另外,我可以通过更改待机节点来制作上述参数吗? 待机是否会归档大师发送的同一WAL?还是基于备用配置来再生?
- 更新:修改为以下值
name | setting | unit
--------------------+---------+------
archive_timeout | 0 | s
checkpoint_timeout | 3600 | s
checkpoint_warning | 3600 | s
max_wal_size | 4000 | MB
min_wal_size | 2000 | MB
shared_buffers | 458752 | 8kB
wal_buffers | 4096 | 8kB
wal_compression | on |
wal_level | replica |
仍然看到每分钟生成的3-4个WAL文件。 我在热备用节点(从备份中获取)上进行了这些更改。我应该在主人中更改吗?主设置对Standby的Wal Generation有影响吗?
示例pg_waldump显示fpi size = 87%
pg_waldump --stats 0000000100000498000000B2
Type N (%) Record size (%) FPI size (%) Combined size (%)
---- - --- ----------- --- -------- --- ------------- ---
XLOG 1 ( 0.00) 114 ( 0.01) 0 ( 0.00) 114 ( 0.00)
Transaction 3070 ( 10.35) 104380 ( 4.86) 0 ( 0.00) 104380 ( 0.63)
Storage 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
CLOG 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Database 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Tablespace 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
MultiXact 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
RelMap 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Standby 2 ( 0.01) 100 ( 0.00) 0 ( 0.00) 100 ( 0.00)
Heap2 590 ( 1.99) 33863 ( 1.58) 46192 ( 0.32) 80055 ( 0.48)
Heap 6679 ( 22.51) 578232 ( 26.92) 4482508 ( 30.92) 5060740 ( 30.41)
Btree 19330 ( 65.14) 1430918 ( 66.62) 9967524 ( 68.76) 11398442 ( 68.48)
Hash 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Gin 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Gist 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Sequence 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
SPGist 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
BRIN 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
CommitTs 4 ( 0.01) 120 ( 0.01) 0 ( 0.00) 120 ( 0.00)
ReplicationOrigin 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Generic 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
LogicalMessage 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
-------- -------- -------- --------
Total 29676 2147727 [12.90%] 14496224 [87.10%] 16643951 [100%]
,使用log_checkpoints = on
2022-06-15 07:08:57 UTC [11] LOG: checkpoint starting: time
2022-06-15 07:29:57 UTC [11] LOG: checkpoint complete: wrote 67010 buffers (14.6%); 0 WAL file(s) added, 12 removed, 56 recycled; write=1259.767 s, sync=0.010 s, total=1259.961 s; sync files=253, longest=0.003 s, average=0.001 s; distance=1125728 kB, estimate=2176006 kB
2022-06-15 07:38:57 UTC [11] LOG: checkpoint starting: time
2022-06-15 07:59:57 UTC [11] LOG: checkpoint complete: wrote 61886 buffers (13.5%); 0 WAL file(s) added, 20 removed, 10 recycled; write=1259.740 s, sync=0.005 s, total=1259.878 s; sync files=185, longest=0.002 s, average=0.001 s; distance=491822 kB, estimate=2007588 kB
Huge pile of WAL files are generated in Master-Standby replication.
walfiles are archived at one of the standby node and every 2 hour, we are using tar
to compress the archived WALs in standby node. Still, it becomes a huge size to store.
When it comes to 30, 90 days' backup it becomes a huge storage issue. Also, ends up taking more time to download and replay the WAL's during restoration.
I have used the below options.
wal_level=replica
wal_compression=on
archive_mode = always
And below parameters are commented/not used.
archive_timeout
checkpoint_timeout
Is there any other way, we can reduce the number of WAL's generated or an easier way to manage them? pg_waldump
is showing around 70-90% of the data is full page images.
Also, Can I make above parameters in effect by changing in standby node?
Is standby archiving the same WAL's sent by the master? OR is it regenerating based on standby's configuration?
-- Update: Modified to below values
name | setting | unit
--------------------+---------+------
archive_timeout | 0 | s
checkpoint_timeout | 3600 | s
checkpoint_warning | 3600 | s
max_wal_size | 4000 | MB
min_wal_size | 2000 | MB
shared_buffers | 458752 | 8kB
wal_buffers | 4096 | 8kB
wal_compression | on |
wal_level | replica |
still seeing 3-4 WAL files generated every minute.
I am making these changes on hot standby node(From where backup is taken). Should I change this in Master? Does master settings have affect on Standby's WAL generation?
Example pg_waldump showing FPI size=87%
pg_waldump --stats 0000000100000498000000B2
Type N (%) Record size (%) FPI size (%) Combined size (%)
---- - --- ----------- --- -------- --- ------------- ---
XLOG 1 ( 0.00) 114 ( 0.01) 0 ( 0.00) 114 ( 0.00)
Transaction 3070 ( 10.35) 104380 ( 4.86) 0 ( 0.00) 104380 ( 0.63)
Storage 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
CLOG 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Database 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Tablespace 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
MultiXact 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
RelMap 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Standby 2 ( 0.01) 100 ( 0.00) 0 ( 0.00) 100 ( 0.00)
Heap2 590 ( 1.99) 33863 ( 1.58) 46192 ( 0.32) 80055 ( 0.48)
Heap 6679 ( 22.51) 578232 ( 26.92) 4482508 ( 30.92) 5060740 ( 30.41)
Btree 19330 ( 65.14) 1430918 ( 66.62) 9967524 ( 68.76) 11398442 ( 68.48)
Hash 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Gin 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Gist 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Sequence 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
SPGist 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
BRIN 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
CommitTs 4 ( 0.01) 120 ( 0.01) 0 ( 0.00) 120 ( 0.00)
ReplicationOrigin 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
Generic 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
LogicalMessage 0 ( 0.00) 0 ( 0.00) 0 ( 0.00) 0 ( 0.00)
-------- -------- -------- --------
Total 29676 2147727 [12.90%] 14496224 [87.10%] 16643951 [100%]
After using log_checkpoints=on
2022-06-15 07:08:57 UTC [11] LOG: checkpoint starting: time
2022-06-15 07:29:57 UTC [11] LOG: checkpoint complete: wrote 67010 buffers (14.6%); 0 WAL file(s) added, 12 removed, 56 recycled; write=1259.767 s, sync=0.010 s, total=1259.961 s; sync files=253, longest=0.003 s, average=0.001 s; distance=1125728 kB, estimate=2176006 kB
2022-06-15 07:38:57 UTC [11] LOG: checkpoint starting: time
2022-06-15 07:59:57 UTC [11] LOG: checkpoint complete: wrote 61886 buffers (13.5%); 0 WAL file(s) added, 20 removed, 10 recycled; write=1259.740 s, sync=0.005 s, total=1259.878 s; sync files=185, longest=0.002 s, average=0.001 s; distance=491822 kB, estimate=2007588 kB
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这可能是适得其反的。这种类型的压缩需要隔离地压缩每个WAL记录,而无需更大的上下文。因此,这不是很有效。但是,当您将整个WAL文件脱机重新压缩时,它们确实可以访问较大的上下文时,第一轮尝试的压缩会干扰更好的固定式压缩尝试。
例如,如果我从1,000,000个PGBench交易中拿走WAL,则它们占据了889192448无wal_compression的原始字节,而637534208则占据。
但是后来,在他们通过“ XZ”(非常缓慢但非常彻底的压缩机)之后,第一组需要129393020字节,但第二组需要155769400。因此,打开压缩太早了,我花了我20%的空间。
您可以在某些WAL文件上使用
PG_WALDUMP -Stat ...
来查看其中实际上是什么。如果主要是FPI,则可以尝试将检查点进一步分开以降低FPI频率。但是,如果您没有太多的FPI开始,那将是无效的。如果您可以隔离造成太多WAL的原因,也许您可以为此做些事情。例如,如果您进行了很多堕落的更新,其中将列设置为已有相同值的更新,则添加一个抑制这些情况的位置可能会使您有很多WAL生成。This may be counter-productive. This type of compression needs to compress each WAL record in isolation, without the larger context. So this is not very effective. However, when you then recompress whole WAL files offline where they do have access to the larger context, the first round of attempted compression interferes with the better-situated compression attempt.
For example, if I take the WAL from 1,000,000 pgbench transactions, they occupy 889192448 raw bytes without wal_compression, and 637534208 with it.
But then after passing them through 'xz' (a very slow but very thorough compressor), the first set takes 129393020 bytes but the 2nd one takes 155769400. So turning on compression too soon cost me 20% more space.
You could use
pg_waldump --stat ...
on some WAL files to see what is actually in them. If it is mostly FPI, then you could try to make the checkpoints further apart to reduce the FPI frequency. But if you don't have much FPI to start with, that would be ineffective. If you can isolate what is causing so much WAL maybe you can do something about it. For example if you do a lot of degenerate updates where a column is set to the same value it already had, adding a WHERE to suppress those cases could spare you a lot of WAL generation.生成的WALS是您的主要机器活动的反映。增加检查点_timeout将有助于减少整体机器活动,从而更容易处理WAL日志。
待机归档是主要由主发送的日志。它们是相同的。是冷待机还是您在发送时处理备用日志?
WALs being generated are a reflection of your primary machine activity. Increasing checkpoint_timeout will help reduce your overall machine activity making it easier to process the WAL logs.
Standby Archiving is the processing the logs as sent by the Primary. They are binary identical. Is it a cold standby or are you processing logs on the standby as they are sent?
由于您的WAL的比例很高,由整页图像组成,因此您可以通过减少检查站来大大减少WAL的数量。每当检查站后第一次肮脏时,全页图像就会写给WAL。您必须支付的价格是更长的崩溃恢复时间。
要降低检查点的速率,请更改以下参数:
checkpoint_timeout
(默认5分钟):将其设置为高1个小时max_wal_size
(默认1GB):将其设置为高于用使用使用的WAL的数量一小时匹配checkpoint_timeout
设置这些设置必须在主服务器上进行, WAL生成的地方,而不是待机。最佳实践是在两个服务器上使用相同的设置。
Since a high percentage of your WAL consists of full page images, you can reduce the amount of WAL considerably by having checkpoints less often. A full page image is written to WAL whenever a page becomes dirty for the first time after a checkpoint. The price you have to pay is a longer crash recovery time.
To reduce the rate of checkpoints, change these parameters:
checkpoint_timeout
(default 5 minutes): set it to something high like 1 hourmax_wal_size
(default 1GB): set it higher than the amount of WAL that is written withing one hour to match thecheckpoint_timeout
settingThese settings have to be made on the primary server, where WAL is generated, not on the standby. Best practice is to use the same settings on both servers.