为什么 pg_restore 这么慢并且 PostgreSQL 几乎不使用 CPU?

发布于 2025-01-13 02:36:48 字数 1105 浏览 0 评论 0 原文

我只需使用 pg_restore 进行 30MB 的小转储,平均需要 5 分钟!在我同事的电脑上,速度超快,大概十几秒。两者之间的区别在于 CPU 使用率:而对于其他数据库,数据库在恢复操作期间使用大量 CPU (60-70%),而在我的机器上,它仅保持在几个百分点左右 (0-3%) )就好像它根本不活跃一样。

确切的命令是: pg_restore -h 127.0.0.1 --username XXX --dbname test --no-comments test_dump.sql

生成此转储的原始命令是: pg_dump --dbname =XXX --user=XXX --no-owner --no-privileges --verbose --format=custom --file=/sql/test_dump.sql

看恢复中间的截图手术:

屏幕截图恢复

这是运行命令的相应 vmstat 1 结果:

Screenshot of vmstat

我花了几个小时在网上寻找解决方案,但是这个CPU 使用不足的情况仍然很神秘。任何想法将不胜感激。

对于堆栈,我使用的是 Ubuntu 20.04,postgres 版本 13.6 正在 Docker 容器中运行。我有一个不错的硬件,不算太差也不算太好。

编辑:这个同样的命令过去在我的机器上使用相同的普通硬盘工作,但现在它非常慢。从我的角度来看,我与其他人(对于他们来说速度非常快)的唯一区别实际上是 CPU 使用率(即使他们有 SSD,这根本不应该成为限制因素,特别是对于 30 MB 转储) )。

编辑2:对于那些提出问题与IO限制和磁盘速度有关的人,我只是在没有任何信念的情况下尝试在我刚刚创建的SSD分区上运行我的命令,但没有任何改变。

I just had to use pg_restore with a small dump of 30MB and it took in average 5 minutes! On my colleagues' computers, it is ultra fast, like a dozen of seconds. The difference between the two is the CPU usage: while for the others, the database uses quite a bunch of CPU (60-70%) during the restore operation, on my machine, it stays around a few percents only (0-3%) as if it was not active at all.

The exact command was : pg_restore -h 127.0.0.1 --username XXX --dbname test --no-comments test_dump.sql

The originating command to produce this dump was: pg_dump --dbname=XXX --user=XXX --no-owner --no-privileges --verbose --format=custom --file=/sql/test_dump.sql

Look at the screenshot taken in the middle of the restore operation:

Screenshot during the restore

Here is the corresponding vmstat 1 result running the command:

Screenshot of vmstat

I've looked at the web for a solution during a few hours but this under-usage of the CPU remains quite mysterious. Any idea will be appreciated.

For the stack, I am on Ubuntu 20.04 and postgres version 13.6 is running into a docker container. I have a decent hardware, neither bad nor great.

EDIT: This very same command worked in the past on my machine with a same common HDD but now it is terribly slow. The only difference I saw with others (for whom it is blazing fast) was really on the CPU-usage from my point of view (even if they have an SSD which shouldn't be at all the limiting factor especially with a 30 MB dump).

EDIT 2: For those who proposed the problem was about IO-boundness and maybe a slow disk, I just tried without any conviction to run my command on an SSD partition I just made and nothing has changed.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

若水微香 2025-01-20 02:36:48

vmstat 输出显示您受到 I/O 限制。获得更快的存储,性能将会提高。

The vmstat output shows that you are I/O bound. Get faster storage, and performance will improve.

毁梦 2025-01-20 02:36:48

PostgreSQL 默认情况下针对数据持久性进行了调整。通常,事务在每次提交时都会刷新到磁盘,强制写入任何磁盘写入缓存, 因此它似乎非常受 IO 限制。

从转储文件恢复数据库时,降低这些持久性设置可能是有意义的,特别是在应用程序脱机时完成恢复时,尤其是在非生产环境中。

我暂时使用以下选项运行 postgres-c fsync=off -c synchronous_commit=off -c full_page_writes=off -c checkpoint_flush_after=256 -c autovacuum=off -c max_wal_senders=0 code>

请参阅这些文档部分以获取更多信息:

还有这篇文章:

PostgreSQL, by default, is tuned for data durability. Usually transactions are flushed to the disk at each and every commit, forcing write-through of any disk write cache, so it seems to be very IO-bound.

When restoring database from a dump file, it may make sense to lower these durability settings, especially if the restore is done while your application is offline, especially in non-production environments.

I temporarily run postgres with these options: -c fsync=off -c synchronous_commit=off -c full_page_writes=off -c checkpoint_flush_after=256 -c autovacuum=off -c max_wal_senders=0

Refer to these documentation sections for more information:

Also this article:

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文