如何在不花很长时间的情况下更改大型 Postgres 表的默认值?
在 Heroku 上运行的 Postgres 支持的 Rails 应用程序上,我尝试添加一个布尔列,默认设置为 false,null 设置为 false(NOT NULL)。
迁移需要几个小时才能运行(实际上大约有 15 个左右的此类字段需要添加)。内存使用量激增(5-6GB 或更多)。我认为这是因为 Postgres 将迁移包装在一个事务中,并尝试将整个表加载到内存中,以便它可以在失败时回滚。
如果没有字段默认值或 NOT NULL,列将立即添加。
作为临时解决方案,我在没有任何限制的情况下添加新列,以 1000 条记录为一批,使用错误的默认值更新所有记录,然后最终更改列以具有默认和 NOT NULL 要求。
现在,迁移运行大约需要 3-4 小时,并且仅使用大约 100mb 的 RAM。
还有更好的解决方案吗?
On a Postgres backed Rails application running on Heroku I am trying to add a boolean column with default set to false and null set to false (NOT NULL).
The migration takes hours upon hours to run (there are actually about 15 or so such fields that need to get added). Memory usage goes through the roof (5-6gb or more). I think this is because Postgres is wrapping the migration in a transaction and trying to load the whole table into memory so it can rollback on failure.
Without the field default or the NOT NULL the column is added in just a moment.
As an interim solution, I am adding the new column without any restrictions, updating all of the records with a false default value in batches of 1000 records, and then finally altering the column to have the default and NOT NULL requirements.
Now, the migration takes about 3-4 hours to run and uses only about 100mb of RAM.
Is there any better solution out there?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
从您的时间安排来看,我希望您通过 RoR 执行“升级”,这可能意味着“一次行”处理。我刚刚在一个包含 1 个主键和 3 个外键的 300K 行表上进行了测试,添加一个布尔值大约需要 5 秒(对我来说太短,无法准确地进行基准测试)查询:
这表明您的框架生成了次优代码。至少可以说。
From your timing, I expect that you perform your 'upgrade' via RoR, which possibly means 'row at a time' processing. I just tested on a 300K rows table with a a primary key and 3 foreign keys, and adding a boolean costs about 5 seconds (too short to benchmark it exact, for me) The query:
Which suggests that your framework produces sub-optimal code. To say the least.
查看 large-hadron-migrator。它声称是大规模单表迁移方案的实现。它比迁移要复杂得多,因为它使用几个后台表和一组触发器来完全在线或以最少的停机时间进行迁移。
Check out large-hadron-migrator. It claims to be an implementation of a scheme for doing large-scale single-table migration. It's a lot more complicated than just a migration because it uses a couple of background tables and a set of triggers to do the migration either entirely online or with minimal downtime.