大量 postgres 删除后的可用空间
我有一个 900 万行的表。我发现其中很大一部分(大约 90%)可以被释放。清理后需要采取哪些措施?真空、重新索引等
I have a 9 million row table. I figured out that a large amount of it (around 90%) can be freed up. What actions are needed after the cleanup? Vacuum, reindex etc.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果您想释放文件系统上的空间,
VACUUM FULL
或CLUSTER
可以帮助您。您可能还想在这些之后运行ANALYZE
,以确保规划器具有最新的统计信息,但这不是特别需要的。重要的是要注意使用
VACUUM FULL
放置ACCESS EXCLUSIVE
对表进行锁定(阻止任何操作、写入和读取),因此您可能希望在此期间使应用程序脱机。在 PostgreSQL 8.2 及更早版本中,
VACUUM FULL
可能是您最好的选择。在 PostgreSQL 8.3 和 8.4 中,
CLUSTER
命令得到了显着改进,因此不建议使用 VACUUM FULL - 它很慢并且会使索引膨胀。 `CLUSTER 将从头开始重新创建索引,并且不会出现膨胀。根据我的经验,它通常也快得多。 CLUSTER 还将使用索引对整个物理表进行排序,因此您必须选择一个索引。如果您不知道哪个,主键就可以正常工作。在 PostgreSQL 9.0 中,VACUUM FULL 被更改为像 CLUSTER 一样工作,因此两者都很好。
很难做出预测,但在具有商用硬件且经过适当调整的服务器上,900 万行应该不会超过 20 分钟。
If you want to free up space on the file system, either
VACUUM FULL
orCLUSTER
can help you. You may also want to runANALYZE
after these, to make sure the planner has up-to-date statistics but this is not specifically required.It is important to note using
VACUUM FULL
places anACCESS EXCLUSIVE
lock on your table(s) (blocking any operation, writes & reads), so you probably want to take your application offline for the duration.In PostgreSQL 8.2 and earlier,
VACUUM FULL
is probably your best bet.In PostgreSQL 8.3 and 8.4, the
CLUSTER
command was significantly improved, so VACUUM FULL is not recommended -- it's slow and it will bloat your indexes. `CLUSTER will re-create indexes from scratch and without the bloat. In my experience, it's usually much faster too. CLUSTER will also sort the whole physical table using an index, so you must pick an index. If you don't know which, the primary key will work fine.In PostgreSQL 9.0,
VACUUM FULL
was changed to work likeCLUSTER
, so both are good.It's hard to make predictions, but on a properly tuned server with commodity hardware, 9 million rows shouldn't take longer than 20 minutes.
您肯定想运行 VACUUM,以释放该空间以供将来插入。如果您想实际回收磁盘上的空间,使其可供操作系统使用,则需要运行 VACUUM FULL。请记住,VACUUM 可以并发运行,但 VACUUM FULL 需要表上的独占锁。
您还需要 REINDEX,因为即使在 VACUUM 运行之后索引仍将保持臃肿状态。如果可能的话,一种更快的方法是删除索引并从头开始重新创建它。
您还需要进行 ANALYZE(分析),您可以将其与 VACUUM(真空)结合起来。
有关详细信息,请参阅文档。
You definitely want to run a VACUUM, to free up that space for future inserts. If you want to actually reclaim that space on disk, making it available to the OS, you'll need to run VACUUM FULL. Keep in mind that VACUUM can run concurrently, but VACUUM FULL requires an exclusive lock on the table.
You will also want to REINDEX, since the indexes will remain bloated even after the VACUUM runs. If possible, a much faster way to do this is to drop the index and create it again from scratch.
You'll also want to ANALYZE, which you can just combine with the VACUUM.
See the documentation for more info.
你好
创建一个包含 10% 所需记录的临时表不是更优化吗?然后删除原始表并将临时表重命名为原始表...
Hi
Don't it be more optimal to create a temporary table with 10% of needed records. Then drop original table and rename temporary to original ...
我对 Postgres 世界比较陌生,但我知道建议使用 VACUUM ANALYZE。我认为还有一个子选项可以释放空间。我发现在进行批量插入或删除时重新索引也很有用。是的,我一直在处理行数相似的表,速度的提高非常明显(UBuntu,Core 2 Quad)
I'm relatively new to the world of Postgres, but I understand VACUUM ANALYZE is recommended. I think there's also a sub-option which just frees up space. I found reindex useful as well when doing batch inserts or deletes. Yes I've been working with tables with a similar number of rows, and the speed increase is very noticeable (UBuntu, Core 2 Quad)