PostgreSQL 无间隙序列
我正在从 MySql 迁移到 Postgres,我注意到当您从 MySql 中删除行时,这些行的唯一 id 在您创建新行时将被重新使用。使用 Postgres,如果您创建行并删除它们,则不会再次使用唯一的 id。
Postgres 中出现这种行为有原因吗?在这种情况下我可以让它表现得更像 MySql 吗?
I'm moving from MySql to Postgres, and I noticed that when you delete rows from MySql, the unique ids for those rows are re-used when you make new ones. With Postgres, if you create rows, and delete them, the unique ids are not used again.
Is there a reason for this behaviour in Postgres? Can I make it act more like MySql in this case?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
序列有间隙以允许并发插入。尝试避免间隙或重新使用已删除的 ID 会产生可怕的性能问题。请参阅 PostgreSQL wiki 常见问题解答。
PostgreSQL
SEQUENCE
s 用于分配ID。这些只会不断增加,并且不受通常的事务回滚规则的约束,以允许多个事务同时获取新的 ID。这意味着如果事务回滚,这些 ID 将被“丢弃”;没有保留“免费”ID 列表,只有当前的 ID 计数器。如果数据库非正常关闭,序列通常也会增加。无论如何,合成密钥(ID)毫无意义。它们的顺序并不重要,它们唯一重要的属性是唯一性。您无法有意义地测量两个 ID 之间的“距离”有多远,也无法有意义地判断一个 ID 是否大于或小于另一个 ID。你所能做的就是说“等于”或“不等于”。其他任何事情都是不安全的。你不应该关心差距。
如果您需要一个重复使用已删除 ID 的无缝序列,您可以拥有一个,只需为此放弃大量性能 - 特别是,您不能在
INSERT
上实现任何并发根本没有,因为您必须扫描表以查找最低的可用 ID,锁定表以进行写入,这样其他事务就无法声明相同的 ID。尝试搜索“postgresql 无间隙序列”。最简单的方法是使用计数器表和获取下一个 ID 的函数。这是一个通用版本,它使用计数器表来生成连续的无间隙 ID;但它不会重复使用 ID。
用法:
请注意,当一个打开的事务获得 ID 时,尝试调用 get_next_id 的所有其他事务将阻塞,直到第一个事务提交或回滚。对于无间隙 ID,这是不可避免的,并且是设计使然。
如果你想在一个表中存储多个用于不同目的的计数器,只需在上述函数中添加一个参数,在计数器表中添加一列,并在
UPDATE< 中添加一个
WHERE
子句即可/code> 将参数与添加的列相匹配。这样您就可以拥有多个独立锁定的计数器行。不要只为新计数器添加额外的列。此功能不会重新使用已删除的 ID,它只是避免引入间隙。
要重复使用 ID,我建议...不要重复使用 ID。
如果确实必须这样做,您可以通过在感兴趣的表上添加 ON INSERT OR UPDATE OR DELETE 触发器来实现此目的,该触发器将已删除的 ID 添加到空闲列表侧表,并将它们从空闲列表中删除。在
INSERT
编辑时列出表格。将UPDATE
视为DELETE
后跟INSERT
。现在修改上面的 ID 生成函数,使其执行SELECT free_id INTO next_value FROM free_ids FOR UPDATE LIMIT 1
操作,如果找到,则DELETE
该行。IF NOT FOUND
照常从生成器表中获取新 ID。这是先前函数的未经测试的扩展,以支持重用:Sequences have gaps to permit concurrent inserts. Attempting to avoid gaps or to re-use deleted IDs creates horrible performance problems. See the PostgreSQL wiki FAQ.
PostgreSQL
SEQUENCE
s are used to allocate IDs. These only ever increase, and they're exempt from the usual transaction rollback rules to permit multiple transactions to grab new IDs at the same time. This means that if a transaction rolls back, those IDs are "thrown away"; there's no list of "free" IDs kept, just the current ID counter. Sequences are also usually incremented if the database shuts down uncleanly.Synthetic keys (IDs) are meaningless anyway. Their order is not significant, their only property of significance is uniqueness. You can't meaningfully measure how "far apart" two IDs are, nor can you meaningfully say if one is greater or less than another. All you can do is say "equal" or "not equal". Anything else is unsafe. You shouldn't care about gaps.
If you need a gapless sequence that re-uses deleted IDs, you can have one, you just have to give up a huge amount of performance for it - in particular, you cannot have any concurrency on
INSERT
s at all, because you have to scan the table for the lowest free ID, locking the table for write so no other transaction can claim the same ID. Try searching for "postgresql gapless sequence".The simplest approach is to use a counter table and a function that gets the next ID. Here's a generalized version that uses a counter table to generate consecutive gapless IDs; it doesn't re-use IDs, though.
Usage:
Note that when one open transaction has obtained an ID, all other transactions that try to call
get_next_id
will block until the 1st transaction commits or rolls back. This is unavoidable and for gapless IDs and is by design.If you want to store multiple counters for different purposes in a table, just add a parameter to the above function, add a column to the counter table, and add a
WHERE
clause to theUPDATE
that matches the parameter to the added column. That way you can have multiple independently-locked counter rows. Do not just add extra columns for new counters.This function does not re-use deleted IDs, it just avoids introducing gaps.
To re-use IDs I advise ... not re-using IDs.
If you really must, you can do so by adding an
ON INSERT OR UPDATE OR DELETE
trigger on the table of interest that adds deleted IDs to a free-list side table, and removes them from the free-list table when they'reINSERT
ed. Treat anUPDATE
as aDELETE
followed by anINSERT
. Now modify the ID generation function above so that it does aSELECT free_id INTO next_value FROM free_ids FOR UPDATE LIMIT 1
and if found,DELETE
s that row.IF NOT FOUND
gets a new ID from the generator table as normal. Here's an untested extension of the prior function to support re-use: