处理非自动递增 ID 的策略

发布于 2024-11-17 16:47:52 字数 888 浏览 5 评论 0原文

最近有人建议我删除一个仅用于存储 ID 的自动递增表。我还没有接受这个，我只是在探索它是否实际上比我目前拥有的更好的解决方案。这会给我留下一个像这样的表：

create table tag_translations (
  tag_id int not null,
  language_id int not null,
  tag_name varchar(255),
  primary key (tag_id, language_id)
);

我将有 tag_id 的重复项，存储其他语言的标签翻译。

添加新标签时，我需要放弃在 tag_id 上使用自动增量，而是手动分配新 ID。除非它只是现有标签的翻译，否则新一批翻译插入的 ID 必须是唯一的。

有人可以用简单的英语向我解释一下这通常是如何完成的吗？我考虑过这一点，但如果我的想法正确的话，它似乎并不比我之前的方法更干净。我假设的过程是这样的：

从 tag_translations 中选择 tag_id
选择结果集中的最大数字 + 1
进行一个新查询（用于插入）
定义一些附加策略以确保当标签获取时新 tag_id 记录的 id 不会重复在或多或少相同的微秒内创建

如果这是这个过程，我认为我最好坚持使用现有的模式，即拥有一个附加表来自动递增 id。我仍然需要执行额外的查询来首先检查唯一的 id（今天我正在用单个连接来换取插入）。如果我认为在需要唯一性时保持 ID 唯一性会令人头疼，那么我可能想放弃这种方法并坚持使用我所拥有的方法。我的想法合理吗？

原文

Someone recently suggested that I remove an auto incrementing table used solely for storing IDs. I have not yet gone with this, I'm just exploring whether it is in fact a better solution than what I currently have. This would leave me with a table like this:

create table tag_translations (
  tag_id int not null,
  language_id int not null,
  tag_name varchar(255),
  primary key (tag_id, language_id)
);

I'm going to have duplicates for tag_id, storing translations of the tag in other languages.

When adding new tags, I need to forgo using auto increment on the tag_id, and instead assign new ID's manually. Unless it's just a translation of an existing tag, the ID needs to be unique for the new batch of translated inserts.

Can someone explain to me, in plain English, how this is typically done? I thought on this, but it doesn't seem to be any cleaner than my previous approach, if I have the thinking right. Here's what I'm assuming the process is:

Select tag_id from tag_translations
Pick the highest number in the result set + 1
Make a new query (for insertion)
Define some additional strategy for ensuring that ids for new tag_id records are never duplicated when tags get created at more or less the same microsecond

If this is the process, I think I'm better off sticking with my existing schema of having an additional table to auto increment ids. I still have to do an additional query to first check for a unique id (I'm trading a single join down the road for an insert today). If the headache of keeping my IDs unique when they need to be unique is what I think it will be, I may want to abandon this approach and stick with what I've got. Is my thinking sound?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

初雪 2024-11-24 16:47:52

要生成新标签 ID，有一个比 select max(tag_id) + 1 更好的选项。您可以使用单个字段和使用 last_insert_id() 接受参数的能力。下面创建了序列：

create table tag_id_seq (id int not null default 0);
insert into tag_id_seq values (0);

创建序列后，您可以使用如下 2 个语句从中获取下一个 id：

update tag_id_seq set id = last_insert_id(id + 1);
select last_insert_id();

last_insert_id() 是特定于连接的，因此第一个语句基本上用于捕获值仅适用于执行它的连接，并更新序列。第二条语句只是检索值。如果 2 个不同的连接执行的更新语句彼此非常接近，它们仍然会在 last_insert_id() 中隐藏不同的 id。

您可以将其包装在一个函数中，仅向其中传递序列名称，然后在真正创建新标签时调用它。这比具有单个 auto_increment 列的 tags 表要好得多。

您甚至可以使用当前使用的 id 来启动它，而不是从 0 开始，下一个 id 为 1：

update tag_id_seq set id = (select coalesc(max(tag_id),0) from tag_translations);

更新 也有一些变化> 语句：

有些人喜欢让 id 字段表示下一个 id 应该是什么，而不是最后一个给出的 id 是什么。在这种情况下，如果它是新序列，则将 id 开头为 1 而不是 0，并使用 set id = last_insert_id( id) + 1 （加上外面）。
此外，某些情况需要“保留”多个新 ID，而不是一次保留 1 个。在这种情况下，您可以添加所需的数量。根据上述变体，假设序列位于 11，这意味着最后检索到的 id 是 10 和 11 > 是下一个 id。如果您需要 7 个新 ID，则可以使用 set id = last_insert_id(id) + 7。 11 通过 select last_insert_id() 检索，这意味着您将使用 ID 11 到 17（含）。该序列将更新为 18，即下一个要检索的 id。

序列在许多情况下都有优势，以下是其中一些：

复合键不能包含 auto_increment 列，但您仍然需要一种生成 id 的方法。
这种在 MySQL 中执行序列的方式的重要之处在于字段。这意味着您几乎可以在任何地方拥有包含当前/下一个序列值的字段。
例如，您可以创建一个序列表，其中 1 列用于序列名称，1 列用于当前/下一个值字段，而不是拥有数十个表（每个序列 1 个表）。一张表中有几十行就整洁得多：
```
创建表序列（
  seqname varchar(50) 主键，
  id int 不为 null 默认 0);
```
（如果表是 InnoDB，则使用行级锁定而不是表锁定。）
InnoDB 中的auto_increment 可能不会按照您喜欢的方式运行。启动时，它执行相当于 select max(id)+1 的操作来重置计数器。这可能会产生倒带和重新使用之前使用过的 id 的效果。

For generating new tag ids, there is a better option than select max(tag_id) + 1. You can mimic sequences/generators in MySQL with a single field and the use of last_insert_id()'s ability to take an argument. The following creates the sequence:

create table tag_id_seq (id int not null default 0);
insert into tag_id_seq values (0);

After creating the sequence, you could get the next id from it with 2 statements like this:

update tag_id_seq set id = last_insert_id(id + 1);
select last_insert_id();

last_insert_id() is connection-specific, so the first statement basically serves to capture a value for only the connection that executed it, as well as update the sequence. The second statement just retrieves the value. If 2 different connections did the update statement very close to each other, they would both still have different ids tucked away in last_insert_id().

You can wrap this up in a function into which you pass merely the sequence name, then call it whenever you are really creating a new tag. This would work much better than your tags table that has the single auto_increment column.

Instead of starting at 0 with the next id being 1, you can bring it up even with currently used ids:

update tag_id_seq set id = (select coalesc(max(tag_id),0) from tag_translations);

There are also variations on the update statement:

Some like to have the id field represent what the next id should be, rather than what the last id given was. In that case, you would start id as 1 instead of 0 if it's a new sequence, and use set id = last_insert_id(id) + 1 (with the addition outside).
Also, some circumstances call for 'reserving' several new ids rather than 1 at a time. In that case you would add however many you need. Based on the above variation, let's say the sequence is at 11, meaning the last id retrieved was 10 and 11 is the next id. If you need 7 new ids, you would use set id = last_insert_id(id) + 7. 11 is retreived with select last_insert_id(), meaning you would use ids 11 to 17 (inclusive). The sequence would be updated to 18, the next id that will be retrieved.

Sequences have advantages in many circumstances, and these are a few:

Compound keys cannot contain an auto_increment column but you still need a way to generate ids.
The important thing about this style of doing sequences in MySQL is the field. This means you can have the field containing the current/next sequence value almost anywhere.
For example, instead of having dozens of tables, 1 for each sequence, you can create a sequences table with 1 column for the sequence name and 1 column for the current/next value field. Dozens of rows in 1 table is much tidier:
```
create table sequences (
  seqname varchar(50) primary key,
  id int not null default 0);
```
(And if the table is InnoDB, row-level locking is used rather than table-locking.)
auto_increment in InnoDB may not behave in a way you like. On startup, it does the equivalent of select max(id)+1 for resetting the counter. That can have the effect of rewinding and re-using ids that had been used previously.