某种“不同的自动增量索引”;每个主键值

发布于 2024-12-06 09:33:53 字数 698 浏览 2 评论 0原文

我有一个表,其中有一个 id (具有自动增量的主键)、uid (例如引用用户 ID 的键)以及其他一些对于我的问题来说不会的东西没关系。

我想为每个 uid 条目在 id 上创建(我们称之为)不同的自动递增键。

因此,我将添加一个 uid 10 的条目,并且该条目的 id 字段将具有 1,因为之前没有带有uid 中的值 10。我将添加一个 uid 4 的新条目,其 id 将是 3,因为我已经有两个 uid 4< 的条目< /强>。

...非常明显的解释,但我试图尽可能解释清楚地证明这个想法...清楚。

  1. 什么 SQL 引擎可以原生提供这样的功能? (非基于 Microsoft/Oracle)
  2. 如果没有,我怎样才能最好地复制它?也许是触发器?
  3. 这个功能有更合适的名字吗?
  4. 如果您知道提供此类功能的非 SQL 数据库引擎,请无论如何命名它,我很好奇。

谢谢。

I have got a table which has an id (primary key with auto increment), uid (key refering to users id for example) and something else which for my question won’t matter.

I want to make, lets call it, different auto-increment keys on id for each uid entry.

So, I will add an entry with uid 10, and the id field for this entry will have a 1 because there were no previous entries with a value of 10 in uid. I will add a new one with uid 4 and its id will be 3 because I there were already two entried with uid 4.

...Very obvious explanation, but I am trying to be as explainative an clear as I can to demonstrate the idea... clearly.

  1. What SQL engine can provide such a functionality natively? (non Microsoft/Oracle based)
  2. If there is none, how could I best replicate it? Triggers perhaps?
  3. Does this functionality have a more suitable name?
  4. In case you know about a non SQL database engine providing such a functioality, name it anyway, I am curious.

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

寄居者 2024-12-13 09:33:53

MySQL 的 MyISAM 引擎可以做到这一点。请参阅他们的手册,使用 AUTO_INCRMENT 部分:

对于 MyISAM 表,您可以在多列索引中的辅助列上指定 AUTO_INCRMENT。在这种情况下,AUTO_INCRMENT 列的生成值计算公式为 MAX(auto_increment_column) + 1 WHERE prefix=given-prefix。当您想要将数据放入有序组时,这非常有用。

该文档在该段落之后继续,展示了一个示例。

MySQL 中的 InnoDB 引擎不支持此功能,这是不幸的,因为在几乎所有情况下最好使用 InnoDB。 我避免使用 MyISAM。

在没有锁定的情况下,您无法使用触发器(或任何仅限于事务范围的 SQL 语句)来模拟此行为INSERT 上的表。考虑以下操作序列:

  1. Mario 启动事务并为用户 4 插入一个新行。Bill
  2. 启动事务并为用户 4 插入一个新行。Mario
  3. 的会话触发一个触发器来计算用户 4 的 MAX(id)+1。您得到 3 Bill
  4. 的会话触发一个触发器来计算 MAX(id)。我得到 3。Bill
  5. 的会话完成了他的 INSERT 并提交。
  6. Mario 的会话尝试完成他的 INSERT,但 (userid=4, id=3) 行现在存在,因此 Mario 遇到主键冲突。

一般来说,如果没有某种同步,您就无法控制这些步骤的执行顺序。

解决方案是:

  • 获取独占表锁。在尝试 INSERT 之前,锁定表。这是为了防止并发 INSERT 创建 竞争条件(如上例所示)所必需的。有必要锁定整个表,因为您试图限制 INSERT,所以没有要锁定的特定行(如果您尝试使用 UPDATE 控制对给定行的访问,则可以仅锁定特定行)。但是锁定表会导致对表的访问变成串行访问,从而限制了吞吐量。

  • 在事务范围之外执行。以不会对两个并发事务隐藏的方式生成 ID 号。顺便说一句,这就是 AUTO_INCRMENT 的作用。两个并发会话将各自获得一个唯一的 id 值,无论其执行顺序或提交顺序如何。但跟踪每个用户 ID 最后生成的 ID 需要访问数据库或重复的数据存储。例如,每个用户 ID 一个 memcached 密钥,可以是 原子递增

确保插入获得唯一的值相对容易。但很难确保它们会获得连续序数值。还要考虑:

  • 如果您插入事务然后回滚,会发生什么?您在该事务中分配了 id 值 3,然后我分配了值 4,因此如果您回滚并我提交,现在就会出现间隙。
  • 如果由于表上的其他约束(例如另一列不为空)而导致 INSERT 失败,会发生什么情况?您也可以通过这种方式获得间隙。
  • 如果您删除了一行,是否需要为同一用户 ID 重新编号以下所有行?如果您使用该解决方案,这会对您的 memcached 条目产生什么影响?

MySQL's MyISAM engine can do this. See their manual, in section Using AUTO_INCREMENT:

For MyISAM tables you can specify AUTO_INCREMENT on a secondary column in a multiple-column index. In this case, the generated value for the AUTO_INCREMENT column is calculated as MAX(auto_increment_column) + 1 WHERE prefix=given-prefix. This is useful when you want to put data into ordered groups.

The docs go on after that paragraph, showing an example.

The InnoDB engine in MySQL does not support this feature, which is unfortunate because it's better to use InnoDB in almost all cases. I avoid using MyISAM.

You can't emulate this behavior using triggers (or any SQL statements limited to transaction scope) without locking tables on INSERT. Consider this sequence of actions:

  1. Mario starts transaction and inserts a new row for user 4.
  2. Bill starts transaction and inserts a new row for user 4.
  3. Mario's session fires a trigger to computes MAX(id)+1 for user 4. You get 3.
  4. Bill's session fires a trigger to compute MAX(id). I get 3.
  5. Bill's session finishes his INSERT and commits.
  6. Mario's session tries to finish his INSERT, but the row with (userid=4, id=3) now exists, so Mario gets a primary key conflict.

In general, you can't control the order of execution of these steps without some kind of synchronization.

The solutions to this are either:

  • Get an exclusive table lock. Before trying an INSERT, lock the table. This is necessary to prevent concurrent INSERTs from creating a race condition like in the example above. It's necessary to lock the whole table, since you're trying to restrict INSERT there's no specific row to lock (if you were trying to govern access to a given row with UPDATE, you could lock just the specific row). But locking the table causes access to the table to become serial, which limits your throughput.

  • Do it outside transaction scope. Generate the id number in a way that won't be hidden from two concurrent transactions. By the way, this is what AUTO_INCREMENT does. Two concurrent sessions will each get a unique id value, regardless of their order of execution or order of commit. But tracking the last generated id per userid requires access to the database, or a duplicate data store. For example, a memcached key per userid, which can be incremented atomically.

It's relatively easy to ensure that inserts get unique values. But it's hard to ensure they will get consecutive ordinal values. Also consider:

  • What happens if you INSERT in a transaction but then roll back? You've allocated id value 3 in that transaction, and then I allocated value 4, so if you roll back and I commit, now there's a gap.
  • What happens if an INSERT fails because of other constraints on the table (e.g. another column is NOT NULL)? You could get gaps this way too.
  • If you ever DELETE a row, do you need to renumber all the following rows for the same userid? What does that do to your memcached entries if you use that solution?
云归处 2024-12-13 09:33:53

SQL Server 应该允许您执行此操作。如果您无法使用计算列(可能不是 -有一些限制),当然你可以在 触发

MySQL 也允许您通过触发器来实现这一点。

SQL Server should allow you to do this. If you can't implement this using a computed column (probably not - there are some restrictions), surely you can implement it in a trigger.

MySQL also would allow you to implement this via triggers.

掀纱窥君容 2024-12-13 09:33:53

在评论中,您提出了有关效率的问题。除非您处理的数据量非常大,否则与使用 4 字节 INT 等相比,存储 8 字节 DATETIME 的开销并不大。

它还极大地简化了数据插入,并且能够处理被删除的记录,而不会在序列中创建“漏洞”。

如果您确实需要它,请小心字段名称。如果表中有 uidid,我希望 id 在该表中是唯一的,并且 uid 来引用其他东西。也许,可以使用字段名称 property_idamendment_id

在实施方面,一般有两种选择。

1)。触发器的

实现有所不同,但逻辑保持不变。由于您没有指定 RDBMS(除了非 MS/Oracle),一般逻辑很简单...

  • 启动事务(通常这已经在触发器内隐式启动)
  • 查找 MAX(amendment_id)对于正在插入的 property_id
  • 使用 MAX(amendment_id) + 1 更新新插入的值
  • 提交

事务 需要注意的是...
- 同时插入多条记录
- 插入的记录中已经填充了 revision_id
- 更新改变现有记录

2)。存储过程

如果您使用存储过程来控制对表的写入,您将获得更多的控制权。

  • 隐含地,您知道您只处理一条记录。
  • 您只是不为 DEFAULT 字段提供参数。
  • 您知道哪些更新/删除可以发生,哪些不能发生。
  • 您可以实现您喜欢的所有业务逻辑,而无需隐藏触发器

我个人推荐存储过程路线,但触发器确实有效。

In a comment you ask the question about efficiency. Unless you are dealing with extreme volumes, storing an 8 byte DATETIME isn't much of an overhead compared to using, for example, a 4 byte INT.

It also massively simplifies your data inserts, as well as being able to cope with records being deleted without creating 'holes' in your sequence.

If you DO need this, be careful with the field names. If you have uid and id in a table, I'd expect id to be unique in that table, and uid to refer to something else. Perhaps, instead, use the field names property_id and amendment_id.

In terms of implementation, there are generally two options.

1). A trigger

Implementations vary, but the logic remains the same. As you don't specify an RDBMS (other than NOT MS/Oracle) the general logic is simple...

  • Start a transaction (often this is Implicitly already started inside triggers)
  • Find the MAX(amendment_id) for the property_id being inserted
  • Update the newly inserted value with MAX(amendment_id) + 1
  • Commit the transaction

Things to be aware of are...
- multiple records being inserted at the same time
- records being inserted with amendment_id being already populated
- updates altering existing records

2). A Stored Procedure

If you use a stored procedure to control writes to the table, you gain a lot more control.

  • Implicitly, you know you're only dealing with one record.
  • You simply don't provide a parameter for DEFAULT fields.
  • You know what updates / deletes can and can't happen.
  • You can implement all the business logic you like without hidden triggers

I personally recommend the Stored Procedure route, but triggers do work.

只是在用心讲痛 2024-12-13 09:33:53

确保数据类型正确非常重要。

您所描述的是多部分密钥。因此,请使用多部分密钥。不要尝试将所有内容编码为魔法整数,否则您的其余代码将会中毒。

如果记录由 (entity_id,version_number) 标识,则接受该描述并直接使用它,而不是破坏键的含义。您必须编写限制版本号的查询,但这没关系。数据库擅长这类事情。

version_number 可以是时间戳,如 a_horse_with_no_name 所示。这是个好主意。使用时间戳而不是普通整数并没有明显的性能劣势。你获得的是意义,这个更重要。

您可以维护一个“最新版本”表,其中对于每个 entity_id,仅包含具有最新 version_number 的记录。这对您来说会是更多的工作,所以只有在您确实需要性能时才这样做。

It is important to get your data types right.

What you are describing is a multi-part key. So use a multi-part key. Don't try to encode everything into a magic integer, you will poison the rest of your code.

If a record is identified by (entity_id,version_number) then embrace that description and use it directly instead of mangling the meaning of your keys. You will have to write queries which constrain the version number but that's OK. Databases are good at this sort of thing.

version_number could be a timestamp, as a_horse_with_no_name suggests. This is quite a good idea. There is no meaningful performance disadvantage to using timestamps instead of plain integers. What you gain is meaning, which is more important.

You could maintain a "latest version" table which contains, for each entity_id, only the record with the most-recent version_number. This will be more work for you, so only do it if you really need the performance.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文