在这种并发情况下如何保证数据的一致性呢?

发布于 2024-08-25 15:18:13 字数 369 浏览 6 评论 0原文

问题是这样的:

  • 我有多个竞争线程(100+)需要访问一个数据库表
  • 每个线程将传递一个字符串名称 - 如果该名称存在于表中,数据库应该返回ID对于名称尚不存在的行,应插入名称并返回 id。
  • 数据库中只能有一个 name 实例 - 即。 name 必须是唯一的

如何确保线程一不会在线程二尝试插入 name1 的同时插入 name1?换句话说,如何保证并发环境中name的唯一性?这也需要尽可能高效——这有可能成为严重的瓶颈。

我正在使用 MySQL 和 Java。

谢谢

The problem is this:

  • I have multiple competing threads (100+) that need to access one database table
  • Each thread will pass a String name - where that name exists in the table, the database should return the id for the row, where the name doesn't already exist, the name should be inserted and the id returned.
  • There can only ever be one instance of name in the database - ie. name must be unique

How do I ensure that thread one doesn't insert name1 at the same time as thread two also tries to insert name1? In other words, how do I guarantee the uniqueness of name in a concurrent environment? This also needs to be as efficient as possible - this has the potential to be a serious bottleneck.

I am using MySQL and Java.

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

沫雨熙 2024-09-01 15:18:14

在数据库中的名称列上创建唯一约束。

Create a unique constraint on name column in database.

羁客 2024-09-01 15:18:14

为名称列添加唯一约束。

Add a unique constraint for the name column.

淡笑忘祈一世凡恋 2024-09-01 15:18:13

假设 name 列有唯一约束,则每个 insert 将获取一个锁。任何尝试同时插入第二次的线程都将等待,直到第一次插入成功或失败(tx 提交或回滚)。

如果第一个事务成功,第二个事务将因唯一密钥违规而失败。然后你就知道它已经存在了。

如果每笔交易有一个插入,那就没问题了。如果每个事务插入超过 1 个,则可能会出现死锁。

每个线程都会传递一个字符串名称 -
该名称存在于表中的位置,
数据库应该返回 id
名称不存在的行
已经存在,名称应该是
插入并返回 id。

总而言之,算法如下:

1 read row with name
   2.1 if found, return row id
   2.2 if not found, attempt to insert
      2.2.1 if insert succeeds, return new row id
      2.2.2 if insert fails with unique constraint violation
          2.2.2.1 read row with name
          2.2.2.2 read should succeed this time, so return row id

由于唯一索引上可能存在高争用,因此插入可能会阻塞一段时间。在这种情况下,事务可能超时。进行一些压力测试,并调整配置,直到它与您的负载正常工作。

此外,您还应该检查是否遇到唯一约束违规异常或其他异常。

再说一次,只有在每个事务有一个插入的情况下,这才有效,否则可能会死锁


另外,您可以尝试使用“select * for update”读取步骤 1 中的行。在这种情况下,它会等待并发插入提交或成功。这可以稍微减少步骤 2.2.2 由于索引争用而产生的错误量。

Assuming there is a unique constraint on the name column, each insert will acquire a lock. Any thread that attempts to insert it a second time concurrently will wait until the 1st insert either succeeds or fails (tx commit or rolls back).

If the 1st transaction succeeds, 2nd transaction will fail with with a unique key violation. Then you know it exists already.

If there is one insert per transaction, it'ok. If there are more than 1 insert per transaction, you may deadlock.

Each thread will pass a String name -
where that name exists in the table,
the database should return the id for
the row, where the name doesn't
already exist, the name should be
inserted and the id returned.

So all in all, the algo is like this:

1 read row with name
   2.1 if found, return row id
   2.2 if not found, attempt to insert
      2.2.1 if insert succeeds, return new row id
      2.2.2 if insert fails with unique constraint violation
          2.2.2.1 read row with name
          2.2.2.2 read should succeed this time, so return row id

Because there can be a high contention on the unique index, the insert may block for some time. In which case the transaction may time out. Make some stress test, and tune the configuration until it works correctly with your load.

Also, you should check if you get a unique constraint violation exception or some other exception.

And again, this works only if there is one insert per transaction, otherwise it may deadlock.


Also, you can try to read the row at step 1 with "select * for update". In this case, it waits until a concurrent insert either commits or succeeds. This can slightly reduce the amount of error at step 2.2.2 due to the contention on the index.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文