在这种并发情况下如何保证数据的一致性呢?
问题是这样的:
- 我有多个竞争线程(100+)需要访问一个数据库表
- 每个线程将传递一个
字符串名称
- 如果该名称存在于表中,数据库应该返回ID对于名称尚不存在的行,应插入名称并返回 id。 - 数据库中只能有一个
name
实例 - 即。 name 必须是唯一的
如何确保线程一不会在线程二尝试插入 name1
的同时插入 name1
?换句话说,如何保证并发环境中name
的唯一性?这也需要尽可能高效——这有可能成为严重的瓶颈。
我正在使用 MySQL 和 Java。
谢谢
The problem is this:
- I have multiple competing threads (100+) that need to access one database table
- Each thread will pass a
String name
- where that name exists in the table, the database should return the id for the row, where the name doesn't already exist, the name should be inserted and the id returned. - There can only ever be one instance of
name
in the database - ie. name must be unique
How do I ensure that thread one doesn't insert name1
at the same time as thread two also tries to insert name1
? In other words, how do I guarantee the uniqueness of name
in a concurrent environment? This also needs to be as efficient as possible - this has the potential to be a serious bottleneck.
I am using MySQL and Java.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在数据库中的名称列上创建唯一约束。
Create a unique constraint on name column in database.
为名称列添加唯一约束。
Add a unique constraint for the name column.
假设 name 列有唯一约束,则每个
insert
将获取一个锁。任何尝试同时插入第二次的线程都将等待,直到第一次插入成功或失败(tx 提交或回滚)。如果第一个事务成功,第二个事务将因唯一密钥违规而失败。然后你就知道它已经存在了。
如果每笔交易有一个插入,那就没问题了。如果每个事务插入超过 1 个,则可能会出现死锁。
总而言之,算法如下:
由于唯一索引上可能存在高争用,因此
插入
可能会阻塞一段时间。在这种情况下,事务可能超时。进行一些压力测试,并调整配置,直到它与您的负载正常工作。此外,您还应该检查是否遇到唯一约束违规异常或其他异常。
再说一次,只有在每个事务有一个插入的情况下,这才有效,否则可能会死锁。
另外,您可以尝试使用“
select * for update
”读取步骤 1 中的行。在这种情况下,它会等待并发插入提交或成功。这可以稍微减少步骤 2.2.2 由于索引争用而产生的错误量。Assuming there is a unique constraint on the name column, each
insert
will acquire a lock. Any thread that attempts to insert it a second time concurrently will wait until the 1stinsert
either succeeds or fails (tx commit or rolls back).If the 1st transaction succeeds, 2nd transaction will fail with with a unique key violation. Then you know it exists already.
If there is one insert per transaction, it'ok. If there are more than 1 insert per transaction, you may deadlock.
So all in all, the algo is like this:
Because there can be a high contention on the unique index, the
insert
may block for some time. In which case the transaction may time out. Make some stress test, and tune the configuration until it works correctly with your load.Also, you should check if you get a unique constraint violation exception or some other exception.
And again, this works only if there is one insert per transaction, otherwise it may deadlock.
Also, you can try to read the row at step 1 with "
select * for update
". In this case, it waits until a concurrent insert either commits or succeeds. This can slightly reduce the amount of error at step 2.2.2 due to the contention on the index.