如何处理 Web 应用程序中的并发更改?
以下是我想在 Web 应用程序中执行的两个潜在工作流程。
变体 1
- 用户发送请求
- 服务器读取数据
- 服务器修改数据
- 服务器保存修改后的数据
变体 2:
- 用户发送请求
- 服务器读取数据
- 服务器向用户发送数据
- 用户发送带有修改的请求
- 服务器保存修改后的数据
在每种情况下,我想知道:什么是确保对该服务的并发访问将产生合理结果的标准方法? (即没有人的编辑被破坏,值对应于编辑的某些顺序等)
这种情况是假设的,但这里有一些我在实践中可能需要处理这个问题的细节:
- Web应用程序,但
- 可能未指定语言,使用Web框架
- 数据存储是一个SQL关系数据库,
- 涉及的逻辑太复杂,无法在查询中很好地表达,例如值=值+1
我觉得我不想在这里尝试重新发明轮子。当然,这些都是众所周知的问题,也有众所周知的解决方案。请指教。
谢谢。
Here are two potential workflows I would like to perform in a web application.
Variation 1
- user sends request
- server reads data
- server modifies data
- server saves modified data
Variation 2:
- user sends request
- server reads data
- server sends data to user
- user sends request with modifications
- server saves modified data
In each of these cases, I am wondering: what are the standard approaches to ensuring that concurrent access to this service will produce sane results? (i.e. nobody's edit gets clobbered, values correspond to some ordering of the edits, etc.)
The situation is hypothetical, but here are some details of where I would likely need to deal with this in practice:
- web application, but language unspecified
- potentially, using a web framework
- data store is a SQL relational database
- the logic involved is too complex to express well in a query e.g. value = value + 1
I feel like I would prefer not to try and reinvent the wheel here. Surely these are well known problems with well known solutions. Please advise.
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
据我所知,这个问题没有通用的解决方案。
问题的根源在于,用户可能会检索数据并在屏幕上盯着它很长时间,然后再进行更新和保存。
我知道三种基本方法:
当用户读取数据库时,锁定记录,直到用户保存任何更新后才释放记录。实际上,这是非常不切实际的。如果用户打开一个屏幕然后去吃午饭而不保存怎么办?还是回家休息一天?或者在尝试更新这个愚蠢的记录时感到非常沮丧,以至于他退出并且再也没有回来?
将您的更新表示为增量而不是目的地。举一个经典的例子,假设您有一个记录库存的系统。每次有销售时,您都必须从库存数量中减去 1(或更多)。
假设现有数量为 10。用户 A 创建了一笔销售。当前数量 = 10。用户 B 创建销售。他还得到当前数量 = 10。用户 A 输入已售出两个单位。新数量 = 10 - 2 = 8。保存。用户 B 输入已售出的一件商品。新数量 = 10(他加载的值) - 1 = 9。保存。显然,出了问题。
解决方案:不要编写“更新库存集数量=9,其中 itemid=12345”,而是编写“更新库存集数量=数量-1,其中 itemid=12345”。然后让数据库对更新进行排队。这与策略 #1 有很大不同,因为数据库只需锁定记录足够长的时间即可读取、更新和写入。当有人盯着屏幕时,它不必等待。
当然,这仅适用于可以表示为增量的更改。比如说,如果您要更新客户的电话号码,这是行不通的。 (例如,旧号码是 555-1234。用户 A 说将其更改为 555-1235。这是 +1 的更改。用户 B 说将其更改为 555-1243。这是 +9 的更改。所以总更改是+10,客户的新号码是 555-1244 :-) ) 但在这种情况下,“最后一个单击 Enter 键的用户获胜”是。无论如何,这可能是你能做的最好的事情。
[注意:论坛会自动对我的编号列表重新编号。我不知道如何覆盖这个。]
To the best of my knowledge, there is no general solution to the problem.
The root of the problem is that the user may retrieve data and stare at it on the screen for a long time before making an update and saving.
I know of three basic approaches:
When the user reads the database, lock the record, and don't release until the user saves any updates. In practice, this is wildly impractical. What if the user brings up a screen and then goes to lunch without saving? Or goes home for the day? Or is so frustrated trying to update this stupid record that he quits and never comes back?
Express your updates as deltas rather than destinations. To take the classic example, suppose you have a system that records stock in inventory. Every time there is a sale, you must subtract 1 (or more) from the inventory count.
So say the present quantity on hand is 10. User A creates a sale. Current quantity = 10. User B creates a sale. He also gets current quantity = 10. User A enters that two units are sold. New quantity = 10 - 2 = 8. Save. User B enters one unit sold. New quantity = 10 (the value he loaded) - 1 = 9. Save. Clearly, something went wrong.
Solution: Instead of writing "update inventory set quantity=9 where itemid=12345", write "update inventory set quantity=quantity-1 where itemid=12345". Then let the database queue the updates. This is very different from strategy #1, as the database only has to lock the record long enough to read it, make the update, and write it. It doesn't have to wait while someone stares at the screen.
Of course, this is only useable for changes that can be expressed as a delta. If you are, say, updating the customer's phone number, it's not going to work. (Like, old number is 555-1234. User A says to change it to 555-1235. That's a change of +1. User B says to change it to 555-1243. That's a change of +9. So total change is +10, the customer's new number is 555-1244. :-) ) But in cases like that, "last user to click the enter key wins" is probably the best you can do anyway.
[Note: the forum is automatically renumbering my numbered lists. I'm not sure how to override this.]
来回答一下标题中的问题。有两种通用解决方案可用于处理通过 HTTP 丢失更新问题 。
假设您的应用程序由两个组件组成;一个前端和一个 API 后端。并且有两个并发用户尝试对同一数据执行更新。
1. 乐观并发控制
实现某种方式来了解哪些数据最后更新了。一些常见的方法:
LastUpdated
时间戳字段Version
字段上面提到的方式可以与使用 条件 HTTP 标头。如果您的服务器框架开箱即用地支持它,那么它会很有用。
示例:
2. 悲观并发控制
使用(悲观)数据库锁。
示例:用户 1 发出 GET 请求。用户 2 发出相同的 GET 请求并收到错误。这是因为我们无法给出可能过时的数据,因此该信息已被数据库锁定。
这种方法有很大的缺点。
更改 API
人们可以更改 API,这样就不会出现更新丢失的问题。
@jay 已经提到增量更新作为解决方案。假设模型中有一个数字字段,每次请求时该字段应加 1。一种实现是 PUT 端点,它使用传入的号码更新模型和字段。该API存在更新丢失问题。另一种实现是有一个增量端点。该API不存在更新丢失的问题。 (如果您将增量视为资源并使用 POST 创建新增量,则该 API 是 RESTful)。
另一种方法是将 API 从 PUT 更改为 PATCH。这不是一个解决方案,但它可以最大限度地减少丢失更新的可能性。
结论
使用乐观并发控制。
当谷歌搜索“丢失更新问题”时,可能只能得到有关数据库的结果。尽管这是同样的问题,但数据库锁定对于通过 HTTP 丢失并发更新来说并不是一个好的解决方案。如上所述,与乐观并发控制相比,它有几个缺点。
To answer the question in the title. There are two general solutions for dealing with the lost update problem over HTTP.
Let's assume your application consists of two components; a front end and an API back end. And there are two concurrent users that tries to perform an update on the same data.
1. Optimistic concurrency control
Implement some way of knowing which data has been updated last. Some common ways:
LastUpdated
timestamp field stored in dbVersion
field stored in dbThe ways mentioned above can be combined with the use of conditional HTTP headers. Can be useful if your server framework supports it out of the box.
Example:
2. Pessimistic concurrency control
Use (pessimistic) database locks.
Example: User 1 makes a GET request. User 2 makes the same GET request and gets an error. This is because we cannot give out potentially outdated data so the information has been locked by db.
There are big downsides with this approach.
Change your API
One can change the API so it doesn't have the lost update problem.
@jay has already mentioned delta updates as a solution. Let's say one have a number field in a model that should be incremented by 1 on each request. One implementation is a PUT endpoint which updates the model and field with the incoming number. This API has the lost update problem. Another implementation is to have an increment endpoint. This API doesn't have the lost update problem. (The API is RESTful if you view an increment as a resource and create new increments with POST).
Another way is to change the API from PUT to PATCH. This is not a solution, but it will minimise the possibility for lost updates.
Conclusion
Use optimistic concurrency control.
When Googling "lost update problem" one might only get results about databases. Although this is the same problem, database locking is not a good solution for lost concurrent updates over HTTP. It has several downsides compared to optimistic concurrency control, as written above.
如果mysql中没有事务,可以使用update命令来确保数据没有损坏。
如果状态为一,则只有一个进程可以得到记录已更新的结果。在下面的代码中,如果未执行更新(如果没有要更新的行),则返回 -1。
If you do not have transactions in mysql, you can use the update command to ensure that the data is not corrupted.
If status is one, then only one process well get the result that a record was updated. In the code below, returns -1 if the update was NOT executed (if there were no rows to update).
应用程序层中的事情很简单 - 每个请求都由不同的线程(或进程)提供服务,因此除非您的处理类(服务)中有状态,否则一切都是安全的。
当您到达数据库(即保存状态的位置)时,事情会变得更加复杂。您需要事务来确保一切正常。
事务具有一组属性 - ACID,“保证数据库事务得到可靠处理”。
Things are simple in the application layer - every request is served by a different thread (or process), so unless you have state in your processing classes (services), everything is safe.
Things get more complicated when you reach the database - i.e. where the state is held. There you need transactions to ensure that everything is ok.
Transactions have a set of properties - ACID, that "guarantee database transactions are processed reliably".