如何处理 Web 应用程序中的并发更改?

发布于 2024-12-01 23:31:07 字数 560 浏览 7 评论 0原文

以下是我想在 Web 应用程序中执行的两个潜在工作流程。

变体 1

  • 用户发送请求
  • 服务器读取数据
  • 服务器修改数据
  • 服务器保存修改后的数据

变体 2:

  • 用户发送请求
  • 服务器读取数据
  • 服务器向用户发送数据
  • 用户发送带有修改的请求
  • 服务器保存修改后的数据

在每种情况下,我想知道:什么是确保对该服务的并发访问将产生合理结果的标准方法? (即没有人的编辑被破坏,值对应于编辑的某些顺序等)

这种情况是假设的,但这里有一些我在实践中可能需要处理这个问题的细节:

  • Web应用程序,但
  • 可能未指定语言,使用Web框架
  • 数据存储是一个SQL关系数据库,
  • 涉及的逻辑太复杂,无法在查询中很好地表达,例如值=值+1

我觉得我不想在这里尝试重新发明轮子。当然,这些都是众所周知的问题,也有众所周知的解决方案。请指教。

谢谢。

Here are two potential workflows I would like to perform in a web application.

Variation 1

  • user sends request
  • server reads data
  • server modifies data
  • server saves modified data

Variation 2:

  • user sends request
  • server reads data
  • server sends data to user
  • user sends request with modifications
  • server saves modified data

In each of these cases, I am wondering: what are the standard approaches to ensuring that concurrent access to this service will produce sane results? (i.e. nobody's edit gets clobbered, values correspond to some ordering of the edits, etc.)

The situation is hypothetical, but here are some details of where I would likely need to deal with this in practice:

  • web application, but language unspecified
  • potentially, using a web framework
  • data store is a SQL relational database
  • the logic involved is too complex to express well in a query e.g. value = value + 1

I feel like I would prefer not to try and reinvent the wheel here. Surely these are well known problems with well known solutions. Please advise.

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

静谧 2024-12-08 23:31:07

据我所知,这个问题没有通用的解决方案。

问题的根源在于,用户可能会检索数据并在屏幕上盯着它很长时间,然后再进行更新和保存。

我知道三种基本方法:

  1. 当用户读取数据库时,锁定记录,直到用户保存任何更新后才释放记录。实际上,这是非常不切实际的。如果用户打开一个屏幕然后去吃午饭而不保存怎么办?还是回家休息一天?或者在尝试更新这个愚蠢的记录时感到非常沮丧,以至于他退出并且再也没有回来?

  2. 将您的更新表示为增量而不是目的地。举一个经典的例子,假设您有一个记录库存的系统。每次有销售时,您都必须从库存数量中减去 1(或更多)。

假设现有数量为 10。用户 A 创建了一笔销售。当前数量 = 10。用户 B 创建销售。他还得到当前数量 = 10。用户 A 输入已售出两个单位。新数量 = 10 - 2 = 8。保存。用户 B 输入已售出的一件商品。新数量 = 10(他加载的值) - 1 = 9。保存。显然,出了问题。

解决方案:不要编写“更新库存集数量=9,其中 itemid=12345”,而是编写“更新库存集数量=数量-1,其中 itemid=12345”。然后让数据库对更新进行排队。这与策略 #1 有很大不同,因为数据库只需锁定记录足够长的时间即可读取、更新和写入。当有人盯着屏幕时,它不必等待。

当然,这仅适用于可以表示为增量的更改。比如说,如果您要更新客户的电话号码,这是行不通的。 (例如,旧号码是 555-1234。用户 A 说将其更改为 555-1235。这是 +1 的更改。用户 B 说将其更改为 555-1243。这是 +9 的更改。所以总更改是+10,客户的新号码是 555-1244 :-) ) 但在这种情况下,“最后一个单击 Enter 键的用户获胜”是。无论如何,这可能是你能做的最好的事情。

  1. 更新时,检查数据库中的相关字段是否与您的“来自”值匹配。例如,假设您在一家律师事务所工作,为您的客户谈判合同。您有一个屏幕,用户可以在其中输入有关谈判的注释。用户A调出一条合同记录。用户B调出相同的合同记录。用户 A 输入,他刚刚通过电话与另一方通话,并且他们同意所提议的条款。用户 B 也一直试图给对方打电话,他输入说他们没有回复电话,他怀疑他们在阻挠。用户 A 单击“保存”。我们是否希望用户 B 的评论覆盖用户 A 的评论?可能不会。相反,我们显示一条消息,指示自他读取记录以来注释已更改,并允许他在决定是否继续保存、中止或输入不同内容之前查看新值。

[注意:论坛会自动对我的编号列表重新编号。我不知道如何覆盖这个。]

To the best of my knowledge, there is no general solution to the problem.

The root of the problem is that the user may retrieve data and stare at it on the screen for a long time before making an update and saving.

I know of three basic approaches:

  1. When the user reads the database, lock the record, and don't release until the user saves any updates. In practice, this is wildly impractical. What if the user brings up a screen and then goes to lunch without saving? Or goes home for the day? Or is so frustrated trying to update this stupid record that he quits and never comes back?

  2. Express your updates as deltas rather than destinations. To take the classic example, suppose you have a system that records stock in inventory. Every time there is a sale, you must subtract 1 (or more) from the inventory count.

So say the present quantity on hand is 10. User A creates a sale. Current quantity = 10. User B creates a sale. He also gets current quantity = 10. User A enters that two units are sold. New quantity = 10 - 2 = 8. Save. User B enters one unit sold. New quantity = 10 (the value he loaded) - 1 = 9. Save. Clearly, something went wrong.

Solution: Instead of writing "update inventory set quantity=9 where itemid=12345", write "update inventory set quantity=quantity-1 where itemid=12345". Then let the database queue the updates. This is very different from strategy #1, as the database only has to lock the record long enough to read it, make the update, and write it. It doesn't have to wait while someone stares at the screen.

Of course, this is only useable for changes that can be expressed as a delta. If you are, say, updating the customer's phone number, it's not going to work. (Like, old number is 555-1234. User A says to change it to 555-1235. That's a change of +1. User B says to change it to 555-1243. That's a change of +9. So total change is +10, the customer's new number is 555-1244. :-) ) But in cases like that, "last user to click the enter key wins" is probably the best you can do anyway.

  1. On update, check that relevant fields in the database match your "from" value. For example, say you work for a law firm negotiating contracts for your clients. You have a screen where a user can enter notes about negotiations. User A brings up a contract record. User B brings up the same contract record. User A enters that he just spoke to the other party on the phone and they are agreeable to the proposed terms. User B, who has also been trying to call the other party, enters that they are not responding to phone calls and he suspects they are stonewalling. User A clicks save. Do we want user B's comments to overwrite user A's? Probably not. Instead we display a message indicating that the notes have been changed since he read the record, and allowing him to see the new value before deciding whether to proceed with the save, abort, or enter something different.

[Note: the forum is automatically renumbering my numbered lists. I'm not sure how to override this.]

毁虫ゝ 2024-12-08 23:31:07

来回答一下标题中的问题。有两种通用解决方案可用于处理通过 HTTP 丢失更新问题

假设您的应用程序由两个组件组成;一个前端和一个 API 后端。并且有两个并发用户尝试对同一数据执行更新。

1. 乐观并发控制

实现某种方式来了解哪些数据最后更新了。一些常见的方法:

  • 使用 ETag。这可以是存储在数据库中的字段,但也可以在每次更新时计算。
  • 使用存储在 db 中的 LastUpdated 时间戳字段
  • 使用存储在 db 中的 Version 字段

上面提到的方式可以与使用 条件 HTTP 标头。如果您的服务器框架开箱即用地支持它,那么它会很有用。

示例:

  1. 用户 1 发出 GET 请求。用户 2 发出相同的 GET 请求并检索相同的数据。
  2. 用户 2 在 PUT 请求中更新数据。
  3. 用户 1 发出 PUT 请求并收到错误消息。这是因为 API 后端比较了传入数据(来自用户 1)的 ETag 和 db 中数据(来自用户 2)的 ETag,但 ETag 不匹配。这意味着我们知道用户 1 在发送 PUT 请求时没有最新的更改。

2. 悲观并发控制

使用(悲观)数据库锁。

示例:用户 1 发出 GET 请求。用户 2 发出相同的 GET 请求并收到错误。这是因为我们无法给出可能过时的数据,因此该信息已被数据库锁定。

这种方法有很大的缺点。

  • 用户2长时间无法读取或更新该数据。而用户1甚至还没有进行更新。
  • 另一种方法是使用数据库事务。 GET 请求开始一个事务,并在用户发出 POST 请求或离开站点时结束。这可以允许更多的并发性。
  • 悲观并发控制要求 API 后端知道用户 1 何时不再查看前端的某些数据。这很难。通信可能通过无状态的 HTTP 进行。如果我猜测一个解决方案,可能是使用 Websockets 与用户 1 保持有状态连接,以监视她正在查看和编辑哪些数据。 (当然也有缺点)

更改 API

人们可以更改 API,这样就不会出现更新丢失的问题。

@jay 已经提到增量更新作为解决方案。假设模型中有一个数字字段,每次请求时该字段应加 1。一种实现是 PUT 端点,它使用传入的号码更新模型和字段。该API存在更新丢失问题。另一种实现是有一个增量端点。该API不存在更新丢失的问题。 (如果您将增量视为资源并使用 POST 创建新增量,则该 API 是 RESTful)。

另一种方法是将 API 从 PUT 更改为 PATCH。这不是一个解决方案,但它可以最大限度地减少丢失更新的可能性。

结论

使用乐观并发控制。

当谷歌搜索“丢失更新问题”时,可能只能得到有关数据库的结果。尽管这是同样的问题,但数据库锁定对于通过 HTTP 丢失并发更新来说并不是一个好的解决方案。如上所述,与乐观并发控制相比,它有几个缺点。

To answer the question in the title. There are two general solutions for dealing with the lost update problem over HTTP.

Let's assume your application consists of two components; a front end and an API back end. And there are two concurrent users that tries to perform an update on the same data.

1. Optimistic concurrency control

Implement some way of knowing which data has been updated last. Some common ways:

  • Using an ETag. This could be a field stored in db, but could also be computed on each update.
  • Using a LastUpdated timestamp field stored in db
  • Using a Version field stored in db

The ways mentioned above can be combined with the use of conditional HTTP headers. Can be useful if your server framework supports it out of the box.

Example:

  1. User 1 makes a GET request. User 2 makes the same GET request and retrieves the same data.
  2. User 2 updates the data in a PUT request.
  3. User 1 makes a PUT request and get error message. This is because the API back-end compared the ETag on the incoming data (from user 1) and ETag on data in db (from user 2) and ETags mismatched. Meaning we know user 1 didn't have the latest changes when sending the PUT request.

2. Pessimistic concurrency control

Use (pessimistic) database locks.

Example: User 1 makes a GET request. User 2 makes the same GET request and gets an error. This is because we cannot give out potentially outdated data so the information has been locked by db.

There are big downsides with this approach.

  • User 2 cannot read or update that data for a long time. And user 1 hasn't even done an update.
  • An alternative could be to use db transactions. A GET request begins a transaction, and it ends when the user makes a POST request or leaves the site. This could allow for more concurrency.
  • Pessimistic concurrency control requires the API back end to know when user 1 is no longer viewing some data on the front end. This is difficult. Communication probably happens over HTTP which is stateless. If I were to guess on a solution, it could be to use Websockets to maintain a stateful connection with user 1 to monitor which data she is viewing and editing. (And of course there are downsides)

Change your API

One can change the API so it doesn't have the lost update problem.

@jay has already mentioned delta updates as a solution. Let's say one have a number field in a model that should be incremented by 1 on each request. One implementation is a PUT endpoint which updates the model and field with the incoming number. This API has the lost update problem. Another implementation is to have an increment endpoint. This API doesn't have the lost update problem. (The API is RESTful if you view an increment as a resource and create new increments with POST).

Another way is to change the API from PUT to PATCH. This is not a solution, but it will minimise the possibility for lost updates.

Conclusion

Use optimistic concurrency control.

When Googling "lost update problem" one might only get results about databases. Although this is the same problem, database locking is not a good solution for lost concurrent updates over HTTP. It has several downsides compared to optimistic concurrency control, as written above.

李白 2024-12-08 23:31:07

如果mysql中没有事务,可以使用update命令来确保数据没有损坏。

UPDATE tableA  SET status=2  WHERE status = 1

如果状态为一,则只有一个进程可以得到记录已更新的结果。在下面的代码中,如果未执行更新(如果没有要更新的行),则返回 -1。

PreparedStatement query;
query = connection.prepareStatement(s);
int rows = -1;
try
{
    rows = query.executeUpdate();
    query.close();
}
catch (Exception e)
{
   e.printStackTrace();
}
return rows;

If you do not have transactions in mysql, you can use the update command to ensure that the data is not corrupted.

UPDATE tableA  SET status=2  WHERE status = 1

If status is one, then only one process well get the result that a record was updated. In the code below, returns -1 if the update was NOT executed (if there were no rows to update).

PreparedStatement query;
query = connection.prepareStatement(s);
int rows = -1;
try
{
    rows = query.executeUpdate();
    query.close();
}
catch (Exception e)
{
   e.printStackTrace();
}
return rows;
左岸枫 2024-12-08 23:31:07

应用程序层中的事情很简单 - 每个请求都由不同的线程(或进程)提供服务,因此除非您的处理类(服务)中有状态,否则一切都是安全的。

当您到达数据库(即保存状态的位置)时,事情会变得更加复杂。您需要事务来确保一切正常。

事务具有一组属性 - ACID,“保证数据库事务得到可靠处理”。

Things are simple in the application layer - every request is served by a different thread (or process), so unless you have state in your processing classes (services), everything is safe.

Things get more complicated when you reach the database - i.e. where the state is held. There you need transactions to ensure that everything is ok.

Transactions have a set of properties - ACID, that "guarantee database transactions are processed reliably".

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文