在BigTable数据存储中,关于并发性,如何“锁定”数据?一个实体?

发布于 2024-08-12 04:28:00 字数 348 浏览 7 评论 0原文

我不知道如何在 BigTable 数据存储中处理这个问题。

想象一下下面的示例(只是为了解释这个概念。该示例与我的实际数据模型不匹配):

  • 我有一个 Counter 实体,用于跟踪我的数据存储中的事务数量。假设当前“计数”为 100。
  • 现在两个 Web 请求同时读取该值。
  • 两个网络请求都添加了一个新的事务
  • ,最后都更新了计数器(到 101)。

计数器值现在不准确。应该是102。

对于如何处理这种情况有什么建议吗?我可以“锁定”计数器以确保在第一个 Web 请求完成之前第二个 Web 请求不会读取它吗?

I am not sure how to handle this in a BigTable datastore.

Imagine the following example (just to explain the concept. The example does not match my actual data model):

  • I have a Counter entity that keeps track of the number of Transactions in my dataStore. Let's say the current 'count' is 100.
  • Now two web requests read this value at the same time.
  • Both web requests add a new Transaction
  • And finally both update the counter (to 101).

The counter value is now inaccurate. It should be 102.

Any suggestions on how to handle this situation? Can I 'lock' the counter to ensure that the second web request doesn't even read it until the first web request completes?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

用心笑 2024-08-19 04:28:00

您有多种选择:

  • 根据计数器和实体的范围,让交易实体成为计数器的子实体。然后,您可以插入交易并以事务方式更新计数器。请记住,这会将您的更新率限制为大约 1-5 QPS。
  • 如果您的计数不必 100% 准确,请分别插入实体并更新计数器(使用单实体事务)。您可以运行常规 cronjob 来重新计算实体数量,并在错误导致计数器不同步时修复计数器。
  • 您可以构建自己的有限的分布式事务支持

You have several options:

  • Depending on the scope of your counter and your entities, have the Transaction entities be child entities of the counter. Then, you can insert a transaction and update the counter transactionally. Bear in mind that this limits your update rate to about 1-5 QPS.
  • If your counts don't have to be 100% accurate, insert the entity and update the counter (using a single-entity transaction) separately. You can run a regular cronjob to re-count the number of entities and fix the counter if errors force it to be out of sync.
  • You could build your own limited distributed transaction support.
前事休说 2024-08-19 04:28:00

除了尼克提供的选项之外,您还可以考虑对计数器进行分片。

保留多个计数器,并选择一个计数器进行更新,使得任何两个请求(理想情况下)不可能或(如果失败)不太可能同时选择同一个分片。

然后您还有更多选择。您可以使用分片作为父级进行交易(与单个计数器相比,这减少了争用),尽管最终您的新交易实体将具有任意选择的父级。或者不要为交易而烦恼,在这种情况下,您可能必须不时地修正计数,就像尼克的非交易选项一样。

要读取总计数,请将所有分片相加。您不会“同时”阅读所有这些内容,但这通常没问题。读取任何计数器,它可能会在您读取它和使用该值之间增加,因此该值实际上只是一个下限。添加碎片没有什么不同,只是可能需要更长的时间。

In addition to the options Nick gives, you could consider sharding the counter.

Keep multiple counters, and pick one to update in such a way that it is (ideally) impossible or (failing that) unlikely that any two requests will simultaneously pick the same shard.

You then have further options. You could do a transaction with the shard as parent (this reduces contention compared with a single counter), although you'll end up with your new Transaction entity having a parent chosen arbitrarily. Or don't bother with a transaction, in which case you'll probably have to fix the count from time to time, as with Nick's non-transaction option.

To read the total count, you add up all the shards. You won't be reading them all "at the same time", but that's usually fine. Reading any counter, it might increase between when you read it, and when you use the value, so the value is really just a lower bound. Adding up the shards is no different, except that it probably takes longer.

放低过去 2024-08-19 04:28:00

目前,这种类型的整数增量可以通过利用 ReadModifyWriteRow 请求对 BigTable 进行一次调用来完成。这种类型的写入请求称为增量和追加

ReadModifyWriteRow 是一项功能,包装方法可在客户端库中使用,代码示例

此 GitHub 问题时间表来看,此功能似乎在某个地方可用自 2018 年发布以来。

Currently, this type of integer increment can be done by a single call to BigTable, leveraging the ReadModifyWriteRow requests. This type of write request is called increments and appends.

The ReadModifyWriteRow is the feature, wrapper methods are available in client libraries, code example.

Judging by this GitHub issue timeline, this feature appears to be available somewhere since the 2018 releases.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文