使用任务队列突显处理能力?

发布于 2024-10-30 03:55:31 字数 560 浏览 0 评论 0原文

我遇到了这样一种情况,我想要对数据存储进行 1000 个不同的查询,对每个单独查询的结果进行一些计算(以获得 1000 个单独的结果),然后返回结果列表。

我希望返回结果列表作为开始计算的同一个 30 秒用户请求的响应,以获得更好的客户端性能。哈!

我有一个大胆的计划。

这些操作中的每一个操作通常在一秒钟内完成都没有问题,它们都不需要与其他操作写入相同的实体组,并且它们都不需要来自任何其他查询的任何信息。是否可以启动 1000 个独立任务,每个任务执行其中一个查询,进行计算,并将结果存储在某种临时实体集合中?原始请求可以等待 10 秒,然后对数据存储中的结果进行一次查询(也许它们都设置了一个我可以查询的唯一值)。任何尚未出现的结果都会在客户端注意到,并且客户端可以在另外十秒内再次请求这些值。

我希望经验丰富的应用工程师能够回答的问题是:

  • 这很可笑吗?如果是这样,那么对于任何数量的任务来说这都是可笑的吗?一次50个合理吗?
  • 如果我每秒读取同一个实体 20 次,就不会遇到数据存储争用,对吧?那些争论的东西都是为了写?
  • 有没有更简单的方法来获取任务的响应?

I've got a situation where I want to make 1000 different queries to the datastore, do some calculations on the results of each individual query (to get 1000 separate results), and return the list of results.

I would like the list of results to be returned as the response from the same 30-second user request that started the calculation, for better client-side performance. Hah!

I have a bold plan.

Each of these operations individually will usually have no problem finishing in under a second, none of them need to write to the same entity group as any other, and none of them need any information from any of the other queries. Might it be possible to start 1000 independent tasks, each taking on one of these queries, doing its calculations, and storing the result in some sort of temporary collection of entities? The original request could wait 10 seconds, and then do a single query for the results from the datastore (maybe they all set a unique value I can query on). Any results that aren't in yet would be noticed at the client end, and the client could just ask for those values again in another ten seconds.

The questions I hope experienced appengineers can answer are:

  • Is this ludicrous? If so, is it ludicrous for any number of tasks? Would 50 at once be reasonable?
  • I won't run into datastore contention if I'm reading the same entity 20 times a second, right? That contention stuff is all for writing?
  • Is there an easier way to get a response from a task?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

愁杀 2024-11-06 03:55:31

是的,听起来很可笑:)

您不应该依赖任务队列来进行这样的操作。您不能指望如此快地生成 1000 个任务(尽管很可能会如此)。

为什么不使用 Channel API 来等待您的响应。因此,您的解决方案变为:

  • 客户端向服务器发送请求 服务器
  • 生成 N 个任务来进行计算并使用 Channel API 令牌响应客户端
  • 客户端使用令牌监听通道
  • 一旦所有任务完成,服务器通过通道将响应推送给客户端

这将避免由于任务执行速度不如您希望的速度或其他原因而很可能不时出现的任何超时问题。

Yep, sounds pretty ludicrous :)

You shouldn't rely on the Taskqueue to operate like that. You can't rely on 1000 tasks being spawned that quickly (although they most likely will).

Why not use the Channel API to wait for your response. So your solution becomes:

  • Client send request to Server
  • Server spawns N tasks to do your calculations and responds to Client with a Channel API token
  • Client listens to the Channel using token
  • Once all the tasks are finished Server pushes response to Client via the Channel

This would avoid any timeout issues that would very likely arrise from time to time due to tasks not executing as fast as you like, or some other reason.

回首观望 2024-11-06 03:55:31

任务队列不提供任务何时执行的坚定保证 - ETA(默认为当前时间)是任务执行的最早时间,但如果队列已备份,或者没有可用于执行该任务的实例,它可能会稍后执行。

一种选择是使用 Datastore Plus / NDB,它允许您执行并行查询。然而,无论您如何执行,1000 个查询都将非常昂贵。

正如 @Chris 所建议的,另一种选择是将任务队列与 Channel API 结合使用,这样您就可以在查询完成时异步通知用户。

The Task Queue doesn't provide firm guarantees on when a task will execute - the ETA (which defaults to the current time) is the earliest time at which it will execute, but if the queue is backed up, or there are no instances available to execute the task, it could execute much later.

One option would be to use Datastore Plus / NDB, which allows you to execute queries in parallel. 1000 queries is going to be very expensive, however, no matter how you execute them.

Another option, as @Chris suggests, is to use the task queue with the Channel API, so you can notify the user asynchronously when the queries complete.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文