当任务结果很大时,我应该如何使用 Celery?

发布于 2024-10-03 20:36:23 字数 282 浏览 4 评论 0原文

处理在 Celery 中执行的结果很大的任务的最佳方法是什么?我正在考虑诸如表转储之类的事情,我可能会返回数百兆字节的数据。

我认为将消息塞入结果数据库的幼稚方法在这里不会为我服务,更不用说如果我将 AMQP 用于结果后端了。然而,我有一些延迟是一个问题的问题;根据导出的特定实例,有时我必须阻塞直到它返回并直接从任务客户端发出导出数据(针对导出内容传入的 HTTP 请求,它不存在,但必须< /em> 在对该请求的响应中提供...无论需要多长时间)

那么,为此编写任务的最佳方法是什么?

What's the best way to handle tasks executed in Celery where the result is large? I'm thinking of things like table dumps and the like, where I might be returning data in the hundreds of megabytes.

I'm thinking that the naive approach of cramming the message into the result database is not going to serve me here, much less if I use AMQP for my result backend. However, I have some of these where latency is an issue; depending on the particular instance of the export, sometimes I have to block until it returns and directly emit the export data from the task client (an HTTP request came in for the export content, it doesn't exist, but must be provided in the response to that request ... no matter how long that takes)

So, what's the best way to write tasks for this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

拥抱影子 2024-10-10 20:36:23

一种选择是在所有工作计算机上运行静态 HTTP 服务器。然后,您的任务可以将大型结果转储到静态根目录中的唯一文件中,并返回对该文件的 URL 引用。然后接收者可以在闲暇时获取结果。

例如。大概是这样的:

@task
def dump_db(db):
  # Some code to dump the DB to /srv/http/static/db.sql
  return 'http://%s/%s.sql' % (socket.gethostname(), db)

您当然需要一些方法来获取旧文件,以及保证唯一性,可能还需要其他问题,但您已经了解了总体思路。

One option would be to have a static HTTP server running on all of your worker machines. Your task can then dump the large result to a unique file in the static root and return a URL reference to the file. The receiver can then fetch the result at its leisure.

eg. Something vaguely like this:

@task
def dump_db(db):
  # Some code to dump the DB to /srv/http/static/db.sql
  return 'http://%s/%s.sql' % (socket.gethostname(), db)

You would of course need some means of reaping old files, as well as guaranteeing uniqueness, and probably other issues, but you get the general idea.

变身佩奇 2024-10-10 20:36:23

我通过构建我的应用程序来将多兆字节结果写入文件中来处理这个问题,我将它们映射到内存中,以便它们在使用该数据的所有进程之间共享......这完全巧妙地解决了如何将结果传递给另一个进程的问题机器,但如果结果那么大,听起来这些任务是服务器进程之间协调的内部任务。

I handle this by structuring my app to write the multi-megabyte results into files, which I them memmap into memory so they are shared among all processes that use that data... This totally finesses the question of how to get the results to another machine, but if the results are that large, it sounds like the these tasks are internal tasks coordinate between server processes.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文