Celery 任务似乎可以完成除了写入数据库之外的所有事情

发布于 2024-11-26 09:51:35 字数 2063 浏览 0 评论 0原文

我将 Django 与 MongoEngine、django-celery 和 celery 的 MongoDB 后端结合使用。我正在排队一项任务。该任务包括从 GridFS 获取文件(通过 MongoEngine FileField)、对其进行操作并将其放回到数据库中。

该任务按我的预期运行,无需排队。当我对其进行排队时,它会转换文件,但不会写入数据库。

这是我的 settings.py 的相关部分。

#These are apparently defaults that I shouldn't need
BROKER_BACKEND = 'mongodb'
BROKER_HOST = "localhost"
BROKER_PORT = 27017
BROKER_USER = ""
BROKER_PASSWORD = ""
BROKER_VHOST = ""

CELERY_RESULT_BACKEND = "mongodb" CELERY_MONGODB_BACKEND_SETTINGS = {
    "host": "localhost",
    "port": 27017,
    "database": "svg",
    "taskmeta_collection": "taskmeta", }

import djcelery djcelery.setup_loader()

我像这样运行芹菜

 $ ./manage.py celeryd -l info

当它运行任务时,芹菜会这样说

[2011-07-23 16:07:11,858: INFO/MainProcess] Got task from broker: graphics.tasks.queue_convert[dfdf98ad-0669-4027-866d-c64971bb6480]
[2011-07-23 16:07:15,196: INFO/MainProcess] Task graphics.tasks.queue_convert[dfdf98ad-0669-4027-866d-c64971bb6480] succeeded in 3.33006596565s

(没有错误)

这是任务。

@task()
def queue_convert(imageId):
    image=Image.objects.get(id=imageId)
    convert(image)

Convert 调用了许多其他函数。基本上,它首先从 FileField 读取,操作该字符串,将该字符串写入文件,操作该文件,将生成的字符串和文件写入其他 FileField,然后运行 ​​image.save()。

mongo 日志看起来有所不同,具体取决于我是否对任务进行排队。这是我使用任务队列时 mongo 日志中发生的情况。

Sat Jul 23 16:03:26 [initandlisten] connection accepted from 127.0.0.1:39065 #801
Sat Jul 23 16:03:26 [initandlisten] connection accepted from 127.0.0.1:39066 #802
Sat Jul 23 16:03:29 [initandlisten] connection accepted from 127.0.0.1:39068 #803

这就是当我直接调用convert(image)而不是调用queue_convert(image.id)时发生的情况

Sat Jul 23 16:07:13 [conn807] end connection 127.0.0.1:43630
Sat Jul 23 16:07:13 [initandlisten] connection accepted from 127.0.0.1:43633 #808
Sat Jul 23 16:07:13 [initandlisten] connection accepted from 127.0.0.1:43634 #809
Sat Jul 23 16:07:13 [conn808] end connection 127.0.0.1:43633

你知道可能会出现什么问题吗?

I am using Django with MongoEngine, django-celery and the MongoDB backend for celery. I am queuing one task. The task involves fetching a file from GridFS (through the MongoEngine FileField), manipulating it and putting it back in the database.

The task runs as I expect without queuing. When I queue it, it converts the files, but it does not write to the database.

Here's the relevant part of my settings.py.

#These are apparently defaults that I shouldn't need
BROKER_BACKEND = 'mongodb'
BROKER_HOST = "localhost"
BROKER_PORT = 27017
BROKER_USER = ""
BROKER_PASSWORD = ""
BROKER_VHOST = ""

CELERY_RESULT_BACKEND = "mongodb" CELERY_MONGODB_BACKEND_SETTINGS = {
    "host": "localhost",
    "port": 27017,
    "database": "svg",
    "taskmeta_collection": "taskmeta", }

import djcelery djcelery.setup_loader()

I'm running celery like this

 $ ./manage.py celeryd -l info

When it runs the task, celery says this

[2011-07-23 16:07:11,858: INFO/MainProcess] Got task from broker: graphics.tasks.queue_convert[dfdf98ad-0669-4027-866d-c64971bb6480]
[2011-07-23 16:07:15,196: INFO/MainProcess] Task graphics.tasks.queue_convert[dfdf98ad-0669-4027-866d-c64971bb6480] succeeded in 3.33006596565s

(No errors)

Here's the task.

@task()
def queue_convert(imageId):
    image=Image.objects.get(id=imageId)
    convert(image)

convert calls a bunch of other functions. Basically, it first reads from a FileField, manipulates that string, writes that string to a file, manipulates that file, writes the generated strings and files to other FileFields and then runs image.save().

The mongo logs look different depending on whether I queue the task. This is what happens in the mongo logs when I use the task queue.

Sat Jul 23 16:03:26 [initandlisten] connection accepted from 127.0.0.1:39065 #801
Sat Jul 23 16:03:26 [initandlisten] connection accepted from 127.0.0.1:39066 #802
Sat Jul 23 16:03:29 [initandlisten] connection accepted from 127.0.0.1:39068 #803

This is what happens when I call convert(image) directly instead of calling queue_convert(image.id)

Sat Jul 23 16:07:13 [conn807] end connection 127.0.0.1:43630
Sat Jul 23 16:07:13 [initandlisten] connection accepted from 127.0.0.1:43633 #808
Sat Jul 23 16:07:13 [initandlisten] connection accepted from 127.0.0.1:43634 #809
Sat Jul 23 16:07:13 [conn808] end connection 127.0.0.1:43633

Any idea as to what might be going wrong?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

究竟谁懂我的在乎 2024-12-03 09:51:35

更新:我已经更多地考虑了您遇到的问题,虽然听起来您已经为您解决了这个问题,但我会添加一些注释,以防有人遇到类似的问题。

Mongodb 显式扩展 JSON,使用“BSON”代替,这将二进制和文件类型添加到支持的类型列表中。我只在 celery 文档中看到过“JSON”,所以我猜想将 mongodb 与 celery 一起使用并处理扩展集需要小心,因为听起来就像处理图像一样。

在最新开发版本 IPYTHON (11.0rc4) 的文档中,他们讨论了他们的分布式工作系统。尽管行话听起来与 celery 相似,但后端可能完全不同。我认为 celery 在后端方面相对灵活,并且可能允许更高的安全性,这听起来像是 ipython 所需要的 Zeromq 的问题。但在数据库方面,根据文档,ipython 系统是“围绕 mongodb 从头开始​​设计的”,并且完全支持 bson。因此,如果您不太关心其他 celery 功能(安全性、与 django 相关的开发基础,当然还有更多),您可能会研究一下。再次强调,这绝不是 celery 和 ipython 应有的严格评估,只是可能的领先; ipython 还与其他科学计算库集成良好,内置对 matplotlib 的支持,以及大量科学计算示例,如果您正在进行图像处理并将图像数据视为 numpy 数组或其他内容,您可能会对这些示例感兴趣。

祝你好运

原始答案:
我同意激光科学——这将有助于在这里了解更多背景信息。由于这些库的复杂性,存在许多未知因素。可能无法按照本网站预期的严格程度来回答。

也就是说,我认为您可能遇到了序列化问题。 Celery 要求你的对象是可腌制的,或者至少可以根据你选择的任何实现进行序列化(我知道它们也支持 JSON,尽管我是一个新手,无法确定 Pickle 和 JSON 是否完全重叠)。我看到你的函数只接受一个整数参数,这很好。但是转向 gridfs 是否意味着您要尝试腌制图像?你当然可以用 celery 操作图像,但我不确定,特别是在神秘的“转换”函数背后发生的一切,你是否可能不小心尝试序列化除 unicode、字典、整数、浮点数等之外的东西您的格式支持的其他一些杂项对象。也许您过去检索了图像的文件路径并在文件中对其进行了操作,而没有检索或发送超过 unicode 的内容,现在拥有了图像本身?

如果我偏离了基地,请让我放松一点。我之所以回复,是因为我在这里和 mongoengine 用户组上都看到了您的消息,并且认为您陷入困境并且找不到更专业的意见。您还可以仔细检查以确保您拥有后端软件的最新版本。我在某个时候遇到了一堆奇怪的 celery 问题,发现当我更新rabbitmq 时这些问题主要得到了解决。祝你好运!

update:I've thought about the problem you were having a bit more, and though it sounds like you solved it for you, I'll add a couple notes in case someone has a similar problem.

Mongodb explicitly expands JSON, using 'BSON' instead, which adds a binary and file type to the list of supported types. I've only seen 'JSON' in the celery docs so I'd guess that care would be required to use mongodb with celery and dealing with the expanded set, as it sounds like you were with images.

In the docs for the latest development version of IPYTHON (11.0rc4) they discuss their distributed work system. Though the lingo sounds similar to celery, the backend may be quite different. I think celery is relatively flexible about backends, and probably allows for more security, which sounds like an issue with zeromq, which ipython requires. But on the database side, the ipython system was 'designed from the ground up around mongodb,' according to the docs, and bson is fully supported. So if you're not too concerned with other celery features (security, development base related to django, and much more, of course), you might look into it. Again, this is by no means the rigorous evaluation that celery and ipython both deserve, just a possible lead; ipython also integrates well with other scientific computing libraries, with built-in support for matplotlib, and lots of scientific computing examples, which might interest you if you're doing image processing and treating your image data as numpy arrays or whatever.

Best of luck

original answer:
I agree with lazerscience - it would help to have more context here. There are so many unknowns due to the complexity of these libraries. It's probably not possible to answer with the rigor expected on this site.

That said, I think you may have run into a serialization problem. Celery requires that your objects be pickleable, or at least serializable according to whatever implementation you choose (I know they support JSON as well, though I'm enough of a novice not to be certain whether Pickle and JSON overlap entirely or not). I see your function only takes an integer parameter, which is good. But would the shift to gridfs mean you're trying to pickle an image? You could certainly manipulate images with celery but I'm not sure, especially with everything happening behind the mysterious 'convert' function, whether you may be accidentally trying to serialize something other than unicode, a dictionary, an integer, a float, and whatever other few miscellaneous objects your format would support. Maybe you'd retrieved a filepath to the image in the past and manipulated it in the file without ever retrieving or sending more than unicode, and now have the image itself?

If I'm way off base, please cut me a little slack. I'm responding because I saw your message both here and on the mongoengine user's group and figured you were stuck and not finding a more expert opinion. You might also double check to be sure you have reasonably current versions of the backend software. I had a bunch of weird celery issues at some point and found they were mainly resolved when I updated rabbitmq. Good luck!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文