使用 memcached 设置/获取对象

发布于 2024-12-10 16:57:00 字数 1240 浏览 3 评论 0原文

在 Django Python 应用程序中,我使用 Celery(任务管理器)启动作业。当每个作业启动时,它们都会返回一个对象(我们将其称为 X 类的实例),让您可以检查作业并检索返回值或引发的错误。

几个人(我希望有一天)能够同时使用这个网络界面;因此,X 类的多个实例可能同时存在,每个实例对应一个排队或并行运行的作业。很难想出一种方法来保留这些 X 对象,因为 我无法使用全局变量(允许我从键查找每个 X 对象的字典);这是因为 Celery 使用不同的进程,而不仅仅是不同的线程,因此每个进程都会修改自己的全局表副本,从而造成混乱。

随后,我收到了使用 memcached 在任务之间共享内存的好建议。我让它正常工作,并且能够在进程之间设置获取整数和字符串值。

问题是这样的:经过今天大量的调试,我了解到 memcachedsetget 似乎不适用于类。这是我最好的猜测:也许在底层 memcached 将对象序列化到共享内存;类 X (可以理解)无法序列化,因为它指向实时数据(作业的状态),因此串行版本可能已过时(即它可能指向错误的位置)它被再次加载。

使用 SQLite 数据库的尝试同样没有结果。我不仅无法弄清楚如何将对象序列化为数据库字段(使用我的 Django models.py 文件),而且还会遇到同样的问题:已启动作业的句柄需要以某种方式保留在 RAM 中(或者使用一些奇特的方法)下面的操作系统技巧),以便它们在作业完成或失败时更新。

我最好的猜测是(尽管谢天谢地让我走到这一步的建议)我应该启动每项工作在某些外部队列中(例如 Sun/Oracle Grid Engine)。但是,如果不使用系统调用,我无法想出一种好方法来做到这一点,我认为这可能是不好的风格(并且可能不安全)。

如何跟踪在 Django 或 Django Celery 中启动的作业?您是否通过简单地将作业参数放入数据库中来启动它们,然后让另一个作业轮询数据库并运行作业?

非常感谢你的帮助,我很迷失。

In a Django Python app, I launch jobs with Celery (a task manager). When each job is launched, they return an object (lets call it an instance of class X) that lets you check on the job and retrieve the return value or errors thrown.

Several people (someday, I hope) will be able to use this web interface at the same time; therefore, several instances of class X may exist at the same time, each corresponding to a job that is queued or running in parallel. It's difficult to come up with a way to hold onto these X objects because I cannot use a global variable (a dictionary that allows me to look up each X objects from a key); this is because Celery uses different processes, not just different threads, so each would modify its own copy of the global table, causing mayhem.

Subsequently, I received the great advice to use memcached to share the memory across the tasks. I got it working and was able to set and get integer and string values between processes.

The trouble is this: after a great deal of debugging today, I learned that memcached's set and get don't seem to work for classes. This is my best guess: Perhaps under the hood memcached serializes objects to the shared memory; class X (understandably) cannot be serialized because it points at live data (the status of the job), and so the serial version may be out of date (i.e. it may point to the wrong place) when it is loaded again.

Attempts to use a SQLite database were similarly fruitless; not only could I not figure out how to serialize objects as database fields (using my Django models.py file), I would be stuck with the same problem: the handles of the launched jobs need to stay in RAM somehow (or use some fancy OS tricks underneath), so that they update as the jobs finish or fail.

My best guess is that (despite the advice that thankfully got me this far) I should be launching each job in some external queue (for instance Sun/Oracle Grid Engine). However, I couldn't come up with a good way of doing that without using a system call, which I thought may be bad style (and potentially insecure).

How do you keep track of jobs that you launch in Django or Django Celery? Do you launch them by simply putting the job arguments into a database and then have another job that polls the database and runs jobs?

Thanks a lot for your help, I'm quite lost.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

倦话 2024-12-17 16:57:00

我认为 django-celery 可以为你完成这项工作。你看过 django-celery 制作的表格吗?即 djcelery_taskstate 保存给定任务的所有数据,例如 stateworker_id 等。对于周期性任务,有一个名为 djcelery_periodictask 的表。

在 Django 视图中,您可以访问 TaskMeta 对象:

from djcelery.models import TaskMeta
task = TaskMeta.objects.get(task_id=task_id)
print task.status

I think django-celery does this work for you. Did you had a look at the tables made by django-celery? I.e. djcelery_taskstate holds all data for a given task like state, worker_id and so on. For periodic tasks there is a table called djcelery_periodictask.

In a Django view you can access the TaskMeta object:

from djcelery.models import TaskMeta
task = TaskMeta.objects.get(task_id=task_id)
print task.status
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文