Celery (Django) 速率限制
我正在使用 Celery 来处理多个数据挖掘任务。其中一项任务连接到远程服务,该服务允许每个用户最多同时建立 10 个连接(换句话说,它可以在全球范围内超过 10 个连接,但它每个单独的作业不能超过 10 个连接)。
我认为 令牌桶(速率限制) 是我的正在寻找,但我似乎找不到它的任何实现。
I'm using Celery to process multiple data-mining tasks. One of these tasks connects to a remote service which allows a maximum of 10 simultaneous connections per user (or in other words, it CAN exceed 10 connections globally but it CANNOT exceed 10 connections per individual job).
I THINK Token Bucket (rate limiting) is what I'm looking for, but I can't seem to find any implementation of it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
Celery 具有速率限制功能,并包含通用令牌桶实现。
设置任务的速率限制:
http://docs.celeryproject.org/en/latest/userguide /tasks.html#Task.rate_limit
或者在运行时:
http://docs.celeryproject.org/en/latest/userguide/workers.html#rate-limits
令牌桶实现在 Kombu 中
Celery features rate limiting, and contains a generic token bucket implementation.
Set rate limits for tasks:
http://docs.celeryproject.org/en/latest/userguide/tasks.html#Task.rate_limit
Or at runtime:
http://docs.celeryproject.org/en/latest/userguide/workers.html#rate-limits
The token bucket implementation is in Kombu
经过大量研究,我发现 Celery 没有明确提供一种方法来限制像这样的并发实例的数量,而且,这样做通常会被认为是不好的做法。
更好的解决方案是在单个任务中并发下载,并使用 Redis 或 Memcached 存储和分发以供其他任务处理。
After much research I found out that Celery does not explicitly provide a way to limit the number of concurrent instances like this and furthermore, doing so would generally be considered bad practice.
The better solution would be to download concurrently within a single task, and use Redis or Memcached to store and distribute for other tasks to process.
尽管这可能是不好的做法,但您可以使用专用队列并限制工作线程,例如:
Although it might be bad practice, you could use a dedicated queue and limit the worker, like: