检索 Celery 队列中的任务列表

发布于 2024-10-29 08:34:23 字数 24 浏览 1 评论 0原文

如何检索队列中尚未处理的任务列表?

How can I retrieve a list of tasks in a queue that are yet to be processed?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(19

像极了他 2024-11-05 08:34:23

编辑:请参阅其他答案以获取队列中的任务列表。

您应该在此处查看:
Celery 指南 - 检查 Workers

基本上是这样的:

my_app = Celery(...)

# Inspect all nodes.
i = my_app.control.inspect()

# Show the items that have an ETA or are scheduled for later processing
i.scheduled()

# Show tasks that are currently active.
i.active()

# Show tasks that have been claimed by workers
i.reserved()

取决于什么你想要

EDIT: See other answers for getting a list of tasks in the queue.

You should look here:
Celery Guide - Inspecting Workers

Basically this:

my_app = Celery(...)

# Inspect all nodes.
i = my_app.control.inspect()

# Show the items that have an ETA or are scheduled for later processing
i.scheduled()

# Show tasks that are currently active.
i.active()

# Show tasks that have been claimed by workers
i.reserved()

Depending on what you want

眼泪也成诗 2024-11-05 08:34:23

如果您使用 Celery+Django 最简单的方法来检查任务,请直接从虚拟环境中的终端使用命令或使用 celery 的完整路径

文档http ://docs.celeryproject.org/en/latest/userguide/workers.html?highlight=revoke#inspecting-workers

$ celery inspect reserved
$ celery inspect active
$ celery inspect registered
$ celery inspect scheduled

另外,如果您使用Celery+RabbitMQ,您可以使用以下命令检查队列列表:

更多信息https://linux.die.net/man/1/rabbitmqctl

$ sudo rabbitmqctl list_queues

If you are using Celery+Django simplest way to inspect tasks using commands directly from your terminal in your virtual environment or using a full path to celery:

Doc: http://docs.celeryproject.org/en/latest/userguide/workers.html?highlight=revoke#inspecting-workers

$ celery inspect reserved
$ celery inspect active
$ celery inspect registered
$ celery inspect scheduled

Also if you are using Celery+RabbitMQ you can inspect the list of queues using the following command:

More info: https://linux.die.net/man/1/rabbitmqctl

$ sudo rabbitmqctl list_queues
同尘 2024-11-05 08:34:23

如果您使用的是rabbitMQ,请在终端中使用它:

sudo rabbitmqctl list_queues

它将打印带有待处理任务数量的队列列表。例如:

Listing queues ...
0b27d8c59fba4974893ec22d478a7093    0
0e0a2da9828a48bc86fe993b210d984f    0
[email protected] 0
11926b79e30a4f0a9d95df61b6f402f7    0
15c036ad25884b82839495fb29bd6395    1
[email protected]    0
celery  166
celeryev.795ec5bb-a919-46a8-80c6-5d91d2fcf2aa   0
celeryev.faa4da32-a225-4f6c-be3b-d8814856d1b6   0

右列中的数字是队列中的任务数。在上面,celery 队列有 166 个待处理任务。

if you are using rabbitMQ, use this in terminal:

sudo rabbitmqctl list_queues

it will print list of queues with number of pending tasks. for example:

Listing queues ...
0b27d8c59fba4974893ec22d478a7093    0
0e0a2da9828a48bc86fe993b210d984f    0
[email protected] 0
11926b79e30a4f0a9d95df61b6f402f7    0
15c036ad25884b82839495fb29bd6395    1
[email protected]    0
celery  166
celeryev.795ec5bb-a919-46a8-80c6-5d91d2fcf2aa   0
celeryev.faa4da32-a225-4f6c-be3b-d8814856d1b6   0

the number in right column is number of tasks in the queue. in above, celery queue has 166 pending task.

混吃等死 2024-11-05 08:34:23

如果不使用优先任务,这实际上是 如果您使用 Redis,则非常简单。要获取任务计数:

redis-cli -h HOST -p PORT -n DATABASE_NUMBER llen QUEUE_NAME

但是,优先任务在redis中使用不同的键,所以整体情况稍微复杂一些。完整的情况是,您需要查询 Redis 以获取任务的每个优先级。在 python 中(以及来自 Flower 项目),这看起来像:

PRIORITY_SEP = '\x06\x16'
DEFAULT_PRIORITY_STEPS = [0, 3, 6, 9]


def make_queue_name_for_pri(queue, pri):
    """Make a queue name for redis
    
    Celery uses PRIORITY_SEP to separate different priorities of tasks into
    different queues in Redis. Each queue-priority combination becomes a key in
    redis with names like:
    
     - batch1\x06\x163 <-- P3 queue named batch1
     
    There's more information about this in Github, but it doesn't look like it 
    will change any time soon:
     
      - https://github.com/celery/kombu/issues/422
      
    In that ticket the code below, from the Flower project, is referenced:
    
      - https://github.com/mher/flower/blob/master/flower/utils/broker.py#L135
        
    :param queue: The name of the queue to make a name for.
    :param pri: The priority to make a name with.
    :return: A name for the queue-priority pair.
    """
    if pri not in DEFAULT_PRIORITY_STEPS:
        raise ValueError('Priority not in priority steps')
    return '{0}{1}{2}'.format(*((queue, PRIORITY_SEP, pri) if pri else
                                (queue, '', '')))


def get_queue_length(queue_name='celery'):
    """Get the number of tasks in a celery queue.
    
    :param queue_name: The name of the queue you want to inspect.
    :return: the number of items in the queue.
    """
    priority_names = [make_queue_name_for_pri(queue_name, pri) for pri in
                      DEFAULT_PRIORITY_STEPS]
    r = redis.StrictRedis(
        host=settings.REDIS_HOST,
        port=settings.REDIS_PORT,
        db=settings.REDIS_DATABASES['CELERY'],
    )
    return sum([r.llen(x) for x in priority_names])

如果你想获得实际任务,你可以使用类似的东西:

redis-cli -h HOST -p PORT -n DATABASE_NUMBER lrange QUEUE_NAME 0 -1

从那里你必须反序列化返回的列表。就我而言,我能够通过以下方式完成此操作:

r = redis.StrictRedis(
    host=settings.REDIS_HOST,
    port=settings.REDIS_PORT,
    db=settings.REDIS_DATABASES['CELERY'],
)
l = r.lrange('celery', 0, -1)
pickle.loads(base64.b64decode(json.loads(l[0])['body']))

请注意,反序列化可能需要一些时间,并且您需要调整上面的命令以处理各种优先级。

If you don't use prioritized tasks, this is actually pretty simple if you're using Redis. To get the task counts:

redis-cli -h HOST -p PORT -n DATABASE_NUMBER llen QUEUE_NAME

But, prioritized tasks use a different key in redis, so the full picture is slightly more complicated. The full picture is that you need to query redis for every priority of task. In python (and from the Flower project), this looks like:

PRIORITY_SEP = '\x06\x16'
DEFAULT_PRIORITY_STEPS = [0, 3, 6, 9]


def make_queue_name_for_pri(queue, pri):
    """Make a queue name for redis
    
    Celery uses PRIORITY_SEP to separate different priorities of tasks into
    different queues in Redis. Each queue-priority combination becomes a key in
    redis with names like:
    
     - batch1\x06\x163 <-- P3 queue named batch1
     
    There's more information about this in Github, but it doesn't look like it 
    will change any time soon:
     
      - https://github.com/celery/kombu/issues/422
      
    In that ticket the code below, from the Flower project, is referenced:
    
      - https://github.com/mher/flower/blob/master/flower/utils/broker.py#L135
        
    :param queue: The name of the queue to make a name for.
    :param pri: The priority to make a name with.
    :return: A name for the queue-priority pair.
    """
    if pri not in DEFAULT_PRIORITY_STEPS:
        raise ValueError('Priority not in priority steps')
    return '{0}{1}{2}'.format(*((queue, PRIORITY_SEP, pri) if pri else
                                (queue, '', '')))


def get_queue_length(queue_name='celery'):
    """Get the number of tasks in a celery queue.
    
    :param queue_name: The name of the queue you want to inspect.
    :return: the number of items in the queue.
    """
    priority_names = [make_queue_name_for_pri(queue_name, pri) for pri in
                      DEFAULT_PRIORITY_STEPS]
    r = redis.StrictRedis(
        host=settings.REDIS_HOST,
        port=settings.REDIS_PORT,
        db=settings.REDIS_DATABASES['CELERY'],
    )
    return sum([r.llen(x) for x in priority_names])

If you want to get an actual task, you can use something like:

redis-cli -h HOST -p PORT -n DATABASE_NUMBER lrange QUEUE_NAME 0 -1

From there you'll have to deserialize the returned list. In my case I was able to accomplish this with something like:

r = redis.StrictRedis(
    host=settings.REDIS_HOST,
    port=settings.REDIS_PORT,
    db=settings.REDIS_DATABASES['CELERY'],
)
l = r.lrange('celery', 0, -1)
pickle.loads(base64.b64decode(json.loads(l[0])['body']))

Just be warned that deserialization can take a moment, and you'll need to adjust the commands above to work with various priorities.

海夕 2024-11-05 08:34:23

要从后端检索任务,请使用此

from amqplib import client_0_8 as amqp
conn = amqp.Connection(host="localhost:5672 ", userid="guest",
                       password="guest", virtual_host="/", insist=False)
chan = conn.channel()
name, jobs, consumers = chan.queue_declare(queue="queue_name", passive=True)

To retrieve tasks from backend, use this

from amqplib import client_0_8 as amqp
conn = amqp.Connection(host="localhost:5672 ", userid="guest",
                       password="guest", virtual_host="/", insist=False)
chan = conn.channel()
name, jobs, consumers = chan.queue_declare(queue="queue_name", passive=True)
过潦 2024-11-05 08:34:23

具有 json 序列化功能的 Redis 复制粘贴解决方案:

def get_celery_queue_items(queue_name):
    import base64
    import json  

    # Get a configured instance of a celery app:
    from yourproject.celery import app as celery_app

    with celery_app.pool.acquire(block=True) as conn:
        tasks = conn.default_channel.client.lrange(queue_name, 0, -1)
        decoded_tasks = []

    for task in tasks:
        j = json.loads(task)
        body = json.loads(base64.b64decode(j['body']))
        decoded_tasks.append(body)

    return decoded_tasks

它适用于 Django。只是不要忘记更改yourproject.celery

A copy-paste solution for Redis with json serialization:

def get_celery_queue_items(queue_name):
    import base64
    import json  

    # Get a configured instance of a celery app:
    from yourproject.celery import app as celery_app

    with celery_app.pool.acquire(block=True) as conn:
        tasks = conn.default_channel.client.lrange(queue_name, 0, -1)
        decoded_tasks = []

    for task in tasks:
        j = json.loads(task)
        body = json.loads(base64.b64decode(j['body']))
        decoded_tasks.append(body)

    return decoded_tasks

It works with Django. Just don't forget to change yourproject.celery.

踏雪无痕 2024-11-05 08:34:23

这在我的应用程序中对我有用:

def get_queued_jobs(queue_name):
    connection = <CELERY_APP_INSTANCE>.connection()

    try:
        channel = connection.channel()
        name, jobs, consumers = channel.queue_declare(queue=queue_name, passive=True)
        active_jobs = []

        def dump_message(message):
            active_jobs.append(message.properties['application_headers']['task'])

        channel.basic_consume(queue=queue_name, callback=dump_message)

        for job in range(jobs):
            connection.drain_events()

        return active_jobs
    finally:
        connection.close()

active_jobs 将是与队列中的任务相对应的字符串列表。

不要忘记将 CELERY_APP_INSTANCE 替换为您自己的。

感谢 @ashish 的回答为我指明了正确的方向:https://stackoverflow.com/a/19465670/9843399

This worked for me in my application:

def get_queued_jobs(queue_name):
    connection = <CELERY_APP_INSTANCE>.connection()

    try:
        channel = connection.channel()
        name, jobs, consumers = channel.queue_declare(queue=queue_name, passive=True)
        active_jobs = []

        def dump_message(message):
            active_jobs.append(message.properties['application_headers']['task'])

        channel.basic_consume(queue=queue_name, callback=dump_message)

        for job in range(jobs):
            connection.drain_events()

        return active_jobs
    finally:
        connection.close()

active_jobs will be a list of strings that correspond to tasks in the queue.

Don't forget to swap out CELERY_APP_INSTANCE with your own.

Thanks to @ashish for pointing me in the right direction with his answer here: https://stackoverflow.com/a/19465670/9843399

南汐寒笙箫 2024-11-05 08:34:23

celery 检查模块似乎只从工作人员的角度了解任务。如果您想查看队列中的消息(尚未被工作人员拉取),我建议使用 pyrabbit,它可以与rabbitmq http api接口,从队列中检索各种信息。

可以在这里找到一个例子:
使用 Celery(RabbitMQ、Django)检索队列长度

The celery inspect module appears to only be aware of the tasks from the workers perspective. If you want to view the messages that are in the queue (yet to be pulled by the workers) I suggest to use pyrabbit, which can interface with the rabbitmq http api to retrieve all kinds of information from the queue.

An example can be found here:
Retrieve queue length with Celery (RabbitMQ, Django)

↙温凉少女 2024-11-05 08:34:23

我认为获取正在等待的任务的唯一方法是保留您启动的任务列表,并让任务在启动时从列表中删除自己。

使用rabbitmqctl和list_queues,您可以了解有多少任务正在等待,但不能了解任务本身:http://www.rabbitmq.com/man/rabbitmqctl.1.man.html

如果您想要的内容包括正在处理但尚未完成的任务,您可以保留任务列表并检查它们的状态:

from tasks import add
result = add.delay(4, 4)

result.ready() # True if finished

或者您让 Celery 使用 CELERY_RESULT_BACKEND 存储结果,并检查哪些任务不在那里。

I think the only way to get the tasks that are waiting is to keep a list of tasks you started and let the task remove itself from the list when it's started.

With rabbitmqctl and list_queues you can get an overview of how many tasks are waiting, but not the tasks itself: http://www.rabbitmq.com/man/rabbitmqctl.1.man.html

If what you want includes the task being processed, but are not finished yet, you can keep a list of you tasks and check their states:

from tasks import add
result = add.delay(4, 4)

result.ready() # True if finished

Or you let Celery store the results with CELERY_RESULT_BACKEND and check which of your tasks are not in there.

ι不睡觉的鱼゛ 2024-11-05 08:34:23

据我所知,Celery 没有提供用于检查队列中等待的任务的 API。这是经纪人特定的。如果您使用 Redis 作为代理,那么检查在 celery(默认)队列中等待的任务就像这样简单:

  1. 代理
  2. 连接到celery 中的 列表项code> list(例如 LRANGE 命令)

请记住,这些任务正在等待可用的工作人员挑选。您的集群可能正在运行一些任务 - 这些任务不会在此列表中,因为它们已被选择。

在特定队列中检索任务的过程是特定于代理的。

As far as I know Celery does not give API for examining tasks that are waiting in the queue. This is broker-specific. If you use Redis as a broker for an example, then examining tasks that are waiting in the celery (default) queue is as simple as:

  1. connect to the broker
  2. list items in the celery list (LRANGE command for an example)

Keep in mind that these are tasks WAITING to be picked by available workers. Your cluster may have some tasks running - those will not be in this list as they have already been picked.

The process of retrieving tasks in particular queue is broker-specific.

七七 2024-11-05 08:34:23

我得出的结论是,获取队列中作业数量的最佳方法是使用rabbitmqctl,正如此处多次建议的那样。为了允许任何选定的用户使用 sudo 运行命令,我按照说明操作这里(我确实跳过了编辑配置文件部分,因为我不介意在命令之前输入 sudo。)

我还抓住了 jamesc 的 grepcut 片段并将其包装在子流程调用中。

from subprocess import Popen, PIPE
p1 = Popen(["sudo", "rabbitmqctl", "list_queues", "-p", "[name of your virtula host"], stdout=PIPE)
p2 = Popen(["grep", "-e", "^celery\s"], stdin=p1.stdout, stdout=PIPE)
p3 = Popen(["cut", "-f2"], stdin=p2.stdout, stdout=PIPE)
p1.stdout.close()
p2.stdout.close()
print("number of jobs on queue: %i" % int(p3.communicate()[0]))

I've come to the conclusion the best way to get the number of jobs on a queue is to use rabbitmqctl as has been suggested several times here. To allow any chosen user to run the command with sudo I followed the instructions here (I did skip editing the profile part as I don't mind typing in sudo before the command.)

I also grabbed jamesc's grep and cut snippet and wrapped it up in subprocess calls.

from subprocess import Popen, PIPE
p1 = Popen(["sudo", "rabbitmqctl", "list_queues", "-p", "[name of your virtula host"], stdout=PIPE)
p2 = Popen(["grep", "-e", "^celery\s"], stdin=p1.stdout, stdout=PIPE)
p3 = Popen(["cut", "-f2"], stdin=p2.stdout, stdout=PIPE)
p1.stdout.close()
p2.stdout.close()
print("number of jobs on queue: %i" % int(p3.communicate()[0]))
穿透光 2024-11-05 08:34:23
inspector = current_celery_app.control.inspect()
scheduled = list(inspector.scheduled().values())[0]
active = list(inspector.active().values())[0]
reserved = list(inspector.reserved().values())[0]
registered = list(inspector.registered().values())[0]
lst = [*scheduled, *active, *reserved]
for i in lst:
    if job_id == i['id']:
        print("Job found")
inspector = current_celery_app.control.inspect()
scheduled = list(inspector.scheduled().values())[0]
active = list(inspector.active().values())[0]
reserved = list(inspector.reserved().values())[0]
registered = list(inspector.registered().values())[0]
lst = [*scheduled, *active, *reserved]
for i in lst:
    if job_id == i['id']:
        print("Job found")
手心的海 2024-11-05 08:34:23

要获取队列中的任务数量,您可以使用 flower 库,这里是一个简化的示例:

import asyncio
from flower.utils.broker import Broker
from django.conf import settings

def get_queue_length(queue):
    broker = Broker(settings.CELERY_BROKER_URL)
    queues_result = broker.queues([queue])
    res = asyncio.run(queues_result) or [{ "messages": 0 }]
    length = res[0].get('messages', 0)
    return length

To get the number of tasks on a queue you can use the flower library, here is a simplified example:

import asyncio
from flower.utils.broker import Broker
from django.conf import settings

def get_queue_length(queue):
    broker = Broker(settings.CELERY_BROKER_URL)
    queues_result = broker.queues([queue])
    res = asyncio.run(queues_result) or [{ "messages": 0 }]
    length = res[0].get('messages', 0)
    return length
痴情 2024-11-05 08:34:23

如果您控制任务的代码,那么您可以通过让任务在第一次执行时触发一次简单的重试,然后检查 inspect().reserved() 来解决该问题。重试将任务注册到结果后端,celery 可以看到这一点。该任务必须接受 selfcontext 作为第一个参数,以便我们可以访问重试计数。

@task(bind=True)
def mytask(self):
    if self.request.retries == 0:
        raise self.retry(exc=MyTrivialError(), countdown=1)
    ...

该解决方案与代理无关,即。您不必担心是否使用 RabbitMQ 还是 Redis 来存储任务。

编辑:经过测试,我发现这只是部分解决方案。保留的大小仅限于工作线程的预取设置。

If you control the code of the tasks then you can work around the problem by letting a task trigger a trivial retry the first time it executes, then checking inspect().reserved(). The retry registers the task with the result backend, and celery can see that. The task must accept self or context as first parameter so we can access the retry count.

@task(bind=True)
def mytask(self):
    if self.request.retries == 0:
        raise self.retry(exc=MyTrivialError(), countdown=1)
    ...

This solution is broker agnostic, ie. you don't have to worry about whether you are using RabbitMQ or Redis to store the tasks.

EDIT: after testing I've found this to be only a partial solution. The size of reserved is limited to the prefetch setting for the worker.

看轻我的陪伴 2024-11-05 08:34:23

我从 Flower 代码库中找到了一个用于获取代理队列长度的用例。
它与经纪人访问一样快。

app = Celery("tasks")

from flower.utils.broker import Broker
broker = Broker(
    app.connection(connect_timeout=1.0).as_uri(include_password=True),
    broker_options=app.conf.broker_transport_options,
    broker_use_ssl=app.conf.broker_use_ssl,
)

async def queue_length():
    queues = await broker.queues(["celery"])
    return queues[0].get("messages")

I found a usecase from the Flower codebase to get the broker queue length.
It's fast as broker access.

app = Celery("tasks")

from flower.utils.broker import Broker
broker = Broker(
    app.connection(connect_timeout=1.0).as_uri(include_password=True),
    broker_options=app.conf.broker_transport_options,
    broker_use_ssl=app.conf.broker_use_ssl,
)

async def queue_length():
    queues = await broker.queues(["celery"])
    return queues[0].get("messages")
梨涡少年 2024-11-05 08:34:23
from celery.task.control import inspect
def key_in_list(k, l):
    return bool([True for i in l if k in i.values()])

def check_task(task_id):
    task_value_dict = inspect().active().values()
    for task_list in task_value_dict:
        if self.key_in_list(task_id, task_list):
             return True
    return False
from celery.task.control import inspect
def key_in_list(k, l):
    return bool([True for i in l if k in i.values()])

def check_task(task_id):
    task_value_dict = inspect().active().values()
    for task_list in task_value_dict:
        if self.key_in_list(task_id, task_list):
             return True
    return False
稳稳的幸福 2024-11-05 08:34:23

在这里,它对我有用,无需删除队列中的消息

def get_broker_tasks() -> []:
    conn = <CELERY_APP_INSTANCE>.connection()

    try:
        simple_queue = conn.SimpleQueue(queue_name)
        queue_size = simple_queue.qsize()
        messages = []

        for i in range(queue_size):
            message = simple_queue.get(block=False)
            messages.append(message)

        return messages
    except:
        messages = []
        return messages
    finally:
        print("Close connection")
        conn.close()

不要忘记将 CELERY_APP_INSTANCE 替换为您自己的。

@Owen:希望我的解决方案满足您的期望。

Here it works for me without remove messages in queue

def get_broker_tasks() -> []:
    conn = <CELERY_APP_INSTANCE>.connection()

    try:
        simple_queue = conn.SimpleQueue(queue_name)
        queue_size = simple_queue.qsize()
        messages = []

        for i in range(queue_size):
            message = simple_queue.get(block=False)
            messages.append(message)

        return messages
    except:
        messages = []
        return messages
    finally:
        print("Close connection")
        conn.close()

Don't forget to swap out CELERY_APP_INSTANCE with your own.

@Owen: Hope my solution meet your expectations.

ˉ厌 2024-11-05 08:34:23
def get_queue_length(total_tasks: int, queue_name: str, node_name: str):
    queue_size = 0
    inspector = app.control.inspect()
    stats = inspector.stats()
    if stats is not None:
        if f"celery@{node_name}" in stats.keys():
            total = stats[f"celery@{node_name}"]["total"]
            if queue_name in total.keys():
                active_tasks = total[queue_name] 
                if int(total_tasks) > int(active_tasks):
                    queue_size = total_tasks - active_tasks
    return queue_size

这利用了 celery 的控制和检查命令,但也密切关注已提交的任务。

仅此并不能真正起作用,除非您有某种将项目排队的循环,如下所示:

total_tasks = 0
max_queue_length = 100 # choose your number
queue = "celery_queue"
full_queue_name = "YourCeleryApp.your_celery_queue_name"
for item in list_of_tasks
    total_tasks+=1
    queue_length = get_queue_length(total_tasks=total_tasks, queue_name=full_queue_name  node_name=node_name)
    while int(queue_length) >= max_queue_length:
        time.sleep(10)
        queue_length = get_queue_length(total_tasks=total_tasks, queue_name=full_queue_name , node_name=node_name)
    your_celery_task.apply_async(kwargs={},queue=queue)

使用此方法,发生的情况如下:

  1. 跟踪已提交的项目数量
  2. 上面的代码将获得total,它是特定队列中特定工作人员已处理的任务数。
  3. 我们检查提交的总任务数是否大于我们的 active_tasks 或已被 celery 处理的任务数。

这意味着如果提交了 50 个任务并且已处理 30 任务,则队列中有 50-30 = 20 个任务

def get_queue_length(total_tasks: int, queue_name: str, node_name: str):
    queue_size = 0
    inspector = app.control.inspect()
    stats = inspector.stats()
    if stats is not None:
        if f"celery@{node_name}" in stats.keys():
            total = stats[f"celery@{node_name}"]["total"]
            if queue_name in total.keys():
                active_tasks = total[queue_name] 
                if int(total_tasks) > int(active_tasks):
                    queue_size = total_tasks - active_tasks
    return queue_size

This leverages celery's control and inspect commands but also keeps an eye on the tasks that have been submitted.

This alone doesn't really work unless you have some sort of loop that is enqueueing items, like the following:

total_tasks = 0
max_queue_length = 100 # choose your number
queue = "celery_queue"
full_queue_name = "YourCeleryApp.your_celery_queue_name"
for item in list_of_tasks
    total_tasks+=1
    queue_length = get_queue_length(total_tasks=total_tasks, queue_name=full_queue_name  node_name=node_name)
    while int(queue_length) >= max_queue_length:
        time.sleep(10)
        queue_length = get_queue_length(total_tasks=total_tasks, queue_name=full_queue_name , node_name=node_name)
    your_celery_task.apply_async(kwargs={},queue=queue)

With this what's happening is the following:

  1. Keep track of how many items have been submitted
  2. The above code will get the total which is the number of tasks that have been processed by a specific worker in a particular queue.
  3. We check whether the number of total tasks submitted is greater than our active_tasks or the tasks that have been processed by celery.

What this means is that if there are 50 tasks submitted and 30 have been processed, then there are 50-30 = 20 tasks in the queue

把回忆走一遍 2024-11-05 08:34:23

使用 subprocess.run

import subprocess
import re
active_process_txt = subprocess.run(['celery', '-A', 'my_proj', 'inspect', 'active'],
                                        stdout=subprocess.PIPE).stdout.decode('utf-8')
return len(re.findall(r'worker_pid', active_process_txt))

小心将 my_proj 更改为 your_proj

With subprocess.run:

import subprocess
import re
active_process_txt = subprocess.run(['celery', '-A', 'my_proj', 'inspect', 'active'],
                                        stdout=subprocess.PIPE).stdout.decode('utf-8')
return len(re.findall(r'worker_pid', active_process_txt))

Be careful to change my_proj with your_proj

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文