如何在 kubernetes 中设置 Gunicorn 工作编号?荚

发布于 2025-01-09 09:30:35 字数 1359 浏览 1 评论 0原文

我正在使用 gunicorngevent 工作线程运行一个 flask 应用程序。在我自己的测试环境中,我按照官方指南 multiprocessing.cpu_count() * 2 + 1 设置worker数量。

如果我想把应用程序放在Kubernetes的pod上并假设资源会像

resources:
  limits:
    cpu: "10"
    memory: "5Gi"
  requests:
    CPU: "3"
    memory: "3Gi"

如何计算worker数量?我应该使用限制 CPU 还是请求 CPU?


附言。我通过 pyinstaller 打包的二进制文件启动应用程序,本质上是 flask run(python script.py),并在主线程中启动 Gunicorn:

def run():
    ...
    if config.RUN_MODEL == 'GUNICORN':
        sys.argv += [
            "--worker-class", "event",
            "-w", config.GUNICORN_WORKER_NUMBER,
            "--worker-connections", config.GUNICORN_WORKER_CONNECTIONS,
            "--access-logfile", "-",
            "--error-logfile", "-",
            "-b", "0.0.0.0:8001",
            "--max-requests", config.GUNICORN_MAX_REQUESTS,
            "--max-requests-jitter", config.GUNICORN_MAX_REQUESTS_JITTER,
            "--timeout", config.GUNICORN_TIMEOUT,
            "--access-logformat", '%(t)s %(l)s %(u)s "%(r)s" %(s)s %(M)sms',
            "app.app_runner:app"
    ]
    sys.exit(gunicorn.run())

if __name__ == "__main__":
    run()

PS。无论我通过限制CPU(10*2+1=21)设置worker数量还是请求CPU(3*2+1=7)性能仍然无法赶上符合我的期望。在此问题下欢迎任何提高性能的尝试建议

I'm running a flask application with gunicorn and gevent worker class. In my own test environment, I follow the official guide multiprocessing.cpu_count() * 2 + 1 to set worker number.

If I want to put the application on Kubernetes' pod and assume that resources will be like

resources:
  limits:
    cpu: "10"
    memory: "5Gi"
  requests:
    CPU: "3"
    memory: "3Gi"

how to calculate the worker number? should I use limits CPU or requests CPU?


PS. I'm launching application via binary file packaged by pyinstaller, in essence flask run(python script.py), and launch gunicorn in the main thread:

def run():
    ...
    if config.RUN_MODEL == 'GUNICORN':
        sys.argv += [
            "--worker-class", "event",
            "-w", config.GUNICORN_WORKER_NUMBER,
            "--worker-connections", config.GUNICORN_WORKER_CONNECTIONS,
            "--access-logfile", "-",
            "--error-logfile", "-",
            "-b", "0.0.0.0:8001",
            "--max-requests", config.GUNICORN_MAX_REQUESTS,
            "--max-requests-jitter", config.GUNICORN_MAX_REQUESTS_JITTER,
            "--timeout", config.GUNICORN_TIMEOUT,
            "--access-logformat", '%(t)s %(l)s %(u)s "%(r)s" %(s)s %(M)sms',
            "app.app_runner:app"
    ]
    sys.exit(gunicorn.run())

if __name__ == "__main__":
    run()

PS. Whether I set worker number by limits CPU (10*2+1=21) or requests CPU (3*2+1=7) the performance still can't catch up with my expectations. Any trial suggestions to improve performance will be welcome under this questions

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

司马昭之心 2025-01-16 09:30:35

如何计算工人数量?我应该使用限制 CPU 还是请求 CPU?

这取决于你的情况。首先,查看有关请求和限制的文档< /a> (此示例针对内存,但对于 CPU 也是如此)。

如果 Pod 运行的节点有足够的可用资源,容器就有可能(并且允许)使用比其对该资源的请求指定的资源更多的资源。但是,容器使用的资源不得超过其限制

例如,如果您为容器设置 256 MiB 的内存请求,并且该容器位于调度到具有 8GiB 内存的节点且没有其他 Pod 的 Pod 中,则该容器可以尝试使用更多内存。

如果您为该容器设置 4GiB 的内存限制,kubelet(和容器运行时) 强制实施限制。运行时会阻止容器使用超过配置的资源限制。例如:当容器中的进程尝试消耗超过允许的内存量时,系统内核会终止尝试分配的进程,并出现内存不足 (OOM) 错误。

回答你的问题:首先,你需要知道你的应用程序需要多少资源(例如CPU)。请求将是应用程序必须接收的最小 CPU 量(您必须自己计算该值。换句话说 - 您必须知道应用程序需要多少最低 CPU 才能正常运行,然后您需要设置该值。)例如,如果您的应用程序在接收更多 CPU 时性能会更好,请考虑添加限制(这是应用程序可以接收的最大 CPU 量)。如果您想按照最高性能计算worker数量,请使用limit来计算该值。另一方面,如果您希望应用程序平稳运行(可能不会尽可能快,但会消耗更少的资源),请使用 request 类型。

how to calculate the worker number? should I use limits CPU or requests CPU?

It depends on your situation. First, look at the documentation about request and limits (this example is for memory, but the same is for CPU).

f the node where a Pod is running has enough of a resource available, it's possible (and allowed) for a container to use more resource than its request for that resource specifies. However, a container is not allowed to use more than its resource limit.

For example, if you set a memory request of 256 MiB for a container, and that container is in a Pod scheduled to a Node with 8GiB of memory and no other Pods, then the container can try to use more RAM.

If you set a memory limit of 4GiB for that container, the kubelet (and container runtime) enforce the limit. The runtime prevents the container from using more than the configured resource limit. For example: when a process in the container tries to consume more than the allowed amount of memory, the system kernel terminates the process that attempted the allocation, with an out of memory (OOM) error.

Answering your question: first of all, you need to know how many resources (eg. CPU) your application needs. Request will be the minimum amount of CPU that the application must receive (you have to calculate this value yourself. In other words - you must know how much the application needs minimum CPU to run properly and then you need to set the value.) For example, if your application will perform better, when it receives more CPU, consider adding a limit ( this is the maximum amount of CPU an application can receive). If you want to calculate the worker number based on the highest performance, use limit to calculate the value. If, on the other hand, you want your application to run smoothly (perhaps not as fast as possible, but it will consume less resources) use request type.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文