GCP 的 kubernetes pod 上的 Pub/Sub 拉取请求数量急剧减少

发布于 2025-01-14 01:19:09 字数 1036 浏览 2 评论 0原文

我的积压 gcp pub/sub 订阅中有 5M~ 消息(总共 7GB~),并且希望提取尽可能多的消息。我正在使用具有以下设置的同步拉取,并等待 3 分钟来堆积消息并发送到另一个数据库。

    defaultSettings := &pubsub.ReceiveSettings{
        MaxExtension:           10 * time.Minute,
        MaxOutstandingMessages: 100000,
        MaxOutstandingBytes:    128e6, // 128 MB
        NumGoroutines:          1,
        Synchronous:            true,
    }

问题是,如果我的 kubernetes 集群上有大约 5 个 pod,那么 pod 几乎每轮(3 分钟时间段)都能够拉近 90k~ 消息。但是,当我在第一轮或第二轮中将 pod 数量增加到 20 时每个 pod 都能够检索 90k~ 消息,但是一段时间后,拉取请求计数会急剧下降,并且每个 pod 在每轮中都会收到 1k-5k~ 消息。我研究了 go 库同步拉取机制,并知道如果没有成功确认消息,您将无法请求新消息,因此拉取请求计数可能会下降以防止超过 MaxOutstandingMessages ,但我正在将我的消息缩小到零pod 启动新的 pod,而我的订阅中仍然有数百万条未确认的消息,并且它们在 3 分钟内收到的消息数量仍然非常少,5 或 20 个 pod 并不重要。大约 20-30 分钟后,他们再次收到 90k~ 条消息,然后在一段时间后再次下降到非常低的水平(从指标页面检查)。另一个有趣的事情是,虽然我的新 Pod 收到的消息数量非常少,但连接到同一订阅的本地计算机在每一轮中都会收到 90k~ 消息。

我读过 pubsub 的配额和限制页面,带宽配额非常高(大区域为每分钟 240,000,000 kB(4 GB/s))。我尝试了很多事情,但无法理解为什么拉取请求计数会在我启动新的 Pod 时大幅下降。 gcp 或 pub/sub 端的 kubernetes 集群节点是否存在某些连接或带宽限制?接收大量消息对于我的任务至关重要。

I have 5M~ messages (total 7GB~) on my backlog gcp pub/sub subscription and want to pull as many as possible of them. I am using synchronous pull with settings below and waiting for 3 minutes to pile up messages and sent to another db.

    defaultSettings := &pubsub.ReceiveSettings{
        MaxExtension:           10 * time.Minute,
        MaxOutstandingMessages: 100000,
        MaxOutstandingBytes:    128e6, // 128 MB
        NumGoroutines:          1,
        Synchronous:            true,
    }

Problem is that if I have around 5 pods on my kubernetes cluster pods are able to pull nearly 90k~ messages almost in each round (3 minutes period).However, when I increase the number of pods to 20 in the first or the second round each pods able to retrieve 90k~ messages however after a while somehow pull request count drastically drops and each pods receives 1k-5k~ messages in each round. I have investigated the go library sync pull mechanism and know that without acking successfully messages you are not able to request for new ones so pull request count may drop to prevent exceed MaxOutstandingMessages but I am scaling down to zero my pods to start fresh pods while there are still millions of unacked messages in my subscription and they still gets very low number of messages in 3 minutes with 5 or 20 pods does not matter. After around 20-30 minutes they receives again 90k~ messages each and then again drops to very low levels after a while (checking from metrics page). Another interesting thing is that while my fresh pods receives very low number of messages, my local computer connected to same subscription gets 90k~ messages in each round.

I have read the quotas and limits page of pubsub, bandwith quotas are extremely high (240,000,000 kB per minute (4 GB/s) in large regions) . I tried a lot of things but couldn't understand why pull request counts drops massively in case I am starting fresh pods. Is there some connection or bandwith limitation for kubernetes cluster nodes on gcp or on pub/sub side? Receiving messages in high volume is critical for my task.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

静谧 2025-01-21 01:19:09

如果您使用同步拉取,我建议使用 StreamingPull 用于您的规模 Pub/Sub 使用情况。

请注意,要通过同步实现低消息传递延迟
拉力,重要的是同时拥有许多突出的拉力
请求。随着主题吞吐量的增加,更多的拉取请求
是必要的。一般来说,异步拉取更适合
对延迟敏感的应用程序。

预计,对于高吞吐量场景和同步拉取,应该始终有许多空闲请求。

同步拉取请求建立与一台特定服务器(进程)的连接。高吞吐量主题由许多服务器处理。进来的消息只会发送到少数服务器,从 3 到 5 台。这些服务器应该有一个已连接的空闲进程,以便能够快速转发消息。

该过程与基于 CPU 的缩放发生冲突。空闲连接不会导致 CPU 负载。至少,每个 Pod 的线程数应该多于 10 个,才能实现基于 CPU 的扩展。

此外,您还可以使用 Horizo​​ntal-Pod-Autoscaler( HPA) 配置用于使用 Pub/Sub 的 GKE Pod。使用 HPA,您可以配置 CPU 使用率。

我的最后建议是考虑Dataflow 适合您的工作量。从 PubSub 消费。

If you are using synchronous pull, I suggest using StreamingPull for your scale Pub/Sub usage.

Note that to achieve low message delivery latency with synchronous
pull, it is important to have many simultaneously outstanding pull
requests. As the throughput of the topic increases, more pull requests
are necessary. In general, asynchronous pull is preferable for
latency-sensitive applications.

It is expected that, for a high throughput scenario and synchronous pull, there should always be many idle requests.

A synchronous pull request establishes a connection to one specific server (process). A high throughput topic is handled by many servers. Messages coming in will go to only a few servers, from 3 to 5. Those servers should have an idle process already connected, to be able to quickly forward messages.

The process conflicts with CPU based scaling. Idle connections don't cause CPU load. At least, there should be many more threads per pod than 10 to make CPU-based scaling work.

Also, you can use Horizontal-Pod-Autoscaler(HPA) configured for Pub/Sub consuming GKE pods. With the HPA, you can configure CPU usage.

My last recommendation would be to consider Dataflow for your workload. Consuming from PubSub.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文