http4s Blaze客户型构建器等待队列完全失败
我们有一个用例,对于一个单个传入请求,微服务必须向其他微服务拨打其他微服务(在最坏的情况下近1000个)以获取一些详细信息。我们的服务是使用Scala,HTTP4和CATS效应构建的,并使用HTTP4S-Blaze-Client库来制作出站HTTP调用。
当前正在生产中,我们正在看到失败org.http4s.client.waitqueuefulelfailure:wait队列已满
and org.http4s.client.poolmanager:最大等待队列限制,未达到1024限制,未达到计划< /代码>。一旦服务进入该状态,它就永远不会从中恢复过,我们将完全阻止。
以下是我们正在使用的Blaze Client配置:
BlazeClientBuilder[F](global)
.withMaxWaitQueueLimit(1024)
.withRequestTimeout(20.seconds)
.resource
.map { client =>
ResponseLogger(logHeaders = false, logBody = true)(
RequestLogger(logHeaders = true, logBody = true, redactHeadersWhen = Middleware.SensitiveHeaders)(client)
)
}
最初,我们使用的是256的最大等待队列限制的默认设置,但随后决定增加至512,然后再增加到1024。
我不确定当出站HTTP请求很慢或耗时时是否会发生这种情况。有时,API响应有时很慢(但这仍然会在我们设置的20秒内返回)。但是我没有足够的证据声称这里就是这种情况。
我们目前正在使用HTTP4S-Blaze-Client_2.13:0.21.0-m6版本。
我不确定是否会进一步增加等待队列的大小会有所帮助。是否可以在服务中实现自定义逻辑以检查等待队列大小并在将请求提交给客户端之前等待? 请建议如何解决这个问题。任何帮助都将不胜感激。
We have a use case where for a single incoming request, a microservice has to make many(nearly 1000 in worst case) outgoing HTTP calls to other microservice to fetch GET some details. Our service is built using Scala, Http4s and Cats-Effect and is using http4s-blaze-client library for making outbound HTTP calls.
Currently in production we are seeing the failure org.http4s.client.WaitQueueFullFailure: Wait queue is full
and org.http4s.client.PoolManager: Max wait queue limit of 1024 reached, not scheduling
. Once the service goes into this state, its never recovering from it and we are completely blocked.
Below is the Blaze Client configuration we are using:
BlazeClientBuilder[F](global)
.withMaxWaitQueueLimit(1024)
.withRequestTimeout(20.seconds)
.resource
.map { client =>
ResponseLogger(logHeaders = false, logBody = true)(
RequestLogger(logHeaders = true, logBody = true, redactHeadersWhen = Middleware.SensitiveHeaders)(client)
)
}
Initially we were using the default setting of 256 for max wait queue limit but then decided to increase to 512 and then to 1024. Currently even the 1024 is not working.
I am not sure if this happens when the outbound HTTP request is slow or times out. There is a possibility that the API response is slow sometimes(but that will still return within 20seconds timeout that we set). But I do not have sufficient evidence to claim that it is the case here.
We are currently using the version http4s-blaze-client_2.13:0.21.0-M6.
I am not sure if increasing the wait queue size further would help. Is it possible to implement custom logic within the service to check the wait queue size and wait before submitting the request to the client?
Please advise how to get around with this issue. Any help would be really appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
好吧,根据注释,
maxwaitqueuelimit
只是“在任何特定时间等待连接的最大请求数”。那么,检查等待队列大小并等待是否已满,这是什么意义呢? HTTP4S已经在等待您。主要区别在于,如果您实现自己的等待(例如,每次执行HTTP请求时使用信号量并获得许可证),那么您可以等待多少请求没有限制。这意味着,当服务器上有高负载时,您将用光内存和崩溃。据推测,这是MaxWaitqueuelimit
首先要阻止的。现在,当您执行很多请求时,它们最终都进入了HTTP4,首先等待队列,除了可以找到连接的那些。
maxtotalConnections
的默认值为10,因此,当您开除1000个请求时,990最终将进入等待队列。如果在那一刻提出另一个请求,则触发了34多个请求,则您已经溢出了等待队列。考虑到您的情况,增加MaxWaitqueuelimit
对我来说似乎是完全合理的。假设您不能以某种方式减少所需的HTTP请求的数量。Well, according to the comments,
maxWaitQueueLimit
is simply the “maximum number of requests waiting for a connection at any specific time”. So what would be the point of checking the wait queue size and waiting if it's full? http4s is already doing the waiting for you. The main difference is that if you implement the waiting yourself (e. g. by using a Semaphore and acquiring a permit every time you perform an HTTP request), then there's no limit to how many requests you can be waiting for. And that means that when there's high load on your server, you'll run out of memory and crash. This is presumably what themaxWaitQueueLimit
is supposed to prevent in the first place.Now, when you perform a lot of requests, they all end up in the http4s wait queue at first, except for those that can find a connection. The default for
maxTotalConnections
is 10, so when you fire off 1000 requests, 990 will end up in the wait queue. If in that moment another request comes in that triggers more than 34 requests, you've already overflowed the wait queue. Increasing themaxWaitQueueLimit
much further seems perfectly reasonable to me given your situation. Assuming you can't somehow reduce the number of required HTTP requests, that is.