F# 风格异步工作流的 WebRequest 超时
对于更广泛的上下文,这里是我的代码,下载URL 列表。
在我看来,在使用 use! 时,没有好的方法来处理 F# 中的超时! response = request.AsyncGetResponse()
样式 URL 获取。我几乎一切都按我希望的方式工作(错误处理和异步请求和响应下载),避免了网站需要很长时间响应时出现的问题。我当前的代码只是无限期地挂起。我已经在我编写的等待 300 秒的 PHP 脚本上尝试过了。它一直在等待。
我找到了两种“解决方案”,这两种方案都是不可取的。
AwaitIAsyncResult
+ BeginGetResponse
就像ildjarn在这个其他堆栈溢出问题。这样做的问题是,如果您已将许多异步请求排队,则某些请求会在 AwaitIAsyncResult
上被人为阻止。换句话说,发出请求的调用已经发出,但幕后的某些东西正在阻止调用。这会导致在发出许多并发请求时过早触发 AwaitIAsyncResult
超时。我的猜测是对单个域的请求数量的限制或只是对总请求的限制。
为了支持我的怀疑,我编写了一个小 WPF 应用程序来绘制请求开始和结束的时间线。在上面链接的代码中,请注意计时器在第 49 行和第 54 行(调用第 10 行)开始和停止。这是生成的时间线图像。
当我将计时器开始移动到初始响应之后(因此我只计时内容的下载)时, 时间线看起来更加真实。请注意,这是两次单独的运行,但除了计时器启动的位置之外,没有任何代码更改。而不是在使用之前直接测量
,我之后就直接得到了。startTime
! response = request.AsyncGetResponse()
为了进一步支持我的主张,我用 Fiddler2 制作了一个时间表。这是生成的时间线。 显然,请求并没有完全按照我的指示开始。
新线程中的 GetResponseStream
换句话说,同步请求和下载调用是在辅助线程中进行的。这确实有效,因为GetResponseStream
尊重WebRequest
对象上的Timeout
属性。但在此过程中,我们失去了所有的等待时间,因为请求已在网上,而响应尚未返回。我们不妨用 C# 编写它...;)
问题
- 这是一个已知问题吗?
- 是否有任何好的解决方案可以利用 F# 异步工作流程并仍然允许超时和错误处理?
- 如果问题确实是我一次发出太多请求,那么限制请求数量的最佳方法是使用
Semaphore(5, 5)
或类似的东西吗? - 附带问题:如果您查看了我的代码,您能看到我做过并且可以修复的任何愚蠢的事情吗?
如果您有任何困惑,请告诉我。
For a broader context, here is my code, which downloads a list of URLs.
It seems to me that there is no good way to handle timeouts in F# when using use! response = request.AsyncGetResponse()
style URL fetching. I have pretty much everything working as I'd like it too (error handling and asynchronous request and response downloading) save the problem that occurs when a website takes a long time to response. My current code just hangs indefinitely. I've tried it on a PHP script I wrote that waits 300 seconds. It waited the whole time.
I have found "solutions" of two sorts, both of which are undesirable.
AwaitIAsyncResult
+ BeginGetResponse
Like the answer by ildjarn on this other Stack Overflow question. The problem with this is that if you have queued many asynchronous requests, some are artificially blocked on AwaitIAsyncResult
. In other words, the call to make the request has been made, but something behind the scenes is blocking the call. This causes the time-out on AwaitIAsyncResult
to be triggered prematurely when many concurrent requests are made. My guess is a limit on the number of requests to a single domain or just a limit on total requests.
To support my suspicion I wrote little WPF application to draw a timeline of when the requests seem to be starting and ending. In my code linked above, notice the timer start and stops on lines 49 and 54 (calling line 10). Here is the resulting timeline image.
When I move the timer start to after the initial response (so I am only timing the downloading of the contents), the timeline looks a lot more realistic. Note, these are two separate runs, but no code change aside from where the timer is started. Instead of having the startTime
measured directly before use! response = request.AsyncGetResponse()
, I have it directly afterwards.
To further support my claim, I made a timeline with Fiddler2. Here is the resulting timeline. Clearly the requests aren't starting exactly when I tell them to.
GetResponseStream
in a new thread
In other words, synchronous requests and download calls are made in a secondary thread. This does work, since GetResponseStream
respects the Timeout
property on the WebRequest
object. But in the process, we lose all of the waiting time as the request is on the wire and the response hasn't come back yet. We might as well write it in C#... ;)
Questions
- Is this a known problem?
- Is there any good solution that takes advantage of F# asynchronous workflows and still allows timeouts and error handling?
- If the problem is really that I am making too many requests at once, then would the best way to limit the number of request be to use a
Semaphore(5, 5)
or something like that? - Side Question: if you've looked at my code, can you see any stupid things I've done and could fix?
If there is anything you are confused about, please let me know.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
AsyncGetResponse 简单地忽略发布的任何超时值...这是我们刚刚制定的解决方案:
即我们委托给另一个代理并调用它提供异步超时...如果我们没有在指定的时间内从代理那里得到回复是时候我们中止请求并继续前进了。
AsyncGetResponse simply ignoring any timeout value posted... here's a solution we just cooked:
i.e. We're delegating to another agent and calling it providing an async timeout... if we do not get a reply from the agent in the specified amount of time we abort the request and move on.
我发现我的其他答案可能无法回答您的特定问题...这是不需要使用信号量的任务限制器的另一个实现。
I see my other answer may fail to answer your particular question... here's another implementation for a task limiter that doesn't require the use of semaphore.