HttpWebRequest 和 I/O 完成端口
我正在开发一个应用程序,该应用程序需要一种类型的消息访问数据库,而另一种类型的消息访问某些外部 xml api。
我必须处理很多...最大的挑战之一是让 HttpWebRequest 类表现良好。我最初只是使用标准同步方法和线程池来完成整个事情。这不太好。
因此,经过一番阅读后,我发现推荐的方法是使用 Begin/End 方法将工作委托给 IO 完成端口,从而释放线程池并产生更好的性能。情况似乎并非如此...性能稍好一些,但与线程池相比,我当然看不到 IO 完成端口的使用量那么多。
我有一个线程旋转并向我发送线程池中可用的工作线程+完成端口。完成端口总是非常低(我见过最多使用 9 个),并且我总是使用大约 120 个工作线程(有时更多)。我对 httpwebrequest
中的所有方法使用开始/结束模式:
Begin/EndGetRequestStream
Begin/EndWrite (Stream)
Begin/EndGetResponse
Begin/EndRead (Stream)
我这样做对吗?我错过了什么吗?我(有时)可以同时使用最多 2048 个 http 连接(来自 netstat 输出) - 为什么完成端口号如此低?
如果有人能够就如何管理工作线程、完成端口和 httpwebrequest 给出一些认真的建议,我们将不胜感激!
编辑:.NET 是一个合理的工具吗?我能否获得大量使用 .NET 和 System.Net 堆栈的 http 连接?有人建议使用 WinHttp(或其他一些 C++ 库)之类的东西,并从 .NET 调用它,但这不是我特别想做的事情!
I'm working on an application that requires for one type of message to go hit a database, and the other type of message to go and hit some external xml api.
I have to process A LOT... one of the big challenges is to get HttpWebRequest class performing well. I initially started with just using the standard synchronous methods and threadpooling the whole thing. This was not good.
So after a bit of reading I saw that the recommended way to do this was to use the Begin/End methods to delegate the work to IO completion ports, thus freeing up the threadpool and yielding better performance. This doesn't seem to be the case... the performance is marginally better but I certainly can't see the IO completion ports being used that much compared to threadpool.
I have a thread that spins round and sends me the available worker threads + completion ports in the threadpool. Completion ports is always very low (max I've seen is 9 used) and I'm always using about 120 worker threads (sometimes more). I use the begin / end pattern for all methods in httpwebrequest
:
Begin/EndGetRequestStream
Begin/EndWrite (Stream)
Begin/EndGetResponse
Begin/EndRead (Stream)
Am I doing it right? Am I missing something? I can use (sometimes) up to 2048 http connections simultaneously (from netstat output) - why would the completion port numbers be so low?
If anyone could give some serious advice about how to do well with this managing worker threads, completion ports and httpwebrequest
it would be hugely appreciated!
EDIT: is .NET a reasonable tool for this? Can I get a high volume of httpconnections working with .NET and the System.Net stack? It's been suggested to use something like WinHttp (or some other C++ library), and pInvoke it from .NET, but this isn't something I especially want to do!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
按照我的理解,在异步请求未完成时,您不会一直占用 I/O 完成端口 - 仅当数据已返回并正在处理时,它才“忙”相应的线程。希望您在回调中没有太多工作要做,这就是为什么您在任何时候都没有很多正在使用的端口的原因。
但你的表现实际上是否很差?您担心的只是数字太低吗?您是否获得了预期的吞吐量?
您可能遇到的一个问题是任何一台主机的 HTTP 连接池都相对较小。如果您对同一台机器有数百个请求,那么默认情况下一次实际上只会发出 2 个请求,以避免对有问题的主机进行 DoS 攻击(并获得保留的好处)活)。您可以通过编程方式或使用 app.config 来增加此值。当然,对于您的情况来说,这可能不是问题,因为您已经解决了问题,或者因为您的所有请求都发送到不同的主机。 (如果 netstat 显示 2048 个连接,那么听起来不错。)
The way I understand it, you don't tie up an I/O completion port all the time that an asynchronous request is outstanding - it's only "busy" when data has been returned and is being processed on the corresponding thread. Hopefully you don't have very much work to do in the callback, which is why you don't have many in-use ports at any one time.
Are you actually getting poor performance though? Is your cause for concern merely the low numbers? Are you getting the throughput you'd expect?
One problem you may have is that the HTTP connection pool for any one host is relatively small. If you have hundreds of requests to the same machine, then by default only 2 requests will actually be made at a time, to avoid DoS-attacking the host in question (and to get the benefits of keep-alive). You can increase this programmatically or using app.config. Of course, this may not be an issue in your case, either because you've already fixed the problem or because all your requests are to different hosts. (If netstat is showing 2048 connections then that doesn't sound bad.)
也许您的 EndRead 方法应该只将结果写入线程安全队列,然后从您控制的少量工作线程中读取该结果。和/或使用 HttpWebRequest 将在完成时发出可等待对象信号的事实,并编写您自己的逻辑来等待来自单个(或少量)线程的所有未完成的请求。
Maybe your EndRead methods should only write the result to a thread safe queue that you then read from a small number of worker threads that are under your control. And/Or use the fact that HttpWebRequest will signal a waitable object when it is done and write your own logic to wait on all the outstanding requests from a single (or small number of) threads.
只有 9 个完成端口线程实际上意味着您可能正确且高效地使用它们。我假设您运行的机器有 8 个核心或 4 个超线程核心,这意味着操作系统将尝试随时保持最多 8 个活动(不是休眠/阻塞/等待)完成端口线程。
如果正在运行的线程之一变为非活动状态(睡眠/阻塞/等待)并且有其他工作项需要处理,则将创建一个附加线程以将活动计数保持在 8。如果您看到 9 个线程,则意味着您正在运行在完成端口线程的方法中几乎没有引入任何阻塞,并且实际上与它们一起进行 CPU 工作。
如果您有 8 个线程在 8 个核心上主动执行 CPU 密集型工作,那么添加更多线程只会减慢速度(线程之间的上下文切换将浪费时间)。
您应该查看的是为什么您有 120 个其他 线程以及它们正在做什么。
Having only 9 completion port threads actually means you're probably using them correctly and efficiently. I'm going to assume that the machine you're running on has either 8 cores or 4 hyperthreaded cores which means that the OS will try to keep up to 8 active (not sleeping/blocking/waiting) completion port threads at any time.
If one of the running threads becomes inactive (sleep/block/wait) and there are additional work items to process, then an additional thread will be created to keep the active count at 8. If you see 9 threads, that means that you are introducing virtually no blocking in the methods on your completion port threads and actually doing CPU work with them.
If you have 8 threads actively doing CPU bound work on 8 cores, then adding more threads will only slow things down (context switching between threads will be the wasted time).
What you should be looking in to is why you have 120 other threads and what those are doing.