最小 Winsock2 应用程序的性能调试网络吞吐量
我有一个非常简单的 Winsock2 TCP 客户端(完整列表如下),它只是简单地发送一堆字节。然而,它在网络上运行速度非常慢;数据只是慢慢流逝。
以下是我尝试过并发现的内容(两台 Windows PC 都在同一 LAN 上):
- 从一台计算机到另一台计算机运行此应用程序的速度很慢 - 发送 8MB 大约需要 50 秒。
- 两个不同的服务器 - netcat 和一个自定义编写的服务器(就像下面的客户端一样简单) - 产生了相同的结果。
- taskmgr 显示 CPU 和网络几乎没有被利用。
- 在同一台计算机上与服务器一起运行此应用程序速度很快 - 发送 8MB 需要大约 1-2 秒。
- 另一个客户端 netcat 工作得很好 - 发送 20MB 的数据大约需要 7 秒。 (我使用了 Cygwin 附带的 nc。)
- 改变缓冲区大小(1*4096、16*4096 和 128*4096)几乎没有什么区别。
- 在不同 LAN 上的 Linux 机器上运行几乎相同的代码效果很好。
- 在
send
调用周围添加一堆打印语句表明我们大部分时间都在阻塞它。 - 在服务器端,我们看到一堆 <= 4K 块的接收(无论发送方推送的缓冲区大小)。然而,其他客户端也会发生这种情况,例如全速运行的 netcat。
有什么想法吗?预先感谢您的任何提示。
#include <winsock2.h>
#include <iostream>
using namespace std;
enum { bytecount = 8388608 };
enum { bufsz = 16*4096 };
int main(int argc, TCHAR* argv[])
{
WSADATA wsaData;
WSAStartup(MAKEWORD(2,2), &wsaData);
struct sockaddr_in sa;
memset(&sa, 0, sizeof sa);
sa.sin_family = AF_INET;
sa.sin_port = htons(9898);
sa.sin_addr.s_addr = inet_addr("157.54.144.70");
if (sa.sin_addr.s_addr == -1) {
cerr << "inet_addr: " << WSAGetLastError() << endl;
return 1;
}
char *blob = new char[bufsz];
for (int i = 0; i < bufsz; ++i) blob[i] = (char) i;
SOCKET s = socket(AF_INET, SOCK_STREAM, IPPROTO_IP);
if (s == INVALID_SOCKET) {
cerr << "socket: " << WSAGetLastError() << endl;
return 1;
}
int res = connect(s, reinterpret_cast<sockaddr*>(&sa), sizeof sa);
if (res != 0) {
cerr << "connect: " << WSAGetLastError() << endl;
return 1;
}
int sent;
for (int j = 0; j < bytecount; j += sent) {
sent = send(s, blob, bufsz, 0);
if (sent < 0) {
cerr << "send: " << WSAGetLastError() << endl;
return 1;
}
}
closesocket(s);
return 0;
}
I have a very simple Winsock2 TCP client - full listing below - which simply blasts a bunch of bytes. However, it's running very slowly over the network; the data just trickles by.
Here's what I've tried and found (both Windows PCs are on the same LAN):
- Running this app from one machine to the other is slow - it takes ~50s to send 8MB.
- Two different servers - netcat and a custom-written one (just as simple as the below client) - yielded the same results.
- taskmgr shows both the CPU and network being barely-utilized.
- Running this app with the server on the same machine is fast - it takes ~1-2s to send 8MB.
- A different client, netcat, works just fine - it takes ~7s to send 20MB of data. (I used the nc that comes with Cygwin.)
- Varying the buffer size (1*4096, 16*4096, and 128*4096) made little difference.
- Running almost the same code on Linux boxes on a different LAN worked just fine.
- Adding a bunch of print statements around the
send
call shows that we spend most of our time blocking on it. - On the server side, we see a bunch of receives of <= 4K chunks (regardless of what size buffers the sender is pushing). However, this happens with other clients as well, like netcat, which runs at full speed.
Any ideas? Thanks in advance for any tips.
#include <winsock2.h>
#include <iostream>
using namespace std;
enum { bytecount = 8388608 };
enum { bufsz = 16*4096 };
int main(int argc, TCHAR* argv[])
{
WSADATA wsaData;
WSAStartup(MAKEWORD(2,2), &wsaData);
struct sockaddr_in sa;
memset(&sa, 0, sizeof sa);
sa.sin_family = AF_INET;
sa.sin_port = htons(9898);
sa.sin_addr.s_addr = inet_addr("157.54.144.70");
if (sa.sin_addr.s_addr == -1) {
cerr << "inet_addr: " << WSAGetLastError() << endl;
return 1;
}
char *blob = new char[bufsz];
for (int i = 0; i < bufsz; ++i) blob[i] = (char) i;
SOCKET s = socket(AF_INET, SOCK_STREAM, IPPROTO_IP);
if (s == INVALID_SOCKET) {
cerr << "socket: " << WSAGetLastError() << endl;
return 1;
}
int res = connect(s, reinterpret_cast<sockaddr*>(&sa), sizeof sa);
if (res != 0) {
cerr << "connect: " << WSAGetLastError() << endl;
return 1;
}
int sent;
for (int j = 0; j < bytecount; j += sent) {
sent = send(s, blob, bufsz, 0);
if (sent < 0) {
cerr << "send: " << WSAGetLastError() << endl;
return 1;
}
}
closesocket(s);
return 0;
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以采取以下措施来获得更好的图像。
Here are the things you can do to get a better picture.
该应用程序看起来不错,并且您说它在 Linux 上运行良好。
我不知道这是否对你有帮助,但我会比较 -
1)windows与linux系统的mtu值。
2)检查Windows和Linux中的tcp接收内存大小。
3)检查两个系统的网卡速度是否相同。
The application looks fine, and you said it works fine with linux.
I dont know whether this will help you, but I would have compared -
1) The mtu values of the windows with the linux system.
2) checked the tcp receive mem size in windows and Linux.
3) checked whether the network card speed of both the systems are same.
我使用 观看了数据包Microsoft 网络监视器 (netmon) 和漂亮的 TCP 分析器可视化工具,结果发现大量数据包丢失并需要重新传输 - 由于重传超时 (RTO),因此速度较慢。
一位同事帮我调试了这个:
我尝试使用第三个盒子作为发送器(与原始发送器相同,带有 nForce 芯片组的 Shuttle XPC),并且工作顺利 - TCP 分析器显示运行非常流畅的 TCP 会话。对我来说,这表明问题实际上是由于原始发送器上有缺陷的 NIC/驱动程序或以太网电缆损坏造成的。
I watched packets going by using Microsoft Network Monitor (netmon) with the nice TCP Analyzer visualizer, and it turned out that tons of packets were getting lost and needing to be retransmitted - hence the slow speeds, because of retransmission timeouts (RTOs).
A colleague helped me debug this:
I tried using a third box as the sender (identical to original sender, a Shuttle XPC with nForce chipset), and this worked smoothly - TCP Analyzer showed very smooth-running TCP sessions. This suggests to me that the problem was actually due to a buggy NIC/driver on the original sender box, or bad Ethernet cable.