最小 Winsock2 应用程序的性能调试网络吞吐量

发布于 2024-08-08 04:59:38 字数 1948 浏览 8 评论 0原文

我有一个非常简单的 Winsock2 TCP 客户端（完整列表如下），它只是简单地发送一堆字节。然而，它在网络上运行速度非常慢；数据只是慢慢流逝。

以下是我尝试过并发现的内容（两台 Windows PC 都在同一 LAN 上）：

从一台计算机到另一台计算机运行此应用程序的速度很慢 - 发送 8MB 大约需要 50 秒。
两个不同的服务器 - netcat 和一个自定义编写的服务器（就像下面的客户端一样简单） - 产生了相同的结果。
taskmgr 显示 CPU 和网络几乎没有被利用。
在同一台计算机上与服务器一起运行此应用程序速度很快 - 发送 8MB 需要大约 1-2 秒。
另一个客户端 netcat 工作得很好 - 发送 20MB 的数据大约需要 7 秒。（我使用了 Cygwin 附带的 nc。）
改变缓冲区大小（1*4096、16*4096 和 128*4096）几乎没有什么区别。
在不同 LAN 上的 Linux 机器上运行几乎相同的代码效果很好。
在 send 调用周围添加一堆打印语句表明我们大部分时间都在阻塞它。
在服务器端，我们看到一堆 <= 4K 块的接收（无论发送方推送的缓冲区大小）。然而，其他客户端也会发生这种情况，例如全速运行的 netcat。

有什么想法吗？预先感谢您的任何提示。

#include <winsock2.h>
#include <iostream>

using namespace std;

enum { bytecount = 8388608 };
enum { bufsz = 16*4096 };

int main(int argc, TCHAR* argv[])
{
  WSADATA wsaData;
  WSAStartup(MAKEWORD(2,2), &wsaData);

  struct sockaddr_in sa;
  memset(&sa, 0, sizeof sa);
  sa.sin_family = AF_INET;
  sa.sin_port = htons(9898);
  sa.sin_addr.s_addr = inet_addr("157.54.144.70");
  if (sa.sin_addr.s_addr == -1) {
    cerr << "inet_addr: " << WSAGetLastError() << endl;
    return 1;
  }

  char *blob = new char[bufsz];
  for (int i = 0; i < bufsz; ++i) blob[i] = (char) i;

  SOCKET s = socket(AF_INET, SOCK_STREAM, IPPROTO_IP);
  if (s == INVALID_SOCKET) {
    cerr << "socket: " << WSAGetLastError() << endl;
    return 1;
  }

  int res = connect(s, reinterpret_cast<sockaddr*>(&sa), sizeof sa);
  if (res != 0) {
    cerr << "connect: " << WSAGetLastError() << endl;
    return 1;
  }

  int sent;
  for (int j = 0; j < bytecount; j += sent) {
    sent = send(s, blob, bufsz, 0);
    if (sent < 0) {
      cerr << "send: " << WSAGetLastError() << endl;
      return 1;
    }
  }

  closesocket(s);

  return 0;
}

原文

I have a very simple Winsock2 TCP client - full listing below - which simply blasts a bunch of bytes. However, it's running very slowly over the network; the data just trickles by.

Here's what I've tried and found (both Windows PCs are on the same LAN):

Running this app from one machine to the other is slow - it takes ~50s to send 8MB.
Two different servers - netcat and a custom-written one (just as simple as the below client) - yielded the same results.
taskmgr shows both the CPU and network being barely-utilized.
Running this app with the server on the same machine is fast - it takes ~1-2s to send 8MB.
A different client, netcat, works just fine - it takes ~7s to send 20MB of data. (I used the nc that comes with Cygwin.)
Varying the buffer size (1*4096, 16*4096, and 128*4096) made little difference.
Running almost the same code on Linux boxes on a different LAN worked just fine.
Adding a bunch of print statements around the send call shows that we spend most of our time blocking on it.
On the server side, we see a bunch of receives of <= 4K chunks (regardless of what size buffers the sender is pushing). However, this happens with other clients as well, like netcat, which runs at full speed.

Any ideas? Thanks in advance for any tips.

#include <winsock2.h>
#include <iostream>

using namespace std;

enum { bytecount = 8388608 };
enum { bufsz = 16*4096 };

int main(int argc, TCHAR* argv[])
{
  WSADATA wsaData;
  WSAStartup(MAKEWORD(2,2), &wsaData);

  struct sockaddr_in sa;
  memset(&sa, 0, sizeof sa);
  sa.sin_family = AF_INET;
  sa.sin_port = htons(9898);
  sa.sin_addr.s_addr = inet_addr("157.54.144.70");
  if (sa.sin_addr.s_addr == -1) {
    cerr << "inet_addr: " << WSAGetLastError() << endl;
    return 1;
  }

  char *blob = new char[bufsz];
  for (int i = 0; i < bufsz; ++i) blob[i] = (char) i;

  SOCKET s = socket(AF_INET, SOCK_STREAM, IPPROTO_IP);
  if (s == INVALID_SOCKET) {
    cerr << "socket: " << WSAGetLastError() << endl;
    return 1;
  }

  int res = connect(s, reinterpret_cast<sockaddr*>(&sa), sizeof sa);
  if (res != 0) {
    cerr << "connect: " << WSAGetLastError() << endl;
    return 1;
  }

  int sent;
  for (int j = 0; j < bytecount; j += sent) {
    sent = send(s, blob, bufsz, 0);
    if (sent < 0) {
      cerr << "send: " << WSAGetLastError() << endl;
      return 1;
    }
  }

  closesocket(s);

  return 0;
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

无力看清 2024-08-15 04:59:38

您可以采取以下措施来获得更好的图像。

您可以检查它在“连接”、“发送”API 调用中花费了多少时间。您可以查看连接调用是否有问题。您可以使用探查器来完成此操作，但如果您的应用程序非常慢，您将能够在调试时看到它。
尝试运行 Wireshark（或 Ethereal）来转储网络流量，以便您看到 TCP 数据包的传输有一定的延迟。如果响应速度很快，那么这只与您的系统有关。如果发现延迟，则可能是路由/网络问题。
您可以运行“route print”来检查您的 PC 如何将流量发送到目标计算机 (157.54.144.70)。您将能够查看是否使用网关并检查不同路由的路由优先级。
尝试发送较小的块。（我的意思是将“bufsz”更改为1024）。性能和缓冲区大小之间是否存在相关性？
检查是否安装了防病毒、防火墙应用程序？确保将其关闭。您可以尝试在网络支持的安全模式下运行同一个应用程序。

回复收藏 0 原文

他不在意 2024-08-15 04:59:38

该应用程序看起来不错，并且您说它在 Linux 上运行良好。
我不知道这是否对你有帮助，但我会比较 -
1）windows与linux系统的mtu值。
2）检查Windows和Linux中的tcp接收内存大小。
3）检查两个系统的网卡速度是否相同。

回复收藏 0 原文

蓝海似她心 2024-08-15 04:59:38

我使用观看了数据包Microsoft 网络监视器 (netmon) 和漂亮的 TCP 分析器可视化工具，结果发现大量数据包丢失并需要重新传输 - 由于重传超时 (RTO)，因此速度较慢。

一位同事帮我调试了这个：

嗯，从接收方的跟踪来看，显然有些数据包没有到达接收方。我还在这些跟踪中看到了一些损坏的数据包（例如部分 TCP 标头等）。
即使在“良好”跟踪（netcat 客户端的接收者视图）中，我也看到一些损坏的数据包（错误的 TCP 数据长度等）。然而，错误并不像其他跟踪中那样频繁。
鉴于这些计算机位于同一子网上，因此不存在可能丢弃数据包的路由器。剩下两个 NIC、以太网电缆和以太网交换机。您可以尝试通过添加第三台机器来隔离坏机器，并尝试使用新机器进行相同的测试，首先替换发送方，然后替换接收方。为第三台机器使用不同的物理端口。如果任何一台原始机器在其和地板插孔之间有一个开关，请尝试从等式中删除该开关。您还可以尝试在原来的两台机器之间使用以太网反向电缆（或者直接将两台机器插入的不同以太网交换机），看看问题是否仍然存在。
由于问题似乎与数据包内容相关，因此我怀疑问题出在布线上。鉴于发送者有一个 NVidia nForce 芯片组以太网，而接收者有一个 Broadcom 以太网，我的钱是发送者的 NIC 是罪魁祸首。如果确实是特定 NIC 的故障，请尝试关闭 NIC 的特殊功能，例如校验和卸载或大量发送卸载。

我尝试使用第三个盒子作为发送器（与原始发送器相同，带有 nForce 芯片组的 Shuttle XPC），并且工作顺利 - TCP 分析器显示运行非常流畅的 TCP 会话。对我来说，这表明问题实际上是由于原始发送器上有缺陷的 NIC/驱动程序或以太网电缆损坏造成的。

I watched packets going by using Microsoft Network Monitor (netmon) with the nice TCP Analyzer visualizer, and it turned out that tons of packets were getting lost and needing to be retransmitted - hence the slow speeds, because of retransmission timeouts (RTOs).

A colleague helped me debug this:

Well, from this trace on the receiver side, it definitely looks like some packets are not making it through to the receiver. I also see what appear to be some mangled packets (things like partial TCP headers, etc) in these traces.
Even in the “good” trace (the receiver's view of the netcat client), I see some mangled packets (wrong TCP data length, etc). The errors aren’t as frequent as in the other trace, however.
Given that these machines are on the same subnet, there is no router in the way which could be dropping packets. That leaves the two NICs, the Ethernet cables, and the Ethernet switches. You could try to isolate the bad machine by adding a third machine into the mix and try the same test with the new machine replacing first the sender and then the receiver. Use a different physical port for the third machine. If either of the original machines has a switch between it and the floor jack, try removing that switch from the equation. You could also try an Ethernet reversing cable between the original two machines (or a different Ethernet switch that you plug the two machines into directly) and see if the problem persists.
Since the problem appears to be packet content dependent, I doubt the problem is in the cabling. Given that the sender has an NVidia nForce chipset Ethernet and the receiver has a Broadcom Ethernet, my money is on the sender’s NIC being the culprit. If it does seem to be the fault of a particular NIC, try turning off special features of the NIC like checksum offloading or large-send offload.

I tried using a third box as the sender (identical to original sender, a Shuttle XPC with nForce chipset), and this worked smoothly - TCP Analyzer showed very smooth-running TCP sessions. This suggests to me that the problem was actually due to a buggy NIC/driver on the original sender box, or bad Ethernet cable.

回复收藏 0 原文

~没有更多了~