如何调整 TCP 以实现高性能单向传输?

发布于 2024-10-19 12:47:40 字数 265 浏览 1 评论 0原文

我的(网络)客户端每 200 毫秒向我的服务器发送 50 到 100 KB 的数据包。最多有 300 个客户。服务器不向客户端发送任何内容。服务器(专用)和客户端位于 LAN 中。如何调整 TCP 配置以获得更好的性能?服务器运行在 Windows Server 2003 或 2008 上,客户端运行在 Windows 2000 及更高版本上。

例如 TCP 窗口大小。改变这个参数有帮助吗?还要别的吗?有什么特殊的插座选项吗?

[编辑]:实际上在不同模式下数据包最大可达 5MB

my (network) client sends 50 to 100 KB data packets every 200ms to my server. there're up to 300 clients. Server sends nothing to client. Server (dedicated) and clients are in LAN. How can I tune TCP configuration for better performance? Server on Windows Server 2003 or 2008, clients on Windows 2000 and up.

e.g. TCP window size. Does changing this parameter help? anything else? any special socket options?

[EDIT]: actually in different modes packets can be up to 5MB

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

夜雨飘雪 2024-10-26 12:47:40

几年前我用 1700 个数据点对此进行了一项研究。结论是,您能做的最好的事情就是在接收器处配置一个巨大的套接字接收缓冲区(例如 512k)。对侦听套接字执行此操作,因此它将由接受的套接字继承,因此在握手时它已经被设置。这反过来又允许在握手期间协商 TCP 窗口缩放,从而允许客户端了解窗口大小 > 64k。巨大的窗口大小基本上可以让客户端以最大可能的速率进行传输,仅受拥塞避免而不是关闭接收窗口的影响。

I did a study on this a couple of years ago wth 1700 data points. The conclusion was that the single best thing you can do is configure an enormous socket receive buffer (e.g. 512k) at the receiver. Do that to the listening socket, so it will be inherited by the accepted sockets, so it will already be set while they are handshaking. That in turn allows TCP window scaling to be negotiated during the handshake, which allows the client to know about the window size > 64k. The enormous window size basically lets the client transmit at the maximum possible rate, subject only to congestion avoidance rather than closed receive windows.

老旧海报 2024-10-26 12:47:40

什么操作系统?
IPv4 还是 v6?
为什么这么大的垃圾场;为什么不能分解呢?

假设有一个稳定、稳定、低带宽的延迟产品,您可以调整诸如飞行大小、初始窗口大小、MTU(取决于数据、IP 版本和模式 [tcp/udp])等内容。

您还可以循环或平衡输入,所以你有更少的来自网卡的中断时间.. 绑定也是一个选项..

5MB /packet/? 这是一个非常糟糕的设计.. 我认为这会导致很多段重传,以及很多内核/堆栈内存用于序列重建/重传(接受等待时间等)..

(这可能吗?)

What OS?
IPv4 or v6?
Why so large of a dump ; why can't it be broken down?

Assuming a solid, stable, low bandwidth:delay prod, you can adjust things like inflight sizing, initial window size, mtu (depending on the data, IP version, and mode[tcp/udp].

You could also round robin or balance inputs, so you have less interrupt time from the nic .. binding is an option as well..

5MB /packet/? That's a pretty poor design .. I would think it'd lead to a lot of segment retrans's , and a LOT of kernel/stack mem being used in sequence reconstruction / retransmits (accept wait time, etc)..

(Is that even possible?)

寄风 2024-10-26 12:47:40

由于所有客户端都在 LAN 中,您可以尝试启用“巨型帧”(需要为此运行 netsh 命令,需要在 google 中搜索精确的命令,但有很多操作方法)。

在应用程序层,您可以使用 TransmitFile,它与 Windows sendfile 等效,并且在 Windows Server 2003 下工作得非常好(它在“非服务器”下人为地限制了速率,但这对您来说不是问题)。请注意,如果动态生成数据,则可以使用内存映射文件。

至于调整参数,增加发送缓冲区可能不会给您带来任何好处,但增加接收缓冲区在某些情况下可能会有所帮助,因为如果接收应用程序不处理,它会降低数据包被丢弃的可能性输入数据足够快。较大的 TCP 窗口大小(注册表设置)可能会有所帮助,因为这允许发送方在必须阻塞直到 ACK 到达之前发送更多数据。

增加程序的工作集配额可能值得考虑,它不会花费您任何费用,而且可能是一个优势,因为内核在发送页面时需要锁定页面。允许锁定更多页面可能会使事情变得更快(也可能不会,但这也不会造成伤害,无论如何,默认值都低得离谱)。

Since all clients are in LAN, you might try enabling "jumbo frames" (need to run a netsh command for that, would need to google for the precise command, but there are plenty of how-tos).

On the application layer, you could use TransmitFile, which is the Windows sendfile equivalent and which works very well under Windows Server 2003 (it is artificially rate-limited under "non server", but that won't be a problem for you). Note that you can use a memory mapped file if you generate the data on the fly.

As for tuning parameters, increasing the send buffer will likely not give you any benefit, though increasing the receive buffer may help in some cases because it reduces the likelihood of packets being dropped if the receiving application does not handle the incoming data fast enough. A bigger TCP window size (registry setting) may help, as this allows the sender to send out more data before having to block until ACKs arrive.

Yanking up the program's working set quota may be worth a consideration, it costs you nothing and may be an advantage, since the kernel needs to lock pages when sending them. Being allowed to have more pages locked might make things faster (or might not, but it won't hurt either, the defaults are ridiculously low anyway).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文