套接字卡在 CLOSE_WAIT 状态

发布于 2024-12-19 21:07:07 字数 373 浏览 0 评论 0原文

当我的两个守护进程相互通信时,我的套接字陷入 close_wait 状态。在阅读了有关该主题的不同问题和博客文章后,我已经验证我正在从双方(发起者和接收者)关闭套接字。

该模型如下:

发送者: 建立连接,发送数据,等待确认,关闭连接

接收方: 接收连接,读取数据,发送确认,关闭连接

谁能告诉我我做错了什么?注意:我现在使用 close() 来关闭连接。我也尝试过使用关机,但它并没有改变什么。任何提示将不胜感激。

编辑: 关闭套接字后不久,接收守护进程就会分叉。我尝试将文件描述符传递给分叉的函数并在子进程中再次显式关闭它,但这并没有解决我的问题。分叉还有其他方式可以影响这个过程吗?请注意,发送守护进程不会分叉。

I am getting sockets stuck in close_wait when two of my daemons speak to each other. After having read different questions and blog entries on the subject, I have verified that I am closing the socket from both sides (originator and receiver).

The model goes as follows:

Sender:
establish connection, send data, wait for confirmation, close connection

Receiver:
receive connection, read data, send confirmation, close connection

Can anyone tell me what I'm doing wrong? Note: I am using close() to close the connections right now. I have tried using shutdown as well and it hasn't changed things. Any hints would be greatly appreciated.

EDIT:
Shortly after closing the socket, the receiving daemon forks. I have tried passing the file descriptor to the function that forks and explicitly closing it again in the child process, but this did not fix my problem. Is there any other way that forking could affect this process? Note that the sending daemon does not fork.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

绿萝 2024-12-26 21:07:07

查看wireshark后,我看到最后的FIN_ACK说:

“[TCP ACKedlostegment][TCPprevioussegmentlost]...”

事实证明,我的问题是由于两个守护进程在同一个机器上运行而引起的(我们的东西)已添加用于测试)。在多个盒子上再次尝试后,我们不再遇到这个问题。

After looking in wireshark, I saw that the final FIN_ACK said:

"[TCP ACKed lost segment] [TCP previous segment lost] ..."

It turns out that my problem was caused by having both daemons running on the same box (something we had added for testing). After trying again on multiple boxes, we no longer get this problem.

一袭水袖舞倾城 2024-12-26 21:07:07

当您有一个打开套接字的应用程序并且在执行一些发送接收之后它从其对等方接受 FIN 时,从该状态开始它会进入 CLOSE_WAIT 状态。它可以永远保持该状态,直到您显式调用 close()。希望您实际上在 close() 中传递了正确的 FD。

when you have an application which has opened a socket and after doing some send receive it accepts a FIN from its peer, from that states onwards it goes to CLOSE_WAIT state. It can remain in that state forever until you explicitly call close(). Hope you are actually passing the right FD in close().

一紙繁鸢 2024-12-26 21:07:07

根据我(简短的)经验,您很可能关闭了错误的 fd,甚至根本没有达到“关闭”语句。我偶然发现了后者,第一个线索是我的应用程序变成了僵尸而不是关闭(特别是在 close 语句之前的一个简单的 printf 使一切都陷入困境)。

可能值得您花时间检查任务管理器/作业/系统监视器/<一些与您的操作系统相关的进程视图名称。

In my (short) experience, it's very possible that you're closing the wrong fd, or even not reaching the "close" statement at all. I stumbled upon the later one and the first clue was that my application became a zombie instead of closing (specifically a simple printf right before the close statement made it all go to hell).

Might be worth your time to check the task manager/jobs/system monitor/< some process view name relevant to your OS>.

朮生 2024-12-26 21:07:07

实际上这些是多线程服务器应用程序中很常见的问题
您可以采取两件事来解决此问题:

  1. 在套接字上使用 FD_CLOSEXEC。
  2. 使用setsockopt并在套接字上设置tcp_keepalive。

上述两种解决方案的实现代码在 *NIX 和 Microsoft 上可能略有不同。差异仅是由于语义差异造成的。

我建议实施上述两项措施。

但是,如果您无法修改代码,那么您可以使用
libkeepalive

Actually these are quite common problems witnessed in multi-threaded server applications
There are two things you could do to resolve this problem:

  1. Use FD_CLOSEXEC on the sockets.
  2. Use setsockopt and set tcp_keepalive on the sockets.

The code for implementation of both of the above solutions can be a little different on *NIX and Microsoft. The difference is only due to semantic differences.

I would recommend implementing both of the above measures.

However if you cannot modify the code then you could use
libkeepalive

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文