TcpListener 停止接受或接受断开的连接

发布于 2024-11-06 18:35:59 字数 675 浏览 4 评论 0原文

目前,我们在 Windows 上运行的自行编写的服务器应用程序遇到了问题(在不同版本上发生)。服务器侦听 TCP 端口,接受连接,交换一些数据,然后再次关闭连接。大约有 100 个客户端不时连接。

有时服务器会停止工作:日志文件显示连接仍被接受,但在第一次读取尝试时会发生套接字错误(10054 - 连接被对等方重置)。我不认为这是一个客户端问题,因为它突然停止为所有客户端工作。

现在我们发现,我们的旧服务器软件也出现了同样的问题,甚至是用另一种编程语言编写的。所以这似乎不是我们程序中的错误 - 我认为这一定是某种操作系统/防火墙问题?当然,防火墙已被停用,但这并没有解决问题。

有什么想法可以研究吗? Wireshark 日志很快就会出现。

日志摘录(时间戳、线程 ID、消息)

11:37:56.137 T#3960 Connection from 10.21.13.3
11:37:56.138 T#3960 Client Exception: Socket Error # 10054
Connection reset by peer.
11:37:56.138 T#3960 ClientDisconnected
11:38:00.294 T#4144 Connection from 10.21.13.3

您可以看到异常几乎在连接被接受的同时发生,在这种情况下客户端会在几秒钟后重新连接。

We currently experience a problem with a self-written server application running on Windows (occurs on different versions). The server listens at a TCP port, accepts connections, exchanges some data and then closes the connections again. There are about 100 clients that connect from time to time.

Sometimes the server stops to work: Log files show that connections are still accepted, but that at the first read attempt a socket error (10054 - Connection reset by peer) occurs. I don't think it is a client issue because it suddenly stops working for all clients.

Now we found out, that the same problem occurs with our old server software, that is even written in another programming language. So it doesn't seem to be an error in our program - I think it has to be some kind of OS / firewall issue? Of course, firewalls have been deactivated, which didn't solve the issue yet.

Any ideas where to look into? Wireshark logs will follow soon..

Excerpt from the log (Timestamp, Thread Id, message)

11:37:56.137 T#3960 Connection from 10.21.13.3
11:37:56.138 T#3960 Client Exception: Socket Error # 10054
Connection reset by peer.
11:37:56.138 T#3960 ClientDisconnected
11:38:00.294 T#4144 Connection from 10.21.13.3

You can see that the exception occurs almost at the same time as the connection is accepted, in this case the client reconnects after a few seconds.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

诗化ㄋ丶相逢 2024-11-13 18:35:59

“有状态”防火墙或 NAT 会跟踪连接,并且应该为它不知道的连接发送 RST。如果防火墙由于某种原因失去了连接跟踪,那么您可能会看到随机连接被重置。

我们工作中的路由器就是这样做的——当 PPP 连接中断时,它会忘记连接,这在下雨和 DSL 重新启动需要太长时间的情况下非常没有帮助。然而,它不是重置连接,而是丢弃数据包(甚至更无帮助!)。

A "stateful" firewall or NAT keeps track of connections, and ought to send RSTs for connectiosn it doesn't know about. If the firewall loses track of connections for some reason, then you'll probably see random connections being reset.

Our router at work does this — it forgets about connections when the PPP connection dies, which is remarkably unhelpful when it rains and the DSL restart takes a bit too long. However, instead of resetting connections, it just drops packets (even more unhelpful!).

痴意少年 2024-11-13 18:35:59

听起来像是防火墙或路由问题 - 也许过时的连接会在超时后断开连接。您是否在协议中使用 ping/keepalive。
否则你可以要求 Wireshark 看看发生了什么。

Sounds like a firewall or routing issue - maybe stale connections get disconnected after a timeout period. Are you using a ping/keepalive inside your protocol.
Otherwise you may ask Wireshark to see what is going on.

旧话新听 2024-11-13 18:35:59

首先,感谢您的许多提示 - 恐怕问题是一个完全不同的问题,您不可能通过阅读我的问题来解决。

服务器应用程序使用 log4net,配置了日志文件 ImmediateFlush = true。如果每个日志语句都直接写入文件并发生多个套接字连接,则会减慢整个应用程序的速度。
服务器大约需要一分钟才能真正接受连接。这远远超过了客户端的超时时间。所以在日志中只显示“已接受”,然后是“已断开连接” - 即使日志也被延迟了!

带来不便敬请谅解...

First, thanks for many hints - I'm afraid the problem was a completely different one which you couldn't possibly solve by reading my question.

The server application uses log4net, configured with a log file an ImmediateFlush = true. If every log statement is directly written into the file and multiple socket connections occur this slows down the whole application.
The server needed about a minute to really accept the connection. This was far more than the timeout on clientside. So in the log there was only shown "accepted" followed by "disconnected" - even the log was delayed!

Sorry for the inconvenience...

自控 2024-11-13 18:35:59

您是否尝试过更改积压工作,然后查看在出现此问题之前服务了多少时间或多少客户

Have you tried changing the backlog and then see how much time or how many clients are served before this problem occurs

べ映画 2024-11-13 18:35:59

您没有说明服务器使用的 Windows 版本,但您应该知道 Windows TCP/IP 堆栈在服务器和客户端操作系统中的行为不同。客户端操作系统允许的同时传入连接数是有限制的,并且它们明显低于您的预期。

You don't say what Windows versions you're using for the server, but you should be aware that the Windows TCP/IP stack behaves differently in server and client OSes. There are limits on how many simultaneous incoming connections a client OS will allow, and they are significantly lower than you might expect.

一张白纸 2024-11-13 18:35:59

从客户端来看日志是什么样的?

由于错误表明客户端正在断开连接;如果您在客户端看到相同的错误,则表明防火墙或代理正在断开连接(双方都看到对方断开连接表明代理/防火墙)。

如果客户端不存在错误;那么我会说你的客户端是你会看到实际错误的地方。

What do the logs look like from the client side?

Since the error is stating that the client is dropping the connection; if you see the same error on the client side then it is a firewall or proxy that is dropping the connection (both side seeing the opposite side dropping the connection is indicative of a proxy/firewall).

If the error is not present on the client side; then I would say that your client side is where you will see the actual error.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文