ObjectInputStream.readObject()在socket通信过程中永远挂起

发布于 2024-12-26 10:24:58 字数 3120 浏览 3 评论 0原文

我在linux系统上遇到了一个socket通信的问题,通信过程如下:客户端发送消息要求服务器执行计算任务,任务完成后等待服务器返回结果消息。

但是如果任务花费的时间较长,比如40分钟左右,客户端就会挂起等待结果消息,即使从服务器端来说,结果消息已经写入socket来响应客户端,但通常可以如果任务花费的时间很少,例如一分钟,则接收结果消息。另外,该问题仅发生在客户环境中,在我们的测试环境中通信过程正常。

我怀疑这个问题的原因是客户环境和测试环境之间的套接字默认超时值不同,但是这两个环境以及客户端和服务器上的以下值是相同的。

getSoTimeout:0
getReceiveBufferSize:43690
getSendBufferSize:8192
getSoLinger:-1
getTrafficClass:0
getKeepAlive:false
getTcpNoDelay:false

客户端上的代码是这样的:

Message msg = null;
ObjectInputStream in = client.getClient().getInputStream();
//if no message readObject() will hang here
while ( true ) {
  try {
   Object recObject = in.readObject();
   System.out.println("Client received msg.");
   msg = (Message)recObject; 
   return msg;
       }catch (Exception e) {
    e.printStackTrace();
    return null;
   }
}

服务器上的代码是这样的,

ObjectOutputStream socketOutStream = getSocketOutputStream();
try {
  MessageJobComplete msgJobComplete = new MessageJobComplete(reportFile, outputFile );
  socketOutStream.writeObject(msgJobComplete);
  }catch(Exception e) {
    e.printStackTrace();
  }

为了解决这个问题,我添加了刷新和重置方法,但问题仍然存在:

ObjectOutputStream socketOutStream = getSocketOutputStream();
try {
   MessageJobComplete msgJobComplete = new MessageJobComplete(reportFile, outputFile );
   socketOutStream.flush();
   logger.debug("AbstractJob#reply to the socket");
   socketOutStream.writeObject(msgJobComplete);
   socketOutStream.reset();
   socketOutStream.flush();
   logger.debug("AbstractJob#after Flush Reply");
 }catch(Exception e) {
    e.printStackTrace();
            logger.error("Exception when sending MessageJobComplete."+e.getMessage());
 }

所以有人知道我下一步应该做什么来解决这个问题问题。 我猜测是环境设置的原因,但我不知道环境因素会影响socket通信?

而使用Tcp/Ip协议进行通信的socket,问题与长时间任务有关,那么tcp的什么值会影响socket通信的超时时间呢?

经过对日志的分析,我发现消息写入套接字后,没有抛出/捕获异常。但总是在15分钟后,服务器端用于接受客户端请求的objectInputStream.readObject()代码片段中出现异常。然而socket.getSoTimeout的值为0,所以很奇怪的是抛出了Timed out Exception。

{2012-01-09  17:44:13,908} ERROR java.net.SocketException: Connection timed out
   at java.net.SocketInputStream.socketRead0(Native Method)
   at java.net.SocketInputStream.read(SocketInputStream.java:146)
   at sun.security.ssl.InputRecord.readFully(InputRecord.java:312)
   at sun.security.ssl.InputRecord.read(InputRecord.java:350)
   at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:809)
   at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:766)
   at sun.security.ssl.AppInputStream.read(AppInputStream.java:94)
   at sun.security.ssl.AppInputStream.read(AppInputStream.java:69)
   at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2265)
   at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2558)
   at  java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2568)
   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314)
   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)

那么为什么会抛出连接超时异常呢?

I have encountered a problem of socket communication on linux system, the communication process is like below: client send a message to ask the server to do a compute task, and wait for the result message from server after the task completes.

But the client would hangs up to wait for the result message if the task costs a long time such as about 40 minutes even though from the server side, the result message has been written to the socket to respond to the client, but it could normally receive the result message if the task costs little time, such as one minute. Additionally, this problem only happens on customer environment, the communication process behaves normally in our testing environment.

I have suspected the cause to this problem is the default timeout value of socket is different between customer environment and testing environment, but the follow values are identical on these two environment, and both Client and server.

getSoTimeout:0
getReceiveBufferSize:43690
getSendBufferSize:8192
getSoLinger:-1
getTrafficClass:0
getKeepAlive:false
getTcpNoDelay:false

the codes on CLient are like:

Message msg = null;
ObjectInputStream in = client.getClient().getInputStream();
//if no message readObject() will hang here
while ( true ) {
  try {
   Object recObject = in.readObject();
   System.out.println("Client received msg.");
   msg = (Message)recObject; 
   return msg;
       }catch (Exception e) {
    e.printStackTrace();
    return null;
   }
}

the codes on server are like,

ObjectOutputStream socketOutStream = getSocketOutputStream();
try {
  MessageJobComplete msgJobComplete = new MessageJobComplete(reportFile, outputFile );
  socketOutStream.writeObject(msgJobComplete);
  }catch(Exception e) {
    e.printStackTrace();
  }

in order to solve this problem, i have added the flush and reset method, but the problem still exists:

ObjectOutputStream socketOutStream = getSocketOutputStream();
try {
   MessageJobComplete msgJobComplete = new MessageJobComplete(reportFile, outputFile );
   socketOutStream.flush();
   logger.debug("AbstractJob#reply to the socket");
   socketOutStream.writeObject(msgJobComplete);
   socketOutStream.reset();
   socketOutStream.flush();
   logger.debug("AbstractJob#after Flush Reply");
 }catch(Exception e) {
    e.printStackTrace();
            logger.error("Exception when sending MessageJobComplete."+e.getMessage());
 }

so do anyone knows what the next steps i should do to solve this problem.
I guess the cause is the environment setting, but I do not know what the environment factors would affect the socket communication?

And the socket using the Tcp/Ip protocal to communicate, the problem is related with the long time task, so what values about tcp would affect the timeout of socket communication?

After my analysis about the logs, i found after the message are written to the socket, there were no exceptions are thrown/caught. But always after 15 minutes, there are exceptions in the objectInputStream.readObject() codes snippet of Server Side which is used to accept the request from client. However, socket.getSoTimeout value is 0, so it is very strange that the a Timed out Exception was thrown.

{2012-01-09  17:44:13,908} ERROR java.net.SocketException: Connection timed out
   at java.net.SocketInputStream.socketRead0(Native Method)
   at java.net.SocketInputStream.read(SocketInputStream.java:146)
   at sun.security.ssl.InputRecord.readFully(InputRecord.java:312)
   at sun.security.ssl.InputRecord.read(InputRecord.java:350)
   at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:809)
   at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:766)
   at sun.security.ssl.AppInputStream.read(AppInputStream.java:94)
   at sun.security.ssl.AppInputStream.read(AppInputStream.java:69)
   at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2265)
   at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2558)
   at  java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2568)
   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1314)
   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:368)

so why the Connection Timed out exceptions are thrown?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

起风了 2025-01-02 10:24:58

这个问题就解决了。使用 tcpdump 捕获消息流。我发现在应用程序级别,ObjectOutputStream.writeObject()方法被调用,在tcp级别,多次发现[TCP ReTransmission]

因此,我得出结论,连接可能已失效,尽管使用 netstat -an 命令,tcp 连接状态仍然是ESTABLISHED

所以我编写了一个测试应用程序来定期发送测试消息作为来自服务器的心跳消息。然后这个问题就消失了。

This problem is solved. using the tcpdump to capture the messages flows. I have found that while in the application level, ObjectOutputStream.writeObject() method was invoked, in the tcp level, many times [TCP ReTransmission] were found.

So, I concluded that the connection is possibly be dead, although using the netstat -an command the tcp connection state still was ESTABLISHED.

So I wrote a testing application to periodically sent Testing messages as the heart-beating messages from the Server. Then this problem disappeared.

聚集的泪 2025-01-02 10:24:58

java.io.InputStreamread() 方法是阻塞调用,这意味着如果它们在以下情况被调用,它们将“永远”等待流中没有数据可供读取。

如果服务器没有响应,这完全是预期的行为,并且根据 javadoc 中发布的合同。

如果您想要非阻塞读取,请使用 java.nio.* 类。

The read() methods of java.io.InputStream are blocking calls., which means they wait "forever" if they are called when there is no data in the stream to read.

This is completely expected behaviour and as per the published contract in javadoc if the server does not respond.

If you want a non-blocking read, use the java.nio.* classes.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文