Java Solaris NIO OP_CONNECT 问题

发布于 2024-11-17 23:36:34 字数 4268 浏览 5 评论 0原文

我有一个 Java 客户端，它使用 Java NIO 使用 TCP 套接字连接到 C++ 服务器。这在 Linux、AIX 和 HP/UX 下有效，但在 Solaris 下，OP_CONNECT 事件永远不会触发。

更多详细信息：

Selector.select() 返回 0，并且“选定的键集”为空。
该问题仅在连接到本地计算机（通过环回或以太网接口）时出现，但在连接到远程计算机时会出现。
我已经在两台不同的 Solaris 10 机器上确认了该问题；使用 JDK 版本 1.6.0_21 和 _26 的物理 SPARC 和虚拟 x64 (VMWare)。

以下是一些演示该问题的测试代码：

import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.ByteBuffer;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.channels.SocketChannel;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Set;

public class NioTest3
{
    public static void main(String[] args)
    {
        int i, tcount = 1, open = 0;
        String[] addr = args[0].split(":");
        int port = Integer.parseInt(addr[1]);
        if (args.length == 2)
            tcount = Integer.parseInt(args[1]);
        InetSocketAddress inetaddr = new InetSocketAddress(addr[0], port);
        try
        {
            Selector selector = Selector.open();
            SocketChannel channel;
            for (i = 0; i < tcount; i++)
            {
                channel = SocketChannel.open();
                channel.configureBlocking(false);
                channel.register(selector, SelectionKey.OP_CONNECT);
                channel.connect(inetaddr);
            }
            open = tcount;
            while (open > 0)
            {
                int selected = selector.select();
                System.out.println("Selected=" + selected);
                Iterator<SelectionKey> it = selector.selectedKeys().iterator();
                while (it.hasNext())
                {
                    SelectionKey key = it.next();
                    it.remove();
                    channel = (SocketChannel)key.channel();
                    if (key.isConnectable())
                    {
                        System.out.println("isConnectable");
                        if (channel.finishConnect())
                        {
                            System.out.println(formatAddr(channel) + " connected");
                            key.interestOps(SelectionKey.OP_WRITE);
                        }
                    }
                    else if (key.isWritable())
                    {
                        System.out.println(formatAddr(channel) + " isWritable");
                        String message = formatAddr(channel) + " the quick brown fox jumps over the lazy dog";
                        ByteBuffer buffer = ByteBuffer.wrap(message.getBytes());
                        channel.write(buffer);
                        key.interestOps(SelectionKey.OP_READ);
                    }
                    else if (key.isReadable())
                    {
                        System.out.println(formatAddr(channel) + " isReadable");
                        ByteBuffer buffer = ByteBuffer.allocate(1024);
                        channel.read(buffer);
                        buffer.flip();
                        byte[] bytes = new byte[buffer.remaining()];
                        buffer.get(bytes);
                        String message = new String(bytes);
                        System.out.println(formatAddr(channel) + " read: '" + message + "'");
                        channel.close();
                        open--;
                    }
                }
            }

        }
        catch (IOException e)
        {
            e.printStackTrace();
        }
    }
    
    static String formatAddr(SocketChannel channel)
    {
        return Integer.toString(channel.socket().getLocalPort());
    }
}

您可以使用命令行运行此代码：

java -cp . NioTest3 <ipaddr>:<port> <num-connections>

如果您正在运行真正的回显服务，则端口应为 7；即：

java -cp . NioTest3 127.0.0.1:7 5

如果您无法运行真正的 echo 服务，那么其来源是此处。在 Solaris 下使用以下命令编译 echo 服务器：

$ cc -o echoserver echoserver.c -lsocket -lnsl

并按如下方式运行：

$ ./echoserver 8007 > out 2>&1 &

这已作为

原文

I have a Java client that connects to a C++ server using TCP Sockets using Java NIO. This works under Linux, AIX and HP/UX but under Solaris the OP_CONNECT event never fires.

Further details:

Selector.select() is returning 0, and the 'selected key set' is empty.
The issue only occurs when connecting to the local machine (via loopback or ethernet interface), but works when connecting to a remote machine.
I have confirmed the issue under two different Solaris 10 machines; a physical SPARC and virtual x64 (VMWare) using both JDK versions 1.6.0_21 and _26.

Here is some test code which demonstrates the issue:

import java.io.IOException;
import java.net.InetSocketAddress;
import java.nio.ByteBuffer;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.channels.SocketChannel;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Set;

public class NioTest3
{
    public static void main(String[] args)
    {
        int i, tcount = 1, open = 0;
        String[] addr = args[0].split(":");
        int port = Integer.parseInt(addr[1]);
        if (args.length == 2)
            tcount = Integer.parseInt(args[1]);
        InetSocketAddress inetaddr = new InetSocketAddress(addr[0], port);
        try
        {
            Selector selector = Selector.open();
            SocketChannel channel;
            for (i = 0; i < tcount; i++)
            {
                channel = SocketChannel.open();
                channel.configureBlocking(false);
                channel.register(selector, SelectionKey.OP_CONNECT);
                channel.connect(inetaddr);
            }
            open = tcount;
            while (open > 0)
            {
                int selected = selector.select();
                System.out.println("Selected=" + selected);
                Iterator<SelectionKey> it = selector.selectedKeys().iterator();
                while (it.hasNext())
                {
                    SelectionKey key = it.next();
                    it.remove();
                    channel = (SocketChannel)key.channel();
                    if (key.isConnectable())
                    {
                        System.out.println("isConnectable");
                        if (channel.finishConnect())
                        {
                            System.out.println(formatAddr(channel) + " connected");
                            key.interestOps(SelectionKey.OP_WRITE);
                        }
                    }
                    else if (key.isWritable())
                    {
                        System.out.println(formatAddr(channel) + " isWritable");
                        String message = formatAddr(channel) + " the quick brown fox jumps over the lazy dog";
                        ByteBuffer buffer = ByteBuffer.wrap(message.getBytes());
                        channel.write(buffer);
                        key.interestOps(SelectionKey.OP_READ);
                    }
                    else if (key.isReadable())
                    {
                        System.out.println(formatAddr(channel) + " isReadable");
                        ByteBuffer buffer = ByteBuffer.allocate(1024);
                        channel.read(buffer);
                        buffer.flip();
                        byte[] bytes = new byte[buffer.remaining()];
                        buffer.get(bytes);
                        String message = new String(bytes);
                        System.out.println(formatAddr(channel) + " read: '" + message + "'");
                        channel.close();
                        open--;
                    }
                }
            }

        }
        catch (IOException e)
        {
            e.printStackTrace();
        }
    }
    
    static String formatAddr(SocketChannel channel)
    {
        return Integer.toString(channel.socket().getLocalPort());
    }
}

You can run this using the command line:

java -cp . NioTest3 <ipaddr>:<port> <num-connections>

Where port should be 7 if you are running against a real echo service; i.e.:

java -cp . NioTest3 127.0.0.1:7 5

If you cannot get a real echo service running then the source to one is here. Compile the echo server under Solaris with:

$ cc -o echoserver echoserver.c -lsocket -lnsl

and run it like this:

$ ./echoserver 8007 > out 2>&1 &

This has been reported to Sun as a bug.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

緦唸λ蓇 2024-11-24 23:36:34

您的错误报告已被关闭为“不是错误”，并附有解释。您忽略了 connect() 的结果，如果为 true，则意味着 OP_CONNECT 永远不会触发，因为通道已连接。如果返回 false，您只需要整个 OP_CONNECT/finishConnect() megillah。因此，除非 connect() 返回 false，否则您甚至不应该注册 OP_CONNECT，更不用说在调用 connect() 之前注册它了。

进一步说明：

在底层，OP_CONNECT 和 OP_WRITE 是同一件事，这解释了它的一部分。

由于您有一个线程用于此操作，解决方法是在阻塞模式下进行连接，然后切换到 I/O 非阻塞模式。

您是否在使用选择器注册通道之后执行 select() ？

处理非阻塞连接的正确方法如下：

channel.configureBlocking(false);
if (!channel.connect(...))
{
    channel.register(sel, SelectionKey.OP_CONNECT, ...); // ... is the attachment, or absent
}
// else channel is connected, maybe register for OP_READ ...
// select() loop runs ...
// Process the ready keys ...
if (key.isConnectable())
{
  if (channel.finishConnect())
  {
     key.interestOps(0); // or SelectionKey.OP_READ or OP_WRITE, whatever is appropriate
  }
}

检查扩展代码后的一些非详尽注释：

关闭通道会取消密钥。您不需要两者都做。
非静态 removeInterest() 方法未正确实现。
TYPE_DEREGISTER_OBJECT 也会关闭通道。不确定这是否是您真正想要的。我本以为它应该只是取消密钥，并且应该有一个单独的操作来关闭通道。
您在小方法和异常处理方面做得太过分了。 addInterest() 和 removeInterest() 就是很好的例子。他们捕获异常，记录它们，然后就像异常没有发生一样继续进行，而实际上他们所做的只是设置或清除一点：一行代码。最重要的是，其中许多都有静态和非静态版本。对于所有调用 key.cancel()、channel.close() 等的小方法也是如此。这一切都没有意义，它只是在计时代码行。它只会增加晦涩难懂并使您的代码更难理解。只需内联执行所需的操作，并在选择循环的底部有一个捕获器。
如果 finishConnect() 返回 false，则不是连接失败，只是尚未完成。如果它抛出异常，则表示连接失败。
您同时注册 OP_CONNECT 和 OP_READ。这没有意义，而且可能会引起问题。在 OP_CONNECT 触发之前没有任何内容可读取。只需首先注册OP_CONNECT即可。
您正在为每次读取分配一个ByteBuffer。这是非常浪费的。在连接的整个生命周期内使用相同的连接。
您忽略了read()的结果。它可以为零。它可以是-1，表示EOS，此时您必须关闭通道。您还假设您将在一次读取中获得完整的应用程序消息。你不能这样假设。这是您应该在连接生命周期内使用单个 ByteBuffer 的另一个原因。
您忽略了write()的结果。它可能小于您调用它时的 buffer.remaining() 。它可以为零。
您可以通过将 NetSelectable 作为关键附件来简化这一过程。然后，您可以删除一些东西，包括通道映射和断言，因为密钥的通道必须始终等于密钥附件的通道。
我肯定还会将 finishConnect() 代码移至 NetSelector，并让 connectEvent() 只是成功/失败通知。你不想传播这种东西。对 readEvent() 执行相同的操作，即在 NetSelector 中执行读取操作，并使用由 NetSelectable 提供的缓冲区，然后通知 <读取结果的code>NetSelectable：count或-1或
例外。写入时也是如此：如果通道可写，则从 NetSelectable 中获取要写入的内容，将其写入 NetSelector 中，并通知结果。您可以让通知回调返回一些内容来指示下一步要做什么，例如关闭通道。

但实际上这比它需要的复杂五倍，并且你有这个错误的事实证明了这一点。简化你的头脑。

Your bug report has been closed as 'not a bug', with an explanation. You are ignoring the result of connect(), which if true means that OP_CONNECT will never fire, because the channel is already connected. You only need the whole OP_CONNECT/finishConnect() megillah if it returns false. So you shouldn't even register OP_CONNECT unless connect() returns false, let alone register it before you've even called connect().

Further remarks:

Under the hood, OP_CONNECT and OP_WRITE are the same thing, which explains part of it.

As you have a single thread for this, the workaround would be to do the connect in blocking mode, then switch to non-blocking for the I/O.

Are you doing the select() after registering the channel with the Selector?

The correct way of handling non-blocking connect is as follows:

channel.configureBlocking(false);
if (!channel.connect(...))
{
    channel.register(sel, SelectionKey.OP_CONNECT, ...); // ... is the attachment, or absent
}
// else channel is connected, maybe register for OP_READ ...
// select() loop runs ...
// Process the ready keys ...
if (key.isConnectable())
{
  if (channel.finishConnect())
  {
     key.interestOps(0); // or SelectionKey.OP_READ or OP_WRITE, whatever is appropriate
  }
}

A few non-exhaustive comments after reviewing your extended code:

Closing a channel cancels the key. You don't need to do both.
The non-static removeInterest() method is incorrectly implemented.
TYPE_DEREGISTER_OBJECT also closes the channel. Not sure if that is what you really intended. I would have thought it should just cancel the key, and there should be a separate operation for closing the channel.
You have gone way overboard on the small methods and exception handling. addInterest() and removeInterest() are good examples. They catch exceptions, log them, then proceed as though the exception hadn't happened, when all they actually do is set or clear a bit: one line of code. And on top of that many of them have both static and non-static versions. Same goes for all the little methods that call key.cancel(), channel.close(), etc. There is no point to all this, it is just clocking up lines of code. It only adds obscurity and makes your code harder to understand. Just do the operation required inline and have a single catcher at the bottom of the select loop.
If finishConnect() returns false it isn't a connection failure, it just hasn't completed yet. If it throws an Exception, that is a connection failure.
You are registering for OP_CONNECT and OP_READ at the same time. This doesn't make sense, and it may cause problems. There is nothing to read until OP_CONNECT has fired. Just register for OP_CONNECT at first.
You are allocating a ByteBuffer per read. This is very wasteful. Use the same one for the life of the connection.
You are ignoring the result of read(). It can be zero. It can be -1, indicating EOS, on which you must close the channel. You are also assuming you will get an entire application message in a single read. You can't assume that. That's another reason why you should use a single ByteBuffer for the life of the connection.
You are ignoring the result of write(). It could be less than buffer.remaining() was when you called it. It can be zero.
You could simplify this a lot by making the NetSelectable the key attachment. Then you could do away with several things, including for example the channel map, and the assert, because the channel of the key must always be equal to the channel of the attachment of the key.
I would also definitely move the finishConnect() code to the NetSelector, and have connectEvent() just be a success/failure notification. You don't want to spread this kind of stuff around. Do the same with readEvent(), i.e. do the read itself in NetSelector, with a buffer supplied by the NetSelectable, and just notify the NetSelectable of the read result: count or -1 or
exception. Ditto on write: if the channel is writable, get something to write from the NetSelectable, write it in the NetSelector, and notify the result. You could have the notification callbacks return something to indicate what to do next, e.g. close the channel.

But really this is all five times as complex as it needs to be, and the fact that you have this bug proves it. Simplify your head.

回复收藏 0 原文

你是我的挚爱i 2024-11-24 23:36:34

我已经使用以下方法解决了这个错误：

如果 Selector.select() 返回 0 （并且没有超时，如果使用了超时版本）则：

迭代使用选择器注册的键通过selector.keys().iterator()（记住不调用iterator.remove()）。
如果已使用该密钥设置了 OP_CONNECT 兴趣，则调用channel.finishConnect() 并执行在 isConnectable() 返回时应执行的操作真。

例如：

if (selected == 0 && elapsed < timeout)
{
    keyIter = selector.keys().iterator();
    while (keyIter.hasNext())
    {
        key = keyIter.next();
        if (key.isValid())
        {
            channel = (SocketChannel)key.channel();
            if (channel != null)
            {
                if ((key.interestOps() & SelectionKey.OP_CONNECT) != 0)
                {
                    if (channel.finishConnect())
                    {
                        key.interestOps(0);
                    }
                }
            }
        }
    }
}

已将此问题作为 bug 报告给 Sun。

I have worked-around this bug using the following:

If Selector.select() returns 0 (and didn't timeout, if the timeout version was used) then:

Iterate over the keys registered with the selector via selector.keys().iterator() (remembering not to call iterator.remove()).
If OP_CONNECT interest has been set with the key then call channel.finishConnect() and do whatever would have been done if isConnectable() has returned true.

For example:

if (selected == 0 && elapsed < timeout)
{
    keyIter = selector.keys().iterator();
    while (keyIter.hasNext())
    {
        key = keyIter.next();
        if (key.isValid())
        {
            channel = (SocketChannel)key.channel();
            if (channel != null)
            {
                if ((key.interestOps() & SelectionKey.OP_CONNECT) != 0)
                {
                    if (channel.finishConnect())
                    {
                        key.interestOps(0);
                    }
                }
            }
        }
    }
}

This has been reported to Sun as a bug.

回复收藏 0 原文

~没有更多了~