Java:将 String 与 ByteBuffer 相互转换以及相关问题

发布于 2024-08-01 13:15:17 字数 1175 浏览 4 评论 0原文

我使用 Java NIO 进行套接字连接,并且我的协议是基于文本的,因此我需要能够在将字符串写入 SocketChannel 之前将它们转换为 ByteBuffer,并将传入的 ByteBuffer 转换回字符串。 目前,我正在使用这段代码:

public static Charset charset = Charset.forName("UTF-8");
public static CharsetEncoder encoder = charset.newEncoder();
public static CharsetDecoder decoder = charset.newDecoder();

public static ByteBuffer str_to_bb(String msg){
  try{
    return encoder.encode(CharBuffer.wrap(msg));
  }catch(Exception e){e.printStackTrace();}
  return null;
}

public static String bb_to_str(ByteBuffer buffer){
  String data = "";
  try{
    int old_position = buffer.position();
    data = decoder.decode(buffer).toString();
    // reset buffer's position to its original so it is not altered:
    buffer.position(old_position);  
  }catch (Exception e){
    e.printStackTrace();
    return "";
  }
  return data;
}

这在大多数情况下都有效,但我怀疑这是否是执行此转换的每个方向的首选(或最简单)方法,或者是否有其他方法可以尝试。 有时,看似随机,对 encode()decode() 的调用会抛出 java.lang.IllegalStateException: Current state = FLUSHED, new state = CODING_END 异常或类似的异常,即使我每次完成转换时都使用新的 ByteBuffer 对象。 我需要同步这些方法吗? 有更好的方法在字符串和字节缓冲区之间进行转换吗? 谢谢!

I am using Java NIO for my socket connections, and my protocol is text based, so I need to be able to convert Strings to ByteBuffers before writing them to the SocketChannel, and convert the incoming ByteBuffers back to Strings. Currently, I am using this code:

public static Charset charset = Charset.forName("UTF-8");
public static CharsetEncoder encoder = charset.newEncoder();
public static CharsetDecoder decoder = charset.newDecoder();

public static ByteBuffer str_to_bb(String msg){
  try{
    return encoder.encode(CharBuffer.wrap(msg));
  }catch(Exception e){e.printStackTrace();}
  return null;
}

public static String bb_to_str(ByteBuffer buffer){
  String data = "";
  try{
    int old_position = buffer.position();
    data = decoder.decode(buffer).toString();
    // reset buffer's position to its original so it is not altered:
    buffer.position(old_position);  
  }catch (Exception e){
    e.printStackTrace();
    return "";
  }
  return data;
}

This works most of the time, but I question if this is the preferred (or simplest) way to do each direction of this conversion, or if there is another way to try. Occasionally, and seemingly at random, calls to encode() and decode() will throw a
java.lang.IllegalStateException: Current state = FLUSHED, new state = CODING_END exception, or similar, even if I am using a new ByteBuffer object each time a conversion is done. Do I need to synchronize these methods? Any better way to convert between Strings and ByteBuffers? Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

初见终念 2024-08-08 13:15:17

Adamski 的回答很好,描述了使用通用编码方法(将字节缓冲区作为输入之一)时编码操作的步骤,

但是,所讨论的方法(在本讨论中)是编码的一种变体 - 编码(CharBuffer in)。 这是一种实现整个编码操作的便捷方法。 (请参阅 PS 中的 java 文档参考)

根据文档,如果编码操作已经在进行中,则不应调用此方法(这就是 ZenBlender 代码中发生的情况 - 使用静态编码器/多线程环境中的解码器)。

就我个人而言,我喜欢使用方便方法(相对于更通用的编码/解码方法),因为它们通过在幕后执行所有步骤来减轻负担。

ZenBlender 和 Adamski 已经在他们的评论中提出了多种安全执行此操作的方法选项。 在这里将它们全部列出:

  • 在每个操作需要时创建一个新的编码器/解码器对象(效率不高,因为它可能会导致大量对象)。 或者,
  • 使用 ThreadLocal 避免为每个操作创建新的编码器/解码器。 或者,
  • 同步整个编码/解码操作(这可能不是首选,除非牺牲一些并发性对于您的程序来说是可以的)

PS

java 文档参考:

  1. 编码(方便)方法:http://docs.oracle.com/javase/6/docs/api/java/ nio/charset/CharsetEncoder.html#encode%28java.nio.CharBuffer%29
  2. 通用编码方法:http://docs.oracle.com/javase/6/docs/api/java/nio /charset/CharsetEncoder.html#encode%28java.nio.CharBuffer,%20java.nio.ByteBuffer,%20boolean%29

Answer by Adamski is a good one and describes the steps in an encoding operation when using the general encode method (that takes a byte buffer as one of the inputs)

However, the method in question (in this discussion) is a variant of encode - encode(CharBuffer in). This is a convenience method that implements the entire encoding operation. (Please see java docs reference in P.S.)

As per the docs, This method should therefore not be invoked if an encoding operation is already in progress (which is what is happening in ZenBlender's code -- using static encoder/decoder in a multi threaded environment).

Personally, I like to use convenience methods (over the more general encode/decode methods) as they take away the burden by performing all the steps under the covers.

ZenBlender and Adamski have already suggested multiple ways options to safely do this in their comments. Listing them all here:

  • Create a new encoder/decoder object when needed for each operation (not efficient as it could lead to a large number of objects). OR,
  • Use a ThreadLocal to avoid creating new encoder/decoder for each operation. OR,
  • Synchronize the entire encoding/decoding operation (this might not be preferred unless sacrificing some concurrency is ok for your program)

P.S.

java docs references:

  1. Encode (convenience) method: http://docs.oracle.com/javase/6/docs/api/java/nio/charset/CharsetEncoder.html#encode%28java.nio.CharBuffer%29
  2. General encode method: http://docs.oracle.com/javase/6/docs/api/java/nio/charset/CharsetEncoder.html#encode%28java.nio.CharBuffer,%20java.nio.ByteBuffer,%20boolean%29
牛↙奶布丁 2024-08-08 13:15:17

除非事情发生了变化,否则你最好使用

public static ByteBuffer str_to_bb(String msg, Charset charset){
    return ByteBuffer.wrap(msg.getBytes(charset));
}

public static String bb_to_str(ByteBuffer buffer, Charset charset){
    byte[] bytes;
    if(buffer.hasArray()) {
        bytes = buffer.array();
    } else {
        bytes = new byte[buffer.remaining()];
        buffer.get(bytes);
    }
    return new String(bytes, charset);
}

通常 buffer.hasArray() 要么总是 true 要么总是 false,具体取决于你的用例。 在实践中,除非您确实希望它在任何情况下都能工作,否则优化掉不需要的分支是安全的。

Unless things have changed, you're better off with

public static ByteBuffer str_to_bb(String msg, Charset charset){
    return ByteBuffer.wrap(msg.getBytes(charset));
}

public static String bb_to_str(ByteBuffer buffer, Charset charset){
    byte[] bytes;
    if(buffer.hasArray()) {
        bytes = buffer.array();
    } else {
        bytes = new byte[buffer.remaining()];
        buffer.get(bytes);
    }
    return new String(bytes, charset);
}

Usually buffer.hasArray() will be either always true or always false depending on your use case. In practice, unless you really want it to work under any circumstances, it's safe to optimize away the branch you don't need.

往日情怀 2024-08-08 13:15:17

查看 CharsetEncoder CharsetDecoderAPI 描述 - 您应该遵循特定的方法调用顺序以避免此问题。 例如,对于CharsetEncoder

  1. 通过reset方法重置编码器,除非之前没有使用过;
  2. 只要有其他输入可用,就调用 encode 方法零次或多次,为 endOfInput 参数传递 false 并填充输入缓冲区并在调用之间刷新输出缓冲区;
  3. 最后一次调用 encode 方法,为 endOfInput 参数传递 true ; 然后
  4. 调用flush方法,以便编码器可以将任何内部状态刷新到输出缓冲区。

顺便说一句,这与我在 NIO 中使用的方法相同,尽管我的一些同事知道他们只使用 ASCII,将每个字符直接转换为字节,我可以想象这可能更快。

Check out the CharsetEncoder and CharsetDecoder API descriptions - You should follow a specific sequence of method calls to avoid this problem. For example, for CharsetEncoder:

  1. Reset the encoder via the reset method, unless it has not been used before;
  2. Invoke the encode method zero or more times, as long as additional input may be available, passing false for the endOfInput argument and filling the input buffer and flushing the output buffer between invocations;
  3. Invoke the encode method one final time, passing true for the endOfInput argument; and then
  4. Invoke the flush method so that the encoder can flush any internal state to the output buffer.

By the way, this is the same approach I am using for NIO although some of my colleagues are converting each char directly to a byte in the knowledge they are only using ASCII, which I can imagine is probably faster.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文