当前位置：文江博客话题详情

对于长度超过 532 字节的压缩数据，GZIPInputStream 解压效果不佳

发布于 2024-07-19 11:26:51 字数 107 浏览 6 评论 0原文

我在java中使用gZipInputStream创建了压缩和解压缩它对于少量数据工作正常，但如果压缩后的数据长度大于 532，那么我的解压就无法正常工作。

谢谢巴比

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

甜柠檬 2024-07-26 11:26:52

重申一下其他人所说的：

通常情况是 str.length() != str.getBytes().length()。许多操作系统使用可变长度编码（例如 UTF-8、UTF-16 或 Windows-949）。
使用 OutputStream.close 确保所有数据正确写入的方法。
使用的返回值InputStream.read 查看已读取了多少字节。不保证一次性读取所有数据。
使用时要小心用于编码/解码的字符串类。

字符串压缩/解压缩方法

  private static byte[] compress(String str, Charset charset) {
    ByteArrayOutputStream buffer = new ByteArrayOutputStream();
    try {
      OutputStream deflater = new GZIPOutputStream(buffer);
      deflater.write(str.getBytes(charset));
      deflater.close();
    } catch (IOException e) {
      throw new IllegalStateException(e);
    }
    return buffer.toByteArray();
  }

  private static String decompress(byte[] data,
      Charset charset) {
    ByteArrayOutputStream buffer = new ByteArrayOutputStream();
    ByteArrayInputStream in = new ByteArrayInputStream(data);
    try {
      InputStream inflater = new GZIPInputStream(in);
      byte[] bbuf = new byte[256];
      while (true) {
        int r = inflater.read(bbuf);
        if (r < 0) {
          break;
        }
        buffer.write(bbuf, 0, r);
      }
    } catch (IOException e) {
      throw new IllegalStateException(e);
    }
    return new String(buffer.toByteArray(), charset);
  }

  public static void main(String[] args) throws IOException {
    StringBuilder sb = new StringBuilder();
    while (sb.length() < 10000) {
      sb.append("write the data here \u00A3");
    }
    String str = sb.toString();
    Charset utf8 = Charset.forName("UTF-8");
    byte[] compressed = compress(str, utf8);

    System.out.println("String len=" + str.length());
    System.out.println("Encoded len="
        + str.getBytes(utf8).length);
    System.out.println("Compressed len="
        + compressed.length);

    String decompressed = decompress(compressed, utf8);
    System.out.println(decompressed.equals(str));
  }

（请注意，因为这些是内存中的流，所以我不是严格如何打开或关闭它们。）

To reiterate what others have said:

It is often the case that str.length() != str.getBytes().length(). Many operating systems use a variable-length encoding (like UTF-8, UTF-16 or Windows-949).
Use OutputStream.close methods to ensure that all data is written correctly.
Use the return value of the InputStream.read to see how many bytes have been read. There is no guarantee that all data will be read in one go.
Be careful when using the String class for encoding/decoding.

String compression/decompression methods

  private static byte[] compress(String str, Charset charset) {
    ByteArrayOutputStream buffer = new ByteArrayOutputStream();
    try {
      OutputStream deflater = new GZIPOutputStream(buffer);
      deflater.write(str.getBytes(charset));
      deflater.close();
    } catch (IOException e) {
      throw new IllegalStateException(e);
    }
    return buffer.toByteArray();
  }

  private static String decompress(byte[] data,
      Charset charset) {
    ByteArrayOutputStream buffer = new ByteArrayOutputStream();
    ByteArrayInputStream in = new ByteArrayInputStream(data);
    try {
      InputStream inflater = new GZIPInputStream(in);
      byte[] bbuf = new byte[256];
      while (true) {
        int r = inflater.read(bbuf);
        if (r < 0) {
          break;
        }
        buffer.write(bbuf, 0, r);
      }
    } catch (IOException e) {
      throw new IllegalStateException(e);
    }
    return new String(buffer.toByteArray(), charset);
  }

  public static void main(String[] args) throws IOException {
    StringBuilder sb = new StringBuilder();
    while (sb.length() < 10000) {
      sb.append("write the data here \u00A3");
    }
    String str = sb.toString();
    Charset utf8 = Charset.forName("UTF-8");
    byte[] compressed = compress(str, utf8);

    System.out.println("String len=" + str.length());
    System.out.println("Encoded len="
        + str.getBytes(utf8).length);
    System.out.println("Compressed len="
        + compressed.length);

    String decompressed = decompress(compressed, utf8);
    System.out.println(decompressed.equals(str));
  }

(Note that because these are in-memory streams, I am not being strict about how I open or close them.)

回复收藏 0 原文