对于长度超过 532 字节的压缩数据,GZIPInputStream 解压效果不佳

发布于 2024-07-19 11:26:51 字数 107 浏览 6 评论 0原文

我在java中使用gZipInputStream创建了压缩和解压缩 它对于少量数据工作正常,但如果压缩后的数据长度大于 532,那么我的解压就无法正常工作。

谢谢 巴比

I have created compression and decompression using gZipInputStream in java
It works fine for small amount of data but if the data length after compression becomes greater thatn 532 then my decompression does not work fine.

Thanks
Bapi

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

甜柠檬 2024-07-26 11:26:52

重申一下其他人所说的:

字符串压缩/解压缩方法

  private static byte[] compress(String str, Charset charset) {
    ByteArrayOutputStream buffer = new ByteArrayOutputStream();
    try {
      OutputStream deflater = new GZIPOutputStream(buffer);
      deflater.write(str.getBytes(charset));
      deflater.close();
    } catch (IOException e) {
      throw new IllegalStateException(e);
    }
    return buffer.toByteArray();
  }

  private static String decompress(byte[] data,
      Charset charset) {
    ByteArrayOutputStream buffer = new ByteArrayOutputStream();
    ByteArrayInputStream in = new ByteArrayInputStream(data);
    try {
      InputStream inflater = new GZIPInputStream(in);
      byte[] bbuf = new byte[256];
      while (true) {
        int r = inflater.read(bbuf);
        if (r < 0) {
          break;
        }
        buffer.write(bbuf, 0, r);
      }
    } catch (IOException e) {
      throw new IllegalStateException(e);
    }
    return new String(buffer.toByteArray(), charset);
  }

  public static void main(String[] args) throws IOException {
    StringBuilder sb = new StringBuilder();
    while (sb.length() < 10000) {
      sb.append("write the data here \u00A3");
    }
    String str = sb.toString();
    Charset utf8 = Charset.forName("UTF-8");
    byte[] compressed = compress(str, utf8);

    System.out.println("String len=" + str.length());
    System.out.println("Encoded len="
        + str.getBytes(utf8).length);
    System.out.println("Compressed len="
        + compressed.length);

    String decompressed = decompress(compressed, utf8);
    System.out.println(decompressed.equals(str));
  }

(请注意,因为这些是内存中的流,所以我不是严格如何打开或关闭它们。)

To reiterate what others have said:

  • It is often the case that str.length() != str.getBytes().length(). Many operating systems use a variable-length encoding (like UTF-8, UTF-16 or Windows-949).
  • Use OutputStream.close methods to ensure that all data is written correctly.
  • Use the return value of the InputStream.read to see how many bytes have been read. There is no guarantee that all data will be read in one go.
  • Be careful when using the String class for encoding/decoding.

String compression/decompression methods

  private static byte[] compress(String str, Charset charset) {
    ByteArrayOutputStream buffer = new ByteArrayOutputStream();
    try {
      OutputStream deflater = new GZIPOutputStream(buffer);
      deflater.write(str.getBytes(charset));
      deflater.close();
    } catch (IOException e) {
      throw new IllegalStateException(e);
    }
    return buffer.toByteArray();
  }

  private static String decompress(byte[] data,
      Charset charset) {
    ByteArrayOutputStream buffer = new ByteArrayOutputStream();
    ByteArrayInputStream in = new ByteArrayInputStream(data);
    try {
      InputStream inflater = new GZIPInputStream(in);
      byte[] bbuf = new byte[256];
      while (true) {
        int r = inflater.read(bbuf);
        if (r < 0) {
          break;
        }
        buffer.write(bbuf, 0, r);
      }
    } catch (IOException e) {
      throw new IllegalStateException(e);
    }
    return new String(buffer.toByteArray(), charset);
  }

  public static void main(String[] args) throws IOException {
    StringBuilder sb = new StringBuilder();
    while (sb.length() < 10000) {
      sb.append("write the data here \u00A3");
    }
    String str = sb.toString();
    Charset utf8 = Charset.forName("UTF-8");
    byte[] compressed = compress(str, utf8);

    System.out.println("String len=" + str.length());
    System.out.println("Encoded len="
        + str.getBytes(utf8).length);
    System.out.println("Compressed len="
        + compressed.length);

    String decompressed = decompress(compressed, utf8);
    System.out.println(decompressed.equals(str));
  }

(Note that because these are in-memory streams, I am not being strict about how I open or close them.)

无所谓啦 2024-07-26 11:26:52

我建议你使用 gCompress.close() 而不是 finish();

我还建议您不能依赖 str.length() 足够长来阅读。 存在数据可能更长的风险,因此字符串将被截断。

您还忽略了 read() 的返回值。 read() 只能保证 read() 一个字节,并且不太可能准确读取 str.length() 字节的数据,因此您可能有很多尾随 nul 字节 \0。 相反,您可以期望阅读 str.getBytes().length()

I would suggest you use gCompress.close() not finish();

I also suggest that you cannot rely on str.length() being long enough to read. There is a risk the data could be longer and so the String will be truncated.

You also ignore the return value of read(). read() is only guaranteed to read() one byte and is unlikely to read exactly str.length() bytes of data, so you are likely to have lots of trailing nul bytes \0. Instead you could expect to read str.getBytes().length()

单调的奢华 2024-07-26 11:26:52

对我来说看起来像是字符编码/解码问题。 应使用Readers/Writers 来写入字符串,例如String.getBytes()。 使用 String(new byte[]) 构造不是正确的方式。

您确实应该使用循环来读取和检查返回的 < strong>字节读取值以确保所有内容都被读回!

Looks like a char encoding/decoding problem to me. One should use Readers/Writers to write Strings, e.g. String.getBytes(). Using String(new byte[]) constructs are not the proper way..

You really should use a loop to read and check the returned bytes read value to ensure everything is read back!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文