GZIP字符串压缩无法解压'£'人物

发布于 2025-01-05 22:27:22 字数 1639 浏览 5 评论 0原文

我有以下用于压缩字符串的代码（为了清楚起见，删除了错误和资源处理）：

import java.util.zip.GZIP*;
import java.io.*;
import java.util.zip.GZIPOutputStream;
import org.apache.commons.io.IOUtils;
import com.Ostermiller.util.Base64;

//Code to compress the string
ByteArrayOutputStream output = new ByteArrayOutputStream(65536);
BufferedWriter writer = new BufferedWriter(
           new OutputStreamWriter(new GZIPOutputStream(output)));
writer.write(stringContents);
String compressedString =  new String(Base64.encode(output.toByteArray()));

...

//Code to decompress the string
byte[] compressedData = Base64.decode(compressedString.getBytes());
BufferedInputStream reader = new BufferedInputStream(
           new GZIPInputStream(new ByteArrayInputStream(compressedData)));
String uncompressedString = IOUtils.toString(reader, "UTF-8");

当尝试对其中包含“£”的字符串进行编码和解码时，我们遇到了错误。具体来说，字符串压缩正常，但是当尝试解压缩字符串时，我们得到以下堆栈跟踪：

sun.io.MalformedInputException
at sun.io.ByteToCharUTF8.convert(ByteToCharUTF8.java(Compiled Code))
at sun.nio.cs.StreamDecoder$ConverterSD.convertInto(StreamDecoder.java:287)
at sun.nio.cs.StreamDecoder$ConverterSD.implRead(StreamDecoder.java:337)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:223)
at java.io.InputStreamReader.read(InputStreamReader.java:208)
at java.io.Reader.read(Reader.java:113)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1128)
at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
at org.apache.commons.io.IOUtils.toString(IOUtils.java:382)

任何人都可以告诉我我的方法的错误以及如何解决这种情况？有更好的方法来做到这一点吗？非常感谢。

原文

I have the following code which we use to compress Strings (with error and resource handling removed for clarity):

import java.util.zip.GZIP*;
import java.io.*;
import java.util.zip.GZIPOutputStream;
import org.apache.commons.io.IOUtils;
import com.Ostermiller.util.Base64;

//Code to compress the string
ByteArrayOutputStream output = new ByteArrayOutputStream(65536);
BufferedWriter writer = new BufferedWriter(
           new OutputStreamWriter(new GZIPOutputStream(output)));
writer.write(stringContents);
String compressedString =  new String(Base64.encode(output.toByteArray()));

...

//Code to decompress the string
byte[] compressedData = Base64.decode(compressedString.getBytes());
BufferedInputStream reader = new BufferedInputStream(
           new GZIPInputStream(new ByteArrayInputStream(compressedData)));
String uncompressedString = IOUtils.toString(reader, "UTF-8");

We are encountering an error when trying to encode and then decode strings with a '£' in them. Specifically, the string compresses OK, but when trying to decompress the string we get the following stack trace:

sun.io.MalformedInputException
at sun.io.ByteToCharUTF8.convert(ByteToCharUTF8.java(Compiled Code))
at sun.nio.cs.StreamDecoder$ConverterSD.convertInto(StreamDecoder.java:287)
at sun.nio.cs.StreamDecoder$ConverterSD.implRead(StreamDecoder.java:337)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:223)
at java.io.InputStreamReader.read(InputStreamReader.java:208)
at java.io.Reader.read(Reader.java:113)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1128)
at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
at org.apache.commons.io.IOUtils.toString(IOUtils.java:382)

Can anyone tell me the error of my ways and how I might fix this situation? Is there a better way to be doing this? Many thanks in advance.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

孤独岁月 2025-01-12 22:27:22

您应该在压缩数据时指定字符编码：

BufferedWriter writer = new BufferedWriter(
           new OutputStreamWriter(new GZIPOutputStream(output), "UTF-8"));

如果不这样做，文本将根据系统默认字符编码转换为字节，在您的情况下不是 UTF-8。

You should specify the character encoding when you compress the data:

BufferedWriter writer = new BufferedWriter(
           new OutputStreamWriter(new GZIPOutputStream(output), "UTF-8"));

If you don't, text is converted to bytes according to the system default character encoding, which in your case is not UTF-8.

回复收藏 0 原文

~没有更多了~

关于作者

猫性小仙女

暂无简介

文章

796 人气

关注发私信

友情链接

文江博客

GZIP字符串压缩无法解压'£'人物

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

GZIP字符串压缩无法解压'£'人物

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

櫻之舞

弥枳

m2429

寻找一个思念的角度

野却迷人

我怀念的。

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。