对 URL 的内容进行压缩 - Java

发布于 2024-11-25 05:27:32 字数 1171 浏览 5 评论 0原文

正如标题所示,我试图从 HTTP 请求中获取并压缩一个字符串。

urlConn = url.openConnection();
int len = CONTENT_LENGTH
byte[] gbytes = new byte[len];
gbuffer = new GZIPInputStream(urlConn.getInputStream(), len);
System.out.println(gbuffer.read(gbytes)+"/"+len);
System.out.println(gbytes);
result = new String(gbytes, "UTF-8");
gbuffer.close();
System.out.println(result);

对于某些 URL,它工作得很好。我得到这样的输出:

42/42
[B@96e8209
The entire 42 bytes of my data. Abcdefghij.

对于其他人,它给了我类似以下输出的内容:

22/77
[B@1d94882
The entire 77 bytes of

如您所见,数据的前一些奇数字节即使不相同,也非常相似,因此它们不应该导致这些问题。我真的似乎无法确定它。增加 CONTENT_LENGTH 并没有帮助,而且无论数据流大小比给我带来问题的数据流大还是小,都可以正常工作。

编辑:问题也不在于原始 gzip 压缩数据,因为 Cocoa 和 Python 都可以毫无问题地对其进行压缩。

编辑:已解决。包括最终代码:

urlConn = url.openConnection();
int offset = 0, len = CONTENT_LENGTH
byte[] gbytes = new byte[len];
gbuffer = new GZIPInputStream(urlConn.getInputStream(), len);
while(offset < len)
{
    offset += gbuffer.read(gbytes, offset, offset-len);
}
result = new String(gbytes, "UTF-8");
gbuffer.close();

So as the title suggests, I'm trying to get and gunzip a string from an HTTP request.

urlConn = url.openConnection();
int len = CONTENT_LENGTH
byte[] gbytes = new byte[len];
gbuffer = new GZIPInputStream(urlConn.getInputStream(), len);
System.out.println(gbuffer.read(gbytes)+"/"+len);
System.out.println(gbytes);
result = new String(gbytes, "UTF-8");
gbuffer.close();
System.out.println(result);

With some URLs, it works fine. I get output like this:

42/42
[B@96e8209
The entire 42 bytes of my data. Abcdefghij.

With others, it gives me something like the following output:

22/77
[B@1d94882
The entire 77 bytes of

As you can see, the first some-odd bytes of data are very similar if not the same, so they shouldn't be causing these issues. I really can't seem to pin it down. Increasing CONTENT_LENGTH doesn't help, and data streams of sizes both larger and smaller than the ones giving me issues work fine.

EDIT: The issue also does not lie within the raw gzipped data, as Cocoa and Python both gunzip it without issue.

EDIT: Solved. Including final code:

urlConn = url.openConnection();
int offset = 0, len = CONTENT_LENGTH
byte[] gbytes = new byte[len];
gbuffer = new GZIPInputStream(urlConn.getInputStream(), len);
while(offset < len)
{
    offset += gbuffer.read(gbytes, offset, offset-len);
}
result = new String(gbytes, "UTF-8");
gbuffer.close();

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

ζ澈沫 2024-12-02 05:27:32

数据可能在流中不可用。第一个 println() 表示您只读取了 22 个字节,因此当您调用 read() 时只有 22 个字节可用。您可以尝试循环,直到读完 CONTENT_LENGTH 字节。也许是这样的:

int index = 0;
int bytesRead = gbuffer.read(gbytes);
while(bytesRead>0 && index<len) {
    index += bytesRead;
    bytesRead = gbuffer.read(gbytes,index,len-index);
}

It's possible that the data isn't available in the stream. The first println() you have says you've only read 22 bytes, so only 22 bytes were available when you called read(). You can try looping until you've read CONTENT_LENGTH worth of bytes. Maybe something like:

int index = 0;
int bytesRead = gbuffer.read(gbytes);
while(bytesRead>0 && index<len) {
    index += bytesRead;
    bytesRead = gbuffer.read(gbytes,index,len-index);
}
风苍溪 2024-12-02 05:27:32

GZIPInputStream.read() 不保证在一次调用中读取所有数据。你应该使用循环:

byte[] buf = new byte[1024];
int len = 0, total = 0;
while ((len = gbuffer.read(buf)) > 0) {
    total += len;
    // do something with data
}

GZIPInputStream.read() is not guaranteed to read all data in one call. You should use a loop:

byte[] buf = new byte[1024];
int len = 0, total = 0;
while ((len = gbuffer.read(buf)) > 0) {
    total += len;
    // do something with data
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文