当前位置：文江博客话题详情

在 Java 中膨胀 byte[] 时出现问题？

发布于 2024-10-10 12:30:35 字数 893 浏览 5 评论 0 原文

我遇到了一个我无法弄清楚的问题。这是问题的定义：我在 Db2/Linux 环境中的 Blob 列中有一些数据。使用 JDK 压缩对 byte[] 进行压缩后，Blob 被写入 DB2（执行此操作的代码在 Linux 环境中运行）。我正在尝试编写一个简单的程序来读取其中一些数据，将其解压缩（使用 JDK），并从 Windows 环境（我的开发环境）中解压缩的字节数组创建一个字符串。问题是，在我解压 Blob (byte[]) 后，解压后的字节数组的长度通常比预期长 1-3 个字节。我所说的预期是指偏移量和长度字段也存储在数据库中。所以在这种情况下，解压后的字节数组的长度通常会比数据库中存储的长度长，只是几个字节。因此，如果我从解压缩的字节数组创建一个 String 对象，并使用数据库中的 offset 和 length 字段使用 substring(offset, length) 方法创建另一个 String 对象，我的第二个 String （通过使用 substring 方法得到的）是更短。

一个例子是：数据库记录包含一个 blob，偏移量：0，长度：260,409 解压缩 blob 后 -

 compressedByte[].length  - 71,212
 decompressedByte[].length   - 260,412
 new String(decompressByte[]).length()  - 260,412
 new String(decompressByte[]).subString(0, 260,409).length() - 260409

对于其他一些输入记录，我看到的差异是长度在 1-3 个字节之间。

我对这个问题有点困惑，想知道是否有人可以提出任何提示，以便我可以进行更多调试来解决这个问题。我想知道这是否与 Linux 环境中字节的存储/写入方式以及 Windows 中的读取方式有关？感谢您的帮助。

原文

I ran into an issue which I can't figure out. Here is the definition of the problem:
I have some data in a Blob column in Db2/Linux environment. Blob was written into DB2 after the byte[] was compressed using JDK compression (code that does this is running in Linux environment).
I am trying to write a simple program to read some of this data decompress it (using JDK) and create a String from the decompressed byte array in Windows Environment (my development environment). Issue is that after I decompress the Blob (byte[]), length of the decompressed byte array is usually 1-3 bytes longer than expected. What I mean by expected is that the offset and length fields are also being stored in the database. So in this case, length of the decompressed byte array is usually longer than the stored length in database, just a few bytes. So if I create a String object from the decompressed byte array and create another String object using the substring(offset, length) method using the offset and length fields from the database, my second String(the one I got by using substring method) is shorter.

An example would be:
database record contains a blob, offset: 0, length: 260,409
after decompressing the blob -

 compressedByte[].length  - 71,212
 decompressedByte[].length   - 260,412
 new String(decompressByte[]).length()  - 260,412
 new String(decompressByte[]).subString(0, 260,409).length() - 260409

For some other input records, the difference I am seeing is anywhere between 1-3 bytes in length.

I am sort of puzzled with this issue and wondering if anyone could suggest any tips so I can do more debugging to figure this issue out. I am wondering whether this could be somehow related to how bytes are being stored/written in Linux environment and how they are being read in Windows? Thanks for your help.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

甩你一脸翔 2024-10-17 12:30:35

我怀疑两个系统的默认编码不同。

// on the linux box   
byte [] blob = str.getBytes("UTF-8");

// in your code 
String str = new String(blob, "UTF-8");

或者至少找出 linux 机器上的默认编码是什么（正常的 UTF-8）并跳过步骤 1。

joelonsoftware.com/articles/Unicode.html" rel="nofollow">Joel 谈软件

I suspect the default encoding is different between the two systems.

// on the linux box   
byte [] blob = str.getBytes("UTF-8");

// in your code 
String str = new String(blob, "UTF-8");

Or at the least find out what the default encoding is on the linux box is (normal UTF-8) and skip step 1.

A really good examplation of what could be happening here is on Joel on software

回复收藏 0 原文

我ぃ本無心為│何有愛 2024-10-17 12:30:35

String 不是字节的通用持有者。毫无疑问，您的 db2/Linux 环境和 Windows 环境之间的默认字符编码不同，这将导致字节和字符之间的来回转换不同。

回复收藏 0 原文

~没有更多了~

关于作者

找个人就嫁了吧

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

在 Java 中膨胀 byte[] 时出现问题？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

離殇

小姐丶请自重

Aik

国产ˉ祖宗

猥琐帝

半仙

友情链接

在 Java 中膨胀 byte[] 时出现问题？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

離殇

小姐丶请自重

Aik

国产ˉ祖宗

猥琐帝

半仙

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。