当前位置：文江博客话题详情

ObjectOutputStream 的 writeObject 方法使用什么字符编码？

发布于 2024-10-07 03:42:27 字数 370 浏览 16 评论 0 原文

我读到Java内部使用UTF-16编码。即我明白，如果我有这样的： String var = "जनमत";那么“जनमत”将在内部以 UTF-16 编码。那么，如果我将此变量转储到某个文件，如下所示：

fileOut = new FileOutputStream("output.xyz");
out = new ObjectOutputStream(fileOut);
out.writeObject(var);

文件“output.xyz”中字符串“जनमत”的编码是否为 UTF-16？另外，稍后如果我想通过 ObjectInputStream 读取文件“output.xyz”，我是否能够获得变量的 UTF-16 表示形式？

谢谢。

原文

I read that Java uses UTF-16 encoding internally. i.e. I understand that if I have like: String var = "जनमत"; then the "जनमत" will be encoded in UTF-16 internally. So, If I dump this variable to some file such as below:

fileOut = new FileOutputStream("output.xyz");
out = new ObjectOutputStream(fileOut);
out.writeObject(var);

will the encoding of the string "जनमत" in the file "output.xyz" be in UTF-16? Also, later on if I want to read from the file "output.xyz" via ObjectInputStream, will I be able to get the UTF-16 representation of the variable?

Thanks.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

黎夕旧梦 2024-10-14 03:42:27

那么，如果我将此变量转储到某个文件...文件“output.xyz”中字符串“जनमत”的编码是否为 UTF-16？

文件中字符串的编码将采用 ObjectOutputStream 想要放入的任何格式。您应该将其视为只能由 ObjectInputStream 读取的黑匣子>。（说真的 - 尽管格式是 IIRC 记录充分< /a>，如果您想要使用其他工具读取它，您应该自己将对象序列化为 XML 或 JSON 或其他形式。）

稍后，如果我想通过 ObjectInputStream 读取文件“output.xyz”，我能否获得变量的 UTF-16 表示形式？

如果您使用 ObjectInputStream 读取文件，您将获得原始对象的副本。这将包括一个 java.lang.String，它只是一个字符流（而不是字节） - 如果您愿意，可以通过 getBytes() 方法 (虽然我怀疑你实际上并不需要）。

总之，不要太担心序列化的内部细节。如果您需要知道发生了什么，请自己创建该文件；如果您只是好奇，请相信 JVM 会做正确的事情。

回复收藏 0 原文

牵你手 2024-10-14 03:42:27

关闭：它并不完全是UTF-16，而是类似UCS-2的东西；但无论哪种方式，它确实对大多数字符使用 2 个字节（以及 2 个字符的序列，即对一些很少使用的代码点使用 4 个字节）。

ObjectOutputStream 使用一种称为“修改的 UTF-8”的东西，它类似于 UTF-8，但其中零字符表示为 2 字节序列，这对于 UTF-8 来说是不合法的（由于编码的唯一性限制），但这种自然解码回到值 0。

但是您真正要问的是“它是否有效，以便我写入一个字符串，读取一个字符串”——答案是肯定的。 JDK 在写出字节时进行正确的编码，在读取时进行解码。

就其价值而言，您最好对字符串使用“writeUTF()”方法，因为我认为结果输出更紧凑。但“writeObject()”也可以工作，只是需要更多的元数据。

回复收藏 0 原文

海拔太高太耀眼 2024-10-14 03:42:27

补充一下，ObjectOutputStream.writeString() 将确定给定字符串的 UTF 长度，并以“标准”UTF 或“长”UTF 格式写入，其中“长”如javadoc

“长”UTF 格式与
标准 UTF，只不过它使用 8
字节头（而不是标准的 2
bytes) 来传达 UTF 编码
长度。

我从代码中得到了这个......

private void writeString(String str, boolean unshared) throws IOException {
    handles.assign(unshared ? null : str);
    long utflen = bout.getUTFLength(str);
    if (utflen <= 0xFFFF) {
        bout.writeByte(TC_STRING);
        bout.writeUTF(str, utflen);
    } else {
        bout.writeByte(TC_LONGSTRING);
        bout.writeLongUTF(str, utflen);
    }
}

并且在 writeObject(Object obj) 中他们做了检查

if (obj instanceof String) {
    writeString((String) obj, unshared);
}

Just to add on this, ObjectOutputStream.writeString() will determing the UTF length of a given string and write it in "standard" UTF or in "long" UTF format where "long" as stated in the javadoc

"Long" UTF format is identical to
standard UTF, except that it uses an 8
byte header (instead of the standard 2
bytes) to convey the UTF encoding
length.

I got this from code...

private void writeString(String str, boolean unshared) throws IOException {
    handles.assign(unshared ? null : str);
    long utflen = bout.getUTFLength(str);
    if (utflen <= 0xFFFF) {
        bout.writeByte(TC_STRING);
        bout.writeUTF(str, utflen);
    } else {
        bout.writeByte(TC_LONGSTRING);
        bout.writeLongUTF(str, utflen);
    }
}

and in writeObject(Object obj) they do a check

if (obj instanceof String) {
    writeString((String) obj, unshared);
}

回复收藏 0 原文

~没有更多了~

关于作者

空城仅有旧梦在

暂无简介

文章

783 人气

关注发私信

友情链接

文江博客

ObjectOutputStream 的 writeObject 方法使用什么字符编码？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

tomoekana

无边思念无边月

眼角的笑意。

在风中等你

是你

syong71

友情链接

ObjectOutputStream 的 writeObject 方法使用什么字符编码？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

tomoekana

无边思念无边月

眼角的笑意。

在风中等你

是你

syong71

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。