避免在 Java 中打印 unicode 替换字符

发布于 2024-08-13 14:41:27 字数 140 浏览 2 评论 0原文

在 Java 中,为什么 Character.toString((char) 65533) 打印出这个符号: � ?

我有一个 Java 程序,可以在各处打印这些字符。这是一个大计划。我能做些什么来避免这种情况有什么想法吗?

In Java, why does Character.toString((char) 65533) print out this symbol: � ?

I have a Java program which prints these characters all over the place. Its a big program. Any ideas on what I can do to avoid this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

挖个坑埋了你 2024-08-20 14:41:27

最可能的情况之一是您尝试使用 UTF-8 字符集读取 ISO-8859 数据。如果您遇到不是有效 UTF-8 的字符序列,那么它将被替换为 � 符号。

检查您的输入流,并确保使用正确的字符集读取它们。

One of the most likely scenarios is that you are trying to read ISO-8859 data using the UTF-8 character set. If you come across a sequence of characters that is not valid UTF-8, then it will be replaced with the � symbol.

Check your input streams, and ensure that you read them using the correct character set.

妳是的陽光 2024-08-20 14:41:27

在java中,为什么Character.toString((char) 65533)打印出这个符号:� ?

因为正是这个特定字符IS与特定相关联代码点。它没有像您想象的那样显示随机字符。

我有一个java程序,它可以在各处打印这些字符。这是一个大计划。有什么想法可以避免这种情况吗?

你的问题出在别的地方。至少可以归结为您应该设置涉及byte-char转换的每一步(将文本存储在文件/数据库中,从文件中读取文本) /db、操作文本、传输文本、显示文本等)以使用 UTF-8

引起我注意的是,Java 对 0xFFFD 绝对没有做任何特殊的事情,它只是用问号 ? 替换未覆盖的字符,并且当您一直坚持 0xFFFD来自Java。我知道 Firefox 完全按照您所说的操作,那么您是否可能将“Firefox”与“Java”混淆了?

如果这是真的,并且您实际上正在谈论 Java Web 应用程序,那么您至少需要将 HTTP 响应编码设置为 UTF-8。您可以通过将 <%@ page pageEncoding="UTF-8" %> 放在相关 JSP 页面的顶部来实现此目的。您可能会发现这篇文章对获取更多背景信息以及解决此“Unicode 问题”所需的所有步骤和解决方案的详细概述。

In java, why does Character.toString((char) 65533) print out this symbol: � ?

Because exact this particular character IS associated with the particular codepoint. It does not display a random character as you seem to think.

I have a java program which prints these characters all over the place. Its a big program. Any ideas on what I can do to avoid this?

Your problem lies somewhere else. It at least boils down that you should set every step which involves byte-char conversions (storing text in file/db, reading text from file/db, manipulating text, transferring text, displaying text, etcetera) to use UTF-8.

Which catches my eye is the fact that Java does absolutely nothing special with 0xFFFD, it just replaces uncovered chars by a question mark ? and that while you keep insisting that 0xFFFD comes from Java. I know that Firefox does exactly what you said, so are you maybe confusing "Firefox" with "Java"?

If this is true and you're actually talking about a Java webapplication, then you need to set at least the HTTP response encoding to UTF-8. You can do that by putting <%@ page pageEncoding="UTF-8" %> in top of the JSP page in question. You may find this article useful to get more background information and a detailed overview of all steps and solutions you need to apply to solve this "Unicode problem".

許願樹丅啲祈禱 2024-08-20 14:41:27

没有 Unicode 字符 U+FFFD。因此,该代码在逻辑上是错误的。 Unicode 替换符号的预期用途是替换错误的输入(例如 (char)65533)。

如何解决:不要在字符串中放入垃圾。字符串用于文本。字节用于随机二进制数据。

There is no Unicode character U+FFFD. Hence, the code is logically incorrect. The intended use of the Unicode Replacement Symbol is to be substitued for bad input (such as (char)65533).

How to fix it: don't put junk in strings. Strings are for text. Bytes are for random binary data.

往日情怀 2024-08-20 14:41:27

那么,您想要它做什么?如果您“到处”都收到这些字符,我怀疑您的数据有问题...您收到无法用 Unicode 表示的数据的情况应该很少见。

您如何开始获取数据?

Well, what do you want it to do? If you're getting these characters "all over the place" I suspect you have bad data... it should be pretty rare that you receive data which can't be represented in Unicode.

How are you getting the data to start with?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文