Java 中不显示 UTF-8 CJK 字符

发布于 2024-11-06 07:11:55 字数 682 浏览 4 评论 0 原文

我已经阅读 Unicode 和 UTF-8 编码有一段时间了，我想我理解它，所以希望这不会是一个愚蠢的问题：

我有一个包含一些 CJK 字符的文件，并且已保存为UTF-8。我安装了各种亚洲语言包，并且其他应用程序可以正确呈现字符，所以我知道这很有效。

在我的 Java 应用程序中，我按如下方式读取该文件：

// Create objects
fis = new FileInputStream(new File("xyz.sgf"));
InputStreamReader is = new InputStreamReader(fis, Charset.forName("UTF-8"));
BufferedReader br = new BufferedReader(is);

// Read and display file contents
StringBuffer sb = new StringBuffer();
String line;
while ((line = br.readLine()) != null) {
    sb.append(line);
}
System.out.println(sb);

输出将 CJK 字符显示为“???”。调用 is.getEncoding() 确认它确实使用 UTF-8。为了使角色正确显示，我缺少哪一步？如果有影响，我会使用 Eclipse 控制台查看输出。

原文

I've been reading up on Unicode and UTF-8 encoding for a while and I think I understand it, so hopefully this won't be a stupid question:

I have a file which contains some CJK characters, and which has been saved as UTF-8. I have various Asian language packs installed and the characters are rendered properly by other applications, so I know that much works.

In my Java app, I read the file as follows:

// Create objects
fis = new FileInputStream(new File("xyz.sgf"));
InputStreamReader is = new InputStreamReader(fis, Charset.forName("UTF-8"));
BufferedReader br = new BufferedReader(is);

// Read and display file contents
StringBuffer sb = new StringBuffer();
String line;
while ((line = br.readLine()) != null) {
    sb.append(line);
}
System.out.println(sb);

The output shows the CJK characters as '???'. A call to is.getEncoding() confirms that it is definitely using UTF-8. What step am I missing to make the characters appear properly? If it makes a difference, I'm looking at the output using the Eclipse console.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

乜一 2024-11-13 07:11:55

System.out.println(sb);

问题出在上面这一行。这将使用默认系统编码对字符数据进行编码并将数据发送到 STDOUT。在许多系统上，这是一个有损过程。

如果更改默认值，System.out 使用的编码和控制台使用的编码必须匹配。

唯一受支持的更改默认系统编码的机制是通过操作系统。 （有些人会建议使用 file.encoding 系统属性，但这是不支持并且可能会产生意想不到的副作用。）您可以使用setOut 到您自己的自定义 PrintStream：

PrintStream stdout = new PrintStream(System.out, autoFlush, encoding);

您可以通过运行配置。

您可以通过我的个人资料在我的博客上找到许多有关该主题的帖子。

System.out.println(sb);

The problem is the above line. This will encode character data using the default system encoding and emit the data to STDOUT. On many systems, this is a lossy process.

If you change the defaults, the encoding used by System.out and the encoding used by the console must match.

The only supported mechanism to change the default system encoding is via the operating system. (Some will advise using the file.encoding system property, but this is not supported and may have unintended side-effects.) You can use setOut to your own custom PrintStream:

PrintStream stdout = new PrintStream(System.out, autoFlush, encoding);

You can change the Eclipse console encoding via the Run configuration.

You can find a number of posts about the subject on my blog - via my profile.

回复收藏 0 原文

我纯我任性 2024-11-13 07:11:55

以下程序使用 TextPad 将 CJK 字符打印到控制台。要查看韩文朝鲜文和日文平假名，我必须告诉 Java 将打印流的编码更改为 EUC_KR 并设置 TextPad 工具输出窗口的属性：

字体是 Arial Unicode MS
脚本是朝鲜文

import java.io.PrintStream;
import java.io.UnsupportedEncodingException;

class Hangul {

    public static void main(String[] args)  throws Exception {

        // Change console encoding to Korean

        PrintStream out = new PrintStream(System.out, true, "EUC_KR");
        System.setOut(out);

        // Print sample to console

        String go_hello  = "가다 こんにちは";
        System.out.println(go_hello);
    }
}

工具输出是：

і다 こんにちは

The following program prints CJK characters to the console using TextPad. To see the Korean Hangul and Japanese Hiragana I had to tell Java to change the print stream's encoding to EUC_KR and set the properties of TextPad's tool output window:

font is Arial Unicode MS
script is Hangul

import java.io.PrintStream;
import java.io.UnsupportedEncodingException;

class Hangul {

    public static void main(String[] args)  throws Exception {

        // Change console encoding to Korean

        PrintStream out = new PrintStream(System.out, true, "EUC_KR");
        System.setOut(out);

        // Print sample to console

        String go_hello  = "가다 こんにちは";
        System.out.println(go_hello);
    }
}

Tool Output is:

가다 こんにちは

回复收藏 0 原文