IntelliJ 中的 Java JDK 18 打印问号“？”当我尝试打印像“\u1699”这样的unicode时

发布于 2025-01-18 17:50:17 字数 1220 浏览 3 评论 0原文

TLDR：我降级到JDK 17（17.0.2），现在它起作用了...

我在看初学者Java教程，由Kody Simpson在YT上（YouTube.com/watch?v=t9lp9nt9nco），在那个教程中，男孩Kody打印的疯狂符号称为“☯Ωø”，但对我来说只是打印“？” - 问号。

char letter = '\u1699';
System.out.println(letter);

我尝试了堆栈溢出上的几乎所有解决方案，例如：

将编码更改为UTF-8的文件，尽管默认情况下是在使用UTF-8。
将'-dconsole.encoding = utf-8''和'-dfile.encoding = utf-8'在编辑自定义VM选项中。
在控制面板中与区域设置混乱。

这一切都没有起作用。

每个帖子都来自很多年前，例如12年来：

unicode字符在Intellij Idea Console中显示为问号

我最终删除并重新删除了Intellij，因为我以为我以为我弄乱了一些设置并想要重新启动，但是这次我做了一个设置Project SDK旧版本Oracle OpenJDK版本14.0.1，现在以某种方式可以使用并打印了“ᚙ”符号。

然后我意识到问题可能是JDK的最新版本，即第18版，所以我下载了JDK 17.0.2，并且它仍然可以使用并打印出符号'ᚙ'，所以很好：:)。但是，当我切换回JDK版本18时，它只是打印“？”再次。

这也很奇怪，因为无论您称之为什么，我都可以将past the paste''粘贴复制到书写代码区域（JDK版本18），

char letter = 'ᚙ';
System.out.println(letter);

但是当我按下运行并尝试打印时……它仍然给出问号。

我不知道为什么会发生这种情况，我开始学习编码2天，所以我可能很愚蠢，或者新版本有一个错误，但是我从未通过Google或这里找到解决方案，所以这就是为什么我要使我的有史以来第一个堆栈溢出帖子。

原文

tldr: I downgraded to JDK 17 (17.0.2) and now it works...

I was watching a beginners Java tutorial by Kody Simpson on YT (youtube.com/watch?v=t9LP9Nt9Nco), and in that tutorial the boy Kody prints crazy symbols called Unicode like "☯Ωøᚙ", but for me it just prints "?" - a question mark.

char letter = '\u1699';
System.out.println(letter);

I tried pretty much every solution on Stack Overflow, such as:

Changing File Encoding to UTF-8, although mine was using UTF-8 by default.
Putting '-Dconsole.encoding=UTF-8' and '-Dfile.encoding=UTF-8' in the Edit Custom VM options.
Messing with Region Settings in control panel.

None of it worked.

Every post was also from many years ago, such as this one, which is from 12 years:

unicode characters appear as question marks in IntelliJ IDEA console

I ended up deleting and re-downloading Intellij because I thought I messed up some settings and wanted a restart, but this time I made the Project SDK an older version, Oracle openJDK version 14.0.1, and now somehow it worked and printed the 'ᚙ' symbol.

Then I realized the problem might be the latest version of the JDK which is version 18, so I downloaded JDK 17.0.2, and it BOOM it still works and prints out the symbol 'ᚙ', so thats nice :). But when I switched back to JDK version 18 it just prints "?" again.

Also its strange because I can copy paste the ᚙ symbol into the writing code area whatever you call it, (on JDK version 18)

char letter = 'ᚙ';
System.out.println(letter);

But when I press RUN and try to PRINT ... it STILL GIVES QUESTION MARK.

I have no clue why this happens, I started learning coding 2 days so I'm probably dumb, or the new version has got a bug, but I never found a solution through Google or here, so this is why I'm making my first ever Stack Overflow post.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

煞人兵器 2025-01-25 17:50:17

我可以复制您的问题：如果使用 JDK 17 编译，则在运行代码时打印可以正常工作；如果使用 JDK 18 编译，则在运行代码时打印会失败。Java

18 中实现的更改之一是 JEP 400：默认为 UTF-8。该 JEP 的摘要指出：

指定 UTF-8 作为标准 Java API 的默认字符集。和
此更改后，依赖于默认字符集的 API 将会表现
在所有实现、操作系统、区域设置中保持一致，
和配置。

这听起来不错，但这一变化的目标之一是（我强调了这一点）：

在整个标准 Java API 中对 UTF-8 进行标准化，除了
控制台 I/O。

所以我认为你的问题出现是因为你已经确保 Intellij IDEA 中控制台的编码是 UTF-8，但是你用来写入该控制台的 PrintStream （即 System.out< /code>) 不是。

的 Javadoc >PrintStream 声明（添加了我的强调）：

PrintStream 打印的所有字符都使用以下命令转换为字节
给定的编码或字符集，或默认字符集（如果没有）
指定。

由于您的 PrintStream 是 System.out，因此您没有指定任何“编码或字符集”，因此使用“默认值” charset”，这可能不是 UTF-8。因此，要让您的代码在 Java 18 上运行，您只需确保您的 PrintStream 使用 UTF-8 进行编码。下面是一些示例代码来显示问题和解决方案：

package pkg;

import java.io.FileDescriptor;
import java.io.FileOutputStream;
import java.io.PrintStream;
import java.nio.charset.StandardCharsets;

public class Humpty {

    public static void main(String[] args) throws java.io.UnsupportedEncodingException {

        char letter = 'ᚙ';
        String charset1 = System.out.charset().displayName();  // charset() requires JDK 18

        System.out.println("Writing the character " + letter + " to a PrintStream with charset " + charset1); // fails

        PrintStream ps = new PrintStream(new FileOutputStream(FileDescriptor.out), true, StandardCharsets.UTF_8);
        String charset2 = ps.charset().displayName(); // charset() requires JDK 18
        ps.println("Writing the character " + letter + " to a PrintStream with charset " + charset2); // works
    }
}

这是运行该代码时控制台中的输出：

C:\Java\jdk-18\bin\java.exe -javaagent:C:\Users\johndoe\AppData\Local\JetBrains\Toolbox\apps\IDEA-U\ch-0\221.5080.93\lib\idea_rt.jar=64750:C:\Users\johndoe\AppData\Local\JetBrains\Toolbox\apps\IDEA-U\ch-0\221.5080.93\bin -Dfile.encoding=UTF-8 -classpath C:\Users\johndoe\IdeaProjects\HelloIntellij\out\production\HelloIntellij pkg.Humpty
Writing the character ? to a PrintStream with charset windows-1252
Writing the character ᚙ to a PrintStream with charset UTF-8

Process finished with exit code 0

注意：

PrintStream 有一个 Java 18 中名为 charset() 其中“返回此 PrintStream 实例中使用的字符集”。上面的代码调用 charset()，并显示对于我的机器，我的“默认字符集”是 windows-1252，而不是 UTF-8。
我使用Intellij IDEA 2022.1 Beta（终极版）进行测试。
在控制台中，我使用了 DejaVu Sans 字体来确保可以渲染字符“ᚙ”。

更新：为了解决 Mostafa Zeinali 在下面的评论中提出的问题，System.out 使用的 PrintStream 可以重定向到 UTF-8 PrintStream 通过调用 System.setOut()。以下是示例代码：

    String charsetOut = System.out.charset().displayName();
    if (!"UTF-8".equals(charsetOut)) {
        System.out.println("The charset for System.out is " + charsetOut + ". Changing System.out to use charset UTF-8");
        System.setOut(new PrintStream(new FileOutputStream(FileDescriptor.out), true, StandardCharsets.UTF_8));
        System.out.println("The charset for System.out is now " +    System.out.charset().displayName());
    }

这是该代码在我的 Windows 10 计算机上的输出：

The charset for System.out is windows-1252. Changing System.out to use charset UTF-8
The charset for System.out is now UTF-8

请注意，System.out 是一个 final 变量，因此您不能直接分配新变量PrintStream 到它。此代码无法编译，并出现错误“无法为最终变量'out'赋值”：

System.out = new PrintStream(new FileOutputStream(FileDescriptor.out), true, StandardCharsets.UTF_8); // Won't compile

I can replicate your problem: printing works correctly when running your code if compiled with JDK 17, and fails when running your code if compiled with JDK 18.

One of the changes implemented in Java 18 was JEP 400: UTF-8 by Default. The summary for that JEP stated:

Specify UTF-8 as the default charset of the standard Java APIs. With
this change, APIs that depend upon the default charset will behave
consistently across all implementations, operating systems, locales,
and configurations.

That sounds good, except one of the goals of that change was (with my emphasis added):

Standardize on UTF-8 throughout the standard Java APIs, except for
console I/O.

So I think your problem arose because you had ensured that the console's encoding in Intellij IDEA was UTF-8, but the PrintStream that you were using to write to that console (i.e. System.out) was not.

The Javadoc for PrintStream states (with my emphasis added):

All characters printed by a PrintStream are converted into bytes using
the given encoding or charset, or the default charset if not
specified.

Since your PrintStream was System.out, you had not specified any "encoding or charset", and were therefore using the "default charset", which was presumably not UTF-8. So to get your code to work on Java 18, you just need to ensure that your PrintStream is encoding with UTF-8. Here's some sample code to show the problem and the solution:

package pkg;

import java.io.FileDescriptor;
import java.io.FileOutputStream;
import java.io.PrintStream;
import java.nio.charset.StandardCharsets;

public class Humpty {

    public static void main(String[] args) throws java.io.UnsupportedEncodingException {

        char letter = 'ᚙ';
        String charset1 = System.out.charset().displayName();  // charset() requires JDK 18

        System.out.println("Writing the character " + letter + " to a PrintStream with charset " + charset1); // fails

        PrintStream ps = new PrintStream(new FileOutputStream(FileDescriptor.out), true, StandardCharsets.UTF_8);
        String charset2 = ps.charset().displayName(); // charset() requires JDK 18
        ps.println("Writing the character " + letter + " to a PrintStream with charset " + charset2); // works
    }
}

This is the output in the console when running that code:

C:\Java\jdk-18\bin\java.exe -javaagent:C:\Users\johndoe\AppData\Local\JetBrains\Toolbox\apps\IDEA-U\ch-0\221.5080.93\lib\idea_rt.jar=64750:C:\Users\johndoe\AppData\Local\JetBrains\Toolbox\apps\IDEA-U\ch-0\221.5080.93\bin -Dfile.encoding=UTF-8 -classpath C:\Users\johndoe\IdeaProjects\HelloIntellij\out\production\HelloIntellij pkg.Humpty
Writing the character ? to a PrintStream with charset windows-1252
Writing the character ᚙ to a PrintStream with charset UTF-8

Process finished with exit code 0

Notes:

PrintStream has a new method in Java 18 named charset() which "returns the charset used in this PrintStream instance". The code above calls charset(), and shows that for my machine my "default charset" is windows-1252, not UTF-8.
I used Intellij IDEA 2022.1 Beta (Ultimate Edition) for testing.
In the console I used font DejaVu Sans to ensure that the character "ᚙ" could be rendered.

UPDATE: To address the issue raised in the comments below by Mostafa Zeinali, the PrintStream used by System.out can be redirected to a UTF-8 PrintStream by calling System.setOut(). Here's sample code:

    String charsetOut = System.out.charset().displayName();
    if (!"UTF-8".equals(charsetOut)) {
        System.out.println("The charset for System.out is " + charsetOut + ". Changing System.out to use charset UTF-8");
        System.setOut(new PrintStream(new FileOutputStream(FileDescriptor.out), true, StandardCharsets.UTF_8));
        System.out.println("The charset for System.out is now " +    System.out.charset().displayName());
    }

This is the output from that code on my Windows 10 machine:

The charset for System.out is windows-1252. Changing System.out to use charset UTF-8
The charset for System.out is now UTF-8

Note that System.out is a final variable, so you can't directly assign a new PrintStream to it. This code fails to compile with the error "Cannot assign a value to final variable 'out'":

System.out = new PrintStream(new FileOutputStream(FileDescriptor.out), true, StandardCharsets.UTF_8); // Won't compile

回复收藏 0 原文

入画浅相思 2025-01-25 17:50:17

TLDR：在 Java 18 上使用它：

-Dfile.encoding="UTF-8" -Dsun.stdout.encoding="UTF-8" -Dsun.stderr.encoding="UTF-8"

来自 JEP 400：

JDK 内部使用了三个与字符集相关的系统属性。它们仍然未指定且不受支持，但为了完整起见，在此处记录：
sun.stdout.encoding 和 sun.stderr.encoding — 用于标准输出流 (System.out) 和标准错误流 (System.err) 以及 java.io.Console API 中的字符集名称。 sun.jnu.encoding — 在编码或解码文件名路径（而不是文件内容）时 java.nio.file 的实现所使用的字符集名称。在 macOS 上，其值为“UTF-8”；在其他平台上，它通常是默认字符集。

正如您所看到的，这两个系统属性“仍未指定且不受支持”。但他们解决了我的问题。因此，请自行承担使用它们的风险，并且不要在生产环境中使用它们。顺便说一句，我正在 Windows 10 上运行 Eclipse。

我认为必须有一个好方法来设置JVM在运行时的默认字符集，并且传递-Dfile.encoding =“UTF-8”并不能做到这一点是愚蠢的。正如您在 JEP 400 中所读到的：

如果 file.encoding 设置为“UTF-8”（即 java -Dfile.encoding=UTF-8），则默认字符集将为 UTF-8。定义此无操作值是为了保留现有命令行的行为。

而这正是它“不”做的事情。传递 Dfile.encoding="UTF-8" 不会“不”保留现有命令行的行为！我认为这表明 Java 18 对 JEP 400 的实现没有做它实际应该做的事情，这首先是问题的根源。

TLDR: Use this on Java 18:

-Dfile.encoding="UTF-8" -Dsun.stdout.encoding="UTF-8" -Dsun.stderr.encoding="UTF-8"

From JEP 400:

There are three charset-related system properties used internally by the JDK. They remain unspecified and unsupported, but are documented here for completeness:
sun.stdout.encoding and sun.stderr.encoding — the names of the charsets used for the standard output stream (System.out) and standard error stream (System.err), and in the java.io.Console API. sun.jnu.encoding — the name of the charset used by the implementation of java.nio.file when encoding or decoding filename paths, as opposed to file contents. On macOS its value is "UTF-8"; on other platforms it is typically the default charset.

As you can see, those two system properties "remain unspecified and unsupported". But they solved my problem. Therefore, please use them at your own risk, and DO NOT use them in production env. I'm running Eclipse on Windows 10 btw.

I think there must be a good way to set the default charset of JVM upon running, and it is stupid that passing -Dfile.encoding="UTF-8" does not do that. As you can read in JEP 400:

If file.encoding is set to "UTF-8" (i.e., java -Dfile.encoding=UTF-8), then the default charset will be UTF-8. This no-op value is defined in order to preserve the behavior of existing command lines.

And this is exactly what it is "NOT" doing. Passing Dfile.encoding="UTF-8" does "not" preserve the behavior of existing command lines! I think this shows that Java 18's implementation of JEP 400 is not doing what it should actually be doing, which is the root of your problem in the first place.

回复收藏 0 原文