我正在使用Windows 10上的Java 17使用logback 1.2.11。我正在使用以下 logback.xml
:
<configuration>
<property scope="context" name="COLORIZER_COLORS" value="boldred@,boldyellow@,boldcyan@,@,@" />
<conversionRule conversionWord="colorize" converterClass="org.tuxdude.logback.extensions.LogColorizer" />
<statusListener class="ch.qos.logback.core.status.NopStatusListener" />
<appender name="STDERR" class="ch.qos.logback.core.ConsoleAppender">
<target>System.err</target>
<withJansi>true</withJansi>
<encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
<pattern>[%colorize(%level)] %msg%n</pattern>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="STDERR" />
</root>
</configuration>
如果在我的代码中,我会使用 system.out.ut.println(“é” )
或 system.err.println(“é”)
,我在主机是预期的。但是,如果我通过登录(通过SLF4J)登录,它将显示屏幕上的θ
字符(U+0398,希腊大写字母Theta)。是否使用&lt; target&gt; system.out&lt;/target&gt;
或&lt; target&gt; system.err&lt;/target&gt;
in my logback.xml
文件。
默认情况下,对于 consoleappender
,应使用系统默认编码。 (请参阅 logack default charset for LayoutWrappingEncoder?进行广泛的讨论。)Windows 10 Console在我的语言环境中编码的Windows-Windows-windows-1252(或在Powershell,ISO-8859-1)。 θ字符甚至都没有出现在这两个魅力中。
当应该打印θ
字符时,为什么要打印θ
当应该打印é
字符时?更一般地,为什么在打印到 system.out
或 system.err
时,为何不使用默认编码来记录回货?
I'm using Logback 1.2.11 with Java 17 on Windows 10. I'm using the following logback.xml
:
<configuration>
<property scope="context" name="COLORIZER_COLORS" value="boldred@,boldyellow@,boldcyan@,@,@" />
<conversionRule conversionWord="colorize" converterClass="org.tuxdude.logback.extensions.LogColorizer" />
<statusListener class="ch.qos.logback.core.status.NopStatusListener" />
<appender name="STDERR" class="ch.qos.logback.core.ConsoleAppender">
<target>System.err</target>
<withJansi>true</withJansi>
<encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
<pattern>[%colorize(%level)] %msg%n</pattern>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="STDERR" />
</root>
</configuration>
If in my code I use System.out.println("é")
or System.err.println("é")
, I see an é
(U+00E9, a small letter e with acute accent) on the console as expected. However if I log through Logback (via SLF4J), it shows a Θ
character (U+0398, a Greek capital letter theta) on the screen. This happens whether I use <target>System.out</target>
or <target>System.err</target>
in my logback.xml
file.
By default the PatternLoutEncoder
for ConsoleAppender
should be using the system default encoding. (See LogBack default charset for LayoutWrappingEncoder? for extensive discussion.) The Windows 10 console encoding in my locale should be Windows-1252 (or in Powershell, ISO-8859-1). The Θ character doesn't even appear in either of those charsets.
Why is Logback printing a Θ
character to the standard output when it should be printing an é
character? More generally, why isn't Logback using the default encoding when printing to System.out
or System.err
?
发布评论
评论(1)
看来LogBack使用了错误的“默认charset”。
system.out
的API Javadocs关于其默认字符集(也适用于System.err
):在我的Windows 10命令提示符上,
charset.defaultcharset()
返回windows> Windows-1252
,而system.console()。charset()
代码> IBM437 。如果创建new UppoteStreamWriter(System.Out,System.Console()。charset())
并编写字符串“é”
,它会产生é如预期。但是,如果我使用
新的outputStreamWriter(System.out,charset.defaultcharset())
和WRITE“é”
,它会产生θ
!这就是θ的来源 - 它是ibm437
charset的一部分!我不会在这里问为什么我的Windows 10命令提示符默认为
IBM437
作为其默认charset;在这个问题的背景下,这是重点。根问题似乎是记录错误地检索了默认字符设置。 (这是长篇小说,但基本上是logback是 string.getBytes()。的默认charset 。)最终在
layoutwrappingencoder
中依靠charset.default.default.default.default.default.default.default.default.default.default.defeart. /code>,不匹配控制台的 /code>;相反,如果要匹配控制台的默认charset,则应默认为
system.console()。charset()
。显然,
layoutwrappingencoder
不知道它是写入控制台还是其他一些输出流,实际上使用charset.defaultchareet()
。也许需要某种方式使ch.qos.logback.core.outputStreamAppender
可以将其Charset暴露于layoutwrappingencodencoder
和ch.qos.logback.core。 consoleappender
可以基于system.console()。charset()
而不是charset.default.default.defaultcharset()
。无论如何,这里的罪魁祸首似乎是使用错误的默认charset进行记录,用于
system.out
和system.err
的控制台。 (有人知道我如何告诉logback使用system.console()。charset()
而不是charset.defaultchareet()提前了解默认的控制台charset,因此我无法将其硬编码
logback.xml
。)我已提交了logback bug logback-1642 。
It looks like Logback is using the wrong "default charset". The API Javadocs of
System.out
says this about its default charset (which applies toSystem.err
as well):On my Windows 10 Command Prompt,
Charset.defaultCharset()
returnswindows-1252
, whileSystem.console().charset()
returnsIBM437
. If create anew OutputStreamWriter(System.out, System.console().charset())
and write the string"é"
, it producesé
as expected. But sure enough if I usenew OutputStreamWriter(System.out, Charset.defaultCharset())
and write"é"
, it producesΘ
! So that's where the Θ was coming from—it is part of theIBM437
charset!I won't ask here why my Windows 10 Command Prompt is defaulting to
IBM437
as its default charset; in the context of this issue, that's beside the point.The root problem seems to be that Logback is retrieving the default character set erroneously. (It's a long story, but basically Logback is relying on the default charset of
String.getBytes()
.) Ultimately Lobback inLayoutWrappingEncoder
is relying on the value ofCharset.defaultCharset()
, which doesn't match that of the console; instead it should be defaulting toSystem.console().charset()
if it wants to match the default charset of the console.Apparently the
LayoutWrappingEncoder
doesn't know if it's writing to the console or some other output stream that in fact usesCharset.defaultCharset()
. Perhaps there needs to be some way thatch.qos.logback.core.OutputStreamAppender
can expose its charset toLayoutWrappingEncoder
, andch.qos.logback.core.ConsoleAppender
can override the default based onSystem.console().charset()
instead ofCharset.defaultCharset()
.In any case the culprit here seems to be Logback using the wrong default charset for the console for
System.out
andSystem.err
. (Anyone know how I can tell Logback to useSystem.console().charset()
instead ofCharset.defaultCharset()
? I certainly don't have any way of knowing the default console charset ahead of time, so I can't hard-code it intologback.xml
.)I have filed Logback bug LOGBACK-1642.