Type chcp at the command prompt, and I bet you'll see Active code page: 932
The windows console has the concept of code pages, a relic of pre-unicode days, where the bytes 0-255 are mapped to different characters, depending upon the language. While the characters a-z, A-Z, 0-9 are consistent, lesser-used characters are mapped to characters popular in the target language.
In code page 932, the backslash is mapped to the yen character.
Caution Windows code page and OEM code page character sets used on Japanese-language operating systems contain the Yen symbol (¥) instead of a backslash (). Thus, the Yen symbol is a prohibited character for NTFS and FAT file systems. When mapping Unicode to a Japanese-language code page, WideCharToMultiByte and other conversion functions map both backslash (U+005C) and the normal Unicode Yen symbol (U+00A5) to this same character. For security reasons, your applications should not typically allow the character U+00A5 in a Unicode string that might be converted for use as a FAT file name. For more information, see Security Considerations: International Features.
UPDATE
Sorry for the delay, it took me a bit to recall where I had originally read about this. The best reference is Mike Kaplan's weblog entry here. michkap is the best Microsoft blog for all things unicode. If you deal with charsets, encoding issues and the dark corners of internationalization, his blog is an essential reference.
From his entry on the yen character as the backslash:
...on Japanese code page 932, 0x5c is the YEN SIGN, and on Korean code page 949, 0x5c is the WON SIGN.
Which is not to say that 0x5c does not act as a path separator -- it still does. And which is also not to say that the Unicode code points for the Yen and the Won (U+00a5 and U+20a9) do act as path separators -- because they do not.
...
In practice, after many years of code page based systems in Japan and Korea using their respective currency symbols as the path separators, it is believed customers were simply used to this appearance. And there was therefore little interest in changing that appearance (when the system settings were Japanese or Korean) to anything but those symbols.
To support this expectation, Japanese and Korean fonts, whenever the default system locale is set to Japanese or Korean, respectively, will display the currency symbol rather than the backslash when U+005c is shown.
You'll be hard pressed to find a better reference than that one, I believe.
Yen 和 \ 字符的字节值都是 0x5C,只是在不同的字符集中。这种情况非常普遍,以至于日本人普遍都意识到了这一点,并且不认为这是一个问题。
请参阅此博文的评论部分 - Norman Diamond,2004 年 12 月 27 日1:45 AM 写道“Windows 路径适用于日语默认系统区域设置,因为 0x5c 是日元符号,而日元符号是路径分隔符。”(Norman 在日本工作)
Both the Yen and the \ character have byte value 0x5C, just in different character sets. This is so common that the Japanese are generally aware of this, and don't consider it a problem.
See the comment section of this blog post - Norman Diamond at 27 Dec 2004 1:45 AM writes "Windows paths work with a Japanese default system locale because 0x5c is the yen sign and the yen sign is the path separator. " (Norman works in Japan)
发布评论
评论(2)
在命令提示符处输入
chcp
,我打赌您会看到Active code page: 932
Windows 控制台具有 代码页,前unicode时代的遗物,其中字节0-255 映射到不同的字符,具体取决于语言。虽然字符 az、AZ、0-9 是一致的,但较少使用的字符会映射到目标语言中流行的字符。
在代码页 932 中,反斜杠映射到日元字符。
这是一个常见问题。请参阅 Microsoft 在 MSDN 上的说明:
更新
抱歉耽搁了,我花了一点时间才想起来我最初是在哪里读到这篇文章的。最好的参考是 此处 Mike Kaplan 的博客条目。 mihkap 是有关 unicode 的最佳 Microsoft 博客。如果您处理字符集、编码问题和国际化的黑暗角落,他的博客是一个重要的参考。
从他对日元字符作为反斜杠的输入来看:
我相信你很难找到比这更好的参考资料了。
Type
chcp
at the command prompt, and I bet you'll seeActive code page: 932
The windows console has the concept of code pages, a relic of pre-unicode days, where the bytes 0-255 are mapped to different characters, depending upon the language. While the characters a-z, A-Z, 0-9 are consistent, lesser-used characters are mapped to characters popular in the target language.
In code page 932, the backslash is mapped to the yen character.
This is a common issue. See Microsoft's note on MSDN:
UPDATE
Sorry for the delay, it took me a bit to recall where I had originally read about this. The best reference is Mike Kaplan's weblog entry here. michkap is the best Microsoft blog for all things unicode. If you deal with charsets, encoding issues and the dark corners of internationalization, his blog is an essential reference.
From his entry on the yen character as the backslash:
You'll be hard pressed to find a better reference than that one, I believe.
Yen 和 \ 字符的字节值都是 0x5C,只是在不同的字符集中。这种情况非常普遍,以至于日本人普遍都意识到了这一点,并且不认为这是一个问题。
请参阅此博文的评论部分 - Norman Diamond,2004 年 12 月 27 日1:45 AM 写道“Windows 路径适用于日语默认系统区域设置,因为 0x5c 是日元符号,而日元符号是路径分隔符。”(Norman 在日本工作)
Both the Yen and the \ character have byte value 0x5C, just in different character sets. This is so common that the Japanese are generally aware of this, and don't consider it a problem.
See the comment section of this blog post - Norman Diamond at 27 Dec 2004 1:45 AM writes "Windows paths work with a Japanese default system locale because 0x5c is the yen sign and the yen sign is the path separator. " (Norman works in Japan)