垃圾字符替换 '/'在 Windows 命令提示符下

发布于 2024-11-27 04:52:16 字数 1459 浏览 2 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

挖鼻大婶 2024-12-04 04:52:16

在命令提示符处输入 chcp,我打赌您会看到 Active code page: 932

Windows 控制台具有 代码页,前unicode时代的遗物,其中字节0-255 映射到不同的字符,具体取决于语言。虽然字符 az、AZ、0-9 是一致的,但较少使用的字符会映射到目标语言中流行的字符。

代码页 932 中,反斜杠映射到日元字符。

这是一个常见问题。请参阅 Microsoft 在 MSDN 上的说明

警告 使用的 Windows 代码页和 OEM 代码页字符集
日语操作系统包含日元符号 (¥)
反斜杠 ()。因此,日元符号是禁用字符
NTFS 和 FAT 文件系统。将 Unicode 映射到日语时
代码页、WideCharToMultiByte 和其他转换函数都映射两者
反斜杠 (U+005C) 和正常的 Unicode 日元符号 (U+00A5) 到此
相同的角色。出于安全原因,您的应用程序不应
通常允许在 Unicode 字符串中使用字符 U+00A5,该字符串可能是
转换为 FAT 文件名。有关更多信息,请参阅
安全注意事项:国际功能。

更新

抱歉耽搁了,我花了一点时间才想起来我最初是在哪里读到这篇文章的。最好的参考是 此处 Mike Kaplan 的博客条目。 mihkap 是有关 unicode 的最佳 Microsoft 博客。如果您处理字符集、编码问题和国际化的黑暗角落,他的博客是一个重要的参考。

从他对日元字符作为反斜杠的输入来看:

...在日语代码页 932 上,0x5c 是日元符号,在韩语上
代码页 949,0x5c 是 WON 标志。

这并不是说 0x5c 不充当路径分隔符——它
仍然如此。这也不是说 Unicode 代码点
对于日元和韩元(U+00a5 和 U+20a9)确实充当路径分隔符
——因为他们不这样做。

...

在实践中,经过日本和日本多年的基于代码页的系统
韩国使用各自的货币符号作为路径分隔符,
据信顾客只是习惯了这种外观。和
因此,人们对改变这种外观没什么兴趣(当
系统设置是日语或韩语)除这些之外的任何内容
符号。

为了支持这种期望,日文和韩文字体,无论何时
默认系统区域设置分别设置为日语或韩语,将
当 U+005c 为时显示货币符号而不是反斜杠
显示。

我相信你很难找到比这更好的参考资料了。

Type chcp at the command prompt, and I bet you'll see Active code page: 932

The windows console has the concept of code pages, a relic of pre-unicode days, where the bytes 0-255 are mapped to different characters, depending upon the language. While the characters a-z, A-Z, 0-9 are consistent, lesser-used characters are mapped to characters popular in the target language.

In code page 932, the backslash is mapped to the yen character.

This is a common issue. See Microsoft's note on MSDN:

Caution Windows code page and OEM code page character sets used on
Japanese-language operating systems contain the Yen symbol (¥) instead
of a backslash (). Thus, the Yen symbol is a prohibited character for
NTFS and FAT file systems. When mapping Unicode to a Japanese-language
code page, WideCharToMultiByte and other conversion functions map both
backslash (U+005C) and the normal Unicode Yen symbol (U+00A5) to this
same character. For security reasons, your applications should not
typically allow the character U+00A5 in a Unicode string that might be
converted for use as a FAT file name. For more information, see
Security Considerations: International Features.

UPDATE

Sorry for the delay, it took me a bit to recall where I had originally read about this. The best reference is Mike Kaplan's weblog entry here. michkap is the best Microsoft blog for all things unicode. If you deal with charsets, encoding issues and the dark corners of internationalization, his blog is an essential reference.

From his entry on the yen character as the backslash:

...on Japanese code page 932, 0x5c is the YEN SIGN, and on Korean
code page 949, 0x5c is the WON SIGN.

Which is not to say that 0x5c does not act as a path separator -- it
still does. And which is also not to say that the Unicode code points
for the Yen and the Won (U+00a5 and U+20a9) do act as path separators
-- because they do not.

...

In practice, after many years of code page based systems in Japan and
Korea using their respective currency symbols as the path separators,
it is believed customers were simply used to this appearance. And
there was therefore little interest in changing that appearance (when
the system settings were Japanese or Korean) to anything but those
symbols.

To support this expectation, Japanese and Korean fonts, whenever the
default system locale is set to Japanese or Korean, respectively, will
display the currency symbol rather than the backslash when U+005c is
shown.

You'll be hard pressed to find a better reference than that one, I believe.

二智少女猫性小仙女 2024-12-04 04:52:16

Yen 和 \ 字符的字节值都是 0x5C,只是在不同的字符集中。这种情况非常普遍,以至于日本人普遍都意识到了这一点,并且不认为这是一个问题。

请参阅此博文的评论部分 - Norman Diamond,2004 年 12 月 27 日1:45 AM 写道“Windows 路径适用于日语默认系统区域设置,因为 0x5c 是日元符号,而日元符号是路径分隔符。”(Norman 在日本工作)

Both the Yen and the \ character have byte value 0x5C, just in different character sets. This is so common that the Japanese are generally aware of this, and don't consider it a problem.

See the comment section of this blog post - Norman Diamond at 27 Dec 2004 1:45 AM writes "Windows paths work with a Japanese default system locale because 0x5c is the yen sign and the yen sign is the path separator. " (Norman works in Japan)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文