Perl:将 Unicode 字符串打印到 Windows 控制台
我在将 Unicode 字符串打印到 Windows 控制台* 时遇到一个奇怪的问题。
考虑以下文本:
אני רוצה לישון
Intermediary
היא רוצה לישון
אתם, הם
Bye
Hello, world!
test
假设它位于名为“file.txt”的文件中。
当我 go*: "type file.txt" 时,它打印得很好。但是当它从 Perl 程序打印时,如下所示:
use strict;
use warnings;
use Encode;
use 5.014;
use utf8;
use autodie;
use warnings qw< FATAL utf8 >;
use open qw< :std :utf8 >;
use feature qw< unicode_strings >;
use warnings 'all';
binmode STDOUT, ':utf8'; # output should be in UTF-8
my $word;
my @array = ( 'אני רוצה לישון', 'Intermediary',
'היא רוצה לישון', 'אתם, הם', 'Bye','Hello, world!', 'test');
foreach $word(@array) {
say $word;
}
Unicode 行(在本例中为希伯来语)每次都会再次显示,部分损坏,如下所示:(
E:\My Documents\Technical\Perl>perl "hello unicode.pl"
אני רוצה לישון
לישון
�ן
Intermediary
היא רוצה לישון
לישון
�ן
אתם, הם
�ם
Bye
Hello, world!
test
我将所有内容保存为 UTF-8)。
这实在是太奇怪了。有什么建议吗?
(这不是“Console2”问题* - 同样的问题出现在“常规”Windows 控制台上,只是在那里您看不到希伯来语字形)。
* 使用“Console”(也称为“Console2”) - 这是一个不错的小实用程序,可以通过 Windows 控制台使用 Unicode - 例如,请参见此处: http://www.hanselman.com/blog/Console2ABetterWindowsCommandPrompt.aspx
** 注意:在控制台,您当然不得不说:
chcp 65001
I am encountering a strange problem in printing Unicode strings to the Windows console*.
Consider this text:
אני רוצה לישון
Intermediary
היא רוצה לישון
אתם, הם
Bye
Hello, world!
test
Assume it's in a file called "file.txt".
When I go*: "type file.txt", it prints out fine. But when it's printed from a Perl program, like this:
use strict;
use warnings;
use Encode;
use 5.014;
use utf8;
use autodie;
use warnings qw< FATAL utf8 >;
use open qw< :std :utf8 >;
use feature qw< unicode_strings >;
use warnings 'all';
binmode STDOUT, ':utf8'; # output should be in UTF-8
my $word;
my @array = ( 'אני רוצה לישון', 'Intermediary',
'היא רוצה לישון', 'אתם, הם', 'Bye','Hello, world!', 'test');
foreach $word(@array) {
say $word;
}
The Unicode lines (Hebrew in this case) show up again each time, partially broken, like this:
E:\My Documents\Technical\Perl>perl "hello unicode.pl"
אני רוצה לישון
לישון
�ן
Intermediary
היא רוצה לישון
לישון
�ן
אתם, הם
�ם
Bye
Hello, world!
test
(I save everything in UTF-8).
This is mighty strange. Any suggestions?
(It's not a "Console2" problem* - the same problem shows up on a "regular" windows console, only there you don't see the Hebrew glyphs).
* Using "Console" (also called "Console2") - it's a nice little utility which enables working with Unicode with the Windows console - see, for example, here:
http://www.hanselman.com/blog/Console2ABetterWindowsCommandPrompt.aspx
** Note: at the console, you have to say, of course:
chcp 65001
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您是否尝试过 perlmonk 的解决方案?
它也使用
:unix
来避免控制台缓冲区。这是该链接中的代码:
Did you try the solution from perlmonk ?
It use
:unix
as well to avoid the console buffer.This is the code from that link:
伙计们:继续研究 Perlmonks 的帖子,结果发现这更整洁、更好:
替换:
使用 Win32::API;
和:
与:
和:
保持其他一切不变。
这更符合 Perl 简洁和神奇的精神。
Guys: continuing on studying that Perlmonks post, turns out that this is even neater and nicer:
replace:
use Win32::API;
and:
with:
and:
Leaving all else intact.
This is even more in the spirit of Perl conciseness and magic.
您还可以使用 Win32::Unicode::Console 或 < a href="http://search.cpan.org/perldoc?Win32%3a%3aUnicode%3a%3aNative" rel="nofollow">Win32::Unicode::Native 在 Windows 控制台上实现 unicode 打印。
You can also utilize Win32::Unicode::Console or Win32::Unicode::Native to achieve unicode prints on windows console.
此外,使用 ConEmu 时不会出现此行为,这还可以在 Windows 命令控制台中启用适当的 Unicode 支持。
Also, this behaviour is not present while using ConEmu, which also enables proper Unicode support in Windows' command console.