Perl:将 Unicode 字符串打印到 Windows 控制台

发布于 2025-01-07 02:54:55 字数 1426 浏览 0 评论 0原文

我在将 Unicode 字符串打印到 Windows 控制台* 时遇到一个奇怪的问题。

考虑以下文本:

אני רוצה לישון

Intermediary

היא רוצה לישון
אתם, הם
Bye
Hello, world!
test

假设它位于名为“file.txt”的文件中。

当我 go*: "type file.txt" 时,它打印得很好。但是当它从 Perl 程序打印时,如下所示:

 use strict;
 use warnings;
 use Encode;
 use 5.014;
 use utf8;
 use autodie;
 use warnings    qw< FATAL  utf8     >;
 use open        qw< :std  :utf8     >;
 use feature     qw< unicode_strings >;
 use warnings 'all';

 binmode STDOUT, ':utf8';   # output should be in UTF-8
 my $word;
 my @array = ( 'אני רוצה לישון', 'Intermediary',
    'היא רוצה לישון', 'אתם, הם', 'Bye','Hello, world!', 'test');
 foreach $word(@array) {
    say $word;
 }

Unicode 行(在本例中为希伯来语)每次都会再次显示,部分损坏,如下所示:(

E:\My Documents\Technical\Perl>perl "hello unicode.pl"
אני רוצה לישון
לישון
�ן

Intermediary
היא רוצה לישון
לישון
�ן

אתם, הם
�ם

Bye
Hello, world!
test

我将所有内容保存为 UTF-8)。

这实在是太奇怪了。有什么建议吗?

(这不是“Console2”问题* - 同样的问题出现在“常规”Windows 控制台上,只是在那里您看不到希伯来语字形)。


* 使用“Console”(也称为“Console2”) - 这是一个不错的小实用程序,可以通过 Windows 控制台使用 Unicode - 例如,请参见此处: http://www.hanselman.com/blog/Console2ABetterWindowsCommandPrompt.aspx

** 注意:在控制台,您当然不得不说:

chcp 65001

I am encountering a strange problem in printing Unicode strings to the Windows console*.

Consider this text:

אני רוצה לישון

Intermediary

היא רוצה לישון
אתם, הם
Bye
Hello, world!
test

Assume it's in a file called "file.txt".

When I go*: "type file.txt", it prints out fine. But when it's printed from a Perl program, like this:

 use strict;
 use warnings;
 use Encode;
 use 5.014;
 use utf8;
 use autodie;
 use warnings    qw< FATAL  utf8     >;
 use open        qw< :std  :utf8     >;
 use feature     qw< unicode_strings >;
 use warnings 'all';

 binmode STDOUT, ':utf8';   # output should be in UTF-8
 my $word;
 my @array = ( 'אני רוצה לישון', 'Intermediary',
    'היא רוצה לישון', 'אתם, הם', 'Bye','Hello, world!', 'test');
 foreach $word(@array) {
    say $word;
 }

The Unicode lines (Hebrew in this case) show up again each time, partially broken, like this:

E:\My Documents\Technical\Perl>perl "hello unicode.pl"
אני רוצה לישון
לישון
�ן

Intermediary
היא רוצה לישון
לישון
�ן

אתם, הם
�ם

Bye
Hello, world!
test

(I save everything in UTF-8).

This is mighty strange. Any suggestions?

(It's not a "Console2" problem* - the same problem shows up on a "regular" windows console, only there you don't see the Hebrew glyphs).


* Using "Console" (also called "Console2") - it's a nice little utility which enables working with Unicode with the Windows console - see, for example, here:
http://www.hanselman.com/blog/Console2ABetterWindowsCommandPrompt.aspx

** Note: at the console, you have to say, of course:

chcp 65001

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

爱情眠于流年 2025-01-14 02:54:55

您是否尝试过 perlmonk 的解决方案?

它也使用 :unix 来避免控制台缓冲区。

这是该链接中的代码:

use Win32::API;

binmode(STDOUT, ":unix:utf8");

#Must set the console code page to UTF8
$SetConsoleOutputCP= new Win32::API( 'kernel32.dll', 'SetConsoleOutputCP', 'N','N' );
$SetConsoleOutputCP->Call(65001);

$line1="\x{2554}".("\x{2550}"x15)."\x{2557}\n";
$line2="\x{2551}".(" "x15)."\x{2551}\n";
$line3="\x{255A}".("\x{2550}"x15)."\x{255D}";
$unicode_string=$line1.$line2.$line3;

print "THIS IS THE CORRECT EXAMPLE OUTPUT IN PURE PERL: \n";
print $unicode_string;

Did you try the solution from perlmonk ?

It use :unix as well to avoid the console buffer.

This is the code from that link:

use Win32::API;

binmode(STDOUT, ":unix:utf8");

#Must set the console code page to UTF8
$SetConsoleOutputCP= new Win32::API( 'kernel32.dll', 'SetConsoleOutputCP', 'N','N' );
$SetConsoleOutputCP->Call(65001);

$line1="\x{2554}".("\x{2550}"x15)."\x{2557}\n";
$line2="\x{2551}".(" "x15)."\x{2551}\n";
$line3="\x{255A}".("\x{2550}"x15)."\x{255D}";
$unicode_string=$line1.$line2.$line3;

print "THIS IS THE CORRECT EXAMPLE OUTPUT IN PURE PERL: \n";
print $unicode_string;
爱殇璃 2025-01-14 02:54:55

伙计们:继续研究 Perlmonks 的帖子,结果发现这更整洁、更好:
替换:
使用 Win32::API;

和:

$SetConsoleOutputCP= new Win32::API( 'kernel32.dll', 'SetConsoleOutputCP', 'N','N' );
$SetConsoleOutputCP->Call(65001);

与:

use Win32::Console;

和:

 Win32::Console::OutputCP(65001);

保持其他一切不变。
这更符合 Perl 简洁和神奇的精神。

Guys: continuing on studying that Perlmonks post, turns out that this is even neater and nicer:
replace:
use Win32::API;

and:

$SetConsoleOutputCP= new Win32::API( 'kernel32.dll', 'SetConsoleOutputCP', 'N','N' );
$SetConsoleOutputCP->Call(65001);

with:

use Win32::Console;

and:

 Win32::Console::OutputCP(65001);

Leaving all else intact.
This is even more in the spirit of Perl conciseness and magic.

回心转意 2025-01-14 02:54:55

您还可以使用 Win32::Unicode::Console 或 < a href="http://search.cpan.org/perldoc?Win32%3a%3aUnicode%3a%3aNative" rel="nofollow">Win32::Unicode::Native 在 Windows 控制台上实现 unicode 打印。

You can also utilize Win32::Unicode::Console or Win32::Unicode::Native to achieve unicode prints on windows console.

请恋爱 2025-01-14 02:54:55

此外,使用 ConEmu 时不会出现此行为,这还可以在 Windows 命令控制台中启用适当的 Unicode 支持。

Also, this behaviour is not present while using ConEmu, which also enables proper Unicode support in Windows' command console.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文