来自编码函数的检查参数

发布于 2024-10-18 17:26:31 字数 1009 浏览 2 评论 0原文

为什么我从第二个循环(CHECK 参数集)得到不同的输出?

#!/usr/bin/env perl
use warnings;
use 5.012;
use Encode qw(encode);
my $s = 'a';

for my $encoding ( 'iso-8859-1', 'iso-8859-15', 'cp1252', 'cp850' ) {
    my $encoded = encode( $encoding, $s );
    my $c = unpack '(B8)*', $encoded;
    printf "%-12s:\t%8s\n", $encoding, $c;
}

say "-------------------";

for my $encoding ( 'iso-8859-1', 'iso-8859-15', 'cp1252', 'cp850' ) {
    my $encoded = encode( $encoding, $s, Encode::FB_WARN );
    my $c = unpack '(B8)*', $encoded;
    printf "%-12s:\t%8s\n", $encoding, $c;
}


# iso-8859-1  :   01100001
# iso-8859-15 :   01100001
# cp1252      :   01100001
# cp850       :   01100001
# -------------------
# iso-8859-1  :   01100001
# Use of uninitialized value $c in printf at ./perl1.pl line 20.
# iso-8859-15 :           
# Use of uninitialized value $c in printf at ./perl1.pl line 20.
# cp1252      :           
# Use of uninitialized value $c in printf at ./perl1.pl line 20.
# cp850       :   

Why do I get from the second loop (CHECK-argument set) a different output?

#!/usr/bin/env perl
use warnings;
use 5.012;
use Encode qw(encode);
my $s = 'a';

for my $encoding ( 'iso-8859-1', 'iso-8859-15', 'cp1252', 'cp850' ) {
    my $encoded = encode( $encoding, $s );
    my $c = unpack '(B8)*', $encoded;
    printf "%-12s:\t%8s\n", $encoding, $c;
}

say "-------------------";

for my $encoding ( 'iso-8859-1', 'iso-8859-15', 'cp1252', 'cp850' ) {
    my $encoded = encode( $encoding, $s, Encode::FB_WARN );
    my $c = unpack '(B8)*', $encoded;
    printf "%-12s:\t%8s\n", $encoding, $c;
}


# iso-8859-1  :   01100001
# iso-8859-15 :   01100001
# cp1252      :   01100001
# cp850       :   01100001
# -------------------
# iso-8859-1  :   01100001
# Use of uninitialized value $c in printf at ./perl1.pl line 20.
# iso-8859-15 :           
# Use of uninitialized value $c in printf at ./perl1.pl line 20.
# cp1252      :           
# Use of uninitialized value $c in printf at ./perl1.pl line 20.
# cp850       :   

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

热鲨 2024-10-25 17:26:31

文档中描述了该行为(请参阅下面的片段) - 它修改数据并将未处理的部分留在$s中。由于没有错误,它基本上清除了你的变量。

*CHECK* = Encode::FB_QUIET
  If *CHECK* is set to Encode::FB_QUIET, (en|de)code will immediately
  return the portion of the data that has been processed so far when an
  error occurs. The data argument will be overwritten with everything
  after that point (that is, the unprocessed part of data). This is
  handy when you have to call decode repeatedly in the case where your
  source data may contain partial multi-byte character sequences, (i.e.
  you are reading with a fixed-width buffer). Here is a sample code that
  does exactly this:

    my $buffer = ''; my $string = '';
    while(read $fh, $buffer, 256, length($buffer)){
      $string .= decode($encoding, $buffer, Encode::FB_QUIET);
      # $buffer now contains the unprocessed partial character
    }

*CHECK* = Encode::FB_WARN
  This is the same as above, except that it warns on error. Handy when
  you are debugging the mode above.

The behavior is described in documentation (see snip below) - it modifies data and leaves unprocessed portion in $s. Since there is no error, it basically clears your variable.

*CHECK* = Encode::FB_QUIET
  If *CHECK* is set to Encode::FB_QUIET, (en|de)code will immediately
  return the portion of the data that has been processed so far when an
  error occurs. The data argument will be overwritten with everything
  after that point (that is, the unprocessed part of data). This is
  handy when you have to call decode repeatedly in the case where your
  source data may contain partial multi-byte character sequences, (i.e.
  you are reading with a fixed-width buffer). Here is a sample code that
  does exactly this:

    my $buffer = ''; my $string = '';
    while(read $fh, $buffer, 256, length($buffer)){
      $string .= decode($encoding, $buffer, Encode::FB_QUIET);
      # $buffer now contains the unprocessed partial character
    }

*CHECK* = Encode::FB_WARN
  This is the same as above, except that it warns on error. Handy when
  you are debugging the mode above.
掀纱窥君容 2024-10-25 17:26:31

CHECK 设置为 Encode::FB_QUIET 时,数据参数将被覆盖:

perl -MEncode -Mutf8 -E '$s="a"; encode("utf-8", $s, Encode::FB_WARN); say $s'

When CHECK is set to Encode::FB_QUIET, the data argument is overwritten:

perl -MEncode -Mutf8 -E '$s="a"; encode("utf-8", $s, Encode::FB_WARN); say $s'
我纯我任性 2024-10-25 17:26:31

您可以通过 Encode::LEAVE_SRC 中的 or 来防止覆盖

my $encoded = encode( $encoding, $s, Encode::FB_WARN | Encode::LEAVE_SRC);

You can prevent the overwriting by oring in Encode::LEAVE_SRC

my $encoded = encode( $encoding, $s, Encode::FB_WARN | Encode::LEAVE_SRC);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文