为什么 XML::LibXML 即使在禁用它们后仍会保留打印错误?

发布于 2024-09-26 10:39:59 字数 1205 浏览 6 评论 0原文

我正在使用 XML::LibXML 来解析文档。

它后面的 HTML 文件有一些小错误,解析器会报告它们:

http://is.gd/create.php?longurl=http://google.com:15: validity error : ID smallink already defined
nal URL was <a href="http://google.com">http://google.com</a><span id="smallink"
                                                                                ^
http://is.gd/create.php?longurl=http://google.com:15: validity error : ID smallink already defined
and use <a href="http://is.gd/fNqtL-">http://is.gd/fNqtL-</a><span id="smallink"
                                                                                ^

但是,我禁用了错误报告:

my $parser = XML::LibXML->new();
$parser->set_options({ recover           => 2,
                       validation        => 0,
                       suppress_errors   => 1,
                       suppress_warnings => 1,
                       pedantic_parser   => 0,
                       load_ext_dtd      => 0, });

my $doc = $parser->parse_html_file("http://is.gd/create.php?longurl=$url");

抑制这些错误的唯一选择是使用 2>/dev/null,这是我不想要的。有人可以帮我消除这些错误吗?

I'm using XML::LibXML to parse a document.

The HTML file behind it, has some minor errors, and the parser reports them:

http://is.gd/create.php?longurl=http://google.com:15: validity error : ID smallink already defined
nal URL was <a href="http://google.com">http://google.com</a><span id="smallink"
                                                                                ^
http://is.gd/create.php?longurl=http://google.com:15: validity error : ID smallink already defined
and use <a href="http://is.gd/fNqtL-">http://is.gd/fNqtL-</a><span id="smallink"
                                                                                ^

However, I disabled error reporting:

my $parser = XML::LibXML->new();
$parser->set_options({ recover           => 2,
                       validation        => 0,
                       suppress_errors   => 1,
                       suppress_warnings => 1,
                       pedantic_parser   => 0,
                       load_ext_dtd      => 0, });

my $doc = $parser->parse_html_file("http://is.gd/create.php?longurl=$url");

My only option to suppress those errors, is to run the script with 2>/dev/null, which I don't want. Could someone help me please get rid of those errors?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

寄人书 2024-10-03 10:39:59

我不知道您是否正确要求 XML::LibXML 不要打印其警告。我假设您是,并且这是 XML::LibXML 中的一个错误(您也应该向作者报告),并且仅解决如何抑制警告。

每次要打印警告时,perl 都会查找 $SIG{__WARN__} 的值,如果该值包含代码引用,则调用它而不是打印警告本身。

您可以使用该方法停止将要忽略的警告打印到 STDERR。但是,您应该小心这一点。确保仅抑制误报,而不是所有警告。警告通常很有用。另外,请确保将 $SIG{__WARN__} 的使用本地化到尽可能小的范围,以避免奇怪的副作用。

# warnings happen just as always
my $parser = ...;
$parser->set_options(...);

{ # in this scope we filter some warnings
    local $SIG{__WARN__} = sub {
        my ($warning) = @_;
        print STDERR $warning if $warning !~ /validity error/;
    };

    $parser->parse_html_file(...);
}

# more code, now the warnings are back to normal again

另请注意,这一切都假设这些警告来自 perl-space。 libxml2(XML::LibXML 在底层使用的 C 库)很可能将警告直接写入 stderr 本身。 $SIG{__WARN__} 将无法阻止它这样做。

I have no idea if you're asking XML::LibXML corretly to not print its warnings. I'll assume you are and this is a bug in XML::LibXML (which you should also report to the author), and only address how to suppress warnings.

Every time a warning is about to be printed, perl will look up the value of $SIG{__WARN__} and, if that contains a code reference, invoke it instead of printing the warning itself.

You can use that stop the warnings you want to ignore to be printed to STDERR. However, you should be careful with this. Make sure to only suppress false-positives, not all warnings. Warnings are usually useful. Also, make sure to localize your use of $SIG{__WARN__} to the smallest possible scope to avoid odd side effects.

# warnings happen just as always
my $parser = ...;
$parser->set_options(...);

{ # in this scope we filter some warnings
    local $SIG{__WARN__} = sub {
        my ($warning) = @_;
        print STDERR $warning if $warning !~ /validity error/;
    };

    $parser->parse_html_file(...);
}

# more code, now the warnings are back to normal again

Also note that this is all assuming those warnings come from perl-space. It's quite possible that libxml2, the C library XML::LibXML uses under the hood, writes warnings directly to stderr itself. $SIG{__WARN__} will not be able to prevent it from doing that.

記柔刀 2024-10-03 10:39:59

一个可能的解决方案是安装一个 $SIG{__WARN__} 处理程序来过滤消息或只是消除所有警告:

local $SIG{__WARN__} = sub { /* $_[0] is the message */ };

A possible solution is to install a $SIG{__WARN__} handler which filters the messages or just silences all warnings:

local $SIG{__WARN__} = sub { /* $_[0] is the message */ };
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文