在 HTML::Element 的 Look_down 例程 Perl 中指定多个类?

发布于 2024-11-19 23:12:46 字数 708 浏览 4 评论 0原文

我正在使用 HTML::TreeBuilder 来解析一些 HTML。

您可以在 'look_down' 例程?

例如,当使用 - 搜索 HTML 时,

for ( $tree->look_down( 'class' => 'postbody'))

我还要在同一循环中搜索附加类 'postprofile'

有没有一种方法可以做到这一点,而不必使用新的 -for ( $tree->look_down( 'class' => 'postprofile' ))

因为这会返回 2 组结果,而我只想要一组合并的结果。

我尝试使用 - for ( $tree->look_down( 'class' => 'postbody||postprofile')) 然而这没有用,

提前谢谢你。

I am using HTML::TreeBuilder to parse some HTML.

Can you specify multiple classes in the 'look_down' routine?

For in stance when searching through HTML using-

for ( $tree->look_down( 'class' => 'postbody'))

I also was to search for an additional class 'postprofile' in the same loop.

Is there a way of doing this without having to use a new -for ( $tree->look_down( 'class' => 'postprofile' ))

As this brings back 2 sets of results whereas I only want one merged set.

I tried using - for ( $tree->look_down( 'class' => 'postbody||postprofile'))

However this did not work,

Thank you in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

慕巷 2024-11-26 23:12:46

尝试使用模式而不是字符串,即

$tree->look_down( 'class' => qr/^(?:postbody|postprofile)$/)

Try using a pattern instead of a string, i.e.,

$tree->look_down( 'class' => qr/^(?:postbody|postprofile)$/)
⊕婉儿 2024-11-26 23:12:46

Jambo,我并不是想无礼,但请阅读手册。我添加了你的问题的链接。

我假设您没有阅读这些文档,因为您无法找到它们。让我们解决这个问题:

如何查找您需要的文档

在线:

  • search.cpan.org 是用于搜索的主要网站CPAN 模块及其文档。在那里可以找到很多东西。

  • perldoc.perl.org 拥有 Perl 几个最新版本的完整在线文档。

命令行:

  • perldoc 显示一个目录,列出您可以仔细阅读的文档的不同部分。

  • perldoc -f function 是一种搜索 perlfunc 并仅查看有关一个函数的信息的快速方法。这是一个超级方便的快速参考。

  • perldoc Module::Name::Here 将向您显示模块的文档。

  • perldoc perlpod 是阅读文档的一部分的示例,在本例中是有关 POD 格式化的文章。

我读什么?

这一切都很棒,但是你怎么知道该看哪里呢?我的意思是,我正在使用一个名为“look_down”的东西。文档在哪里?

在这种情况下,您可以看到“look_down”总是像这样 $somevar->look_down(blarg) 那样被调用。查找 $somevar 来自哪里。它是什么样的物体?最坏的情况是,您发现它是其他调用的结果,现在您必须找到该调用的文档并查看返回的内容。但步骤是一样的。递归地推过去。最终你会到达 my $tree = HTML::TreeBuilder->new_from_content() 或类似的东西。现在您可以阅读 HTML::TreeBuilder 中的 new_from_content 文档。嘿,我们得到了一个 HTML::Tree 对象,它是 HTML::Element 的子类!所以我们检查两个类。哇啊,look_down 位于 HTML::Element 中。

如果您有从其他模块导入的例程,这会有点棘手。希望您的代码作者足够体贴,明确列出了他的例程的来源:

use Some::Module qw( useful_sub  confusing_sub );

这意味着 useful_subconfusing_sub 来自 Some::Module;

如果您不幸,您的作者只写了 use Some::Module; 这意味着您将获得所有默认导出。这意味着您需要阅读文档才能了解导入的内容。

为了可维护性,您可以通过始终准确指定从函数导入哪些例程来减少这个噩梦。如果您不想导入任何内容,可以将其指定为: use Some::Module ();

在查找普通子名称时,记住它们可能是实际函数会有所帮助。所以不要忘记搜索 perldoc。

最后,我希望您发现这很有用。 R-ing TFM 是一种非常强大的技术,学习如何查找相关文档是释放这种力量的隐藏技能。 Perl 有大量文档需要翻阅,当您不知道去哪里查找时,这可能会令人生畏。

Jambo, I am not trying to be rude, but please read the manual. I added links to your question.

I am going to assume that you did not read the docs because you were unable to find them. Let's address that issue:

How to Find the Docs You Need

Online:

  • search.cpan.org is a main website used to search for CPAN modules and their documentation. Many things can be found there.

  • perldoc.perl.org has the complete shipping documentation online for several recent versions of Perl.

Command Line:

  • perldoc shows a table of contents listing different sections of documentation you can peruse.

  • perldoc -f function is a quick way to search perlfunc and see the information on only one function. This is a super handy quick reference.

  • perldoc Module::Name::Here will show you a module's documentation.

  • perldoc perlpod is a sample of reading a section of the docs, in this case the article on POD formatting.

Which thing do I read?

All this is great, but how do you know where to look? I mean, I've got this thing called "look_down" that I am using. Where are the docs?

In this case, you can see that "look_down" is always called like this $somevar->look_down(blarg). Find where $somevar comes from. What kind of object is it? Worst case, you found that it is the result of some other call, now you have to find the docs for THAT call and see what is returned. But the steps are the same. Recursively push on through. Eventually you'll get to my $tree = HTML::TreeBuilder->new_from_content() or something like that. Now you can read the new_from_content docs in HTML::TreeBuilder. Hey, we get a HTML::Tree object that is a subclass of HTML::Element! So we check both classes. Whoah, look_down is in HTML::Element.

This is a little trickier if you have routines that are imported from other modules. Hopefully the author of your code was considerate enough to explicitly list where his routines come from:

use Some::Module qw( useful_sub  confusing_sub );

This means that useful_sub and confusing_sub come from Some::Module;

If you are unlucky your author wrote only use Some::Module; which means you get all the default exports. Which means you need to read the docs to find out what was imported.

For maintainability's sake, you can reduce this nightmare by always specifying exactly what routines you import from a function. If you want to import NOTHING, you can specify that as: use Some::Module ();

When looking for plain sub-names, it helps to remember that they may be actual functions. So don't forget to search perldoc.

In closing, I hope you find this useful. R-ing TFM is an amazingly powerful technique, and learning how to find relevant docs is the hidden skill that unlocks the power. Perl has a ton of docs to wade through, and it can be intimidating when you don't know where to look.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文