当前位置：文江博客话题详情

在 Hpricot 中将 id 与正则表达式进行匹配的最佳方法是什么？

发布于 2024-08-13 03:18:51 字数 93 浏览 8 评论 0原文

使用 apricot，可以很容易地看到如何使用 CSS 选择器提取具有给定 id 或类的所有元素。是否可以根据这些元素的某些属性是否与某些正则表达式匹配来从文档中提取元素？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

赢得她心 2024-08-20 03:18:51

如果你的意思是做类似的事情：

doc.search("//div[@id=/regex/]")

那么我认为这是不可能完成的。另一种方法是查找所有元素，然后迭代结果，删除那些与正则表达式不匹配的元素。

result = doc.search("//div")
result.delete_if (|x| x.to_s !~ /regex/)

有很多替代方法。该线程还有另外两个建议： Hpricot 和正则表达式。

请注意，根据您要匹配的内容，您也许可以使用 Hpricot Wiki，例如：

E[@foo$=“bar”]

匹配“foo”的 E 元素
属性值完全以
字符串“酒吧”

If you mean do something like:

doc.search("//div[@id=/regex/]")

then I don't think it can be done. The alternative is to find all elements and then iterate through the results deleting those that don't match a regex.

result = doc.search("//div")
result.delete_if (|x| x.to_s !~ /regex/)

There are lots of alternative approaches. This thread has two other suggestions: Hpricot and Regular Expression.

Note, depending on exactly what it is you are trying to match you may be able to use the "Supported, but different" syntaxes available on the Hpricot Wiki, e.g: