正则表达式 - 抓取特定标签内的特定单词

发布于 2024-09-11 01:24:44 字数 648 浏览 1 评论 0原文

我不认为自己是 PHP“菜鸟”,但正则表达式对我来说仍然很陌生。

我正在做一个 CURL,在那里我收到了评论列表。每个评论都有这样的 HTML 结构:

<div class="comment-text">the comment</div>

我想要的很简单:我想从 preg_match_all 获取在此特定 DIV 标记中包含“cool”一词的评论。

到目前为止我所拥有的:

preg_match_all("#<div class=\"comment-text\">\bcool\b</div>#Uis", $getcommentlist, $matchescomment);

遗憾的是,这不起作用。但如果正则表达式只是#\bcool\b#Uis,它就会起作用。但我真的很想在这些标签中捕捉到“酷”这个词。

我知道我可以做 2 个正则表达式(一个获取所有评论,另一个过滤每个评论以捕获“酷”一词),但我想知道如何在一个 preg_match_all 中做到这一点?

我认为我距离解决方案并不遥远,但不知何故我就是找不到它。肯定是少了点什么。

感谢您抽出时间。

I don't consider myself a PHP "noob", but regular expressions are still new to me.

I'm doing a CURL where I receive a list of comments. Every comment has this HTML structure:

<div class="comment-text">the comment</div>

What I want is simple: I want to get, from a preg_match_all, the comments that have the word "cool" in this specific DIV tag.

What I have so far:

preg_match_all("#<div class=\"comment-text\">\bcool\b</div>#Uis", $getcommentlist, $matchescomment);

Sadly, this doesn't work. But if the REGEX is simply #\bcool\b#Uis, it will work. But I really want to capture the word "cool" in those tags.

I know I could do 2 regular expressions (one that gets all the comments, the other that filters each of them to capture the word "cool"), but I was wondering how could I do this in one preg_match_all?

I don't think I'm far from the solution, but somehow I just can't find it. Something's definitely missing.

Thank you for your time.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

鯉魚旗 2024-09-18 01:24:46

这应该会为您提供所需的内容,并在您想稍微更改一些内容时提供一些灵活性:

$input = '<div class="comment-text">the comment</div><div class="comment-text">cool</div><div class="comment-text">this one is cool too</div><div class="comment-text">ool</div>';
$class="comment-text";
$text="cool";
$pattern = '#<div class="'.$class.'">([^<]*'.$text.'[^<]*)</div>#s';
preg_match_all($pattern, $input, $matches);

显然,您需要将输入设置为 $input 的值。运行后,匹配的

数组将位于 $matches[0] 中,匹配的文本数组将位于 >$matches[1]

您可以通过更改 $class$text 值来更改要匹配的 div 类或需要的 div 内文本, 分别。

This should give you what you're looking for, and provide some flexibility if you want to change things a bit:

$input = '<div class="comment-text">the comment</div><div class="comment-text">cool</div><div class="comment-text">this one is cool too</div><div class="comment-text">ool</div>';
$class="comment-text";
$text="cool";
$pattern = '#<div class="'.$class.'">([^<]*'.$text.'[^<]*)</div>#s';
preg_match_all($pattern, $input, $matches);

Obviously, you need to set your input as the value for $input. After this runs, an array of the <div>s that matched will be in $matches[0] and an array of the text that matched will be in $matches[1]

You can change the class of div to match or the within-div text to require by changing the $class and $text values, respectively.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文