PHP preg_replace - h1 标签内不匹配

发布于 2024-08-26 21:37:37 字数 739 浏览 4 评论 0原文

如果在长 HTML 字符串中找到关键字，我将使用 preg_replace 添加指向关键字的链接。如果在 h1 标签或强标签中找到关键字，我不想添加链接。

下面的正则表达式几乎可以工作，基本上说（我认为）：如果关键字没有立即被 h1 标签或强标签包裹，则替换为匹配的关键字，作为指向 google 的粗体链接。

$result = preg_replace('%(?!<h1>)(?!<strong>)\b(bobs widgets)\b(?!<\/strong>)(?!<\/h1>)%i','<a href="http://www.google.com"><strong>$1</strong></a>', $result, -1);

（我不想在强标签中匹配 if 的原因是因为我递归了很多关键字，所以不想在后续传递中链接已经链接的关键字）

上面的工作正常并且不会匹配

<h1>bobs widgets</h1>

：然而，将匹配以下文本中的关键字，因为 h1 标签并不紧接在关键字的两侧：

<h1>Here are bobs widgets for sale</h1>

我需要将两侧的空格设为可选，并尝试添加 \s* 但这对我没有任何帮助。我将非常感谢在这里朝着正确的方向推动。

原文

I am using preg_replace to add a link to keywords if they are found within a long HTML string. I don't want to add a link if the keyword is found within h1 tags or strong tags.

The below regex nearly works and basically says (I think): If the keyword is not immediately wrapped by either a h1 tag or a strong tag then replace with the keyword that was matched, as a bolded link to google.

$result = preg_replace('%(?!<h1>)(?!<strong>)\b(bobs widgets)\b(?!<\/strong>)(?!<\/h1>)%i','<a href="http://www.google.com"><strong>$1</strong></a>', $result, -1);

(the reason I don't want to match if in strong tags is because I am recursing through a lot of keywords so don't want to link an already linked keyword on subsequent passes)

the above works fine and won't match:

<h1>bobs widgets</h1>

It will however match the keyword in the following text, because the h1 tag isn't immediately either side of the keyword:

<h1>Here are bobs widgets for sale</h1>

I need to make the spaces either side optional and have tried adding \s* but that doesn't get me anywhere. I'd be very grateful for a push in the right direction here.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

挽心 2024-09-02 21:37:37

正则表达式不适合这项工作。这在 Stack Overflow 上已经讨论过很多次了（例如网站上最著名的帖子）。

您需要的是一个 HTML 解析器，例如简单 HTML DOM 解析器。帮自己一个忙，从一开始就使用这样的东西。想象一下，当您遇到

时会发生什么，其中有人添加了属性，或者可能有人不正确地关闭了标签，因此您在 `< 上出现了混乱的顺序;/strong>` 和

。让这样的事情与正则表达式一起工作是不值得的，有时甚至是不可能的。

回复收藏 0 原文

久光 2024-09-02 21:37:37

...只要记住，这种方法最终会导致悲伤，你需要开始寻找更好的方法。一种方法是使用“tidy”将 html 修复为可解析的 xml，然后 php 提供一些 xml 操作 API 来处理数据。

无论如何，这是一个答案。

您可以添加一些通配符来代替单词边界。像这样的事情应该可以解决问题：

([^<>]*)(bobs widgets)([^<>]*)

然后，添加更多替换标记以将文本的其余部分保留在输出中：

'$1<a href="http://www.google.com"><strong>$2</strong></a>$3'

现在点击“保存”并隐藏在沙发后面；）

... just remember that eventually this approach will lead to sadness, and you'll need to start looking for a better approach. One way is to use 'tidy' to fix up your html into parseable xml, and then php offers a few xml manipulation APIs to work with the data.

Here's an answer anyway.

You can add some wildcards instead of the word boundaries. Something like this should do the trick:

([^<>]*)(bobs widgets)([^<>]*)

Then, add some more replacement markers to keep the remainder of your text in the output:

'$1<a href="http://www.google.com"><strong>$2</strong></a>$3'

Now hit save and hide behind the sofa ;)

回复收藏 0 原文

~没有更多了~

关于作者

ぺ禁宫浮华殁

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

PHP preg_replace - h1 标签内不匹配

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

时会发生什么，其中有人添加了属性，或者可能有人不正确地关闭了标签，因此您在 `< 上出现了混乱的顺序;/strong>` 和

关于作者

相关话题

热门标签

推荐作者

linfzu01

§对你不离不弃

可遇━不可求

枕梦

qq_3LFa8Q

JP

友情链接

PHP preg_replace - h1 标签内不匹配

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

时会发生什么，其中有人添加了属性，或者可能有人不正确地关闭了标签，因此您在 < 上出现了混乱的顺序;/strong> 和

关于作者

相关话题

热门标签

推荐作者

linfzu01

§对你不离不弃

可遇━不可求

枕梦

qq_3LFa8Q

JP

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

时会发生什么，其中有人添加了属性，或者可能有人不正确地关闭了标签，因此您在 `< 上出现了混乱的顺序;/strong>` 和