php 正则表达式匹配 html 标签之外

发布于 2024-12-11 17:16:39 字数 568 浏览 0 评论 0原文

我正在 html 页面上制作 preg_replace 。我的模式旨在为 html 中的某些单词添加周围标签。然而，有时我的正则表达式会修改 html 标签。例如，当我尝试替换此文本时：

<a href="example.com" alt="yasar home page">yasar</a>

以便 yasar 读取 yasar ，我的正则表达式还替换了锚标记的 alt 属性中的 yasar。当前我使用的 preg_replace() 看起来像这样：

preg_replace("/(asf|gfd|oyws)/", '<span class=something>${1}</span>',$target);

How can I make a Regular Expression, so it does not match everything inside a html tag?

原文

I am making a preg_replace on html page. My pattern is aimed to add surrounding tag to some words in html. However, sometimes my regular expression modifies html tags. For example, when I try to replace this text:

<a href="example.com" alt="yasar home page">yasar</a>

So that yasar reads <span class="selected-word">yasar</span> , my regular expression also replaces yasar in alt attribute of anchor tag. Current preg_replace() I am using looks like this:

preg_replace("/(asf|gfd|oyws)/", '<span class=something>${1}</span>',$target);

How can I make a regular expression, so that it doesn't match anything inside a html tag?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

似最初 2024-12-18 17:16:39

您可以为此使用断言，因为您只需确保搜索的单词出现在 > 之后或任何 < 之前。后一个测试更容易完成，因为前瞻断言可以是可变长度：

/(asf|foo|barr)(?=[^>]*(<|$))/

另请参阅 http://www.regular-expressions.info /lookaround.html 对该断言语法有很好的解释。

You can use an assertion for that, as you just have to ensure that the searched words occur somewhen after an >, or before any <. The latter test is easier to accomplish as lookahead assertions can be variable length:

/(asf|foo|barr)(?=[^>]*(<|$))/

See also http://www.regular-expressions.info/lookaround.html for a nice explanation of that assertion syntax.

回复收藏 0 原文

还给你自由 2024-12-18 17:16:39

Yasar，重新提出这个问题，因为它有另一个未提及的解决方案。

此解决方案不是仅检查下一个标记字符是否为开始标记，而是跳过所有 <完整标记>。

关于使用正则表达式解析 html 的所有免责声明，下面是正则表达式：

<[^>]*>(*SKIP)(*F)|word1|word2|word3

这是一个演示。在代码中，它看起来像这样：

$target = "word1 <a skip this word2 >word2 again</a> word3";
$regex = "~<[^>]*>(*SKIP)(*F)|word1|word2|word3~";
$repl= '<span class="">\0</span>';
$new=preg_replace($regex,$repl,$target);
echo htmlentities($new);

这是此代码的在线演示。

参考

Yasar, resurrecting this question because it had another solution that wasn't mentioned.

Instead of just checking that the next tag character is an opening tag, this solution skips all <full tags>.

With all the disclaimers about using regex to parse html, here is the regex:

<[^>]*>(*SKIP)(*F)|word1|word2|word3

Here is a demo. In code, it looks like this:

$target = "word1 <a skip this word2 >word2 again</a> word3";
$regex = "~<[^>]*>(*SKIP)(*F)|word1|word2|word3~";
$repl= '<span class="">\0</span>';
$new=preg_replace($regex,$repl,$target);
echo htmlentities($new);

Here is an online demo of this code.

Reference

回复收藏 0 原文

断爱 2024-12-18 17:16:39

这可能是您想要的东西： http://snipplr.com/view/3618/< /a>
一般来说，我建议不要这样做。更好的选择是去掉所有 HTML 标签，转而依赖 BBcode，例如：

[b]bold text[b] [i]italic text[i]

但是，我意识到这可能不太适合您想要做的事情。

另一种选择可能是 HTML Purifier，请参阅：http://htmlpurifier.org/

This might be the kind of thing that you're after: http://snipplr.com/view/3618/
In general, I'd advise against such. A better alternative is to strip out all HTML tags and instead rely on BBcode, such as:

[b]bold text[b] [i]italic text[i]

However I appreciate that this might not work well with what you're trying to do.

Another option may be HTML Purifier, see: http://htmlpurifier.org/

回复收藏 0 原文

我做我的改变 2024-12-18 17:16:39

从我的想法来看，这应该有效：

echo preg_replace("/<(.*)>(.*)<\/(.*)>/i","<$1><span class=\"some-class\">$2</span></$3>",$target);

但是，我不知道这有多安全。我只是提出一种可能性:)

From top of my mind, this should be working:

echo preg_replace("/<(.*)>(.*)<\/(.*)>/i","<$1><span class=\"some-class\">$2</span></$3>",$target);

But, I don't know how safe this would be. I am just presenting a possibility :)

回复收藏 0 原文

~没有更多了~

关于作者

梅倚清风

暂无简介

0 文章

0 评论

25 人气

关注发私信

友情链接

文江博客

php 正则表达式匹配 html 标签之外

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

php 正则表达式匹配 html 标签之外

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。