的正则表达式标签更换

发布于 2024-09-12 14:14:16 字数 680 浏览 4 评论 0 原文

我是正则表达式的新手，但我正在努力学习它。我想删除html文本的标签，只保留内部文本。类似这样的事情：

Original: Lorem ipsum <a href="http://www.google.es">Google</a> Lorem ipsum <a href="http://www.bing.com">Bing</a>
Result:  Lorem ipsum Google Lorem ipsum Bing

我正在使用这段代码：

$patterns = array( "/(<a href=\"[a-z0-9.:_\-\/]{1,}\">)/i", "/<\/a>/i");
$replacements = array("", "");

$text = 'Lorem ipsum <a href="http://www.google.es">Google</a> Lorem ipsum <a href="http://www.bing.com">Bing</a>';
$text = preg_replace($patterns,$replacements,$text);

它有效，但我不知道这段代码是否更有效或更具可读性。

我可以以某种方式改进代码吗？

原文

I'm new to regular expressions, but I'm trying to learn about it. I want to remove the tag of a html text, and let only the inner text. Something like that:

Original: Lorem ipsum <a href="http://www.google.es">Google</a> Lorem ipsum <a href="http://www.bing.com">Bing</a>
Result:  Lorem ipsum Google Lorem ipsum Bing

I'm using this code:

$patterns = array( "/(<a href=\"[a-z0-9.:_\-\/]{1,}\">)/i", "/<\/a>/i");
$replacements = array("", "");

$text = 'Lorem ipsum <a href="http://www.google.es">Google</a> Lorem ipsum <a href="http://www.bing.com">Bing</a>';
$text = preg_replace($patterns,$replacements,$text);

It works, but I don't know if this code is the more efficient or the more readable.

Can I improve the code in some way?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

乞讨 2024-09-19 14:14:17

不要使用正则表达式，请改用 DOM 解析器。

回复收藏 0 原文

极致的悲 2024-09-19 14:14:17

如果您的内容仅包含锚标记，那么 strip_tags 可能更容易使用。

如果 a 和 href 之间存在虚假空格，或者标记中存在任何其他属性，则 preg_replace 将不会替换。

回复收藏 0 原文

国际总奸 2024-09-19 14:14:17

在这种情况下，使用正则表达式并不是一个好主意。话虽如此：

<?php
    $text = 'Lorem ipsum <a href="http://www.google.es">Google</a> Lorem ipsum <a href="http://www.bing.com">Bing</a>';
    $text = preg_replace(
        '@\\<a\\b[^\\>]*\\>(.*?)\\<\\/a\\b[^\\>]*\\>@',
        '\\1',
        $text
    );
    echo $text;
    // Lorem ipsum Google Lorem ipsum Bing
?>

这是一个非常简单的正则表达式，它不是防弹的。

In this case, using regex is not a good idea. Having said that:

<?php
    $text = 'Lorem ipsum <a href="http://www.google.es">Google</a> Lorem ipsum <a href="http://www.bing.com">Bing</a>';
    $text = preg_replace(
        '@\\<a\\b[^\\>]*\\>(.*?)\\<\\/a\\b[^\\>]*\\>@',
        '\\1',
        $text
    );
    echo $text;
    // Lorem ipsum Google Lorem ipsum Bing
?>

This is a very trivial regex, its not bullet proof.

回复收藏 0 原文