当前位置：文江博客话题详情

正则表达式查找和替换 HTML 注释标签的内容

发布于 2024-07-11 22:22:04 字数 453 浏览 10 评论 0原文

我有一个 CMS，它使用基于 HTML 注释的语法来让用户插入 Flash 视频播放器、幻灯片和其他用户无法轻松编写的“硬”代码。

一部 FLV 电影的语法如下所示：

我使用以下代码：

$find_players = preg_match("/

如果只有一名玩家，这非常有用，$match[1] 包含文件名（这就是我所需要的）

我对正则表达式的了解正在消失，所以我无法调整它以获取多于一场比赛。

如果页面上有更多内容，它会完全崩溃，因为它匹配得太贪婪了（从第一个

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

巷雨优美回忆 2024-07-18 22:22:04

您可能需要正则表达式修饰符 U（PCRE_UNGREEDY，不贪婪地匹配）。这将获取尽可能短的匹配，这意味着您不会从第一个的结尾进行匹配。

一个简短的示例：

<?php
$text = "blah\n<!-x=abc->blah<!-x=def->blah\n\nblah<!-x=ghi->\nblahblah" ;
$reg  = "/<!-x=(.*)->/U" ;
preg_match_all( $reg, $text, $matches ) ;
print_r( $matches ) ;

您的代码将变为：

$find_players = preg_match_all("/<!--PLAYER=(.*)-->/Ui", $html_content, $matches);
// print $matches[1] ;

您正在使用的 's' 修饰符 (PCRE_DOTALL) 可能也没有帮助；你不太可能有一个带有换行符的文件名。

编辑：@Stevens 建议使用这种语法，我同意这种语法稍微清晰一些 - 将 U 修饰符移动到捕获括号。

$find_players = preg_match_all("/<!--PLAYER=(?U)(.*)-->/i", $html_content, $matches);

You probably want the regex modifier U (PCRE_UNGREEDY, to match ungreedily). This will fetch the shortest possible match, meaning that you won't match from the beginning of the first

An abbreviated example:

<?php
$text = "blah\n<!-x=abc->blah<!-x=def->blah\n\nblah<!-x=ghi->\nblahblah" ;
$reg  = "/<!-x=(.*)->/U" ;
preg_match_all( $reg, $text, $matches ) ;
print_r( $matches ) ;

Your code then becomes:

$find_players = preg_match_all("/<!--PLAYER=(.*)-->/Ui", $html_content, $matches);
// print $matches[1] ;

The 's' modifier (PCRE_DOTALL) you're using probably isn't helpful, either; you're unlikely to have a filename with a linebreak in it.

EDIT: @Stevens suggests this syntax, which I agree is slightly clearer - moving the U modifier to the capturing parentheses.

$find_players = preg_match_all("/<!--PLAYER=(?U)(.*)-->/i", $html_content, $matches);

回复收藏 0 原文

空城仅有旧梦在 2024-07-18 22:22:04

使用正则表达式时，使用更具体的表达式通常比使用“惰性点”性能更高，“惰性点”通常会导致过度回溯。您可以使用负向前瞻来实现相同的结果，而不会使正则表达式引擎负担过重：

$find_players = preg_match("/<!--PLAYER=((?:[^-]+|-(?!->))*)-->/ig", $html_content, $match);

请注意，使用惰性点不太可能在像这样的简单情况下导致明显的问题，但始终告诉正则表达式引擎<强>正是你的意思。在这种情况下，您希望收集尽可能多的字符（“贪婪”）而不传递注释终止符。终止符是一个破折号，后跟另一个破折号和一个大于号。因此，我们允许使用任意数量的任何字符，除了破折号或不启动注释终止符的破折号。

When working with regular expressions, it's typically more performant to use a more specific expression rather than a "lazy dot", which generally causes excessive backtracking. You can use a negative lookahead to achieve the same results without overburdening the regex engine:

$find_players = preg_match("/<!--PLAYER=((?:[^-]+|-(?!->))*)-->/ig", $html_content, $match);

Mind you, it's unlikely that using the lazy dot will cause noticeable problems with a simple case like this, but it's a good habit to always tell the regex engine exactly what you mean. In this case, you want to collect as many characters as possible ("greedy") without passing a comment terminator. A terminator is a dash followed by another dash and a greater-than sign. So, we allow any number of any character except dash or dashes that don't start a comment terminator.

回复收藏 0 原文

欢烬 2024-07-18 22:22:04

$find_players = preg_match("/<!--PLAYER\=(.*?)-->/i", $html_content, $match);

(.*?)

应该可以正常工作。

$find_players = preg_match("/<!--PLAYER\=(.*?)-->/i", $html_content, $match);

(.*?)

should work just fine.

回复收藏 0 原文

~没有更多了~

关于作者

离笑几人歌

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

正则表达式查找和替换 HTML 注释标签的内容

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

知足的幸福

我一向站在原地

慕烟庭风

秉忠贞之诚守退让之实

小兔几

mb_3y7WUgWY

友情链接

正则表达式查找和替换 HTML 注释标签的内容

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

知足的幸福

我一向站在原地

慕烟庭风

秉忠贞之诚 守退让之实

小兔几

mb_3y7WUgWY

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

秉忠贞之诚守退让之实