从 preg_match() 更改为 preg_replace() 并删除匹配的内容

发布于 2024-12-08 19:53:59 字数 602 浏览 0 评论 0原文

我知道在 HTML 上使用正则表达式不是首选,但我仍然很困惑为什么这不起作用:

我试图从文档中删除“头部”。
这是文档:

<html>
 <head>
   <!--
     a comment within the head
     -->
 </head>
 <body>
stuff in the body
 </body>
</html>

我的代码:

$matches = array(); $result = preg_match ('/(?:<head[^>]*>)(.*?)(<\/head>)/is', $contents, $matches); 
var_dump ($matches);

这实际上不起作用。 这是我看到的输出:

array(3) { [0]=> string(60) " " [1]=> string(47) " " [2]=> string(7) "" }

但是,如果我将 HTML 文档调整为没有注释

I know that using regular expressions on HTML is not preferred, but I am still confused as to why this doesn't work:

I'm trying to remove the "head" from a document.
Here's the doc:

<html>
 <head>
   <!--
     a comment within the head
     -->
 </head>
 <body>
stuff in the body
 </body>
</html>

My code:

$matches = array(); $result = preg_match ('/(?:<head[^>]*>)(.*?)(<\/head>)/is', $contents, $matches); 
var_dump ($matches);

This does not actually work.
Here's the output I see:

array(3) { [0]=> string(60) " " [1]=> string(47) " " [2]=> string(7) "" }

However, if I adjust the HTML doc to not have the comment

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

夜未央樱花落 2024-12-15 19:53:59

您的正则表达式看起来不错,但是提取 ;你想要移除头部。尝试使用 preg_replace 代替:

$without_head = preg_replace ('/(?:<head[^>]*>)(.*?)(<\/head>)/is', '', $contents);

Your regular expression looks fine, but that extracts the <head>; you want to remove the head. Try using preg_replace instead:

$without_head = preg_replace ('/(?:<head[^>]*>)(.*?)(<\/head>)/is', '', $contents);
神妖 2024-12-15 19:53:59

您的脚本工作正常,但由于转储中的 HTML 而无法正确显示(您可以通过 var_dump 输出中的长度来判断)。尝试:

$result = preg_match ('/(?:<head[^>]*>)(.*?)(<\/head>)/is', $contents, $matches); 
ob_start(); // Capture the result of var_dump
var_dump ($matches);
echo htmlentities(ob_get_clean()); // Escape HTML in the dump

另外,正如已经说过的,您需要使用 preg_replace 将匹配项替换为 '' 才能真正删除头部。

Your script is working fine, it's not displaying correctly due to the HTML in the dump (you can tell by the lengths in your var_dump output). Try:

$result = preg_match ('/(?:<head[^>]*>)(.*?)(<\/head>)/is', $contents, $matches); 
ob_start(); // Capture the result of var_dump
var_dump ($matches);
echo htmlentities(ob_get_clean()); // Escape HTML in the dump

Also, as has been said, you need to use preg_replace to replace the match with '' in order to actually remove the head.

听你说爱我 2024-12-15 19:53:59
php > $str=<<<EOS
<<< > <head>
<<< >    <!--
<<< >      a comment within the head
<<< >      -->
<<< >  </head>
<<< > EOS;
php > $r=preg_match('/(?:<head[^>]*>)(.*?)(<\/head>)/is',$str,$matches);
php > var_dump($r);
int(1)
php > var_dump($matches);
array(3) {
  [0]=>
  string(63) "<head>
   <!--
     a comment within the head
     -->
 </head>"
  [1]=>
  string(50) "
   <!--
     a comment within the head
     -->
 "
  [2]=>
  string(7) "</head>"
}

您的意思是使用 preg_replace 吗?

php > $r=preg_replace('/(?:<head[^>]*>)(.*?)(<\/head>)/is','',$str);
php > var_dump($r);
string(0) ""
php > $str=<<<EOS
<<< > <head>
<<< >    <!--
<<< >      a comment within the head
<<< >      -->
<<< >  </head>
<<< > EOS;
php > $r=preg_match('/(?:<head[^>]*>)(.*?)(<\/head>)/is',$str,$matches);
php > var_dump($r);
int(1)
php > var_dump($matches);
array(3) {
  [0]=>
  string(63) "<head>
   <!--
     a comment within the head
     -->
 </head>"
  [1]=>
  string(50) "
   <!--
     a comment within the head
     -->
 "
  [2]=>
  string(7) "</head>"
}

Do you mean to use preg_replace?

php > $r=preg_replace('/(?:<head[^>]*>)(.*?)(<\/head>)/is','',$str);
php > var_dump($r);
string(0) ""
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文