从 preg_match() 更改为 preg_replace() 并删除匹配的内容
我知道在 HTML 上使用正则表达式不是首选,但我仍然很困惑为什么这不起作用:
我试图从文档中删除“头部”。
这是文档:
<html>
<head>
<!--
a comment within the head
-->
</head>
<body>
stuff in the body
</body>
</html>
我的代码:
$matches = array(); $result = preg_match ('/(?:<head[^>]*>)(.*?)(<\/head>)/is', $contents, $matches);
var_dump ($matches);
这实际上不起作用。 这是我看到的输出:
array(3) { [0]=> string(60) " " [1]=> string(47) " " [2]=> string(7) "" }
但是,如果我将 HTML 文档调整为没有注释
I know that using regular expressions on HTML is not preferred, but I am still confused as to why this doesn't work:
I'm trying to remove the "head" from a document.
Here's the doc:
<html>
<head>
<!--
a comment within the head
-->
</head>
<body>
stuff in the body
</body>
</html>
My code:
$matches = array(); $result = preg_match ('/(?:<head[^>]*>)(.*?)(<\/head>)/is', $contents, $matches);
var_dump ($matches);
This does not actually work.
Here's the output I see:
array(3) { [0]=> string(60) " " [1]=> string(47) " " [2]=> string(7) "" }
However, if I adjust the HTML doc to not have the comment
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您的正则表达式看起来不错,但是提取
;你想要移除头部。尝试使用
preg_replace
代替:Your regular expression looks fine, but that extracts the
<head>
; you want to remove the head. Try usingpreg_replace
instead:您的脚本工作正常,但由于转储中的 HTML 而无法正确显示(您可以通过
var_dump
输出中的长度来判断)。尝试:另外,正如已经说过的,您需要使用
preg_replace
将匹配项替换为''
才能真正删除头部。Your script is working fine, it's not displaying correctly due to the HTML in the dump (you can tell by the lengths in your
var_dump
output). Try:Also, as has been said, you need to use
preg_replace
to replace the match with''
in order to actually remove the head.您的意思是使用 preg_replace 吗?
Do you mean to use preg_replace?