使用 PHP 检查并删除空标签
从字符串中删除空 html 标签的最快方法是什么?
我已经编写了类似的程序来检测空锚标记:
$temp = strip_tags($string, "<blockquote><a>");
$cmatch = array();
if(preg_match_all("~<a.*><\/a>~iU", $temp, $cmatch, PREG_SET_ORDER))
{
foreach($cmatch as $cm)
{
foreach($cm as $t) //echo htmlentities($t)."<br />";
$temp = trim(str_replace($t, '', $temp));
}
}
if(!empty($temp))
{
echo '<div class="c" style="margin-top:20px;">';
echo $temp;
echo '</div>';
}
//do not output if empty tags (problem with div margin)
必须可以更有效地做到这一点。将字符串转换为 html DOM 并在那里进行检查会更快吗?
What is the fastest way to remove empty html tags from a string?
I have programmed something like this to detect empty anchor tags:
$temp = strip_tags($string, "<blockquote><a>");
$cmatch = array();
if(preg_match_all("~<a.*><\/a>~iU", $temp, $cmatch, PREG_SET_ORDER))
{
foreach($cmatch as $cm)
{
foreach($cm as $t) //echo htmlentities($t)."<br />";
$temp = trim(str_replace($t, '', $temp));
}
}
if(!empty($temp))
{
echo '<div class="c" style="margin-top:20px;">';
echo $temp;
echo '</div>';
}
//do not output if empty tags (problem with div margin)
It must be possible to do this more efficiently. Would it be faster to convert the string to html DOM and do checking there?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
正则表达式不是解析 HTML 的正确工具.
作为一个非具体的答案,我强烈建议使用 DOM 解析库来完成此任务。举几个使正则表达式成为噩梦的陷阱:
标记,但您会捕获
标签吗? > 标签?
p
标记是否为空?:如果是这样,您的代码会捕获它吗?如果没有,你需要在绳子上跑多少次才能有足够的信心接住所有的球?
Regular expressions are not the right tool for parsing HTML.
As a non-specific answer, I highly recommend using a DOM parsing library to accomplish this. To name a few gotchas that will make regular expressions a nightmare:
<a></a>
tags, but will you catch<a />
tags?p
tag empty?:<p><a></a></p>
If so, will your code catch it? If it doesn't, how many passes will you need to run on the string before you're confident enough to have caught them all?