HTML 标签的正则表达式
我正在执行以下操作:
<?
$text = preg_replace ("/<p>(.*?)<\/p>/", "$1<br>", "$text");
?>
这样我就可以摆脱
标签并在字符串末尾放置一个空格(这是为了页面样式)。
这对于 Something"
非常有效。
然而,对于像这样的文本:
<h3>Section 1.10.32 of "de Finibus Bonorum et Malorum", written by Cicero in 45 BC</h3>
<p>"Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam nihil molestiae consequatur, vel illum qui dolorem eum fugiat quo voluptas nulla pariatur?"</p>
我从 Lorem Ipsum (lipsum.com) 页面获取的文本不起作用,而且我不知道为什么。
在某种程度上相关的注释上(我不确定它是否足够相关以保留在同一问题中,但它可以帮助解决这个问题),是否有任何函数或方法可以自动删除这些标签中可能包含的每个javascript片段他们? 例如,
<p onmouseover="alert('hello');">
感谢您的帮助。
I'm doing the following:
<?
$text = preg_replace ("/<p>(.*?)<\/p>/", "$1<br>", "$text");
?>
So I can get rid of <p>
tags and place a space at the end of the string (this is for styling of the page).
This works for "<p>Something</p>"
perfectly.
However, with text like:
<h3>Section 1.10.32 of "de Finibus Bonorum et Malorum", written by Cicero in 45 BC</h3>
<p>"Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam nihil molestiae consequatur, vel illum qui dolorem eum fugiat quo voluptas nulla pariatur?"</p>
That I took from the Lorem Ipsum (lipsum.com) page, it doesn't work, and I don't have a clue why.
On a somehow related note (and I'm not sure if it's related enough to keep in the same question, but it could help towards this problem), is there any function or way to automatically remove every javascript snippet that these tags could have in them?
e.g
<p onmouseover="alert('hello');">
Thanks for any help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
尝试这个 php 调用:
它将处理忽略大小写匹配(
p
和P
)以及多行匹配。Try this php call:
It will handle ignore case matches (
p
andP
) as well as multi-line matches.就这样:
它还可以正确处理您的 p 可能具有的任何属性(例如我的示例中的类)。
There you go :
It also handles correctly any attribut your p might have (like a class in my example).
php 文档中已经保存了一些函数,
特别是这个: http:// php.net/manual/en/function.strip-tags.php#93567
There are some functions already saved in php documentation
specially this one: http://php.net/manual/en/function.strip-tags.php#93567