选择性编码 HTML,如何?
请允许我通过之前和之后解释我的问题...
我在网络社区上有一个评论系统。用户可以在文本区域中输入任何内容,包括特殊字符和 HTML 标记。在 MySQL 中,我完全按照键入的方式存储评论正文,无需任何干预。然而,在显示时我使用 HTML 实体来防止用户弄乱 HTML:
<?= nl2br(htmlentities($comment['body'], ENT_QUOTES, 'UTF-8')) ?>
这工作正常。但是,我现在正在尝试通过自动将评论内放置的一些链接转换为更丰富的对象来丰富评论系统。这涉及到照片论坛,有时用户通过在评论中粘贴 URL 来引用其他照片:
'http://www.jungledragon.com/image/12/eagle.html
使用正则表达式,我将上述有效链接替换为标记。在这种情况下,它将被替换为 img 标签,这样用户就可以直接在评论中看到该图像的缩略图,而不是链接。
更换工作正常。但是,由于我使用的是 htmlentities,替换标记将呈现为文本,而不是呈现的图像。这里没有什么惊喜。
我的问题是,如何有选择地对评论正文进行 html 编码?我希望这些链接替换不被转义,但其他所有内容都应该被转义。
Allow me to explain my problem by before and after...
I have a comment system on a web community. Users can type in anything they want in a textarea, including special characters and HTML tags. In MySQL, I store the comment body exactly as typed, without any intervention. However, upon display I use HTML entities to prevent users from messing with HTML:
<?= nl2br(htmlentities($comment['body'], ENT_QUOTES, 'UTF-8')) ?>
This is working fine. However, I am now trying to enrich the comment system by automatically converting some links that are placed inside comments into richer objects. This concerns a photo forum and sometimes users make references to other photos by pasting in URLs in the comments:
'http://www.jungledragon.com/image/12/eagle.html
Using regular expressions, I am replacing valid links like the above into markup. In this case, it would be replaced with an img tag so that instead of a link, users see a thumb of that image directly inline in the comment.
The replacement is working fine. However, since I am using htmlentities, the replacement markup will render as text, rather than a rendered image. No surprises here.
My question is, how can I selectively html encode a comment body? I want these links replacements to not be escaped, but everything else should be escaped.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
首先做htmlentities,然后做替换。
Do the htmlentities first and the replacing afterwords.
通常,您会使用库来清理 HTML。这里列出了一些:
http://htmlpurifier.org/comparison
Usually, you'd use a library to sanitize the HTML instead. A few are listed here:
http://htmlpurifier.org/comparison