从数据库输出 html 数据到浏览器的最佳实践

发布于 2024-10-18 10:56:23 字数 292 浏览 4 评论 0原文

我将 html 数据存储在数据库中。

html数据非常简单,由所见即所得编辑器生成。

在我将 html 数据存储在数据库中并通过 HTMLPurifier 运行它之前,以消除任何不良情况。

当我将数据输出回浏览器时,因为它是html数据,显然我不能使用php的htmlspecialchars()。

我想知道就 XSS 攻击而言,这是否存在任何问题。在保存到数据库之前通过 HTMLPurifier 传递数据是否足够?有什么我遗漏的事情/我应该采取的其他步骤吗?

(预先)感谢您的帮助。

I store html data in a database.

The html data is very simple, and is generated by a wysiwyg editor.

Before I store the html data in the database and I run it through HTMLPurifier, to remove any badness.

When I output data back out to the browser, because it is html data, obviously I cannot use php's htmlspecialchars().

I am wondering if there are any problems with this as far as XSS attacks are concerned. Is passing the data through HTMLPurifier before saving in the database enough? Are there any things I am missing / other steps I should be taking?

Thanks (in advance) for your help.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

打小就很酷 2024-10-25 10:56:23

你正在做的事情是正确的。为了确定起见,您也可以考虑在途中进行过滤。您提到您正在使用 HTMLPurifier - 这非常棒。只是永远不要尝试自己实施消毒剂,这种方法有很多陷阱。

What you are doing is correct. You may also consider filtering on the way just to be sure. You mentioned you are using HTMLPurifier - which is great. Just don't ever try to implement a sanitizer on your own, there are lots of pitfalls in that approach.

野心澎湃 2024-10-25 10:56:23

我从来没有遇到过主流富文本编辑器的问题。

当人们能够使用 Web 表单将原始 html 嵌入到您的页面中时,XSS 就会发生,您稍后会输出该输入(因此在写入屏幕时始终对用户输入进行编码)。

使用(好的)文本编辑器不会发生这种情况。如果用户输入 html 代码(例如 < 或 >),文本编辑器无论如何都会对其进行编码。它将创建的唯一标签是它自己的。

I've never had an issue with mainstream richtext editors.

XSS happen when people are able to embed raw html into your page using web forms, the input of which you output at a later date (so always encode user input when writing to screen).

This can't happen with a (good) text editor. If a user types in html code (e.g. < or >), the text editor will encode it anyway. The only tags it will create are its own.

提笔书几行 2024-10-25 10:56:23

有一个函数htmlspecialchars,它将字符编码为它们的html等价物。例如,< 变为 <

此外,您可能希望清除任何可疑标签。我不久前写了一个简短的 js 函数来为一个项目执行此操作(绝不是包罗万象!)您可能想要根据您的需要对其进行编辑,或者基于它来创建您自己的...

    <script language="javascript" type="text/javascript">

    function Button1_onclick() {
        //get text
        var text = document.getElementById("txtIn").value;
        //wype it
        text = wype(text);
        //give it back
        document.getElementById("txtOut").value = text;
    }

    function wype(text) {
        text = script(text);
        text = regex(text);
        return text
    }


    function script(text) {
        var re1 = new RegExp('<script.*?>.*?</scri'+'pt>', 'g');
        text = text.replace(re1, '');
        return text
    }

    function regex(text) {
        var tags = ["html", "body", "head", "!doctype", "script", "embed", "object", "frameset", "frame", "iframe", "meta", "link", "div", "title", "w", "m", "o", "xml"];
        for (var x = 0; x < tags.length; x++) {
            var tag = tags[x];
            var re = new RegExp('<' + tag + '[^><]*>|<.' + tag + '[^><]*>', 'g');
            text = text.replace(re, '');
        }
        return text;
    }
</script>

There is a function htmlspecialchars, that will encode characters into their html equivalent. For example < becomes <

In addition you may want to clean out any suspicious tags. I wrote a short js function a while ago to do this for a project (by no means all-inclusive!) You may want to take this and edit it for your needs, or base your own off of it...

    <script language="javascript" type="text/javascript">

    function Button1_onclick() {
        //get text
        var text = document.getElementById("txtIn").value;
        //wype it
        text = wype(text);
        //give it back
        document.getElementById("txtOut").value = text;
    }

    function wype(text) {
        text = script(text);
        text = regex(text);
        return text
    }


    function script(text) {
        var re1 = new RegExp('<script.*?>.*?</scri'+'pt>', 'g');
        text = text.replace(re1, '');
        return text
    }

    function regex(text) {
        var tags = ["html", "body", "head", "!doctype", "script", "embed", "object", "frameset", "frame", "iframe", "meta", "link", "div", "title", "w", "m", "o", "xml"];
        for (var x = 0; x < tags.length; x++) {
            var tag = tags[x];
            var re = new RegExp('<' + tag + '[^><]*>|<.' + tag + '[^><]*>', 'g');
            text = text.replace(re, '');
        }
        return text;
    }
</script>
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文