对超链接进行编码 - 何时以及如何?
我有一个与安全相关的问题。我的 Web 应用程序允许用户输入 URL。 URL 立即存储在数据库中(此时没有进行任何清理。这是错误的吗?)。我正在使用 Linq to SQL,因此它已经参数化。当向用户显示超链接时,我使用了转发器。我是否需要对超链接文本以及工具提示和 href 属性进行编码?或者我只需要对文本(显示的)进行编码。另外,我假设 URL 编码是我需要的,但是我还必须使用 HTML 编码吗?
我在文本为 它似乎弄乱了 href 和文本。我猜这意味着它没有完全安全?
编辑 - 我应该添加,如果我对输出进行编码,如何才能显示“/”而不是“%2”? 谢谢
I have a security related question. My web application allows users to input URLs. The URL is immediately stored in the database (no santization at this point. Is this wrong?). I'm using Linq to SQL so it's already parameterized. When displaying the hyperlink back to the user, I'm using a repeater. Do I need to encode the hyperlink text as well as the tooltip and href property? Or do I only have to encode the text (which is displayed). Also, I assume URL encode is what I need here, but do I also have to use HTML encode?
I tried Server.UrlEncode
on all three properties where the text was <script> alert("hello") </script>
and it seemed to mess up the href and text. I'm guessing this means that it's not fully secured?
Edit - I should add, if I encode on output, how can I make it so that a "/" is displayed instead of "%2"?
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您是否允许任意 http(s) 链接/文本?
文本(锚标记的innerHtml)必须是htmlentity编码的。至于 href:
首先在输入时,至少检查输入 url 是否确实是一个 http 或 https 链接,并且主机名和路径中仅包含有效字符(使用 RFC,但随意进一步限制;注意 punycode 用于非ascii 域名,因此字符白名单很短)。这将防止用于网络钓鱼的 javascript: url、username:password@hostname url、ftp://、kindle:// 和其他方案,在 url 中使用 \(由 IE 转换为 /,但可能会混淆您对域名的阅读) 、使用过多的空格 www.good%20{N times}evil.com url 等。
如果允许参数,请对各个名称和值进行 urlencode,尽管它们会影响目标(也不对 html 实体进行编码)。删除 # 和之后的任何内容,因为无论如何都不会发送到目标。将 href 用双引号引起来。
如果适用的话,当用户离开您的网站时发出警告可能是个好主意。请注意,目标站点将获取页面 url 作为引荐来源网址。另一种选择是仅允许链接到您知道无害的白名单域(除了负责任的行为之外,这还可以防止您的网站被 netcraft、google 等识别为链接到有害网站)。
Do you allow arbitrary http(s) links/text?
The text (innerHtml of anchor tag) must be htmlentity encoded. As far as the href:
First on input, at a minimum check that the input url is really a http or https link with only valid characters in hostname and path (use RFC but feel free to constrain further; note punycode is used for non-ascii domain names, so whitelist of chars is short). This will prevent javascript: urls, username:password@hostname urls used for phishing, ftp://, kindle://, and other schemes, use of \ in url (converted to / by IE but might confuse your reading of domain), use of excessive blank space www.good%20{N times}evil.com urls, etc.
If you allow params, urlencode the individual names and values although they affect target (also don't html entity encode). Strip # and anything after since that is not sent to target anyway. Enclose the href in double quotes.
It may be a good idea to warn users when they navigate away from your site if applicable. Note that target site will get page url as referrer. Other option is to allow only links to whitelisted domains that you know are not harmful (apart from behaving responsibly this will prevent your site from being identified as linking to harmful sites by netcraft, google, etc).
这是错误的吗?是 在输入时进行消毒,而不是在输出时进行消毒。
如果在将其保存到数据库之前进行清理,则输出时无需编码。
经验法则:信任所有层或应用程序上的数据,从而尽早清理。
Is this wrong? YES Sanitize upon input, not output.
If you sanitize before saving it to db, no need to encode when outputting.
Rule of thumb: trust your data on all layers or your app, thus sanitize early.
您应该立即清理输入(这与转义不同)。这意味着执行某种验证,确认数据是 URL,或者至少仅包含 URL 允许的字符。使用 Regex 或 URL 解析库来完成此操作(抱歉我对 .NET 的 API 不太熟悉)。
您应该对输出进行编码,除非您想将其用作 HTML 元素中的 URL(您确实这样做了!),在这种情况下您不应该进行任何编码。您肯定需要对工具提示和链接标记正文中的文本进行编码。我会特别认真地考虑如何在输入进入数据库之前对其进行清理。我建议浏览这个XSS 攻击示例的绝佳资源。
在输出上编码而不是在保存到数据库时编码的原因是每个输出介质可能具有不同的编码/转义规则。例如,HTML 与 JavaScript 不同,JavaScript 与 PDF、Flash、CSS 等不同...
另外,我假设您在保存到数据库时使用准备好的语句,以避免 SQL 注入?
You should sanitize on input (this is different from escaping) straight away. This means performing some sort of validation that the data is a URL, or at the very least, only contains characters allowed for a URL. Use Regex to do this, or a URL parsing library (Sorry I'm not too familiar with .NET's API).
You should encode on output, unless you want to use it as a URL in an HTML element (which you do!), in which case you shouldn't do any encoding. You'll definitely need to encode tooltip and and the text in the body of the link tag. I would think extra hard about how you sanitize the input before it's entered into the database. I suggest browsing this fantastic resource of example XSS attacks.
The reason you encode on output, not when saving to the database, is that each output medium may have different encoding/escaping rules. e.g. HTML is different from JavaScript, which is different from say PDF, or Flash, or CSS, etc...
Also I assume you're using prepared statements when saving to the database, to avoid SQL Injection?