输出或输入过滤?
输出或输入过滤?
我经常看到人们写“过滤你的输入”,“清理你的输入”,不信任用户数据,但我只同意最后一个,我认为信任任何外部数据都是一个坏主意,即使它是相对于内部的系统。
输入过滤: 我看到的最常见的。 采用发布数据或任何其他外部信息源的形式,并在保存时定义一些边界,例如确保文本是文本,数字是数字,sql 是有效的 sql,html 是有效的 html 并且不包含有害内容标记,然后将“安全”数据保存在数据库中。
但是,在获取数据时,您只需使用数据库中的原始数据。
在我个人看来,数据从来都不是真正安全的。 虽然听起来很简单,只需过滤从表单和 url 获得的所有内容,但实际上比这困难得多,这对于一种语言可能是安全的,但对于另一种语言则不然。
输出过滤: 当这样做时,我将原始的未更改的数据(无论它是什么)与准备好的语句一起保存到数据库中,然后在访问数据时过滤掉有问题的代码,这有它自己的优点: 这在 html 和服务器端脚本之间添加了一层。 我认为这是某种数据访问分离。
现在,数据根据上下文进行过滤,例如,我可以将数据库中的数据以纯转义文本、html 或任何地方的形式呈现在 html 文档中。
这里的缺点是您绝对不能忘记添加过滤,这比输入过滤要困难一些,并且在提供数据时会使用更多的 CPU。
这并不意味着您不需要进行验证检查,您仍然需要进行验证检查,只是您不保存过滤后的数据,而是对其进行验证并在数据因某种原因无效时向用户提供错误消息。
因此,与其“过滤你的输入”,也许应该“验证你的输入,过滤你的输出”。
那么我应该选择“输入验证和过滤”还是“输入验证和输出过滤”?
Output or Input filtering?
I constantly see people writing "filter you inputs", "sanitize your inputs", don't trust user data, but I only agree with the last one, where I consider trusting any external data a bad idea even if it is internal relative to the system.
Input filtering:
The most common that I see.
Take the form post data or any other external source of information and define some boundaries when saving it, for example making sure text is text, numbers are numbers, that sql is valid sql, that html is valid html and that it does not contain harmful markup, and then you save the "safe" data in the database.
But when fetching data you just use the raw data from the database.
In my personal opinion, the data is never really safe.
Although it sounds easy, just filter everything you get from forms and url's, in reality it is much harder than that, it might be safe for one language but not another.
Output filtering:
When doing it this way I save the raw unaltered data, whatever it might be, with prepared statements into the database and then filter out the problematic code when accessing the data, this has it's own advantages:
This adds a layer between html and the server side script.
which I consider to be data access separation of sorts.
Now data is filtered depending on the context, for example I can have the data from the database presented in a html document as plain-escaped-text, or as html or as anything anywhere.
The drawbacks here are that you must not ever forget to add the filtering which is a little bit harder than with input filtering and it uses a bit more CPU when providing data.
This does not mean that you don't need to do validation checks, you still do, it's just that you don't save the filtered data, you validate it and provide the user with a error message if the data is somehow invalid.
So instead of going with "filter your inputs" maybe it should be "validate your inputs, filter your outputs".
so should I go with "Input validation and filtering" or "Input validation and output filtering"?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
对我来说听起来像是语义学。无论哪种方式,要记住的重要一点是确保不良数据不会进入系统。
进行输出过滤而不是输入过滤是要求 SQL 注入。
Sounds like semantics to me. Either way the important thing to remember is to make sure bad data doesn't get in the system.
Doing output filtering instead of input filtering is asking for an SQL Injection .
输入和输出没有通用的“过滤”。
验证您的输入,转义您的输出。如何执行此操作取决于上下文。
验证是为了确保输入落在合理的范围内,例如字符串的长度、美元金额的数字或正在更新的记录由执行更新的用户拥有。这是为了维护数据的逻辑一致性,并防止人们做一些事情,例如将他们购买的产品的价格归零或删除他们不应该访问的记录。它与“过滤”或转义输入中的特定字符无关。
转义是上下文问题,只有当您使用可能因注入某些字符而中毒的数据执行某些操作时,转义才真正有意义。对发送到浏览器的数据中的 HTML 字符进行转义。对发送到数据库的数据中的 SQL 字符进行转义。在 JavaScript
标记内写入数据时转义引号。只需要注意您正在处理的数据将如何被您传递到的系统解释并相应地转义即可。
There is no generic "filtering" for input and output.
Validate your input, escape your output. How you do this depends on context.
Validation is about making sure input falls within sensible ranges, like the length of strings, the numericality of dollar amounts or that a record being updated is owned by the user performing the update. This is about maintaining the logical consistency of your data and preventing people from doing things like zeroing the price of a product they are purchasing or deleting records they shouldn't have access to. It has nothing to do with "filtering" or escaping specific characters in your input.
Escaping is a matter of context, and only really makes sense when you're doing something with data that can be poisoned by injecting certain characters. Escape HTML characters in data you send to the browser. Escape SQL characters in data you send to the database. Escape quotes when you're writing data inside JavaScript
<script>
tags. Just be conscious of how the data you're dealing with is going to be interpreted by the system you're passing it to and escape accordingly.最好的解决方案是过滤两者。只执行一项操作会使您更有可能错过一个案例,并且可能使您容易受到其他类型的攻击。
如果您只进行输入过滤,攻击者可能会找到绕过您的输入并导致漏洞的方法。这可能是有权访问您的数据库的人手动输入数据,也可能是攻击者通过 FTP 或其他未经检查的渠道或许多其他方法上传文件。
如果您只进行输出过滤,则可能会遭受 SQL 注入和其他服务器端攻击。
最好的方法是过滤您的输入和输出。它可能会导致更多负载,但大大降低了攻击者发现漏洞的风险。
The best solution is to filter both. Doing just one makes it more likely that you miss a case, and can leave you open to other types of attacks.
If you only do input filtering, an attacker could find a way to bypass your inputs and cause a vulnerability. This could be someone with access to your database entering data manually, it could be an attacker uploading a file through FTP or some other channel that is not checked, or many other methods.
If you only do output filtering, you can leave yourself open to SQL injection and other server side attacks.
The best method is to filter both your inputs and outputs. It may cause more load, but greatly reduces the risk of an attacker finding a vulnerability.