如何将 utf-8 字符串中的字符列入黑名单?
我有一个 HTML 文本输入,用户可以在其中输入自己的名称。该名称只是一个用户友好的显示名称,它不用于识别数据库中的用户或后端的任何内容。
我想允许utf-8字符,这样人们就可以输入他们母语的字符,无论是中文还是瑞典语还是其他语言。
但是,我想将某些字符列入黑名单,例如 <、>、[、]、?、* 等,以阻止任何潜在的脚本小子试图利用输入进行 SQL 注入或其他任何操作。
我认为这将是一个简单的代码,网络上会有很多示例,但是答案(如果有的话)隐藏在如何使用白名单验证电子邮件地址的示例中(只有英文字母数字字符,没有亚洲字符)或其他语言特定字符),或者,奇怪的是,如何完全停止某些字符的按键。
我不想完全停止按键,因为我认为这可能会让我的用户感到困惑。相反,我会输出一条错误,指出如果他们输入了黑名单中的字符,则他们无法使用字符 X。
那么,对于像我这样完全不懂正则表达式的人来说,有没有一种直接的方法可以将 Javascript 中的字符列入黑名单呢?
如果它不妨碍用户输入他们所使用的任何语言的时髦字符,我也会选择白名单解决方案。
I have an HTML text input where users can write in a name for themselves. The name is just a user-friendly display name, it's not used to identify the user in the database or for anything on the back end.
I want to allow utf-8 characters, so that people can input characters of their native langugage, whether it's Chinese or Swedish or whatever.
However, I want to blacklist certain characters, like <,>, [, ], ?, *, and so on, to stop any potential script kiddies trying to exploit the input to make an SQL injection or whatever.
I thought this would be straightforward code that there would be lots of examples of on the web, but the answer, if it's out there, is buried among examples of how to use a whitelist to validate email addresses (only English alphanumeric characters, no Asian or other language specific characters), or, oddly enough, how to stop key presses for certain characters entirely.
I don't want to stop key presses entirely, as I think that might confuse the user in my case. Instead I'll output an error saying they can't use character X if they input a blacklisted one.
So, for a guy like me who sucks totally at regex, is there a straightforward way of blacklisting characters in Javascript?
I would also go for a whitelist solution if it didn't inhibit the ability for users to put in the funky characters from whatever language they're using.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
首先,您需要清理客户端和服务器端的输入数据。任何聪明到能够发起攻击的人都会聪明到足以禁用 javascript 足够长的时间,以便将他们想要的数据输入到您的表单中。
现在就用 javascript 正则表达式来阻止输入而言,SO 上有很多问题都在讨论这个问题。
//一些技术内容
javascript 正则表达式删除所有特殊字符
这组正则表达式是否完全防止跨站点脚本攻击?
//简单的js示例,用于阻止输入不需要的字符。
http://www.sitepoint.com/forums/showthread.php?t= 142118
据我所知,共识似乎是您需要列入白名单而不是黑名单。也许来自 SO 的经验丰富的人可以为您指出处理用例的最佳方法。
To start you would want to clean the input data on both the client side and the server side. Anyone clever enough to be creating attacks will be clever enough to disable javascript long enough to get the data they want into your forms.
Now as far as a javascript regex to prevent entry - there are lots of questions on SO that talk about this.
//some technical stuff
javascript regexp remove all special characters
Does this set of regular expressions FULLY protect against cross site scripting?
//simple js example to stop entry of unwanted characters.
http://www.sitepoint.com/forums/showthread.php?t=142118
From what I read the consensus seems to be that you need to whitelist and not blacklist. Perhaps someone from SO with more experience can point you towards the best way to handle your use case.
这实际上取决于你一般如何进行验证(我喜欢 jQuery validate 作为一个框架),但基本检查可能类似于:
但是,正如 Mike Daniels 指出的,如果您主要关心的是脚本小子,请不要这样做。任何想要弄乱您的输入表单的人都可以轻松绕过 Javascript 验证。将 Javascript 验证视为一个很好的 UI 功能 - 它为用户提供有用的反馈,而无需让他们重新加载页面。但这不是任何类型的安全 - 无论您使用什么 Javascript 进行验证,您都必须检查和清理服务器端的输入。
It would really depend on how you're doing validation in general (I like jQuery validate as a framework), but the basic check might look something like:
But, as Mike Daniels noted, don't do this if your main concern is with script kiddies. Anyone who wants to mess with your input form can easily circumvent Javascript validation. Consider the Javascript validation a nice UI feature - it gives your user helpful feedback without making them reload the page. But it's not security of any kind - you have to check and sanitize the input on the server side no matter what Javascript you use for validation.