我应该如何清理网址,以免人们将 漢字或 á或者其中有其他东西?
我应该如何清理网址,以免人们在其中放入汉字或其他内容?
编辑:我正在使用java。该 URL 将根据用户在表单上提出的问题生成。 StackOverflow 似乎只是删除了有问题的字符,但它也将 á 变成了 a。
是否有执行此操作的标准约定?或者每个开发人员都只编写自己的版本?
How should I sanitize urls so people don't put 漢字 or other things in them?
EDIT: I'm using java. The url will be generated from a question the user asks on a form. It seems StackOverflow just removed the offending characters, but it also turns an á into an a.
Is there a standard convention for doing this? Or does each developer just write their own version?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您描述的过程是
slugify
。没有固定的机制可以做到这一点;每个框架都以自己的方式处理它。The process you're describing is
slugify
. There's no fixed mechanism for doing it; every framework handles it in their own way.是的,我会清理/删除。它要么不一致,要么编码看起来很难看
使用 Java 请参阅 URLEncoder API 文档
小心!如果您要删除诸如奇数字符之类的元素,那么两个不同的输入可能会在无意时产生相同的剥离 URL。
这意味着它将被编码。 URL 应该是可读的。标准往往带有英语偏见(那是什么?Langist?Languagist?)。
不确定其他国家/地区的惯例是什么,但如果我看到发送给我的 URL 中存在大量编码,我会认为这是愚蠢或可疑的......
除非链接正确显示,由浏览器编码并在另一端解码...但是你想冒这个风险吗?
StackOverflow 似乎只是从 URL 中一起删除这些字符:)
谢谢迈克·斯普罗斯
Yes, I would sanitize/remove. It will either be inconsistent or look ugly encoded
Using Java see URLEncoder API docs
Be careful! If you are removing elements such as odd chars, then two distinct inputs could yield the same stripped URL when they don't mean to.
This means it will get encoded. URLs should be readable. Standards tend to be English biased (what's that? Langist? Languagist?).
Not sure what convention is other countries, but if I saw tons of encoding in a URL send to me, I would think it was stupid or suspicious ...
Unless the link is displayed properly, encoded by the browser and decoded at the other end ... but do you want to take that risk?
StackOverflow seems to just remove those chars from the URL all together :)
Thanks Mike Spross
你说的是哪种语言?
在 PHP 中,我认为这是最简单的,并且可以处理所有事情:
https:/ /www.php.net/manual/en/function.urlencode.php
Which language you are talking about?
In PHP I think this is the easiest and would take care of everything:
https://www.php.net/manual/en/function.urlencode.php