为什么要使用 urlencode?
我正在编写一个 Web 应用程序并学习如何对 html 链接进行 urlencode...
这里的所有 urlencode 问题(请参阅下面的标签)都是“如何...?”问题。
我的问题不是“如何?”但为什么?”。
甚至维基百科文章也只讨论了它的机制:
http://en.wikipedia.org/wiki/Urlencode 但根本不是为什么我应该在我的应用程序中使用 urlencode 。
使用(或不使用)urlencode 的安全影响是什么?
如何利用 urlencode 失败?
未编码的 url 会出现什么样的错误或失败?
我问这个问题是因为即使没有 urlencode,我的应用程序开发网站的链接(例如以下工作按预期进行: http://myapp/my%20test/ée/ràé
为什么我应该使用 urlencode?
或者另一种说法:
我应该何时使用 urlencode?什么情况下?
I am writing a web application and learning how to urlencode html links...
All the urlencode questions here (see tag below) are "How to...?" questions.
My question is not "How?" but "Why?".
Even the wikipedia article only addresses the mechanics of it:
http://en.wikipedia.org/wiki/Urlencode
but not why I should use urlencode in my application at all.
What are the security implications of using (or rather not using) urlencode?
How can a failure to use urlencode be exploited?
What kind of bugs or failures can crop up with unencoded urls?
I'm asking because even without urlencode, a link to my application dev web site like the following works as expected:http://myapp/my%20test/ée/ràé
Why should I use urlencode?
Or another way to put it:
When should I use urlencode? In what kind of situations?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
更新:上面有一个更好的解释(imo):
以及
因为 RFC 中有说明:
和
Update: There is an even better explanation (imo) further above:
and
Because it is stated in the RFC:
and
主要原因是它本质上转义要包含在网页 URL 中的字符。
假设用户输入用户表单字段为“&joe”,并且我们希望使用 URL 编码重定向到包含该名称作为 URL 一部分的页面,则例如:
如果您没有使用 urlencoding ,你最终会得到:
并且那个&符号会导致各种不可预测性
The main reason is it essentially escapes characters to be included in the URL of your webpage.
Suppose a user inputs a user form field as "&joe" and we would like to redirect to a page which contains that name as part of the URL, using URL encoding, it would then be, for example:
If you didnt use urlencoding, you would end up with:
and that ampersand would cause all sorts of unpredictability
应使用 URL 编码的原因有两个:
„ <; > #%\| ^ [ ] ` 空格
。例如,空格不是有效的 URL 字符,因为如果文本中包含空格,则在文本中识别完整的 URL 将是不明确的。! # $ % & ' ( ) * + , / : ; =? @[]
。例如,?
被保留用于标记查询参数的开始,如果我们不在路径或查询参数内部对?
进行编码,则可能会破坏语法。There are two reasons why you should use URL encoding:
„ < > # % \ | ^ [ ] ` spaces
. For instance, whitespace is not a valid URL character, since it would be ambiguous to spot the full URL in texts if they would contain whitespaces.! # $ % & ' ( ) * + , / : ; = ? @ [ ]
. For instance,?
is reserved to mark start of query parameters, and if we do not encode?
in the path or inside query parameter, it might break the syntax.有 RFC 定义 URL 格式,浏览器/Web 服务器开发人员依赖将此作为解释数据的标准。如果不遵守,结果可能难以预测。
HTTP URL 有其规范,它规定几乎所有非拉丁字符都需要编码。
There're RFCs that define format for URLs, and browser/web server developers rely on this as a standard for interpreting data. If you don't comply, the results may be unpredictable.
HTTP URL has its specification, and it states that practically all non-latin characters need to be encoded.
我能想到的两个原因:
&
之类的字符,就会出现问题。Two reasons I could think of:
&
inside some parameter.如果您的两条路径是这样的
并且
注意空格和空格,您将如何区分? %20 是 URL 的一部分。
How will you distinguish if your two of path are like this
and
Note space & %20 is part of URL.
URL 编码是将字符串转换为有效 URL 格式的过程。有效的 URL 格式意味着 URL 仅包含所谓的“字母 | 数字 | 安全 | 额外 | 转义”字符。
URL编码通常用于转换通过html表单传递的数据,因为此类数据可能包含特殊字符,例如“/”、“.”、“#”等,这些字符可能: a) 有特殊含义;或 b) 不是 URL 的有效字符;或 c) 可以在传输过程中更改。例如,“#”字符需要进行编码,因为它具有 html 锚点的特殊含义。该字符还需要进行编码,因为在有效的 URL 格式中不允许使用该字符。此外,某些字符(例如“~”)可能无法在 Internet 上正确传输。
URL Encoding is the process of converting string into valid URL format. Valid URL format means that the URL contains only what is termed "alpha | digit | safe | extra | escape" characters.
URL encoding is normally performed to convert data passed via html forms, because such data may contain special character, such as "/", ".", "#", and so on, which could either: a) have special meanings; or b) is not a valid character for an URL; or c) could be altered during transfer. For instance, the "#" character needs to be encoded because it has a special meaning of that of an html anchor. The character also needs to be encoded because is not allowed on a valid URL format. Also, some characters, such as "~" might not transport properly across the internet.
它在网络标准 RFC 1738 中指定。
It is specified in the web standard RFC 1738.