HTTP协议使用哪种编码?
当浏览器向 Web 服务器发送 HTTP 请求时,使用什么编码来对线路上的 HTTP 协议进行编码? 是ASCII吗? UTF8? 还是UTF16? 或者它是否指定以预定义格式使用哪种编码(在进行任何解码之前?)
PS 我不是在询问请求/响应的实际有效负载(例如 HTML)。 我询问请求行(即 GET /index.html HTTP/1.1
)和标头(即 Host: google.com
)
When a browser sends an HTTP request to a web server, what encoding is used to encode the HTTP protocol on the wire? Is it ASCII? UTF8? or UTF16? Or does it specify which encoding it uses in a predefined format (before any decoding takes place?)
P.S
I'm not asking about the actual payload (e.g. HTML) of the request/response. I'm asking about the request line (i.e. GET /index.html HTTP/1.1
) and headers (i.e. Host: google.com
)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
RFC 2616 包括以下内容:
然后文档中的几乎所有其他内容都是根据以下内容定义的这些实体(
OCTET
、CHAR
等)。 因此,您可以查看 RFC 以找出 HTTP 请求/响应的哪些部分可以包含 OCTET; 所有其他部分都必须是 ASCII。 (我会自己做,但这需要很长时间)具体来说,对于请求行,方法名称和 HTTP 版本将仅为 ASCII 字符,但 URL 本身可能包含非 ASCII 字符。 但是如果你查看RFC 2396,它就是这么说的。
我猜这意味着它也将由 ASCII 字符组成。
RFC 2616 includes this:
And then pretty much everything else in the document is defined in terms of those entities (
OCTET
,CHAR
, etc.). So you could look through the RFC to find out which parts of an HTTP request/response can includeOCTET
s; all other parts must be ASCII. (I'd do it myself, but it'd take a long time)For the request line specifically, the method name and HTTP version are going to be ASCII characters only, but it's possible that the URL itself could include non-ASCII characters. But if you look at RFC 2396, it says that.
Which I guess means that it'll consist of ASCII characters as well.
HTTP 1.1 使用 US-ASCII 作为请求行<的基本字符集<请求中的 /a> 响应中的状态行 (原因短语除外)和字段名称,但允许字段值中包含任何八位字节,并且 消息正文。
HTTP 1.1 uses US-ASCII as basic character set for the request line in requests, the status line in responses (except the reason phrase) and the field names but allows any octet in the field values and the message body.