在 HTTP 中,您可以在请求中指定客户端可以使用 accept
标头接受响应中的特定内容,其值例如为 application/xml
。内容类型规范允许您在内容类型中包含参数,例如 charset=utf-8
,表示您可以接受具有指定字符集的内容。
还有 accept-charset
标头,它指定客户端接受的字符编码。
如果指定了两个标头,并且 accept
标头包含带有 charset 参数的内容类型,那么服务器应将哪个标头视为上级标头?
例如:
Accept: application/xml; q=1,
text/plain; charset=ISO-8859-1; q=0.8
Accept-Charset: UTF-8
我使用 Fiddler 向各种服务器发送了一些示例请求,以测试它们的响应方式:
示例
W3
请求
GET http://www.w3.org/ HTTP/1.1
Host: www.w3.org
Accept: text/html;charset=UTF-8
Accept-Charset: ISO-8859-1
响应
Content-Type: text/html; charset=utf-8
Google
请求
GET http://www.google.co.uk/ HTTP/1.1
Host: www.google.co.uk
Accept: text/html;charset=UTF-8
Accept-Charset: ISO-8859-1
响应
Content-Type: text/html; charset=ISO-8859-1
< Strong>StackOverflow
请求
GET http://stackoverflow.com/ HTTP/1.1
Host: stackoverflow.com
Accept: text/html;charset=UTF-8
Accept-Charset: ISO-8859-1
响应
Content-Type: text/html; charset=utf-8
Microsoft
请求
GET http://www.microsoft.com/ HTTP/1.1
Host: www.microsoft.com
Accept: text/html;charset=UTF-8
Accept-Charset: ISO-8859-1
响应
Content-Type: text/html
对于预期行为似乎没有达成任何共识。我试图表现出惊讶的样子。
In HTTP you can specify in a request that your client can accept specific content in responses using the accept
header, with values such as application/xml
. The content type specification allows you to include parameters in the content type, such as charset=utf-8
, indicating that you can accept content with a specified character set.
There is also the accept-charset
header, which specifies the character encodings which are accepted by the client.
If both headers are specified and the accept
header contains content types with the charset parameter, which should be considered the superior header by the server?
e.g.:
Accept: application/xml; q=1,
text/plain; charset=ISO-8859-1; q=0.8
Accept-Charset: UTF-8
I've sent a few example requests to various servers using Fiddler to test how they respond:
Examples
W3
Request
GET http://www.w3.org/ HTTP/1.1
Host: www.w3.org
Accept: text/html;charset=UTF-8
Accept-Charset: ISO-8859-1
Response
Content-Type: text/html; charset=utf-8
Google
Request
GET http://www.google.co.uk/ HTTP/1.1
Host: www.google.co.uk
Accept: text/html;charset=UTF-8
Accept-Charset: ISO-8859-1
Response
Content-Type: text/html; charset=ISO-8859-1
StackOverflow
Request
GET http://stackoverflow.com/ HTTP/1.1
Host: stackoverflow.com
Accept: text/html;charset=UTF-8
Accept-Charset: ISO-8859-1
Response
Content-Type: text/html; charset=utf-8
Microsoft
Request
GET http://www.microsoft.com/ HTTP/1.1
Host: www.microsoft.com
Accept: text/html;charset=UTF-8
Accept-Charset: ISO-8859-1
Response
Content-Type: text/html
There doesn't seem to be any consensus around what the expected behaviour is. I am trying to look surprised.
发布评论
评论(6)
尽管您可以在
Accept
标头中设置媒体类型,但该媒体类型的charset
参数定义并未在 RFC 2616 (但并不禁止)。因此,如果您要实现一个兼容 HTTP 1.1 的服务器,您应该首先查找
Accept-charset
标头,然后在Accept
标头中搜索您自己的参数。Altough you can set media type in
Accept
header, thecharset
parameter definition for that media type is not defined anywhere in RFC 2616 (but it is not forbidden, though).Therefore if you are going to implement a HTTP 1.1 compliant server, you shall first look for
Accept-charset
header, and then search for your own parameters atAccept
header.阅读 RFC 2616 第 14.1 和 14.2 节。
Accept
标头不允许您指定字符集
。你有使用
Accept-Charset
标头代替。Read RFC 2616 Section 14.1 and 14.2. The
Accept
header does not allow you to specify acharset
. You haveto use the
Accept-Charset
header instead.首先,
Accept
headers可以接受参数,参见 RFC 7231 第 5.3.2 节全部
text/*
mime-types 可以接受字符集参数。Accept-Charset
标头允许用户代理指定它支持的字符集。如果
Accept-Charset
标头不存在,则用户代理必须为 each 指定 eachcharset
参数>text/*
它接受的媒体类型,例如Firstly,
Accept
headers can accept parameters, see RFC 7231 section 5.3.2All
text/*
mime-types can accept a charset parameter.The
Accept-Charset
header allows a user-agent to specify the charsets it supports.If the
Accept-Charset
header did not exist, a user-agent would have to specify eachcharset
parameter for eachtext/*
media type it accepted, e.g.根据 Mozilla 开发网络,您永远不应该使用 Accept-Charset 标头。它已经过时了。
According to Mozilla Development Network, you should never use the Accept-Charset header. It's obsolete.
RFC 7231 第 5.3.2 节(
接受
) 明确指出:因此每个内容类型都允许使用字符集参数。理论上,客户端可以接受,例如,仅在
UTF-8
中的text/html
和仅在US- 中的
。text/plain
ASCII但在
中声明可能的字符集通常更有意义Accept-Charset
标头,适用于Accept
标头中提到的所有类型。如果这些标头的字符集不重叠,服务器可以发送状态 <代码>406 不可接受。
然而,出于各种原因,我不希望服务器进行花哨的交叉匹配。它会使服务器代码更加复杂(因此更容易出错),而实际上客户端很少< /a> 发送此类请求。而且现在我希望服务器端的所有内容都使用 UTF-8 并按原样发送,因此无需协商。
RFC 7231 section 5.3.2 (
Accept
) clearly states:So a charset parameter for each content-type is allowed. In theory a client could accept, for example,
text/html
only inUTF-8
andtext/plain
only inUS-ASCII
.But it would usually make more sense to state possible charsets in the
Accept-Charset
header as that applies to all types mentioned in theAccept
header.If those headers’ charsets don’t overlap, the server could send status
406 Not Acceptable
.However, I wouldn’t expect fancy cross-matching from a server for various reasons. It would make the server code more complicated (and therefore more error-prone) while in practice a client would rarely send such requests. Also nowadays I would expect everything server-side is using UTF-8 and sent as-is so there’s nothing to negotiate.
我认为这并不重要。客户正在做一些愚蠢的事情;为此不需要互操作性:-)
I don't think it matters. The client is doing something dumb; there doesn't need to be interoperability for that :-)