As described in the other answers, there are two ways to specify the encoding of a document returned via HTTP:
as part of the Content-Type header field
encoding declaration inside the XML file (e.g. <?xml version="1.0" encoding="UTF-8"?>)
However, both of these are optional. According to the HTTP spec, the encoding defaults to ISO 8859-1 is not specified. With an XML file, if the file is supplied with an HTTP Content-Typ header, this is the correct encoding. Otherwise, the default is UTF-8 or UTF-16 (depending on the presence of a byte order mark (BOM).
So if you know the content is in UTF-8 or in UTF-16, check for the BOM. If it's there, it's UTF-16, otherwise UTF-8. See e.g. http://www.opentag.com/xfaq_enc.htm#enc_default for an explanation.
给定 URI 处的资源可能有多种表示形式。因此,在实际获得之前,您通常无法真正提前知道所获得的表示的内容类型和编码。使用 HTTP HEAD 方法可以让您了解服务器愿意提供哪些内容类型和编码。这也会根据您的客户端发送的标头(接受:...)而有所不同。 如果您想了解更多相关信息,请查找“内容类型协商”。
执行 HEAD 或 GET 请求应返回带有适当 charset 字段的 Content-Type 标头。如果此服务器上没有进行内容类型协商(通常是这种情况),则这不会改变。
I'm assuming you're after the encoding of the representation of the resource addressed by this URL.
A resource at a given URI may have multiple representations. Thus, you can't really know generally in advance the content type and encoding of the representation you get until you actually get it. Using the HTTP HEAD method can give you some indication as to which content types and encodings the server is willing to offer. This will also vary depending on the headers your client sends (Accept: ...). If you want to learn more about this, look for "Content-type negotiation".
Doing a HEAD or GET request should return a Content-Type header with the appropriate charset field. If no content-type negotiation takes place on this server (which is often the case), this will not vary.
If you're using HttpURLConnection in Java, you can see the headers using getHeaderFieldKey and getHeaderField.
发布评论
评论(3)
XML 消息指定编码类型。
XML messages specify the encoding type.
正如其他答案中所述,有两种方法可以指定通过 HTTP 返回的文档的编码:
Content-Type
标头字段编码声明的一部分< ;?xml version="1.0"encoding="UTF-8"?>
)但是,这两个都是可选的。根据 HTTP 规范,未指定编码默认为 ISO 8859-1。对于 XML 文件,如果该文件提供了 HTTP Content-Typ 标头,则这是正确的编码。否则,默认为 UTF-8 或 UTF-16(取决于是否存在字节顺序标记 (BOM)。
因此,如果您知道内容是 UTF-8 或 UTF-16,请检查 BOM 如果存在,则为 UTF-16,否则请参见 http://www.opentag.com/xfaq_enc.htm#enc_default 获取解释。
As described in the other answers, there are two ways to specify the encoding of a document returned via HTTP:
Content-Type
header field<?xml version="1.0" encoding="UTF-8"?>
)However, both of these are optional. According to the HTTP spec, the encoding defaults to ISO 8859-1 is not specified. With an XML file, if the file is supplied with an HTTP Content-Typ header, this is the correct encoding. Otherwise, the default is UTF-8 or UTF-16 (depending on the presence of a byte order mark (BOM).
So if you know the content is in UTF-8 or in UTF-16, check for the BOM. If it's there, it's UTF-16, otherwise UTF-8. See e.g. http://www.opentag.com/xfaq_enc.htm#enc_default for an explanation.
我假设您正在对该 URL 所寻址的资源的表示形式进行编码。
给定 URI 处的资源可能有多种表示形式。因此,在实际获得之前,您通常无法真正提前知道所获得的表示的内容类型和编码。使用 HTTP
HEAD
方法可以让您了解服务器愿意提供哪些内容类型和编码。这也会根据您的客户端发送的标头(接受:...
)而有所不同。如果您想了解更多相关信息,请查找“内容类型协商”。
执行
HEAD
或GET
请求应返回带有适当charset
字段的Content-Type
标头。如果此服务器上没有进行内容类型协商(通常是这种情况),则这不会改变。如果您在 Java 中使用
HttpURLConnection
,则可以使用getHeaderFieldKey
和getHeaderField
查看标头。I'm assuming you're after the encoding of the representation of the resource addressed by this URL.
A resource at a given URI may have multiple representations. Thus, you can't really know generally in advance the content type and encoding of the representation you get until you actually get it. Using the HTTP
HEAD
method can give you some indication as to which content types and encodings the server is willing to offer. This will also vary depending on the headers your client sends (Accept: ...
).If you want to learn more about this, look for "Content-type negotiation".
Doing a
HEAD
orGET
request should return aContent-Type
header with the appropriatecharset
field. If no content-type negotiation takes place on this server (which is often the case), this will not vary.If you're using
HttpURLConnection
in Java, you can see the headers usinggetHeaderFieldKey
andgetHeaderField
.