提交的字符编码——_charset_隐藏字段
对于我们的 Web 应用程序,我们有多个包含文本区域的 HTML 页面。我们所有的页面均使用 ISO-8859-1 字符集呈现。当在 Windows 计算机上通过 IE6 访问页面并将特殊字符(例如“智能引用”)复制到文本区域时,我们的某些页面会使用 Windows 1252 字符编码提交页面。在其他页面上,页面似乎使用 UTF-8 字符编码提交。我一直在使用以下隐藏字段跟踪提交字符编码:
<input type="hidden" name="_charset_" />
在 Windows 1252 提交字符编码页面上,我们收到“windows-1252”值。
在 UTF-8 提交字符编码页面上,我们收到一个空白值。
在后端,我们使用 ISO-8859-1。虽然理想情况下我们希望提交字符编码,但我没有看到在 IE 6 上强制执行该行为的选项。考虑到 Windows 1252 和 UTF-8 之间的选择,我更喜欢在 Windows 1252 中提交内容,这样更有可能当页面以 ISO-8859-1 重新呈现时正确呈现。
我对我们的页面进行了一定的深入研究,但我并没有意识到某些页面以一种字符编码提交的原因。
1) 当 IE 6 返回空白的字符集时,这实际上等同于 UTF-8 吗?当提交字符编码为 UTF-8 时,或者仅当无法正确确定要使用的字符编码时,IE 6 是否始终返回空白字符集?
2) 页面上可能存在哪些差异,导致 IE 6 在某些页面上选择 Windows 1252,而在其他页面上选择 UTF-8?我扫描了页面中的 UTF-8 字符和任何接受字符集属性,但都找不到。
附加说明:我在以下链接中找到了有关隐藏输入的字符集的信息。
http:// /web.archive.org/web/20060427015200/ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html
For our web app, we have multiple HTML pages containing text areas. All of our pages are rendered with an ISO-8859-1 charset. When the page is accessed through IE6 on a Windows machine and special characters such as a "smart quote" are copied in to the text area, some of our pages submit the page using the Windows 1252 character encoding. On the others, the pages appear to submit using the UTF-8 character encoding. I've been tracking the submit character encoding by using the following hidden field:
<input type="hidden" name="_charset_" />
On the Windows 1252 submit character encoding pages, we receive a value of "windows-1252".
On the UTF-8 submit character encoding pages, we receive a blank value.
On the backend, we are using ISO-8859-1. While ideally we would want the submit character encoding, I do not see an option for forcing that behavior on IE 6. Given the choice between Windows 1252 and UTF-8, I would prefer the content be submitted in Windows 1252 so that is more likely to render correctly when the page re-renders in ISO-8859-1.
I've looked into our pages in some depth and nothing jumps out at me as the reason why some pages submit in one character encoding.
1) When IE 6 returns a charset of blank, does that in fact equate to UTF-8? Does IE 6 always return a charset of blank when the submit character encoding is UTF-8, or only when it is unable to properly determine what character encoding to use?
2) What possible differences could there be on the pages that would result in IE 6 picking Windows 1252 on some pages and UTF-8 on others? I scanned the page for UTF-8 characters and for any accept-charset attributes and could not find either.
Additional Note: I found the information on the charset hidden input at the following link.
http://web.archive.org/web/20060427015200/ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
MSDN 声明 IE 只接受“utf-8”作为该属性的值。
MSDN states that IE only accepts "utf-8" as a value for this attribute.
名为
_charset_
的隐藏字段经过 符合 HTML5 的 客户端:提交字符编码根据 以下算法:
所以我认为如果你在后端没有收到
_charset_
表单参数,你应该假设字符编码是UTF-8
The hidden field named
_charset_
has special treatement by HTML5 conforming clients:The submission character encoding is selected according to the following algorithm:
So I think that if you do not receive a
_charset_
form parameter at backend, you should assume the character encoding isUTF-8