提交的字符编码——_charset_隐藏字段

发布于 2024-09-08 01:10:34 字数 1079 浏览 8 评论 0原文

对于我们的 Web 应用程序，我们有多个包含文本区域的 HTML 页面。我们所有的页面均使用 ISO-8859-1 字符集呈现。当在 Windows 计算机上通过 IE6 访问页面并将特殊字符（例如“智能引用”）复制到文本区域时，我们的某些页面会使用 Windows 1252 字符编码提交页面。在其他页面上，页面似乎使用 UTF-8 字符编码提交。我一直在使用以下隐藏字段跟踪提交字符编码：

<input type="hidden" name="_charset_" />

在 Windows 1252 提交字符编码页面上，我们收到“windows-1252”值。

在 UTF-8 提交字符编码页面上，我们收到一个空白值。

在后端，我们使用 ISO-8859-1。虽然理想情况下我们希望提交字符编码，但我没有看到在 IE 6 上强制执行该行为的选项。考虑到 Windows 1252 和 UTF-8 之间的选择，我更喜欢在 Windows 1252 中提交内容，这样更有可能当页面以 ISO-8859-1 重新呈现时正确呈现。

我对我们的页面进行了一定的深入研究，但我并没有意识到某些页面以一种字符编码提交的原因。

1) 当 IE 6 返回空白的字符集时，这实际上等同于 UTF-8 吗？当提交字符编码为 UTF-8 时，或者仅当无法正确确定要使用的字符编码时，IE 6 是否始终返回空白字符集？

2) 页面上可能存在哪些差异，导致 IE 6 在某些页面上选择 Windows 1252，而在其他页面上选择 UTF-8？我扫描了页面中的 UTF-8 字符和任何接受字符集属性，但都找不到。

附加说明：我在以下链接中找到了有关隐藏输入的字符集的信息。

http:// /web.archive.org/web/20060427015200/ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html

原文

For our web app, we have multiple HTML pages containing text areas. All of our pages are rendered with an ISO-8859-1 charset. When the page is accessed through IE6 on a Windows machine and special characters such as a "smart quote" are copied in to the text area, some of our pages submit the page using the Windows 1252 character encoding. On the others, the pages appear to submit using the UTF-8 character encoding. I've been tracking the submit character encoding by using the following hidden field:

<input type="hidden" name="_charset_" />

On the Windows 1252 submit character encoding pages, we receive a value of "windows-1252".

On the UTF-8 submit character encoding pages, we receive a blank value.

On the backend, we are using ISO-8859-1. While ideally we would want the submit character encoding, I do not see an option for forcing that behavior on IE 6. Given the choice between Windows 1252 and UTF-8, I would prefer the content be submitted in Windows 1252 so that is more likely to render correctly when the page re-renders in ISO-8859-1.

I've looked into our pages in some depth and nothing jumps out at me as the reason why some pages submit in one character encoding.

1) When IE 6 returns a charset of blank, does that in fact equate to UTF-8? Does IE 6 always return a charset of blank when the submit character encoding is UTF-8, or only when it is unable to properly determine what character encoding to use?

2) What possible differences could there be on the pages that would result in IE 6 picking Windows 1252 on some pages and UTF-8 on others? I scanned the page for UTF-8 characters and for any accept-charset attributes and could not find either.

Additional Note: I found the information on the charset hidden input at the following link.

http://web.archive.org/web/20060427015200/ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

自由范儿 2024-09-15 01:10:34

MSDN 声明 IE 只接受“utf-8”作为该属性的值。

回复收藏 0 原文

烏雲後面有陽光 2024-09-15 01:10:34

名为 _charset_ 的隐藏字段经过符合 HTML5 的客户端：

[...]名称字符集的 ASCII 不区分大小写的匹配是特殊的：如果
用作没有 value 属性的隐藏控件的名称，然后
在提交过程中，value 属性会自动被赋予一个值
由提交的字符编码组成。

提交字符编码根据以下算法：

如果用户代理要为表单选择一种编码，它
必须运行以下步骤：
令编码为文档的字符编码。
如果表单元素具有accept-charset属性，则将编码设置为
运行这些子步骤的返回值：
让 input 为表单元素的accept-charset 属性的值。
让候选编码标签成为分割输入的结果
ASCII 空白。
让候选编码为空的字符编码列表。
对于候选编码标签中的每个标记依次（按顺序）
它们是在输入中找到的），获取令牌的编码，如果
这不会导致失败，将编码附加到候选者
编码。
如果候选编码为空，则返回 UTF-8。
返回候选编码中的第一个编码。
返回encoding获取输出编码的结果。