搜索框和表单中的中文/日文字符
为什么我用火狐输入:汉
时,GET会变成:
q=%E6%BC%A2&start=0
但是,当我用IE8输入同一个汉字时,GET却是:
q=?&start=0
它变成了一个疑问句标记。
Why is it that when I use Firefox to enter: 漢
, the GET will transform to:
q=%E6%BC%A2&start=0
However, when I use IE8 and I type the same chinese character, the GET is:
q=?&start=0
It turns it into a question mark.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
将页面的编码标记为UTF-8,这个问题就会消失。如果没有此提示,Firefox 有时也无法自动检测您的编码。并且您可能已经在 IE 中手动更改了一次编码,因此它成为未标记页面的新默认值。
将其放入您的
中:
如果您的内容实际上不是 UTF-8,那么您需要使用替代方法。 FORM 上有一个 html 属性,它向 IE 提示您希望将非 ANSI 代码页字符作为 UTF-8 发送,但最好只使用正确的内容类型。
此外,地址栏可能不是查看结果文本的最佳位置,因为我上次检查时,它不能可靠地处理非 ACP 字符。确保您查看的是实际的请求数据。
如果您谈论的是在浏览器的地址栏或搜索框中输入文本,而不是特定的网页,我不会在英文版 Windows 7 上重现此问题。也许您使用的是非常旧的 Windows 版本,并且您的系统 ANSI 代码页不包含该字符; Win95/Win98/WinME肯定会有这个问题。
编辑添加:
在 IE 8 中,在包含此内容的页面上输入您指定的字符对我来说完全符合预期。我已经用 Fiddler 验证了这一点。无论您遇到什么问题,都可能与您迄今为止所描述的问题不同。
实际上,除非您为页面本身使用替代编码,否则您实际上不需要接受字符集。但出于说明目的,我将其保留下来。为了让它真正有用,至少在早期版本的 IE 中(事情可能已经改变;我的一位同事指定了 IE5 左右的行为),您需要一个隐藏的“
_charset_
”字段没有价值鼓励浏览器标记它实际使用的字符集,但这在 utf-8 页面中是多余的)。Mark the encoding of the page as UTF-8 and this problem will go away. Firefox will fail to autodetect your encoding without this hint sometimes, too. And you may have manually changed the encoding in IE once, so that becomes the new default for unmarked pages.
put this in your
<HEAD>
:If your content isn't really in UTF-8, then you'll need to use an alternate method. There's an html attribute on FORM that hints to IE that you want non-ANSI codepage characters to be sent as UTF-8, but it's far nicer to just use the correct content type.
Also, the address bar may not be the best place to look at the resulting text, as the last time I checked, it didn't reliably work with non-ACP characters. Make sure you're looking at the actual request data.
If you're talking about entering text into the address bar or search box in the browser, and not a specific web page, I don't reproduce this problem on English Windows 7. Perhaps you're using a very old version of Windows and your system ANSI code page does not contain that character; Win95/Win98/WinME would certainly have that problem.
Edited to add:
In IE 8, entering the character you specified on a page containing this content works exactly as expected for me. I've verified this with Fiddler. Whatever problem you are having is probably different than what you have described so far.
You actually don't need the accept-charset unless you are using an alternate encoding for the page itself. But I am leaving it in for illustrative purposes. For it to be actually useful, at least in earlier versions of IE (things may have changed; a colleague of mine specified the behavior back in IE5 or so), you need a hidden "
_charset_
" field with no value to encourage the browser to mark what charset it actually used, but that's superfluous in a utf-8 page).可能是字体安装或 URL 编码问题
我在处理 CJK 字符时看到的主要问题之一是安装操作系统时默认未安装东亚语言字体。即使没有安装,这些字符也能在 MS Word 中正确显示。
为了确保操作系统中的所有应用程序都可以处理 CJK(中文、日文和韩文),最好执行以下练习
希望您有随身携带 Windows CD 以继续此操作。
之后IE8有望正确显示字符。
另外,如果您正在进行任何 url 编码,请确保在处理非 ASCII 字符时始终使用 UTF-8 作为字符编码。
It can either be font installation or URL encoding issue
One of main issue which I have seen when dealing with CJK characters is the installation of East Asian Language fonts not done by default when OS is installed. These characters show up properly in MS Word even without installation being done.
To make sure all applications in OS can deal with CJK (Chinese, Japanese and Korean), doing the below exercise is better
Hopefully you have the windows CD with you to proceed with this.
After that IE8 hopefully would show characters properly.
Also in case you are doing any url encoding make sure you always use UTF-8 as the character encoding when dealing with non ASCII characters.
首先,IE 认为中文字符可以按 UTF-8 格式“按原样”发送,而 Firefox 则认为它们需要进行 URL 编码。
您是否观察过网络上的 GET 请求?我敢打赌,它确实是一个三字节序列,并且您用来显示它的工具正在将其减少为 ?。
To begin with, IE believes that Chinese characters can be sent 'as is' in UTF-8, while Firefox thinks they need to be URL-encoded.
Have you watched the GET request on the wire? I bet that it's really a three-byte sequence and that the tool you are using to display it is reducing it to a ?.