HTTP 基本身份验证用户名中的 UTF-8 字符被破坏

发布于 2024-07-15 12:55:02 字数 1212 浏览 4 评论 0原文

我正在尝试使用 Ruby on Rails 构建 Web 服务。 用户通过 HTTP 基本身份验证进行身份验证。 我想允许用户名和密码中包含任何有效的 UTF-8 字符。

问题在于浏览器在将基本身份验证凭据发送到我的服务之前会破坏它们中的字符。 为了进行测试,我使用“カタカナカタカナカタカナカタカナカタカナカタカナカタカナカタカナ”作为我的用户名(不知道它是什么意思 - AFAIK)我们的 QA 人员想出了一些随机字符 - 如果有冒犯之处,请原谅我)。

如果我将其作为字符串并执行 username.unpack("h*") 将其转换为十六进制,我得到: '3e28ba3e28fb3e28ba3e38a83e28ba3e28fb3e28ba3e38a83e28ba3e28fb3e28ba3e38a83e28ba3e28fb3e28ba3e38a83e 28ba3e28fb3e28ba3e38a83e28ba3e28fb3e28ba3e38a83e28ba3e28fb3e28ba3e38a83e28ba3e28fb3e28ba3e38a8' 这似乎适合 32 个汉字字符(每个 3 个字节/6 个十六进制数字)。

如果我对通过 HTTP 基本身份验证传入的用户名执行相同的操作,我会得到: 'bafbbaacbafbbaacbafbbaacbafbbaacbafbbaacbafbbaacbafbbaacbafbbaac'。 显然要短得多。 使用 Firefox Live HTTP Headers 插件,这是正在发送的实际标头:

Authorization: Basic q7+ryqu/q8qrv6vKq7+ryqu/q8qrv6vKq7+ryqu/q8o6q7+ryqu/q8qrv6vKq7+ryqu/q8qrv6vKq7+ryqu/q8o=

看起来像 'bafbba...' 字符串,高半字节和低半字节交换了(至少当我将其粘贴到 Emacs 中时,进行 Base 64 解码,然后切换到 hexl 模式)。 这可能是用户名的 UTF16 表示形式,但我没有得到任何东西来将其显示为乱码。

Rails 将内容类型标头设置为 UTF-8,因此浏览器应该以该编码发送。 我获得了表单提交的正确数据。

该问题在 Firefox 3.0.8 和 IE 7 中都会出现。

那么...是否有一些神奇的方法可以让 Web 浏览器通过 HTTP Basic Auth 发送 UTF-8 字符? 我在接收端的处理方式是否错误? HTTP 基本身份验证是否不适用于非 ASCII 字符?

I'm trying to build a web service using Ruby on Rails. Users authenticate themselves via HTTP Basic Auth. I want to allow any valid UTF-8 characters in usernames and passwords.

The problem is that the browser is mangling characters in the Basic Auth credentials before it sends them to my service. For testing, I'm using 'カタカナカタカナカタカナカタカナカタカナカタカナカタカナカタカナ' as my username (no idea what it means - AFAIK it's some random characters our QA guy came up with - please forgive me if it is somehow offensive).

If I take that as a string and do username.unpack("h*") to convert it to hex, I get: '3e28ba3e28fb3e28ba3e38a83e28ba3e28fb3e28ba3e38a83e28ba3e28fb3e28ba3e38a83e28ba3e28fb3e28ba3e38a83e28ba3e28fb3e28ba3e38a83e28ba3e28fb3e28ba3e38a83e28ba3e28fb3e28ba3e38a83e28ba3e28fb3e28ba3e38a8' That seems about right for 32 kanji characters (3 bytes/6 hex digits per).

If I do the same with the username that's coming in via HTTP Basic auth, I get:
'bafbbaacbafbbaacbafbbaacbafbbaacbafbbaacbafbbaacbafbbaacbafbbaac'. It's obviously much shorter. Using the Firefox Live HTTP Headers plugin, here's the actual header that's being sent:

Authorization: Basic q7+ryqu/q8qrv6vKq7+ryqu/q8qrv6vKq7+ryqu/q8o6q7+ryqu/q8qrv6vKq7+ryqu/q8qrv6vKq7+ryqu/q8o=

That looks like that 'bafbba...' string, with the high and low nibbles swapped (at least when I paste it into Emacs, base 64 decode, then switch to hexl mode). That might be a UTF16 representation of the username, but I haven't gotten anything to display it as anything but gibberish.

Rails is setting the content-type header to UTF-8, so the browser should be sending in that encoding. I get the correct data for form submissions.

The problem happens in both Firefox 3.0.8 and IE 7.

So... is there some magic sauce for getting web browsers to send UTF-8 characters via HTTP Basic Auth? Am I handling things wrong on the receiving end? Does HTTP Basic Auth just not work with non-ASCII characters?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

风和你 2024-07-22 12:55:02

我想允许用户名和密码中包含任何有效的 UTF-8 字符。

放弃一切希望。 基本身份验证和 Unicode 不能混合。

对于如何在进行 base64 处理之前将非 ASCII 字符编码为基本身份验证用户名:密码令牌,没有标准 (*)。 因此,每个浏览器都会做一些不同的事情:

  • Opera 使用 UTF-8;
  • IE 使用系统的默认代码页(你无法知道,除了它从来都不是 UTF-8),并使用 Windows“猜测一个看起来有点像的随机字符”默默地破坏不适合它的字符。一个你想要的或可能只是不想要的秘密食谱;
  • Mozilla 仅使用字符代码点的低字节,这会产生编码为 ISO-8859-1 并不可挽回地破坏非 8859-1 字符的效果...例外在执行 XMLHttpRequest 时,在这种情况下它使用 UTF-8;
  • Safari 和 Chrome 编码为 ISO-8859-1,当使用非 8859-1 字符时根本无法发送授权标头。

*:有些人对该标准的解释是:

  • 它应该始终是 ISO-8859-1,因为它是直接包含在标头中的原始 8 位字符的默认编码;
  • 它应该以某种方式使用 RFC2047 规则进行编码。

但是这些提案都没有涉及包含在 Base64 编码的身份验证令牌中,并且 HTTP 规范中的 RFC2047 参考实际上根本不起作用,因为它可能使用的所有地方都被“atom 上下文”明确禁止。 ' RFC2047 本身的规则,即使 HTTP 标头遵守 RFC822 系列的规则和扩展,但它们并不这样做。

总结一下:呃。 这个问题在标准或除 Opera 之外的浏览器中得到修复的希望微乎其微。 这只是促使人们放弃 HTTP 基本身份验证而转而采用非标准且难以访问的基于 cookie 的身份验证方案的又一个因素。 真是耻辱。

I want to allow any valid UTF-8 characters in usernames and passwords.

Abandon all hope. Basic Authentication and Unicode don't mix.

There is no standard(*) for how to encode non-ASCII characters into a Basic Authentication username:password token before base64ing it. Consequently every browser does something different:

  • Opera uses UTF-8;
  • IE uses the system's default codepage (which you have no way of knowing, other than it's never UTF-8), and silently mangles characters that don't fit into to it using the Windows ‘guess a random character that looks a bit like the one you wanted or maybe just not’ secret recipe;
  • Mozilla uses only the lower byte of character codepoints, which has the effect of encoding to ISO-8859-1 and mangling the non-8859-1 characters irretrievably... except when doing XMLHttpRequests, in which case it uses UTF-8;
  • Safari and Chrome encode to ISO-8859-1, and fail to send the authorization header at all when a non-8859-1 character is used.

*: some people interpret the standard to say that either:

  • it should be always ISO-8859-1, due to that being the default encoding for including raw 8-bit characters directly included in headers;
  • it should be encoded using RFC2047 rules, somehow.

But neither of these proposals are on topic for inclusion in a base64-encoded auth token, and the RFC2047 reference in the HTTP spec really doesn't work at all since all the places it might potentially be used are explicitly disallowed by the ‘atom context’ rules of RFC2047 itself, even if HTTP headers honoured the rules and extensions of the RFC822 family, which they don't.

In summary: ugh. There is little-to-no hope of this ever being fixed in the standard or in the browsers other than Opera. It's just one more factor driving people away from HTTP Basic Authentication in favour of non-standard and less-accessible cookie-based authentication schemes. Shame really.

滴情不沾 2024-07-22 12:55:02

基本身份验证不提供对非 ISO-8859-1 字符的支持,这是一个已知的缺点。

已知某些 UA 使用 UTF-8(例如 Opera),但也没有互操作性。

据我所知,除了定义一个处理所有 Unicode 的新身份验证方案之外,没有办法解决这个问题。 并部署它。

It's a known shortcoming that Basic authentication does not provide support for non-ISO-8859-1 characters.

Some UAs are known to use UTF-8 instead (Opera comes to mind), but there's no interoperability for that either.

As far as I can tell, there's no way to fix this, except by defining a new authentication scheme that handles all of Unicode. And getting it deployed.

情话已封尘 2024-07-22 12:55:02

HTTP 摘要式身份验证也无法解决此问题。 它也面临同样的问题,即客户端无法告诉服务器它正在使用什么字符集,而服务器无法正确假设客户端使用的字符集。

HTTP Digest authentication is no solution for this problem, either. It suffers from the same problem of the client being unable to tell the server what character set it's using and the server being unable to correctly assume what the client used.

夏花。依旧 2024-07-22 12:55:02

您是否使用 curl 等工具进行过测试以确保这不是 Firefox 问题? HTTP Auth RFC 对 ASCII 与非 ASCII 没有提及,但它确实说了标头中传递的值是用冒号分隔的用户名和密码,但我在 Firefox 报告发送的字符串中找不到冒号。

Have you tested using something like curl to make sure it's not a Firefox issue? The HTTP Auth RFC is silent on ASCII vs. non-ASCII, but it does say the value passed in the header is the username and the password separated by a colon, and I can't find a colon in the string that Firefox is reporting sending.

硬不硬你别怂 2024-07-22 12:55:02

如果您正在针对 Windows 8.1 进行编码,请注意 HttpCredentialsHeaderValue 文档中的示例(错误地)使用了 UTF-16 编码。 比较好的解决方法是切换到 UTF-8(因为 CryptographicBuffer.ConvertStringToBinary 不支持 ISO-8859-1)。

请参阅 http://msdn。 microsoft.com/en-us/library/windows/apps/windows.web.http.headers.httpcredentialsheadervalue.aspx

If you are coding for Windows 8.1, note that the sample in the documentation for HttpCredentialsHeaderValue is (wrongly) using UTF-16 encoding. Reasonably good fix is to switch to UTF-8 (as ISO-8859-1 is not supported by CryptographicBuffer.ConvertStringToBinary).

See http://msdn.microsoft.com/en-us/library/windows/apps/windows.web.http.headers.httpcredentialsheadervalue.aspx.

2024-07-22 12:55:02

我们今天使用了一个解决方法来规避同事密码中的非 ASCII 字符问题:

curl -u "USERNAME:`echo -n 'PASSWORT' | iconv -f ISO-8859-1 -t UTF-8`" 'URL'

USERNAMEPASSWORDURL 替换为你的价值观。 本示例使用 shell 命令替换来转换密码字符编码在执行 curl 命令之前将其转换为 UTF-8。

注意:我在这里使用了 ` ... ` 评估而不是 ${ ... } 因为如果密码包含 !< 则它不会失败/code> 字符... [shells 喜欢 ! 字符 ;-)]

说明 非 ASCII 字符:

echo -n 'zz<zz§zz$zz-zzäzzözzüzzßzz' | iconv -f ISO-8859-1 -t UTF-8

Here a workaround we used today to circumvent the issue of non-ascii characters in the password of a colleague:

curl -u "USERNAME:`echo -n 'PASSWORT' | iconv -f ISO-8859-1 -t UTF-8`" 'URL'

Replace USERNAME, PASSWORD and URL with your values. This example uses shell command substitution to transform the password character encoding to UTF-8 before executing the curl command.

Note: I used here a ` ... ` evaluation instead of ${ ... } because it doesn't fail if the password contains a ! character... [shells love ! characters ;-)]

Illustration of what happens with non-ASCII characters:

echo -n 'zz<zz§zz$zz-zzäzzözzüzzßzz' | iconv -f ISO-8859-1 -t UTF-8
三生池水覆流年 2024-07-22 12:55:02

我可能完全无知,但我在 ajax 调用中发送 UTF8 字符串作为标头时寻找问题时看到了这篇文章。

我可以通过在发送字符串之前对字符串进行 Base64 编码来解决我的问题。 这意味着您可以在提交之前使用一些简单的 JS 将表单转换为 base64,这样就可以将其转换回服务器端。

这个简单的工具允许我将 utf8 字符串作为简单的 ASCII 发送。 我发现这要归功于这句简单的话:

base64(此编码旨在使二进制数据能够通过非 8 位干净的传输层进行传输)。 http://www.webtoolkit.info/javascript-base64.html

我希望这有帮助不知何故。 只是想尽一点微薄之力回馈社会!

I might be a total ignorant, but I came to this post while looking for a problem while sending a UTF8 string as a header inside an ajax call.

I could solve my problem by encoding in Base64 the string right before sending it. That means that you could with some simple JS convert the form to base64 right before submittting and that way it can be conevrted back on the server side.

This simple tools allowed me to have utf8 strings send as simple ASCII. I found that thanks to this simple sentence:

base64 (this encoding is designed to make binary data survive transport through transport layers that are not 8-bit clean). http://www.webtoolkit.info/javascript-base64.html

I hope this helps somehow. Just trying to give back a little bit to the community!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文