如何从 URL 参数获取 unicode 字符?
我需要使用 GET 请求通过 JavaScript 客户端将 JSON 发送到我的服务器,因此我开始回显响应以确保翻译过程中不会丢失任何内容。普通文本似乎没有问题,但是一旦我包含任何类型的 Unicode 字符(例如“ç”),该字符就会以某种方式编码(例如“\u00e7”)并且返回值不同于请求值。我主要关心的是,A) 在我的 Python 代码中保存了客户端想要正确发送到数据库的内容,B) 我将相同的值回显给客户端已发送(测试时)。
也许这意味着我不能使用 base64,或者必须一路做一些不同的事情。我对此表示同意。我的实现只是为了达到目的的一种尝试。
当前步骤(如果需要,任何步骤都可以更改):
我想要发送到服务器的原始 JSON 字符串:
'{"weird-chars": "°ç"}'
JavaScript 通过 GET 参数传递给服务器的字符串的 Base64 编码版本(顺便说一句,编码字符串末尾的等号会导致任何问题吗?):
http://www.myserver.com/?json=eyJ3ZWlyZC1jaGFycyI6ICLCsMOnIn0=
Python str 来自参数的
b64decode
结果:
'{"weird-chars": "\xc2\xb0\xc3\xa7"}'
来自解码参数的 json.loads
的 Python dict
:
{'weird-chars': u'\xb0\xe7'}
来自 str
的 Python str
该 dict
的 code>json.dumps (以及随后输出到浏览器的内容):
'{"weird-chars": "\u00b0\u00e7"}'
I need to use a GET request to send JSON to my server via a JavaScript client, so I started echoing responses back to make sure nothing is lost in translation. There doesn't seem to be a problem with normal text, but as soon as I include a Unicode character of any sort (e.g. "ç") the character is encoded somehow (e.g. "\u00e7") and the return value is different from request value. My primary concern is that, A) In my Python code saves what the client intended on sending to the database correctly, and B) I echo the same values back to the client that were sent (when testing).
Perhaps this means I can't use base64, or have to do something different along the way. I'm ok with that. My implementation is just an attempt at a means to an end.
Current steps (any step can be changed, if needed):
Raw JSON string which I want to send to the server:
'{"weird-chars": "°ç"}'
JavaScript Base64 encoded version of the string passed to server via GET param (on a side note, will the equals sign at the end of the encoded string cause any issues?):
http://www.myserver.com/?json=eyJ3ZWlyZC1jaGFycyI6ICLCsMOnIn0=
Python str
result from b64decode
of param:
'{"weird-chars": "\xc2\xb0\xc3\xa7"}'
Python dict
from json.loads
of decoded param:
{'weird-chars': u'\xb0\xe7'}
Python str
from json.dumps
of that dict
(and subsequent output to the browser):
'{"weird-chars": "\u00b0\u00e7"}'
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在我看来一切都很好。
也许您应该在尝试使用 JSON 之前对其进行解码。
Everything looks fine to me.
Perhaps you should decode the JSON before attempting to use it.
你的程序很好,你只需要多一步;也就是说,从 unicode 编码为
utf-8
(或支持“奇怪字符”的任何其他编码。)将解码视为从常规字符串到 unicode 和编码作为您从从 unicode 取回的操作。换句话说:
您de - 编码一个
str
来生成一个unicode
字符串,en - 编码一个
>unicode
字符串来生成str
。因此:
encodedchars
将包含您的字符,以所选编码(在本例中为utf-8
)显示。Your procedure's fine, you just need 1 more step; that is, encoding from unicode to
utf-8
(or any other encoding that supports the 'weird characters'.)Think of decoding as what you do to go from a regular string to unicode and encoding as what you do to get back from unicode. In other words:
You de - code a
str
to produce aunicode
stringand en - code a
unicode
string to produce anstr
.So:
encodedchars
will contain your characters, displayed in the selected encoding (in this case,utf-8
).