在 Python 和 Javascript 中解码 unicode姜戈

发布于 2024-10-05 06:09:42 字数 626 浏览 2 评论 0原文

在网站上,我通过 POST 将单词 pluş 发送到 Django 视图。 它以 plu%25C8%2599 形式发送。因此,我采用了该字符串并尝试找出一种将 %25C8%2599 重新转换为 ş 的方法。

我尝试像这样解码字符串:

from urllib import unquote_plus
s = "plu%25C8%2599"
print unquote_plus(unquote_plus(s).decode('utf-8'))

我得到的结果是 pluÈ ,它的长度实际上是 5,而不是 4。

如何在编码后获得原始字符串 pluş

编辑:

我设法这样做

def js_unquote(quoted):
  quoted = quoted.encode('utf-8')
  quoted = unquote_plus(unquote_plus(quoted)).decode('utf-8')
  return quoted

它看起来很奇怪,但按照我需要的方式工作。

On a website I have the word pluș sent via POST to a Django view.
It is sent as plu%25C8%2599. So I took that string and tried to figure out a way how to make %25C8%2599 back into ș.

I tried decoding the string like this:

from urllib import unquote_plus
s = "plu%25C8%2599"
print unquote_plus(unquote_plus(s).decode('utf-8'))

The result i get is pluÈ which actually has a length of 5, not 4.

How can I get the original string pluș after it's encoded ?

edit:

I managed to do it like this

def js_unquote(quoted):
  quoted = quoted.encode('utf-8')
  quoted = unquote_plus(unquote_plus(quoted)).decode('utf-8')
  return quoted

It looks weird but works the way I needed it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

我是有多爱你 2024-10-12 06:09:42

URL 解码两次,然后解码为 UTF-8。

URL-decode twice, then decode as UTF-8.

撧情箌佬 2024-10-12 06:09:42

除非你知道编码是什么,否则你不能。 Unicode 本身并不是一种编码。您可以尝试 BeautifulSoup 或 UnicodeDammit,这可能会帮助您获得所需的结果。

我希望这会有所帮助

另请查看:

http://www.joelonsoftware.com/articles/Unicode.html

You can't unless you know what the encoding is. Unicode itself is not an encoding. You might try BeautifulSoup or UnicodeDammit, which might help you get the result you were hoping for.

http://www.crummy.com/software/BeautifulSoup/

I hope this helps!

Also take a look at:

http://www.joelonsoftware.com/articles/Unicode.html

2024-10-12 06:09:42
unquote_plus(s).encode('your_lang_encoding')

我就是这样尝试的。我尝试通过 HTML 表单直接向 django URI 发送 json POST 请求,其中包含 unicode 字符,如 "şğüöçı+" 并且它有效。我在 encode() 函数中使用了 iso_8859-9 编码器。

unquote_plus(s).encode('your_lang_encoding')

I was try like that. I was tried to sent a json POST request by HTML form to directly a django URI, which is included unicode characters like "şğüöçı+" and it works. I have used iso_8859-9 encoder in encode() function.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文