Python 将文本解码为 ascii

发布于 2024-12-06 07:32:05 字数 174 浏览 0 评论 0原文

如何解码 unicode 字符串,如下所示:

什么%2527s%2bthe%2btime%252c%2bnow%253f

转换为 ascii,如下所示:

现在是什么+时间

How to decode unicode string like this:

what%2527s%2bthe%2btime%252c%2bnow%253f

into ascii like this:

what's+the+time+now

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

遗忘曾经 2024-12-13 07:32:05

在你的例子中,字符串被解码了两次,所以我们需要取消引号两次才能将其恢复

In [1]: import urllib
In [2]: urllib.unquote(urllib.unquote("what%2527s%2bthe%2btime%252c%2bnow%253f") )
Out[3]: "what's+the+time,+now?"

in your case, the string was decoded twice, so we need unquote twice to get it back

In [1]: import urllib
In [2]: urllib.unquote(urllib.unquote("what%2527s%2bthe%2btime%252c%2bnow%253f") )
Out[3]: "what's+the+time,+now?"
错爱 2024-12-13 07:32:05

像这样的东西吗?

title = u"what%2527s%2bthe%2btime%252c%2bnow%253f"
print title.encode('ascii','ignore')

另外,看看这个

Something like this?

title = u"what%2527s%2bthe%2btime%252c%2bnow%253f"
print title.encode('ascii','ignore')

Also, take a look at this

迷路的信 2024-12-13 07:32:05

您可以使用以下内容转换 %(hex) 转义字符:

import re

def my_decode(s):
    re.sub('%([0-9a-fA-F]{2,4})', lambda x: unichr(int(x.group(1), 16)), s)

s = u'what%2527s%2bthe%2btime%252c%2bnow%253f'
print my_decode(s)

results in the unicode string

u'what\u2527s+the+time\u252c+now\u253f'

不确定您如何知道将 \u2527 转换为单引号,或在转换为 ascii 时删除 \u253f 和 \u252c 字符

You could convert the %(hex) escaped chars with something like this:

import re

def my_decode(s):
    re.sub('%([0-9a-fA-F]{2,4})', lambda x: unichr(int(x.group(1), 16)), s)

s = u'what%2527s%2bthe%2btime%252c%2bnow%253f'
print my_decode(s)

results in the unicode string

u'what\u2527s+the+time\u252c+now\u253f'

Not sure how you'd know to convert \u2527 to a single quote, or drop the \u253f and \u252c chars when converting to ascii

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文