使用 python 进行简单的 ascii url 编码
看一下:
import urllib
print urllib.urlencode(dict(bla='Ã'))
输出是
bla=%C3%BC
我想要的很简单,我想要 ascii 格式的输出而不是 utf-8,所以我需要输出:
bla=%C3
如果我尝试:
urllib.urlencode(dict(bla='Ã'.decode('iso-8859-1')))
不起作用(我所有的 python 文件都是 utf-8 编码的) :
'ascii' 编解码器无法对位置 0-1 中的字符进行编码:序号不在范围(128)
在生产中,输入是 unicode 的。
look at that:
import urllib
print urllib.urlencode(dict(bla='Ã'))
the output is
bla=%C3%BC
what I want is simple, I want the output in ascii instead of utf-8, so I need the output:
bla=%C3
if I try:
urllib.urlencode(dict(bla='Ã'.decode('iso-8859-1')))
doesn't work (all my python files are utf-8 encoded):
'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
In production, the input comes unicoded.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
看看 python 中的 unicode 音译
:您的情况:
这是一个第三方库,可以通过以下方式轻松安装:
Have a look at unicode transliteration in python:
In your case:
This is a third party library, which can be easily installed via:
这不是 ASCII,它没有映射到 0x80 以上的字符。您正在谈论 ISO-8859-1,或者可能是代码页 1252(基于它的 Windows 编码)。
好吧,这取决于您在源代码中保存字符
à
时使用的编码,不是吗?听起来您的文本编辑器已将其另存为 UTF-8。 (这是一件好事,因为像 ISO-8859-1 这样的区域特定编码需要尽快消失。)告诉 Python,您保存的源文件是 UTF-8 格式,按照 PEP 263:
或者,如果您不想那么麻烦,请使用反斜杠转义:
尽管,无论如何,现代 Web 应用程序应该使用 UTF-8 作为输入,而不是 ISO-8859-1/cp1252。
That's not ASCII, which has no characters mapped above 0x80. You're talking about ISO-8859-1, or possibly code page 1252 (the Windows encoding based on it).
Well that depends on what encoding you've used to save the character
Ã
in the source, doesn't it? It sounds like your text editor has saved it as UTF-8. (That's a good thing, because locale-specific encodings like ISO-8859-1 need to go away ASAP.)Tell Python that the source file you've saved is in UTF-8 as per PEP 263:
Or, if you don't want that hassle, use a backslash escape:
Although, either way, a modern webapp should be using UTF-8 for its input rather than ISO-8859-1/cp1252.
工作得很好的 asciification 是这样的:
pretty well working asciification is this way:
如果您的输入实际上是 UTF-8 并且您想要 iso-8859-1 作为输出(不是 ASCII),您需要的是:
If your input is actually UTF-8 and you want iso-8859-1 as output (which is not ASCII) what you need is:
感谢所有解决方案。你们所有人都汇聚到同一个点上。
我把正确的代码更改为 .encode('iso-8859-1') ,结果一团糟
,
它有效。
thanks to all solutions. all of you converge to the very same point.
I made a mess changing the right code
to
turn back to .encode('iso-8859-1') and it works.
包
unihandecode
是然后在
python
中打印
A
。Package
unihandecode
isthen in
python
prints
A
.