Python 转换 unicode 字符串并将其保存到列表中
我需要将一系列名称(例如“Alam\xc3\xa9”)插入到列表中,然后必须将它们保存到 SQLite 数据库中。
我知道我可以通过 Tiping: 正确呈现这些名称:
print eval(repr(NAME)).decode("utf-8")
但我必须将它们插入到列表中,所以我不能使用 print
其他方式在没有打印的情况下执行此操作?
I need to insert a series of names (like 'Alam\xc3\xa9') into a list, and than I have to save them into a SQLite database.
I know that I can render these names correctly by tiping:
print eval(repr(NAME)).decode("utf-8")
But I have to insert them into a list, so I can't use the print
Other way for doing this without the print?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这里有很多很多的误解。
您引用的字符串不是 Unicode。它是一个字节字符串,以 UTF-8 编码。
您可以通过解码将其转换为 Unicode:
当您将
unicode_name
的值打印到控制台时,您将看到以下两件事之一:在这里,您可以看到输入名称并按 Enter 键会显示 Unicode 代码点的表示形式。这与输入
print repr(unicode_name)
相同。但是,执行print unicode_name
会打印实际字符 - 即在幕后,它将其编码为终端的正确编码,并打印结果。但这都是无关紧要的,因为 Unicode 字符串只能在内部表示。一旦您想将其存储在数据库、文件或任何地方,您就需要对其进行编码。最有可能选择的编码是 UTF-8 - 这就是它最初的编码。
正如您所看到的,使用原始的未解码版本的名称,
repr
和print
再次显示代码和字符。因此,并不是说将其转换为 Unicode 实际上就使其成为“真正”正确的字符。那么,如果想将其存储到数据库中该怎么办呢?没有什么。什么都没有。 Sqlite 接受 UTF-8 输入,并将其数据以 UTF-8 格式存储在磁盘上。因此,在数据库中存储
name
的原始值绝对不需要任何转换。Lots and lots of misconceptions here.
The string you quote is not Unicode. It is a byte string, encoded in UTF-8.
You can convert it to Unicode by decoding it:
When you print the value of
unicode_name
to the console, you will see one of two things:Here, you can see that just typing the name and pressing enter shows a representation of the Unicode code points. This is the same as typing
print repr(unicode_name)
. However, doingprint unicode_name
prints the actual characters - ie behind the scenes, it encodes it to the correct encoding for your terminal, and prints the result.But this is all irrelevant, because Unicode strings can only be represented internally. As soon as you want to store it in a database, or a file, or anywhere, you need to encode it. And the most likely encoding to choose is UTF-8 - which is what it was in originally.
As you can see, using the original non-decoded version of the name,
repr
andprint
once again show the codes and the characters. So it's not that converting it to Unicode actually makes it any more "really" the correct character.So, what to do if you want to store it in a database? Nothing. Nothing at all. Sqlite accepts UTF-8 input, and stores its data in UTF-8 format on the disk. So there is absolutely no conversion needed to store the original value of
name
in the database.您在寻找这样的东西吗?
Are you looking for something like this?