从远程数据库获取UTF8字符串

发布于 2024-10-09 01:33:42 字数 219 浏览 9 评论 0原文

我的应用程序从远程 MySQL 数据库下载一些数据。问题是 db 将字符串存储为 utf8。但我收到的数据是 ascii 解码的。如何解决这个问题?

代码:

cursor = conn.cursor()
query = """MY QUERY HERE"""
cursor.execute(query)
result = cursor.fetchall()

My application downloads some data from remote MySQL database. Problem is that db stores strings as utf8. But data I receive is ascii decoded. How to get around this ?

The code :

cursor = conn.cursor()
query = """MY QUERY HERE"""
cursor.execute(query)
result = cursor.fetchall()

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

君勿笑 2024-10-16 01:33:42

也许有一个例子——这里我创建了一个unicode字符串“u”,将其编码为utf8,将其从utf8解码回unicode字符串,将其编码为ascii(这会引发异常,因为该字符串中的扩展字符可以不被编码为ascii),然后最后编码为ascii,用“?”替换错误:

Python 2.6.4 (r264:75706, Dec  7 2009, 18:43:55) 
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> u = u'abc\u2020123'
>>> u
u'abc\u2020123'
>>> u.encode('utf8')
'abc\xe2\x80\xa0123'
>>> s = _
>>> s.decode('utf8')
u'abc\u2020123'
>>> u.encode('ascii')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2020' in position 3: ordinal not in range(128)
>>> u.encode('ascii', 'replace')
'abc?123'
>>>

大概,您从数据库中获取utf8字符串,您应该将它们从utf8解码为unicode字符串,然后可能重新编码它们在输出上用于消耗程序输出的任何内容...通常您需要一个类似以下的模型:

  1. 输入数据 - 从输入编码转换为 unicode [string.decode('utf8')]
  2. 处理数据 - 仅处理unicode 对象
  3. 输出结果 -- 从 unicode 转换为输出编码 [string.encode('utf8')]

这为您提供了编码/解码的清晰分离,并避免将编码处理代码传播到整个应用程序,因为核心只处理 unicode 。

Perhaps an example is in order -- here I create a unicode string "u", encode it as utf8, decode that from utf8 back to a unicode string, encode it as ascii (which throws an exception since the extended character in this string can't be encoded as ascii), then finally encode as ascii replacing errors with the "?":

Python 2.6.4 (r264:75706, Dec  7 2009, 18:43:55) 
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> u = u'abc\u2020123'
>>> u
u'abc\u2020123'
>>> u.encode('utf8')
'abc\xe2\x80\xa0123'
>>> s = _
>>> s.decode('utf8')
u'abc\u2020123'
>>> u.encode('ascii')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2020' in position 3: ordinal not in range(128)
>>> u.encode('ascii', 'replace')
'abc?123'
>>>

Presumably, you're getting utf8 strings back from the db, you should decode these from utf8 to a unicode string, then probably re-encode them on output for whatever is consuming the output of your program... Typically you want a model something like:

  1. Input data -- transform from input encoding to unicode [string.decode('utf8')]
  2. Process data -- dealing only with unicode objects
  3. Output result -- transform from unicode to output encoding [string.encode('utf8')]

This gives you a clean separation of encoding/decoding and avoids spreading encoding-handling code all over your application since the core only deals with unicode.

深者入戏 2024-10-16 01:33:42

您可能想尝试string.encode('ascii').decode('utf-8')

You might want to try string.encode('ascii').decode('utf-8')?

娇女薄笑 2024-10-16 01:33:42

在从数据库查询之前执行 conn.set_character_encoding('utf8')

do a conn.set_character_encoding('utf8') before querying from the db.

残龙傲雪 2024-10-16 01:33:42

只需将你的Python设置为utf-8编码,你就不用再担心了。使用 db2/ mongodb 加载数据时遇到此问题。

只需将 site.py 下的默认编码设置为 utf-8 即可。

看看@ http://blog.ianbicking.org/illusive-setdefaultencoding.html

just set your python to utf-8 encoding and you don't have to worry anymore. had this problem with db2/ mongodb to load data.

just set the defaultencoding to utf-8 under site.py.

have a look @ http://blog.ianbicking.org/illusive-setdefaultencoding.html

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文