Python:如何让 StringIO.writelines 接受 unicode 字符串?

发布于 2024-08-13 09:42:13 字数 445 浏览 7 评论 0原文

得到了 a ,

UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in position 34: ordinal not in range(128)

我在下面的“a.desc”中存储的字符串上 因为它包含“£”字符。它作为 unicode 字符串存储在底层 Google App Engine 数据存储中,所以没问题。 cStringIO.StringIO.writelines 函数似乎正在尝试以 ascii 格式对其进行编码:

result.writelines(['blahblah',a.desc,'blahblahblah'])

如果这是正确的措辞,如何指示它将编码视为 unicode?

应用程序引擎在 python 2.5 上运行

I'm getting a

UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in position 34: ordinal not in range(128)

on a string stored in 'a.desc' below as it contains the '£' character. It's stored in the underlying Google App Engine datastore as a unicode string so that's fine. The cStringIO.StringIO.writelines function is trying seemingly trying to encode it in ascii format:

result.writelines(['blahblah',a.desc,'blahblahblah'])

How do I instruct it to treat the encoding as unicode if that's the correct phrasing?

app engine runs on python 2.5

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

白馒头 2024-08-20 09:42:13

您可以将 StringIO 对象包装在 codecs.StreamReaderWriter 对象中以自动编码和解码 unicode。

像这样:

import cStringIO, codecs
buffer = cStringIO.StringIO()
codecinfo = codecs.lookup("utf8")
wrapper = codecs.StreamReaderWriter(buffer, 
        codecinfo.streamreader, codecinfo.streamwriter)

wrapper.writelines([u"list of", u"unicode strings"])

buffer 将填充 utf-8 编码的字节。

如果我正确理解你的情况,你只需要写,所以你也可以这样做:

import cStringIO, codecs
buffer = cStringIO.StringIO()
wrapper = codecs.getwriter("utf8")(buffer)

You can wrap the StringIO object in a codecs.StreamReaderWriter object to automatically encode and decode unicode.

Like this:

import cStringIO, codecs
buffer = cStringIO.StringIO()
codecinfo = codecs.lookup("utf8")
wrapper = codecs.StreamReaderWriter(buffer, 
        codecinfo.streamreader, codecinfo.streamwriter)

wrapper.writelines([u"list of", u"unicode strings"])

buffer will be filled with utf-8 encoded bytes.

If I understand your case correctly, you will only need to write, so you could also do:

import cStringIO, codecs
buffer = cStringIO.StringIO()
wrapper = codecs.getwriter("utf8")(buffer)
小梨窩很甜 2024-08-20 09:42:13

StringIO 文档

与 StringIO 模块实现的内存文件不同,[cStringIO] 提供的内存文件无法接受无法编码为纯 ASCII 字符串的 Unicode 字符串。

如果可能,请使用 StringIO 而不是 cStringIO。

StringIO documentation:

Unlike the memory files implemented by the StringIO module, those provided by [cStringIO] are not able to accept Unicode strings that cannot be encoded as plain ASCII strings.

If possible, use StringIO instead of cStringIO.

风吹过旳痕迹 2024-08-20 09:42:13

您还可以在将字符串添加到 StringIO 之前手动将其编码为 utf-8

for val in rows:
    if isinstance(val, unicode):
        val = val.encode('utf-8')
result.writelines(rows)

You can also encode your string as utf-8 manually before adding it to the StringIO

for val in rows:
    if isinstance(val, unicode):
        val = val.encode('utf-8')
result.writelines(rows)
謌踐踏愛綪 2024-08-20 09:42:13

Python 2.6 引入了 io 模块,您应该考虑使用 io.StringIO(), “unicode 文本的内存流。”

在旧的 python 版本中,这没有优化(纯 Python),在更高的版本中,这已经优化为(快速)C 代码。

Python 2.6 introduced the io module and you should consider using io.StringIO(), "An in-memory stream for unicode text."

In older python versions this is not optimized (pure Python), in later versions this has been optimized to (fast) C code.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文