CSV、DictWriter、unicode 和 utf-8

发布于 2024-09-10 16:36:06 字数 1067 浏览 6 评论 0原文

我在使用 DictWriter 和非 ASCII 字符时遇到问题。我的问题的简短版本:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import codecs
import csv

f = codecs.open("test.csv", 'w', 'utf-8')
writer = csv.DictWriter(f, ['field1'], delimiter='\t')
writer.writerow({'field1':u'å'.encode('utf-8')})
f.close()

给出了这个回溯:

Traceback (most recent call last):
File "test.py", line 10, in <module>writer.writerow({'field1':u'å'.encode('utf-8')})
File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/csv.py", line 124, in writerow
return self.writer.writerow(self._dict_to_list(rowdict))
File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/codecs.py", line 638, in write
return self.writer.write(data)
File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/codecs.py", line 303, in write data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)

我有点迷失,因为根据我在文档中读到的内容,DictWriter 应该能够使用 UTF-8。

I am having problems with the DictWriter and non-ascii characters. A short version of my problem:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import codecs
import csv

f = codecs.open("test.csv", 'w', 'utf-8')
writer = csv.DictWriter(f, ['field1'], delimiter='\t')
writer.writerow({'field1':u'å'.encode('utf-8')})
f.close()

Gives this Traceback:

Traceback (most recent call last):
File "test.py", line 10, in <module>writer.writerow({'field1':u'å'.encode('utf-8')})
File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/csv.py", line 124, in writerow
return self.writer.writerow(self._dict_to_list(rowdict))
File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/codecs.py", line 638, in write
return self.writer.write(data)
File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/codecs.py", line 303, in write data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)

I am bit lost as the DictWriter ought to be able to work with UTF-8 from what I have read in the documentation.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

稳稳的幸福 2024-09-17 16:36:06

使用codecs.open 获得的对象需要在其write 方法中包含一个unicode 字符串——这就是重点。 csv.DictWriter 当然是使用 utf8 编码的字节字符串来调用该方法,因此出现了异常。

f 的创建更改为 f = open("test.csv", 'wb') (将 codecs 从图片中取出)并一切应该都很好。

The object you obtain with codecs.open wants a unicode string in its write method -- that's the whole point. csv.DictWriter of course is calling that method with a utf8-encoded byte string instead, whence the exception.

Change f's creation to f = open("test.csv", 'wb') (taking codecs out of the picture) and things should work just fine.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文