Python 将字符串保存到文件中。统一码错误

发布于 2024-12-21 19:30:38 字数 607 浏览 0 评论 0原文

我正在使用 Python 中的 Spreadsheet API 从 Google 电子表格中提取数据。我可以使用 for 循环在命令行上打印电子表格的每一行，但某些文本包含符号，例如摄氏度符号（小圆圈）。当我在命令行上打印这些行时，我想将它们写入文件。但当我这样做时，我遇到了不同的 unicode 错误。我尝试通过手动执行来解决该问题，但问题太多：

current=current.replace(u'\xa0',u'')
current=current.replace(u'\u000a',u'p')
current=current.replace(u'\u201c',u'\"')
current=current.replace(u'\u201d',u'\"')
current=current.replace(u'\u2014',u'-')

我该怎么做才能不会出现错误？例如

UnicodeEncodeError：“ascii”编解码器无法对位置 1394 处的字符 u'\xa0' 进行编码：序号不在范围（128）

current=current.replace(u'\u0446',u'u')

原文

I am extracting data from a Google spreadsheet using Spreadsheet API in Python. I can print every row of my spreadsheet on the commandline with a for loop but some of the text contain symbols e.g. celsius degree symbol(little circle). As I print these rows on the commandline I want to write them to a file. But I get different unicode errors when I do this. I tried solving it by doing it manually but there are too many:

current=current.replace(u'\xa0',u'')
current=current.replace(u'\u000a',u'p')
current=current.replace(u'\u201c',u'\"')
current=current.replace(u'\u201d',u'\"')
current=current.replace(u'\u2014',u'-')

what can I do so I won't get errors?
e.g.

UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 1394: ordinal not in range(128)

current=current.replace(u'\u0446',u'u')

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

半透明的墙 2024-12-28 19:30:38

您想从它采用的任何编码对其进行解码：

decoded_str = encoded_str.decode('utf-8')

有关如何处理 unicode 字符串的更多信息，您应该查看 http://docs.python.org/howto/unicode.html

You want to decode it from whatever encoding it's in:

decoded_str = encoded_str.decode('utf-8')

For more information on how to deal with unicode strings, you should go over http://docs.python.org/howto/unicode.html

回复收藏 0 原文

梦忆晨望 2024-12-28 19:30:38

import unicodedata
decoded = unicodedata.normalize('NFKD', encoded).decode('UTF-8', 'ignore')

我不太确定在这种情况下是否需要标准化。此外，该忽略选项意味着您可能会丢失一些信息，因为解码错误将被忽略。

import unicodedata
decoded = unicodedata.normalize('NFKD', encoded).decode('UTF-8', 'ignore')

I'm not quite sure that the normalize is needed in this case. Also, that ignore option means that you might loose some information, because decoding errors will be ignored.

回复收藏 0 原文

久伴你 2024-12-28 19:30:38

''.join(c for c in current if ord(c) < 128)

''.join(c for c in current if ord(c) < 128)

回复收藏 0 原文

~没有更多了~

关于作者

凉城

暂无简介

文章

26 人气

关注发私信

友情链接

文江博客

Python 将字符串保存到文件中。统一码错误

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

橘味果▽酱

倾听心声的旋律

十年九夏

魂牵梦绕锁你心扉

旧情勿念

断爱

友情链接

Python 将字符串保存到文件中。统一码错误

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

橘味果▽酱

倾听心声的旋律

十年九夏

魂牵梦绕锁你心扉

旧情勿念

断爱

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。