如何强制 PyYAML 将字符串加载为 unicode 对象?
PyYAML 包将未标记的字符串加载为 unicode 或 str 对象,具体取决于其内容。
我想在整个程序中使用 unicode 对象(不幸的是,目前还无法切换到 Python 3)。
有没有一种简单的方法可以强制 PyYAML 始终以字符串加载 unicode 对象?我不想用 !!python/unicode
标签弄乱我的 YAML。
# Encoding: UTF-8
import yaml
menu= u"""---
- spam
- eggs
- bacon
- crème brûlée
- spam
"""
print yaml.load(menu)
输出:['spam', 'eggs', 'bacon', u'cr\xe8me br\xfbl\xe9e', 'spam']
我想要:[u'spam' , u'鸡蛋', u'培根', u'cr\xe8me br\xfbl\xe9e', u'垃圾邮件']
The PyYAML package loads unmarked strings as either unicode or str objects, depending on their content.
I would like to use unicode objects throughout my program (and, unfortunately, can't switch to Python 3 just yet).
Is there an easy way to force PyYAML to always strings load unicode objects? I do not want to clutter my YAML with !!python/unicode
tags.
# Encoding: UTF-8
import yaml
menu= u"""---
- spam
- eggs
- bacon
- crème brûlée
- spam
"""
print yaml.load(menu)
Output: ['spam', 'eggs', 'bacon', u'cr\xe8me br\xfbl\xe9e', 'spam']
I would like: [u'spam', u'eggs', u'bacon', u'cr\xe8me br\xfbl\xe9e', u'spam']
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是一个通过始终输出
unicode
来覆盖 PyYAML 对字符串的处理的版本。实际上,这可能与我发布的其他响应的结果相同,只是更短(即您仍然需要确保自定义类中的字符串转换为unicode
或传递unicode
如果您使用自定义处理程序,请自行字符串):(上面给出了
[u'spam', u'eggs', u'bacon', u'cr\xe8me br\xfbl\xe9e', u'spam']
)我还没有在 LibYAML(基于 C 的解析器)上测试它,因为我无法编译它,所以我将保留其他答案。
Here's a version which overrides the PyYAML handling of strings by always outputting
unicode
. In reality, this is probably the identical result of the other response I posted except shorter (i.e. you still need to make sure that strings in custom classes are converted tounicode
or passedunicode
strings yourself if you use custom handlers):(The above gives
[u'spam', u'eggs', u'bacon', u'cr\xe8me br\xfbl\xe9e', u'spam']
)I haven't tested it on
LibYAML
(the c-based parser) as I couldn't compile it though, so I'll leave the other answer as it was.您可以使用以下函数将
str
替换为PyYAML
解码输出中的unicode
类型:Here's a function you could use to use to replace
str
withunicode
types from the decoded output ofPyYAML
: