Python 中是否有支持将长字符串转储为块文字或折叠块的 yaml 库?
我希望能够转储包含长字符串的字典,我希望将其采用块样式以提高可读性。例如:
foo: |
this is a
block literal
bar: >
this is a
folded block
PyYAML 支持使用这种样式加载文档,但我似乎找不到以这种方式转储文档的方法。我错过了什么吗?
I'd like to be able to dump a dictionary containing long strings that I'd like to have in the block style for readability. For example:
foo: |
this is a
block literal
bar: >
this is a
folded block
PyYAML supports the loading of documents with this style but I can't seem to find a way to dump documents this way. Am I missing something?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
结果:
为了完整性,还应该有 str 实现,但我会偷懒:-)
The result:
For completeness, one should also have str implementations, but I'm going to be lazy :-)
pyyaml 确实支持转储文字或折叠块。
使用
Representer.add_representer
定义类型:
然后您可以定义这些类型的表示者。
请注意,虽然 Gary 的 解决方案非常适合 unicode,您可能需要更多工作才能使字符串正常工作(请参阅represent_str 的实现)。
然后,您可以将这些表示符添加到默认转储器中:
... 并测试它:
结果:
使用
default_style
如果您有兴趣让所有字符串都遵循默认样式,您还可以使用
default_style
关键字参数,例如:或 对于折叠文字:
或对于双引号文字:
警告:
这是您可能意想不到的示例:
结果:
1) 不可打印字符
请参阅 YAML 规范转义字符 (第 5.7 节):
如果要保留不可打印的字符(例如 TAB),则需要使用双引号标量。如果您能够使用文字样式转储标量,并且其中存在不可打印字符(例如 TAB),则您的 YAML 转储程序不合规。
例如
pyyaml
检测不可打印字符\t
并使用双引号样式,即使指定了默认样式:2) 前导和尾随空格
另一点有用规范中的信息是:
这意味着,如果您的字符串确实具有前导或尾随空格,则除了双引号之外,这些字符不会以标量样式保留。因此,pyyaml 尝试检测标量中的内容,并可能强制使用双引号样式。
pyyaml
does support dumping literal or folded blocks.Using
Representer.add_representer
defining types:
Then you can define the representers for those types.
Please note that while Gary's solution works great for unicode, you may need some more work to get strings to work right (see implementation of represent_str).
Then you can add those representers to the default dumper:
... and test it:
result:
Using
default_style
If you are interested in having all your strings follow a default style, you can also use the
default_style
keyword argument, e.g:or for folded literals:
or for double-quoted literals:
Caveats:
Here is an example of something you may not expect:
results in:
1) non-printable characters
See the YAML spec for escaped characters (Section 5.7):
If you want to preserve non-printable characters (e.g. TAB), you need to use double-quoted scalars. If you are able to dump a scalar with literal style, and there is a non-printable character (e.g. TAB) in there, your YAML dumper is non-compliant.
E.g.
pyyaml
detects the non-printable character\t
and uses the double-quoted style even though a default style is specified:2) leading and trailing white spaces
Another bit of useful information in the spec is:
This means that if your string does have leading or trailing white space, these would not be preserved in scalar styles other than double-quoted. As a consequence,
pyyaml
tries to detect what is in your scalar and may force the double-quoted style.这可以相对容易地完成,唯一的“障碍”是如何
指示字符串中的哪些空格需要被替换
表示为折叠标量,需要成为折叠。字面标量
有包含该信息的显式换行符,但这不能
用于折叠标量,因为它们可以包含显式换行符,例如
如果有前导空格,并且末尾还需要换行
为了不使用剥离 chomping 指示符 (
>-
) 表示,该指示符给出:
fold_pos
属性需要一个可逆的可迭代,表示位置空格指示折叠位置。
如果你的字符串中从来没有管道字符('|'),那么你
可以做类似的事情:
这也给出了您期望的输出
This can be relatively easily done, the only "hurdle" being how to
indicate which of the spaces in the string, that needs to be
represented as a folded scalar, needs to become a fold. The literal scalar
has explicit newlines containing that information, but this cannot
be used for folded scalars, as they can contain explicit newlines e.g. in
case there is leading whitespace and also needs a newline at the end
in order not to be represented with a stripping chomping indicator (
>-
)which gives:
The
fold_pos
attribute expects a reversable iterable, representing positionsof spaces indicating where to fold.
If you never have pipe characters ('|') in your strings you
could have done something like:
which also gives exactly the output you expect