解析 YAML 并假设某个路径始终是字符串

发布于 2024-09-26 01:36:48 字数 1287 浏览 2 评论 0原文

我正在使用 http://pyyaml.org 中的 YAML 解析器,我希望它始终将某些字段解释为字符串,但是我无法弄清楚 add_path_resolver() 是如何工作的。

例如:解析器假定“version”是浮点数:

network:
- name: apple
- name: orange
version: 2.3
site: banana

某些文件具有“version:2”(被解释为int)或“version:2.3 alpha”(被解释为str)。

我希望它们始终被解释为 str。

似乎 yaml.add_path_resolver() 应该让我指定,“当你看到版本:时,总是将其解释为 str),但它没有很好地记录。我最好的猜测是:

yaml.add_path_resolver(u'!root', ['version'], kind=str)

但这不起作用。

关于如何进行的建议让我的版本字段始终为字符串?

PS 以下是不同“版本”字符串的一些示例以及它们的解释方式:

(Pdb) import yaml
(Pdb) import pprint
(Pdb) pprint.pprint(yaml.load("---\nnetwork:\n- name: apple\n- name: orange\nversion: 2\nsite: banana"))
{'network': [{'name': 'apple'}, {'name': 'orange'}],
 'site': 'banana',
 'version': 2}
(Pdb) pprint.pprint(yaml.load("---\nnetwork:\n- name: apple\n- name: orange\nversion: 2.3\nsite: banana"))
{'network': [{'name': 'apple'}, {'name': 'orange'}],
 'site': 'banana',
 'version': 2.2999999999999998}
(Pdb) pprint.pprint(yaml.load("---\nnetwork:\n- name: apple\n- name: orange\nversion: 2.3 alpha\nsite: banana"))
{'network': [{'name': 'apple'}, {'name': 'orange'}],
 'site': 'banana',
 'version': '2.3 alpha'}

I am using the YAML parser from http://pyyaml.org and I want it to always interpret certain fields as string, but I can't figure out how add_path_resolver() works.

For example: The parser assumes that "version" is a float:

network:
- name: apple
- name: orange
version: 2.3
site: banana

Some files have "version: 2" (which is interpreted as an int) or "version: 2.3 alpha" (which is interpreted as a str).

I want them to always be interpreted as a str.

It seems that yaml.add_path_resolver() should let me specify, "When you see version:, always interpret it as a str) but it is not documented very well. My best guess is:

yaml.add_path_resolver(u'!root', ['version'], kind=str)

But that doesn't work.

Suggestions on how to get my version field to always be a string?

P.S. Here are some examples of different "version" strings and how they are interpreted:

(Pdb) import yaml
(Pdb) import pprint
(Pdb) pprint.pprint(yaml.load("---\nnetwork:\n- name: apple\n- name: orange\nversion: 2\nsite: banana"))
{'network': [{'name': 'apple'}, {'name': 'orange'}],
 'site': 'banana',
 'version': 2}
(Pdb) pprint.pprint(yaml.load("---\nnetwork:\n- name: apple\n- name: orange\nversion: 2.3\nsite: banana"))
{'network': [{'name': 'apple'}, {'name': 'orange'}],
 'site': 'banana',
 'version': 2.2999999999999998}
(Pdb) pprint.pprint(yaml.load("---\nnetwork:\n- name: apple\n- name: orange\nversion: 2.3 alpha\nsite: banana"))
{'network': [{'name': 'apple'}, {'name': 'orange'}],
 'site': 'banana',
 'version': '2.3 alpha'}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

另类 2024-10-03 01:36:48

从当前来源来看:

 # Note: `add_path_resolver` is experimental.  The API could be changed.

它似乎尚未完成(还?)。 可行的语法(据我所知)是:

yaml.add_path_resolver(u'tag:yaml.org,2002:str', ['version'], yaml.ScalarNode)

但是,它不行。

似乎首先检查隐式类型解析器,如果匹配,则它永远不会检查用户定义的解析器。有关更多详细信息,请参阅 resolver.py (查找函数 <代码>解决)。

我建议将您的 version 条目更改为

version: !!str 2.3

This 将始终将其强制为字符串。

From the current source:

 # Note: `add_path_resolver` is experimental.  The API could be changed.

It appears that it's not complete (yet?). The syntax that would work (as far as I can tell) is:

yaml.add_path_resolver(u'tag:yaml.org,2002:str', ['version'], yaml.ScalarNode)

However, it doesn't.

It appears that the implicit type resolvers are checked first, and if one matches, then it never checks the user-defined resolvers. See resolver.py for more details (look for the function resolve).

I suggest changing your version entry to

version: !!str 2.3

This will always coerce it to a string.

八巷 2024-10-03 01:36:48

到目前为止,最简单的解决方案不是使用基本的 .load() (无论如何这是不安全的),而是将其与 Loader=BaseLoader 一起使用,它将每个标量加载为一个字符串:

import yaml

yaml_str = """\
network:
- name: apple
- name: orange
version: 2.3
old: 2
site: banana
"""

data = yaml.load(yaml_str, Loader=yaml.BaseLoader)
print(data)

给出:

{'network': [{'name': 'apple'}, {'name': 'orange'}], 'version': '2.3', 'old': '2', 'site': 'banana'}

By far the easiest solution for this is not use the basic .load() (which is unsafe anyway), but use it with Loader=BaseLoader, which loads every scalar as a string:

import yaml

yaml_str = """\
network:
- name: apple
- name: orange
version: 2.3
old: 2
site: banana
"""

data = yaml.load(yaml_str, Loader=yaml.BaseLoader)
print(data)

gives:

{'network': [{'name': 'apple'}, {'name': 'orange'}], 'version': '2.3', 'old': '2', 'site': 'banana'}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文