在 python 中验证 yaml 文档
XML 的好处之一是能够根据 XSD 验证文档。 YAML 没有此功能,那么如何验证我打开的 YAML 文档是否符合我的应用程序所需的格式?
One of the benefits of XML is being able to validate a document against an XSD. YAML doesn't have this feature, so how can I validate that the YAML document I open is in the format expected by my application?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
鉴于 JSON 和 YAML 非常相似,您可以使用 JSON-Schema 来验证 YAML 的相当大的子集。这是一个代码片段(您需要 PyYAML 和 jsonschema 安装):
这样做的一个问题是,如果您的架构跨越多个文件并且您使用
"$ref"
引用其他文件,那么这些其他文件将需要是 JSON,我认为。但可能有办法解决这个问题。在我自己的项目中,我正在使用 JSON 文件指定架构,而实例是 YAML。Given that JSON and YAML are pretty similar beasts, you could make use of JSON-Schema to validate a sizable subset of YAML. Here's a code snippet (you'll need PyYAML and jsonschema installed):
One problem with this is that if your schema spans multiple files and you use
"$ref"
to reference the other files then those other files will need to be JSON, I think. But there are probably ways around that. In my own project, I'm playing with specifying the schema using JSON files whilst the instances are YAML.我发现 Cerberus 非常可靠,拥有丰富的文档并且易于使用。
下面是一个基本的实现示例:
my_yaml.yaml
:在
schema.py
中定义验证架构:使用 PyYaml 加载
yaml
文档:请记住,Cerberus 是一个不可知的数据验证工具,这意味着它可以支持 YAML 以外的格式,例如 JSON 、XML 等。
I find Cerberus to be very reliable with great documentation and straightforward to use.
Here is a basic implementation example:
my_yaml.yaml
:Defining the validation schema in
schema.py
:Using the PyYaml to load a
yaml
document:Keep in mind that Cerberus is an agnostic data validation tool, which means that it can support formats other than YAML, such as JSON, XML and so on.
您可以将 YAML 文档加载为 dict 并使用库 schema 来检查它:
You can load YAML document as a dict and use library schema to check it:
尚未提及 Pydantic。
从他们的例子来看:
Pydantic has not been mentioned.
From their example:
尝试 Rx,它有一个 Python 实现。它适用于 JSON 和 YAML。
来自 Rx 站点:
Try Rx, it has a Python implementation. It works on JSON and YAML.
From the Rx site:
是的 - 对验证的支持对于许多重要的用例至关重要。请参阅 YAML 以及模式验证 « Stuart Gunter
正如已经提到的,有 Rx,可用于各种语言,并且Kwalify 适用于 Ruby 和 Java。
另请参阅 PyYAML 讨论:YAMLSchemaDiscussion。
相关的工作是 JSON Schema,它甚至有一些 IETF 标准化活动:draft-zyp-json-schema-03 - 用于描述 JSON 结构和含义的 JSON 媒体类型文件
Yes - having support for validation is vital for lots of important use cases. See e.g. YAML and the importance of Schema Validation « Stuart Gunter
As already mentioned, there is Rx, available for various languages, and Kwalify for Ruby and Java.
See also the PyYAML discussion: YAMLSchemaDiscussion.
A related effort is JSON Schema, which even had some IETF standardization activity: draft-zyp-json-schema-03 - A JSON Media Type for Describing the Structure and Meaning of JSON Documents
我参与了一个类似的项目,我需要验证 YAML 的元素。
首先,我认为“PyYAML 标签”是最好、最简单的方法。但后来决定使用“PyKwalify”,它实际上定义了 YAML 的架构。
PyYAML 标签:
YAML 文件具有标签支持,我们可以通过为数据类型添加前缀来强制执行此基本检查。 (例如)对于整数 - !!int "123"
有关 PyYAML 的更多信息: http://pyyaml.org/ wiki/PyYAMLDocumentation#Tags
这很好,但如果您要将其公开给最终用户,则可能会引起混乱。
我做了一些研究来定义 YAML 的架构。
PyKwalify:
有一个名为 PyKwalify 的包可以用于此目的: https://pypi.python.org/pypi /pykwalify
这个包最适合我的要求。
我在本地设置中尝试了一个小例子,并且正在工作。这是示例架构文件。
此架构的有效 YAML 文件
谢谢
I worked on a similar project where I need to validate the elements of YAML.
First, I thought 'PyYAML tags' is the best and simple way. But later decided to go with 'PyKwalify' which actually defines a schema for YAML.
PyYAML tags:
The YAML file has a tag support where we can enforce this basic checks by prefixing the data type. (e.g) For integer - !!int "123"
More on PyYAML: http://pyyaml.org/wiki/PyYAMLDocumentation#Tags
This is good, but if you are going to expose this to the end user, then it might cause confusion.
I did some research to define a schema of YAML.
PyKwalify:
There is a package called PyKwalify which serves this purpose: https://pypi.python.org/pypi/pykwalify
This package best fits my requirements.
I tried this with a small example in my local set up, and is working. Heres the sample schema file.
Valid YAML file for this schema
Thanks
这些看起来不错。 yaml 解析器可以处理语法错误,并且这些库之一可以验证数据结构。
These look good. The yaml parser can handle the syntax erorrs, and one of these libraries can validate the data structures.
(I've tried this one, it is decent, if a bit sparse.)
您可以使用 python 的 yaml lib 显示加载文件的消息/字符/行/文件。
错误消息可以通过 exc.problem
访问
exc.problem_mark
来获取
对象。该对象允许您访问属性
因此您可以创建自己的问题指针:
You can use python's yaml lib to display message/char/line/file of your loaded file.
The error message can be accessed via exc.problem
Access
exc.problem_mark
to get a<yaml.error.Mark>
object.This object allows you to access attributes
Hence you can create your own pointer to the issue:
我封装了一些现有的 json 相关 python 库旨在能够将它们与
yaml
一起使用。生成的 python 库主要包装...
jsonschema
- 针对json-schema
文件的json
文件验证器,被包装以支持还针对yaml
格式的json-schema
文件验证yaml
文件。jsonpath-ng
- Python 的JSONPath
实现,被包装以支持直接在yaml
JSONPath > files....并且可以在 github 上找到:
https://github.com/yaccob/ytools
它可以使用
pip
安装:pip install ytools
验证示例(来自 https://github.com/yaccob/ytools#validation):
您还没有开箱即用的是针对
yaml
格式的外部架构进行验证以及。ytools 并没有提供任何以前不存在的东西——它只是让一些现有解决方案的应用更加灵活和方便。
I wrapped some existing json-related python libraries aiming for being able to use them with
yaml
as well.The resulting python library mainly wraps ...
jsonschema
- a validator forjson
files againstjson-schema
files, being wrapped to support validatingyaml
files againstjson-schema
files inyaml
-format as well.jsonpath-ng
- an implementation ofJSONPath
for python, being wrapped to supportJSONPath
selection directly onyaml
files.... and is available on github:
https://github.com/yaccob/ytools
It can be installed using
pip
:pip install ytools
Validation example (from https://github.com/yaccob/ytools#validation):
What you don't get out of the box yet, is validating against external schemas that are in
yaml
format as well.ytools
is not providing anything that hasn't existed before - it just makes the application of some existing solutions more flexible and more convenient.我不知道 python 解决方案。但是有一个用于 YAML 的 ruby 模式验证器,名为 kwalify< /a>.如果您没有遇到 python 库,您应该能够使用子进程访问它。
I'm not aware of a python solution. But there is a ruby schema validator for YAML called kwalify. You should be able to access it using subprocess if you don't come across a python library.