如何从 Linux shell 脚本解析 YAML 文件?

发布于 2024-10-17 18:41:58 字数 97 浏览 12 评论 0原文

我希望提供一个结构化的配置文件,对于非技术用户来说尽可能容易编辑(不幸的是它必须是一个文件),所以我想使用 YAML。然而我找不到任何从 Unix shell 脚本解析这个的方法。

I wish to provide a structured configuration file which is as easy as possible for a non-technical user to edit (unfortunately it has to be a file) and so I wanted to use YAML. I can't find any way of parsing this from a Unix shell script however.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(25

寄居者 2024-10-24 18:41:59

我曾经使用 python 将 yaml 转换为 json 并在 jq 中进行处理。

python -c "import yaml; import json; from pathlib import Path; print(json.dumps(yaml.safe_load(Path('file.yml').read_text())))" | jq '.'

I used to convert yaml to json using python and do my processing in jq.

python -c "import yaml; import json; from pathlib import Path; print(json.dumps(yaml.safe_load(Path('file.yml').read_text())))" | jq '.'
纵情客 2024-10-24 18:41:59

另一种选择是将 YAML 转换为 JSON,然后使用 jq 与 JSON 表示进行交互,以从中提取信息或对其进行编辑。

我编写了一个包含此胶水的简单 bash 脚本 - 请参阅 GitHub 上的 Y2J 项目

Another option is to convert the YAML to JSON, then use jq to interact with the JSON representation either to extract information from it or edit it.

I wrote a simple bash script that contains this glue - see Y2J project on GitHub

鸢与 2024-10-24 18:41:59
perl -ne 'chomp; printf qq/%s="%s"\n/, split(/\s*:\s*/,$_,2)' file.yml > file.sh
perl -ne 'chomp; printf qq/%s="%s"\n/, split(/\s*:\s*/,$_,2)' file.yml > file.sh
花开柳相依 2024-10-24 18:41:59

如果您需要单个值,您可以使用一个工具将 YAML 文档转换为 JSON 并提供给 jq,例如yq

Sample.yaml 的内容:

---
bob:
  item1:
    cats: bananas
  item2:
    cats: apples
  thing:
    cats: oranges

示例:

$ yq -r '.bob["thing"]["cats"]' sample.yaml 
oranges

If you need a single value you could a tool which converts your YAML document to JSON and feed to jq, for example yq.

Content of sample.yaml:

---
bob:
  item1:
    cats: bananas
  item2:
    cats: apples
  thing:
    cats: oranges

Example:

$ yq -r '.bob["thing"]["cats"]' sample.yaml 
oranges
七月上 2024-10-24 18:41:59

我知道这是非常具体的,但我认为我的答案可能对某些用户有帮助。
如果您的计算机上安装了 nodenpm,则可以使用 js-yaml
首先安装:

npm i -g js-yaml
# or locally
npm i js-yaml

然后在你的bash脚本中

#!/bin/bash
js-yaml your-yaml-file.yml

另外,如果你使用jq,你可以做类似的事情,

#!/bin/bash
json="$(js-yaml your-yaml-file.yml)"
aproperty="$(jq '.apropery' <<< "$json")"
echo "$aproperty"

因为js-yaml将yaml文件转换为json字符串文字。然后,您可以将该字符串与 Unix 系统中的任何 json 解析器一起使用。

I know this is very specific, but I think my answer could be helpful for certain users.
If you have node and npm installed on your machine, you can use js-yaml.
First install :

npm i -g js-yaml
# or locally
npm i js-yaml

then in your bash script

#!/bin/bash
js-yaml your-yaml-file.yml

Also if you are using jq you can do something like that

#!/bin/bash
json="$(js-yaml your-yaml-file.yml)"
aproperty="$(jq '.apropery' <<< "$json")"
echo "$aproperty"

Because js-yaml converts a yaml file to a json string literal. You can then use the string with any json parser in your unix system.

紧拥背影 2024-10-24 18:41:59

现在做这件事的快速方法(以前的方法对我不起作用):

sudo wget https://github.com/mikefarah/yq/releases/download/v4.4.1/yq_linux_amd64 -O /usr/bin/yq &&\
sudo chmod +x /usr/bin/yq

示例asd.yaml:

a_list:
  - key1: value1
    key2: value2
    key3: value3

解析根:

user@vm:~$ yq e '.' asd.yaml                                                                                                         
a_list:
  - key1: value1
    key2: value2
    key3: value3

解析key3:

user@vm:~$ yq e '.a_list[0].key3' asd.yaml                                                                                             
value3

A quick way to do the thing now (previous ones haven't worked for me):

sudo wget https://github.com/mikefarah/yq/releases/download/v4.4.1/yq_linux_amd64 -O /usr/bin/yq &&\
sudo chmod +x /usr/bin/yq

Example asd.yaml:

a_list:
  - key1: value1
    key2: value2
    key3: value3

parsing root:

user@vm:~$ yq e '.' asd.yaml                                                                                                         
a_list:
  - key1: value1
    key2: value2
    key3: value3

parsing key3:

user@vm:~$ yq e '.a_list[0].key3' asd.yaml                                                                                             
value3
爱的故事 2024-10-24 18:41:59

如果您有 python 2 和 PyYAML,则可以使用我编写的这个解析器,名为 parse_yaml.py。它所做的一些更简洁的事情是让您选择一个前缀(如果您有多个具有相似变量的文件)并从 yaml 文件中选择一个值。

例如,如果您有以下 yaml 文件:

staging.yaml:

db:
    type: sqllite
    host: 127.0.0.1
    user: dev
    password: password123

prod.yaml:

db:
    type: postgres
    host: 10.0.50.100
    user: postgres
    password: password123

您可以加载这两个文件而不会发生冲突。

$ eval $(python parse_yaml.py prod.yaml --prefix prod --cap)
$ eval $(python parse_yaml.py staging.yaml --prefix stg --cap)
$ echo $PROD_DB_HOST
10.0.50.100
$ echo $STG_DB_HOST
127.0.0.1

甚至可以挑选您想要的值。

$ prod_user=$(python parse_yaml.py prod.yaml --get db_user)
$ prod_port=$(python parse_yaml.py prod.yaml --get db_port --default 5432)
$ echo prod_user
postgres
$ echo prod_port
5432

If you have python 2 and PyYAML, you can use this parser I wrote called parse_yaml.py. Some of the neater things it does is let you choose a prefix (in case you have more than one file with similar variables) and to pick a single value from a yaml file.

For example if you have these yaml files:

staging.yaml:

db:
    type: sqllite
    host: 127.0.0.1
    user: dev
    password: password123

prod.yaml:

db:
    type: postgres
    host: 10.0.50.100
    user: postgres
    password: password123

You can load both without conflict.

$ eval $(python parse_yaml.py prod.yaml --prefix prod --cap)
$ eval $(python parse_yaml.py staging.yaml --prefix stg --cap)
$ echo $PROD_DB_HOST
10.0.50.100
$ echo $STG_DB_HOST
127.0.0.1

And even cherry pick the values you want.

$ prod_user=$(python parse_yaml.py prod.yaml --get db_user)
$ prod_port=$(python parse_yaml.py prod.yaml --get db_port --default 5432)
$ echo prod_user
postgres
$ echo prod_port
5432
奢华的一滴泪 2024-10-24 18:41:59

您可以使用 等效yq 用 golang 编写:

./go-yg -yamlFile /home/user/dev/ansible-firefox/defaults/main.yml -key
firefox_version

返回:

62.0.3

You could use an equivalent of yq that is written in golang:

./go-yg -yamlFile /home/user/dev/ansible-firefox/defaults/main.yml -key
firefox_version

returns:

62.0.3
书间行客 2024-10-24 18:41:59

每当您需要“如何使用 shell 脚本中的 YAML/JSON/兼容数据”的解决方案(该解决方案适用于几乎所有使用 Python 的操作系统(*nix、OSX、Windows))时,请考虑 yamlpath,它提供了几个用于读取、写入、搜索和合并 YAML、EYAML、JSON 和兼容文件的命令行工具。由于几乎每个操作系统都预装了 Python 或者安装起来很简单,这使得 yamlpath 具有高度的可移植性。更有趣的是:该项目定义了一种直观的路径语言,具有非常强大、命令行友好的语法,可以访问一个或多个节点。

针对您的具体问题,并使用 Python 的本机包管理器 或操作系统的包管理器 (yamlpath可以通过某些操作系统的 RPM 获得):

#!/bin/bash
# Read values directly from YAML (or EYAML, JSON, etc) for use in this shell script:
myShellVar=$(yaml-get --query=any.path.no[matter%how].complex source-file.yaml)

# Use the value any way you need:
echo "Retrieved ${myShellVar}"

# Perhaps change the value and write it back:
myShellVar="New Value"
yaml-set --change=/any/path/no[matter%how]/complex --value="$myShellVar" source-file.yaml

尽管您没有指定数据是一个简单的标量值,所以让我们加大赌注。如果你想要的结果是一个数组怎么办?更具挑战性的是,如果它是一个哈希数组并且您只想要每个结果的一个属性怎么办?进一步假设您的数据实际上分布在多个 YAML 文件中,并且您需要单个查询中的所有结果。这是一个更有趣的问题来证明。因此,假设您有以下两个 YAML 文件:

文件:data1.yaml

---
baubles:
  - name: Doohickey
    sku: 0-000-1
    price: 4.75
    weight: 2.7g
  - name: Doodad
    sku: 0-000-2
    price: 10.5
    weight: 5g
  - name: Oddball
    sku: 0-000-3
    price: 25.99
    weight: 25kg

文件:data2.yaml

---
baubles:
  - name: Fob
    sku: 0-000-4
    price: 0.99
    weight: 18mg
  - name: Doohickey
    price: 10.5
  - name: Oddball
    sku: 0-000-3
    description: This ball is odd

您如何仅报告每个的 sku应用从 data2.yaml 到 data1.yaml 的更改后库存中的项目,全部来自 shell 脚本?尝试一下:

#!/bin/bash
baubleSKUs=($(yaml-merge --aoh=deep data1.yaml data2.yaml | yaml-get --query=/baubles/sku -))

for sku in "${baubleSKUs[@]}"; do
    echo "Found bauble SKU:  ${sku}"
done

只需几行代码即可获得所需的内容:

Found bauble SKU:  0-000-1
Found bauble SKU:  0-000-2
Found bauble SKU:  0-000-3
Found bauble SKU:  0-000-4

如您所见,yamlpath 将非常复杂的问题转化为简单的解决方案。请注意,整个查询是作为流处理的;查询没有更改任何 YAML 文件,也没有临时文件。

我意识到这是“解决同一问题的另一个工具”,但在阅读此处的其他答案后,yamlpath 似乎比大多数替代方案更便携和更强大。它还完全理解 YAML/JSON/兼容文件,并且不需要将 YAML 转换为 JSON 来执行请求的操作。因此,每当您需要更改源 YAML 文件中的数据时,都会保留原始 YAML 文件中的注释。与某些替代方案一样,yamlpath 也可以跨操作系统移植。更重要的是,yamlpath 定义了一种非常强大的查询语言,可以实现非常专业/过滤的数据查询。它甚至可以在单个查询中对文件不同部分的结果进行操作。

如果您想一次获取或设置数据中的多个值——包括散列/数组/映射/列表等复杂数据——yamlpath 可以做到这一点。想要一个值但不知道它在文档中的确切位置? yamlpath 可以找到它并为您提供确切的路径。需要将多个数据文件合并在一起,包括来自 STDIN 的数据文件? yamlpath 也这样做。此外,yamlpath 完全理解 YAML 锚点及其别名,始终提供或更改您期望的数据,无论它是具体值还是引用值。

免责声明:我编写并维护了 yamlpath,它基于 ruamel.yaml,而 ruamel.yaml 又基于 PyYAML。因此,yamlpath 完全符合标准。

Whenever you need a solution for "How to work with YAML/JSON/compatible data from a shell script" which works on just about every OS with Python (*nix, OSX, Windows), consider yamlpath, which provides several command-line tools for reading, writing, searching, and merging YAML, EYAML, JSON, and compatible files. Since just about every OS either comes with Python pre-installed or it is trivial to install, this makes yamlpath highly portable. Even more interesting: this project defines an intuitive path language with very powerful, command-line-friendly syntax that enables accessing one or more nodes.

To your specific question and after installing yamlpath using Python's native package manager or your OS's package manager (yamlpath is available via RPM to some OSes):

#!/bin/bash
# Read values directly from YAML (or EYAML, JSON, etc) for use in this shell script:
myShellVar=$(yaml-get --query=any.path.no[matter%how].complex source-file.yaml)

# Use the value any way you need:
echo "Retrieved ${myShellVar}"

# Perhaps change the value and write it back:
myShellVar="New Value"
yaml-set --change=/any/path/no[matter%how]/complex --value="$myShellVar" source-file.yaml

You didn't specify that the data was a simple Scalar value though, so let's up the ante. What if the result you want is an Array? Even more challenging, what if it's an Array-of-Hashes and you only want one property of each result? Suppose further that your data is actually spread out across multiple YAML files and you need all the results in a single query. That's a much more interesting question to demonstrate with. So, suppose you have these two YAML files:

File: data1.yaml

---
baubles:
  - name: Doohickey
    sku: 0-000-1
    price: 4.75
    weight: 2.7g
  - name: Doodad
    sku: 0-000-2
    price: 10.5
    weight: 5g
  - name: Oddball
    sku: 0-000-3
    price: 25.99
    weight: 25kg

File: data2.yaml

---
baubles:
  - name: Fob
    sku: 0-000-4
    price: 0.99
    weight: 18mg
  - name: Doohickey
    price: 10.5
  - name: Oddball
    sku: 0-000-3
    description: This ball is odd

How would you report only the sku of every item in inventory after applying the changes from data2.yaml to data1.yaml, all from a shell script? Try this:

#!/bin/bash
baubleSKUs=($(yaml-merge --aoh=deep data1.yaml data2.yaml | yaml-get --query=/baubles/sku -))

for sku in "${baubleSKUs[@]}"; do
    echo "Found bauble SKU:  ${sku}"
done

You get exactly what you need from only a few lines of code:

Found bauble SKU:  0-000-1
Found bauble SKU:  0-000-2
Found bauble SKU:  0-000-3
Found bauble SKU:  0-000-4

As you can see, yamlpath turns very complex problems into trivial solutions. Note that the entire query was handled as a stream; no YAML files were changed by the query and there were no temp files.

I realize this is "yet another tool to solve the same question" but after reading the other answers here, yamlpath appears more portable and robust than most alternatives. It also fully understands YAML/JSON/compatible files and it does not need to convert YAML to JSON to perform requested operations. As such, comments within the original YAML file are preserved whenever you need to change data in the source YAML file. Like some alternatives, yamlpath is also portable across OSes. More importantly, yamlpath defines a query language that is extremely powerful, enabling very specialized/filtered data queries. It can even operate against results from disparate parts of the file in a single query.

If you want to get or set many values in the data at once -- including complex data like hashes/arrays/maps/lists -- yamlpath can do that. Want a value but don't know precisely where it is in the document? yamlpath can find it and give you the exact path(s). Need to merge multiple data file together, including from STDIN? yamlpath does that, too. Further, yamlpath fully comprehends YAML anchors and their aliases, always giving or changing exactly the data you expect whether it is a concrete or referenced value.

Disclaimer: I wrote and maintain yamlpath, which is based on ruamel.yaml, which is in turn based on PyYAML. As such, yamlpath is fully standards-compliant.

云归处 2024-10-24 18:41:59

使用 Python 的 PyYAMLYAML::Perl

如果您想将所有 YAML 值解析为 bash 值,请尝试此脚本。这也将处理评论。请参阅下面的示例用法:

# pparse.py

import yaml
import sys
            
def parse_yaml(yml, name=''):
    if isinstance(yml, list):
        for data in yml:
            parse_yaml(data, name)
    elif isinstance(yml, dict):
        if (len(yml) == 1) and not isinstance(yml[list(yml.keys())[0]], list):
            print(str(name+'_'+list(yml.keys())[0]+'='+str(yml[list(yml.keys())[0]]))[1:])
        else:
            for key in yml:
                parse_yaml(yml[key], name+'_'+key)

            
if __name__=="__main__":
    yml = yaml.safe_load(open(sys.argv[1]))
    parse_yaml(yml)

test.yml

- folders:
  - temp_folder: datasets/outputs/tmp
  - keep_temp_folder: false

- MFA:
  - MFA: false
  - speaker_count: 1
  - G2P: 
    - G2P: true
    - G2P_model: models/MFA/G2P/english_g2p.zip
    - input_folder: datasets/outputs/Youtube/ljspeech/wavs
    - output_dictionary: datasets/outputs/Youtube/ljspeech/dictionary.dict
  - dictionary: datasets/outputs/Youtube/ljspeech/dictionary.dict
  - acoustic_model: models/MFA/acoustic/english.zip
  - temp_folder: datasets/outputs/tmp
  - jobs: 4
  - align:
    - config: configs/MFA/align.yaml
    - dataset: datasets/outputs/Youtube/ljspeech/wavs
    - output_folder: datasets/outputs/Youtube/ljspeech-aligned

- TTS:
  - output_folder: datasets/outputs/Youtube
  - preprocess:
    - preprocess: true
    - config: configs/TTS_preprocess.yaml # Default Config 
    - textgrid_folder: datasets/outputs/Youtube/ljspeech-aligned
    - output_duration_folder: datasets/outputs/Youtube/durations
    - sampling_rate: 44000 # Make sure sampling rate is same here as in preprocess config

需要 YAML 值的脚本:

yaml() {
    eval $(python pparse.py "$1")
}

yaml "test.yml"

# What python printed to bash:

folders_temp_folder=datasets/outputs/tmp
folders_keep_temp_folder=False
MFA_MFA=False
MFA_speaker_count=1
MFA_G2P_G2P=True
MFA_G2P_G2P_model=models/MFA/G2P/english_g2p.zip
MFA_G2P_input_folder=datasets/outputs/Youtube/ljspeech/wavs
MFA_G2P_output_dictionary=datasets/outputs/Youtube/ljspeech/dictionary.dict
MFA_dictionary=datasets/outputs/Youtube/ljspeech/dictionary.dict
MFA_acoustic_model=models/MFA/acoustic/english.zip
MFA_temp_folder=datasets/outputs/tmp
MFA_jobs=4
MFA_align_config=configs/MFA/align.yaml
MFA_align_dataset=datasets/outputs/Youtube/ljspeech/wavs
MFA_align_output_folder=datasets/outputs/Youtube/ljspeech-aligned
TTS_output_folder=datasets/outputs/Youtube
TTS_preprocess_preprocess=True
TTS_preprocess_config=configs/TTS_preprocess.yaml
TTS_preprocess_textgrid_folder=datasets/outputs/Youtube/ljspeech-aligned
TTS_preprocess_output_duration_folder=datasets/outputs/Youtube/durations
TTS_preprocess_sampling_rate=44000

使用 bash 访问变量:

echo "$TTS_preprocess_sampling_rate";
>>> 44000

Complex parsing is easiest with a library such as Python's PyYAML or YAML::Perl.

If you want to parse all the YAML values into bash values, try this script. This will handle comments as well. See example usage below:

# pparse.py

import yaml
import sys
            
def parse_yaml(yml, name=''):
    if isinstance(yml, list):
        for data in yml:
            parse_yaml(data, name)
    elif isinstance(yml, dict):
        if (len(yml) == 1) and not isinstance(yml[list(yml.keys())[0]], list):
            print(str(name+'_'+list(yml.keys())[0]+'='+str(yml[list(yml.keys())[0]]))[1:])
        else:
            for key in yml:
                parse_yaml(yml[key], name+'_'+key)

            
if __name__=="__main__":
    yml = yaml.safe_load(open(sys.argv[1]))
    parse_yaml(yml)

test.yml

- folders:
  - temp_folder: datasets/outputs/tmp
  - keep_temp_folder: false

- MFA:
  - MFA: false
  - speaker_count: 1
  - G2P: 
    - G2P: true
    - G2P_model: models/MFA/G2P/english_g2p.zip
    - input_folder: datasets/outputs/Youtube/ljspeech/wavs
    - output_dictionary: datasets/outputs/Youtube/ljspeech/dictionary.dict
  - dictionary: datasets/outputs/Youtube/ljspeech/dictionary.dict
  - acoustic_model: models/MFA/acoustic/english.zip
  - temp_folder: datasets/outputs/tmp
  - jobs: 4
  - align:
    - config: configs/MFA/align.yaml
    - dataset: datasets/outputs/Youtube/ljspeech/wavs
    - output_folder: datasets/outputs/Youtube/ljspeech-aligned

- TTS:
  - output_folder: datasets/outputs/Youtube
  - preprocess:
    - preprocess: true
    - config: configs/TTS_preprocess.yaml # Default Config 
    - textgrid_folder: datasets/outputs/Youtube/ljspeech-aligned
    - output_duration_folder: datasets/outputs/Youtube/durations
    - sampling_rate: 44000 # Make sure sampling rate is same here as in preprocess config

Script where YAML values are needed:

yaml() {
    eval $(python pparse.py "$1")
}

yaml "test.yml"

# What python printed to bash:

folders_temp_folder=datasets/outputs/tmp
folders_keep_temp_folder=False
MFA_MFA=False
MFA_speaker_count=1
MFA_G2P_G2P=True
MFA_G2P_G2P_model=models/MFA/G2P/english_g2p.zip
MFA_G2P_input_folder=datasets/outputs/Youtube/ljspeech/wavs
MFA_G2P_output_dictionary=datasets/outputs/Youtube/ljspeech/dictionary.dict
MFA_dictionary=datasets/outputs/Youtube/ljspeech/dictionary.dict
MFA_acoustic_model=models/MFA/acoustic/english.zip
MFA_temp_folder=datasets/outputs/tmp
MFA_jobs=4
MFA_align_config=configs/MFA/align.yaml
MFA_align_dataset=datasets/outputs/Youtube/ljspeech/wavs
MFA_align_output_folder=datasets/outputs/Youtube/ljspeech-aligned
TTS_output_folder=datasets/outputs/Youtube
TTS_preprocess_preprocess=True
TTS_preprocess_config=configs/TTS_preprocess.yaml
TTS_preprocess_textgrid_folder=datasets/outputs/Youtube/ljspeech-aligned
TTS_preprocess_output_duration_folder=datasets/outputs/Youtube/durations
TTS_preprocess_sampling_rate=44000

Access variables with bash:

echo "$TTS_preprocess_sampling_rate";
>>> 44000
逆光下的微笑 2024-10-24 18:41:59

如果您知道您感兴趣的标签以及您期望的 yaml 结构,那么在 Bash 中编写一个简单的 YAML 解析器并不难。

在以下示例中,解析器将结构化 YAML 文件读取到环境变量、数组和关联数组中。

注意:此解析器的复杂性与 YAML 文件的结构有关。对于 YAML 文件的每个结构化组件,您将需要一个单独的子例程。高度结构化的 YAML 文件可能需要更复杂的方法,例如通用递归下降解析器。

xmas.yaml 文件:

# Xmas YAML example
---
 # Values
 pear-tree: partridge
 turtle-doves: 2.718
 french-hens: 3

 # Array
 calling-birds:
   - huey
   - dewey
   - louie
   - fred

 # Structure
 xmas-fifth-day:
   calling-birds: four
   french-hens: 3
   golden-rings: 5
   partridges:
     count: 1
     location: "a pear tree"
   turtle-doves: two

解析器使用 mapfile 将文件作为数组读入内存,然后循环遍历每个标签并创建环境变量。

  • pear-tree:turtle-doves:french-hens: 最终成为简单的环境变量
  • calling-birds: 变成数组
  • xmas-fifth-day: 结构表示为关联数组,但是如果您不使用 Bash 4.0 或更高版本,则可以将它们编码为环境变量。
  • 注释和空白将被忽略。
#!/bin/bash
# -------------------------------------------------------------------
# A simple parser for the xmas.yaml file
# -------------------------------------------------------------------
# 
# xmas.yaml tags
#  #                        - Ignored
#                           - Blank lines are ignored
#  ---                      - Initialiser for days-of-xmas 
#   pear-tree: partridge    - a string
#   turtle-doves: 2.718     - a string, no float type in Bash
#   french-hens: 3          - a number
#   calling-birds:          - an array of strings
#     - huey                - calling-birds[0]
#     - dewey
#     - louie
#     - fred
#   xmas-fifth-day:         - an associative array
#     calling-birds: four   - a string
#     french-hens: 3        - a number
#     golden-rings: 5       - a number
#     partridges:           - changes the key to partridges.xxx
#       count: 1            - a number
#       location: "a pear tree" - a string
#     turtle-doves: two     - a string
# 
# This requires the following routines
# ParseXMAS
#   parses #, ---, blank line
#   unexpected tag error
#   calls days-of-xmas
#
# days-of-xmas
#   parses pear-tree, turtle-doves, french-hens
#   calls calling-birds
#   calls xmas-fifth-day
# 
# calling-birds
#   elements of the array
#
# xmas-fifth-day
#   parses calling-birds, french-hens, golden-rings, turtle-doves
#   calls partridges
# 
# partridges
#   parses partridges.count, partridges.location
#

function ParseXMAS()
{

  # days-of-xmas
  #   parses pear-tree, turtle-doves, french-hens
  #   calls calling-birds
  #   calls xmas-fifth-day
  # 
  function days-of-xmas()
  {
    unset PearTree TurtleDoves FrenchHens

    while [ $CURRENT_ROW -lt $ROWS ]
    do
      LINE=( ${CONFIG[${CURRENT_ROW}]} )
      TAG=${LINE[0]}
      unset LINE[0]

      VALUE="${LINE[*]}"

      echo "  days-of-xmas[${CURRENT_ROW}] ${TAG}=${VALUE}"

      if [ "$TAG" = "pear-tree:" ]
      then
        declare -g PearTree=$VALUE
      elif [ "$TAG" = "turtle-doves:" ]
      then
        declare -g TurtleDoves=$VALUE
      elif [ "$TAG" = "french-hens:" ]
      then
        declare -g FrenchHens=$VALUE
      elif [ "$TAG" = "calling-birds:" ]
      then
        let CURRENT_ROW=$(($CURRENT_ROW + 1))
        calling-birds
        continue
      elif [ "$TAG" = "xmas-fifth-day:" ]
      then
        let CURRENT_ROW=$(($CURRENT_ROW + 1))
        xmas-fifth-day
        continue
      elif [ -z "$TAG" ] || [ "$TAG" = "#" ]
      then
        # Ignore comments and blank lines
        true
      else
        # time to bug out
        break
      fi

      let CURRENT_ROW=$(($CURRENT_ROW + 1))
    done
  }

  # calling-birds
  #   elements of the array
  function calling-birds()
  {
    unset CallingBirds

    declare -ag CallingBirds

    while [ $CURRENT_ROW -lt $ROWS ]
    do
      LINE=( ${CONFIG[${CURRENT_ROW}]} )
      TAG=${LINE[0]}
      unset LINE[0]

      VALUE="${LINE[*]}"

      echo "    calling-birds[${CURRENT_ROW}] ${TAG}=${VALUE}"

      if [ "$TAG" = "-" ]
      then
        CallingBirds[${#CallingBirds[*]}]=$VALUE
      elif [ -z "$TAG" ] || [ "$TAG" = "#" ]
      then
        # Ignore comments and blank lines
        true
      else
        # time to bug out
        break
      fi

      let CURRENT_ROW=$(($CURRENT_ROW + 1))
    done
  }

  # xmas-fifth-day
  #   parses calling-birds, french-hens, golden-rings, turtle-doves
  #   calls fifth-day-partridges
  # 
  function xmas-fifth-day()
  {
    unset XmasFifthDay

    declare -Ag XmasFifthDay

    while [ $CURRENT_ROW -lt $ROWS ]
    do
      LINE=( ${CONFIG[${CURRENT_ROW}]} )
      TAG=${LINE[0]}
      unset LINE[0]

      VALUE="${LINE[*]}"

      echo "    xmas-fifth-day[${CURRENT_ROW}] ${TAG}=${VALUE}"

      if [ "$TAG" = "calling-birds:" ]
      then
        XmasFifthDay[CallingBirds]=$VALUE
      elif [ "$TAG" = "french-hens:" ]
      then
        XmasFifthDay[FrenchHens]=$VALUE
      elif [ "$TAG" = "golden-rings:" ]
      then
        XmasFifthDay[GOLDEN-RINGS]=$VALUE
      elif [ "$TAG" = "turtle-doves:" ]
      then
        XmasFifthDay[TurtleDoves]=$VALUE
      elif [ "$TAG" = "partridges:" ]
      then
        let CURRENT_ROW=$(($CURRENT_ROW + 1))
        partridges
        continue
      elif [ -z "$TAG" ] || [ "$TAG" = "#" ]
      then
        # Ignore comments and blank lines
        true
      else
        # time to bug out
        break
      fi
 
      let CURRENT_ROW=$(($CURRENT_ROW + 1))
    done
  }

  function partridges()
  {
    while [ $CURRENT_ROW -lt $ROWS ]
    do
      LINE=( ${CONFIG[${CURRENT_ROW}]} )
      TAG=${LINE[0]}
      unset LINE[0]

      VALUE="${LINE[*]}"

      echo "      partridges[${CURRENT_ROW}] ${TAG}=${VALUE}"

      if [ "$TAG" = "count:" ]
      then
        XmasFifthDay[PARTRIDGES.COUNT]=$VALUE
      elif [ "$TAG" = "location:" ]
      then
        XmasFifthDay[PARTRIDGES.LOCATION]=$VALUE
      elif [ -z "$TAG" ] || [ "$TAG" = "#" ]
      then
        # Ignore comments and blank lines
        true
      else
        # time to bug out
        break
      fi
 
      let CURRENT_ROW=$(($CURRENT_ROW + 1))
    done
  }

  # ===================================================================
  # Load the configuration file

  mapfile CONFIG < xmas.yaml

  let ROWS=${#CONFIG[@]}
  let CURRENT_ROW=0

  # +
  # #
  #
  # ---
  # -
  while [ $CURRENT_ROW -lt $ROWS ]
  do
    LINE=( ${CONFIG[${CURRENT_ROW}]} )
    TAG=${LINE[0]}
    unset LINE[0]

    VALUE="${LINE[*]}"

    echo "[${CURRENT_ROW}] ${TAG}=${VALUE}"

    if [ "$TAG" = "---" ]
    then
        let CURRENT_ROW=$(($CURRENT_ROW + 1))
        days-of-xmas
        continue
    elif [ -z "$TAG" ] || [ "$TAG" = "#" ]
    then
        # Ignore comments and blank lines
        true
    else
        echo "Unexpected tag at line $(($CURRENT_ROW + 1)): <${TAG}>={${VALUE}}"
        break
    fi

    let CURRENT_ROW=$(($CURRENT_ROW + 1))
  done
}

echo =========================================
ParseXMAS

echo =========================================
declare -p PearTree
declare -p TurtleDoves
declare -p FrenchHens
declare -p CallingBirds
declare -p XmasFifthDay

这会产生以下输出

=========================================
[0] #=Xmas YAML example
[1] ---=
  days-of-xmas[2] #=Values
  days-of-xmas[3] pear-tree:=partridge
  days-of-xmas[4] turtle-doves:=2.718
  days-of-xmas[5] french-hens:=3
  days-of-xmas[6] =
  days-of-xmas[7] #=Array
  days-of-xmas[8] calling-birds:=
    calling-birds[9] -=huey
    calling-birds[10] -=dewey
    calling-birds[11] -=louie
    calling-birds[12] -=fred
    calling-birds[13] =
    calling-birds[14] #=Structure
    calling-birds[15] xmas-fifth-day:=
  days-of-xmas[15] xmas-fifth-day:=
    xmas-fifth-day[16] calling-birds:=four
    xmas-fifth-day[17] french-hens:=3
    xmas-fifth-day[18] golden-rings:=5
    xmas-fifth-day[19] partridges:=
      partridges[20] count:=1
      partridges[21] location:="a pear tree"
      partridges[22] turtle-doves:=two
    xmas-fifth-day[22] turtle-doves:=two
=========================================
declare -- PearTree="partridge"
declare -- TurtleDoves="2.718"
declare -- FrenchHens="3"
declare -a CallingBirds=([0]="huey" [1]="dewey" [2]="louie" [3]="fred")
declare -A XmasFifthDay=([CallingBirds]="four" [PARTRIDGES.LOCATION]="\"a pear tree\"" [FrenchHens]="3" [GOLDEN-RINGS]="5" [PARTRIDGES.COUNT]="1" [TurtleDoves]="two" )

If you know what tags you are interested in and the yaml structure you expect then it is not that hard to write a simple YAML parser in Bash.

In the following example the parser reads a structured YAML file into environment variables, an array and an associative array.

Note: The complexity of this parser is tied to the structure of the YAML file. You will need a separate subroutine for each structured component of the YAML file. Highly structured YAML files might require a more sophisticated approach, eg a generic recursive descent parser.

The xmas.yaml file:

# Xmas YAML example
---
 # Values
 pear-tree: partridge
 turtle-doves: 2.718
 french-hens: 3

 # Array
 calling-birds:
   - huey
   - dewey
   - louie
   - fred

 # Structure
 xmas-fifth-day:
   calling-birds: four
   french-hens: 3
   golden-rings: 5
   partridges:
     count: 1
     location: "a pear tree"
   turtle-doves: two

The parser uses mapfile to read the file into memory as an array then cycles through each tag and creates environment variables.

  • pear-tree:, turtle-doves: and french-hens: end up as simple environment variables
  • calling-birds: becomes an array
  • The xmas-fifth-day: structure is represented as an associative array however you could encode these as environment variables if you are not using Bash 4.0 or later.
  • Comments and white space are ignored.
#!/bin/bash
# -------------------------------------------------------------------
# A simple parser for the xmas.yaml file
# -------------------------------------------------------------------
# 
# xmas.yaml tags
#  #                        - Ignored
#                           - Blank lines are ignored
#  ---                      - Initialiser for days-of-xmas 
#   pear-tree: partridge    - a string
#   turtle-doves: 2.718     - a string, no float type in Bash
#   french-hens: 3          - a number
#   calling-birds:          - an array of strings
#     - huey                - calling-birds[0]
#     - dewey
#     - louie
#     - fred
#   xmas-fifth-day:         - an associative array
#     calling-birds: four   - a string
#     french-hens: 3        - a number
#     golden-rings: 5       - a number
#     partridges:           - changes the key to partridges.xxx
#       count: 1            - a number
#       location: "a pear tree" - a string
#     turtle-doves: two     - a string
# 
# This requires the following routines
# ParseXMAS
#   parses #, ---, blank line
#   unexpected tag error
#   calls days-of-xmas
#
# days-of-xmas
#   parses pear-tree, turtle-doves, french-hens
#   calls calling-birds
#   calls xmas-fifth-day
# 
# calling-birds
#   elements of the array
#
# xmas-fifth-day
#   parses calling-birds, french-hens, golden-rings, turtle-doves
#   calls partridges
# 
# partridges
#   parses partridges.count, partridges.location
#

function ParseXMAS()
{

  # days-of-xmas
  #   parses pear-tree, turtle-doves, french-hens
  #   calls calling-birds
  #   calls xmas-fifth-day
  # 
  function days-of-xmas()
  {
    unset PearTree TurtleDoves FrenchHens

    while [ $CURRENT_ROW -lt $ROWS ]
    do
      LINE=( ${CONFIG[${CURRENT_ROW}]} )
      TAG=${LINE[0]}
      unset LINE[0]

      VALUE="${LINE[*]}"

      echo "  days-of-xmas[${CURRENT_ROW}] ${TAG}=${VALUE}"

      if [ "$TAG" = "pear-tree:" ]
      then
        declare -g PearTree=$VALUE
      elif [ "$TAG" = "turtle-doves:" ]
      then
        declare -g TurtleDoves=$VALUE
      elif [ "$TAG" = "french-hens:" ]
      then
        declare -g FrenchHens=$VALUE
      elif [ "$TAG" = "calling-birds:" ]
      then
        let CURRENT_ROW=$(($CURRENT_ROW + 1))
        calling-birds
        continue
      elif [ "$TAG" = "xmas-fifth-day:" ]
      then
        let CURRENT_ROW=$(($CURRENT_ROW + 1))
        xmas-fifth-day
        continue
      elif [ -z "$TAG" ] || [ "$TAG" = "#" ]
      then
        # Ignore comments and blank lines
        true
      else
        # time to bug out
        break
      fi

      let CURRENT_ROW=$(($CURRENT_ROW + 1))
    done
  }

  # calling-birds
  #   elements of the array
  function calling-birds()
  {
    unset CallingBirds

    declare -ag CallingBirds

    while [ $CURRENT_ROW -lt $ROWS ]
    do
      LINE=( ${CONFIG[${CURRENT_ROW}]} )
      TAG=${LINE[0]}
      unset LINE[0]

      VALUE="${LINE[*]}"

      echo "    calling-birds[${CURRENT_ROW}] ${TAG}=${VALUE}"

      if [ "$TAG" = "-" ]
      then
        CallingBirds[${#CallingBirds[*]}]=$VALUE
      elif [ -z "$TAG" ] || [ "$TAG" = "#" ]
      then
        # Ignore comments and blank lines
        true
      else
        # time to bug out
        break
      fi

      let CURRENT_ROW=$(($CURRENT_ROW + 1))
    done
  }

  # xmas-fifth-day
  #   parses calling-birds, french-hens, golden-rings, turtle-doves
  #   calls fifth-day-partridges
  # 
  function xmas-fifth-day()
  {
    unset XmasFifthDay

    declare -Ag XmasFifthDay

    while [ $CURRENT_ROW -lt $ROWS ]
    do
      LINE=( ${CONFIG[${CURRENT_ROW}]} )
      TAG=${LINE[0]}
      unset LINE[0]

      VALUE="${LINE[*]}"

      echo "    xmas-fifth-day[${CURRENT_ROW}] ${TAG}=${VALUE}"

      if [ "$TAG" = "calling-birds:" ]
      then
        XmasFifthDay[CallingBirds]=$VALUE
      elif [ "$TAG" = "french-hens:" ]
      then
        XmasFifthDay[FrenchHens]=$VALUE
      elif [ "$TAG" = "golden-rings:" ]
      then
        XmasFifthDay[GOLDEN-RINGS]=$VALUE
      elif [ "$TAG" = "turtle-doves:" ]
      then
        XmasFifthDay[TurtleDoves]=$VALUE
      elif [ "$TAG" = "partridges:" ]
      then
        let CURRENT_ROW=$(($CURRENT_ROW + 1))
        partridges
        continue
      elif [ -z "$TAG" ] || [ "$TAG" = "#" ]
      then
        # Ignore comments and blank lines
        true
      else
        # time to bug out
        break
      fi
 
      let CURRENT_ROW=$(($CURRENT_ROW + 1))
    done
  }

  function partridges()
  {
    while [ $CURRENT_ROW -lt $ROWS ]
    do
      LINE=( ${CONFIG[${CURRENT_ROW}]} )
      TAG=${LINE[0]}
      unset LINE[0]

      VALUE="${LINE[*]}"

      echo "      partridges[${CURRENT_ROW}] ${TAG}=${VALUE}"

      if [ "$TAG" = "count:" ]
      then
        XmasFifthDay[PARTRIDGES.COUNT]=$VALUE
      elif [ "$TAG" = "location:" ]
      then
        XmasFifthDay[PARTRIDGES.LOCATION]=$VALUE
      elif [ -z "$TAG" ] || [ "$TAG" = "#" ]
      then
        # Ignore comments and blank lines
        true
      else
        # time to bug out
        break
      fi
 
      let CURRENT_ROW=$(($CURRENT_ROW + 1))
    done
  }

  # ===================================================================
  # Load the configuration file

  mapfile CONFIG < xmas.yaml

  let ROWS=${#CONFIG[@]}
  let CURRENT_ROW=0

  # +
  # #
  #
  # ---
  # -
  while [ $CURRENT_ROW -lt $ROWS ]
  do
    LINE=( ${CONFIG[${CURRENT_ROW}]} )
    TAG=${LINE[0]}
    unset LINE[0]

    VALUE="${LINE[*]}"

    echo "[${CURRENT_ROW}] ${TAG}=${VALUE}"

    if [ "$TAG" = "---" ]
    then
        let CURRENT_ROW=$(($CURRENT_ROW + 1))
        days-of-xmas
        continue
    elif [ -z "$TAG" ] || [ "$TAG" = "#" ]
    then
        # Ignore comments and blank lines
        true
    else
        echo "Unexpected tag at line $(($CURRENT_ROW + 1)): <${TAG}>={${VALUE}}"
        break
    fi

    let CURRENT_ROW=$(($CURRENT_ROW + 1))
  done
}

echo =========================================
ParseXMAS

echo =========================================
declare -p PearTree
declare -p TurtleDoves
declare -p FrenchHens
declare -p CallingBirds
declare -p XmasFifthDay

This produces the following output

=========================================
[0] #=Xmas YAML example
[1] ---=
  days-of-xmas[2] #=Values
  days-of-xmas[3] pear-tree:=partridge
  days-of-xmas[4] turtle-doves:=2.718
  days-of-xmas[5] french-hens:=3
  days-of-xmas[6] =
  days-of-xmas[7] #=Array
  days-of-xmas[8] calling-birds:=
    calling-birds[9] -=huey
    calling-birds[10] -=dewey
    calling-birds[11] -=louie
    calling-birds[12] -=fred
    calling-birds[13] =
    calling-birds[14] #=Structure
    calling-birds[15] xmas-fifth-day:=
  days-of-xmas[15] xmas-fifth-day:=
    xmas-fifth-day[16] calling-birds:=four
    xmas-fifth-day[17] french-hens:=3
    xmas-fifth-day[18] golden-rings:=5
    xmas-fifth-day[19] partridges:=
      partridges[20] count:=1
      partridges[21] location:="a pear tree"
      partridges[22] turtle-doves:=two
    xmas-fifth-day[22] turtle-doves:=two
=========================================
declare -- PearTree="partridge"
declare -- TurtleDoves="2.718"
declare -- FrenchHens="3"
declare -a CallingBirds=([0]="huey" [1]="dewey" [2]="louie" [3]="fred")
declare -A XmasFifthDay=([CallingBirds]="four" [PARTRIDGES.LOCATION]="\"a pear tree\"" [FrenchHens]="3" [GOLDEN-RINGS]="5" [PARTRIDGES.COUNT]="1" [TurtleDoves]="two" )
野生奥特曼 2024-10-24 18:41:59

在 RHEL 中,此命令会输出 STDIN 是否是 YAML,如果不是,则输出。

 yum install perl-YAML.noarch
 (...)
 cat someflux \
 | perl -MYAML -MData::Dumper \
        -e 'print Dumper YAML::Load(join q{}, <STDIN>)'

In RHEL, this command outputs if STDIN is a YAML or die if not.

 yum install perl-YAML.noarch
 (...)
 cat someflux \
 | perl -MYAML -MData::Dumper \
        -e 'print Dumper YAML::Load(join q{}, <STDIN>)'
寄人书 2024-10-24 18:41:59

您还可以考虑使用 Grunt (JavaScript 任务运行程序)。可以方便地与shell集成。它支持读取 YAML (grunt.file.readYAML) 和 JSON (grunt.file.readJSON) 文件。

这可以通过在 Gruntfile.js (或 Gruntfile.coffee)中创建任务来实现,例如:

module.exports = function (grunt) {

    grunt.registerTask('foo', ['load_yml']);

    grunt.registerTask('load_yml', function () {
        var data = grunt.file.readYAML('foo.yml');
        Object.keys(data).forEach(function (g) {
          // ... switch (g) { case 'my_key':
        });
    });

};

然后从 shell 只需运行 grunt foo (检查grunt --help以获取可用任务)。

此外,您还可以使用从任务传递的输入变量来实现 exec:foo 任务 (grunt-exec) (foo: { cmd: 'echo bar <% = foo %>' }) 以便以您想要的任何格式打印输出,然后将其通过管道传输到另一个命令中。


还有一个与 Grunt 类似的工具,名为 gulp ,带有附加插件 gulp-yaml

通过安装:npm install --save-dev gulp-yaml

示例用法:

var yaml = require('gulp-yaml');

gulp.src('./src/*.yml')
  .pipe(yaml())
  .pipe(gulp.dest('./dist/'))

gulp.src('./src/*.yml')
  .pipe(yaml({ space: 2 }))
  .pipe(gulp.dest('./dist/'))

gulp.src('./src/*.yml')
  .pipe(yaml({ safe: true }))
  .pipe(gulp.dest('./dist/'))

更多选项来处理 YAML 格式,检查YAML 站点以获取可用的项目、库和其他资源帮助您解析该格式。


其他工具:

  • Jshon

    <块引用>

    解析、读取和创建 JSON

You can also consider using Grunt (The JavaScript Task Runner). Can be easily integrated with shell. It supports reading YAML (grunt.file.readYAML) and JSON (grunt.file.readJSON) files.

This can be achieved by creating a task in Gruntfile.js (or Gruntfile.coffee), e.g.:

module.exports = function (grunt) {

    grunt.registerTask('foo', ['load_yml']);

    grunt.registerTask('load_yml', function () {
        var data = grunt.file.readYAML('foo.yml');
        Object.keys(data).forEach(function (g) {
          // ... switch (g) { case 'my_key':
        });
    });

};

then from shell just simply run grunt foo (check grunt --help for available tasks).

Further more you can implement exec:foo tasks (grunt-exec) with input variables passed from your task (foo: { cmd: 'echo bar <%= foo %>' }) in order to print the output in whatever format you want, then pipe it into another command.


There is also similar tool to Grunt, it's called gulp with additional plugin gulp-yaml.

Install via: npm install --save-dev gulp-yaml

Sample usage:

var yaml = require('gulp-yaml');

gulp.src('./src/*.yml')
  .pipe(yaml())
  .pipe(gulp.dest('./dist/'))

gulp.src('./src/*.yml')
  .pipe(yaml({ space: 2 }))
  .pipe(gulp.dest('./dist/'))

gulp.src('./src/*.yml')
  .pipe(yaml({ safe: true }))
  .pipe(gulp.dest('./dist/'))

To more options to deal with YAML format, check YAML site for available projects, libraries and other resources which can help you to parse that format.


Other tools:

  • Jshon

    parses, reads and creates JSON

弄潮 2024-10-24 18:41:59

受到托斯顿回答的启发:

export YML_PATH=./config.yml
yml() { # requires python modules yaml and munch installed
    python -c "import yaml; import munch; print(munch.munchify(yaml.safe_load(open('$YML_PATH'))).$1)"
}
PGPASSWORD=$(yml db.password) psql -h $(yml db.host) -p $(yml db.port) -d $(yml db.name) -U $(yml db.user) -f so_much_ugliness.sql

如果你使用这个(就像我一样),你可以轻轻地扇自己一巴掌;)。

Inspired by Torsten's answer:

export YML_PATH=./config.yml
yml() { # requires python modules yaml and munch installed
    python -c "import yaml; import munch; print(munch.munchify(yaml.safe_load(open('$YML_PATH'))).$1)"
}
PGPASSWORD=$(yml db.password) psql -h $(yml db.host) -p $(yml db.port) -d $(yml db.name) -U $(yml db.user) -f so_much_ugliness.sql

You can gently slap yourself if you use this (as I did) ;).

多情癖 2024-10-24 18:41:59

我知道我的答案很具体,但是如果已经安装了 PHPSymfony,那么使用 Symfony 的 YAML 解析器会非常方便。

例如:

php -r "require '$SYMFONY_ROOT_PATH/vendor/autoload.php'; \
    var_dump(\Symfony\Component\Yaml\Yaml::parse(file_get_contents('$YAML_FILE_PATH')));"

这里我只是使用 var_dump 来输出解析后的数组,但当然你可以做更多...:)

I know my answer is specific, but if one already has PHP and Symfony installed, it can be very handy to use Symfony's YAML parser.

For instance:

php -r "require '$SYMFONY_ROOT_PATH/vendor/autoload.php'; \
    var_dump(\Symfony\Component\Yaml\Yaml::parse(file_get_contents('$YAML_FILE_PATH')));"

Here I simply used var_dump to output the parsed array but of course you can do much more... :)

秉烛思 2024-10-24 18:41:58

这是一个仅支持 bash 的解析器,它利用 sed 和 awk 来解析简单的 yaml 文件:

function parse_yaml {
   local prefix=$2
   local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
   sed -ne "s|^\($s\):|\1|" \
        -e "s|^\($s\)\($w\)$s:$s[\"']\(.*\)[\"']$s\$|\1$fs\2$fs\3|p" \
        -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p"  $1 |
   awk -F$fs '{
      indent = length($1)/2;
      vname[indent] = $2;
      for (i in vname) {if (i > indent) {delete vname[i]}}
      if (length($3) > 0) {
         vn=""; for (i=0; i<indent; i++) {vn=(vn)(vname[i])("_")}
         printf("%s%s%s=\"%s\"\n", "'$prefix'",vn, $2, $3);
      }
   }'
}

它可以理解以下文件:

## global definitions
global:
  debug: yes
  verbose: no
  debugging:
    detailed: no
    header: "debugging started"

## output
output:
   file: "yes"

使用以下方式解析时,

parse_yaml sample.yml

将输出:

global_debug="yes"
global_verbose="no"
global_debugging_detailed="no"
global_debugging_header="debugging started"
output_file="yes"

它也可以理解由 ruby​​ 生成的 yaml 文件,其中可能包含 ruby​​ 符号,例如:

---
:global:
  :debug: 'yes'
  :verbose: 'no'
  :debugging:
    :detailed: 'no'
    :header: debugging started
  :output: 'yes'

并将输出与前面的示例相同的内容。

脚本中的典型用法是:

eval $(parse_yaml sample.yml)

parse_yaml 接受前缀参数,以便导入的设置都具有公共前缀(这将降低命名空间冲突的风险)。

parse_yaml sample.yml "CONF_"

产量:

CONF_global_debug="yes"
CONF_global_verbose="no"
CONF_global_debugging_detailed="no"
CONF_global_debugging_header="debugging started"
CONF_output_file="yes"

请注意,文件中的先前设置可以由以后的设置引用:

## global definitions
global:
  debug: yes
  verbose: no
  debugging:
    detailed: no
    header: "debugging started"

## output
output:
   debug: $global_debug

另一个很好的用法是首先解析默认文件,然后解析用户设置,因为后面的设置会覆盖第一个设置,所以这是有效的:

eval $(parse_yaml defaults.yml)
eval $(parse_yaml project.yml)

Here is a bash-only parser that leverages sed and awk to parse simple yaml files:

function parse_yaml {
   local prefix=$2
   local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
   sed -ne "s|^\($s\):|\1|" \
        -e "s|^\($s\)\($w\)$s:$s[\"']\(.*\)[\"']$s\$|\1$fs\2$fs\3|p" \
        -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p"  $1 |
   awk -F$fs '{
      indent = length($1)/2;
      vname[indent] = $2;
      for (i in vname) {if (i > indent) {delete vname[i]}}
      if (length($3) > 0) {
         vn=""; for (i=0; i<indent; i++) {vn=(vn)(vname[i])("_")}
         printf("%s%s%s=\"%s\"\n", "'$prefix'",vn, $2, $3);
      }
   }'
}

It understands files such as:

## global definitions
global:
  debug: yes
  verbose: no
  debugging:
    detailed: no
    header: "debugging started"

## output
output:
   file: "yes"

Which, when parsed using:

parse_yaml sample.yml

will output:

global_debug="yes"
global_verbose="no"
global_debugging_detailed="no"
global_debugging_header="debugging started"
output_file="yes"

it also understands yaml files, generated by ruby which may include ruby symbols, like:

---
:global:
  :debug: 'yes'
  :verbose: 'no'
  :debugging:
    :detailed: 'no'
    :header: debugging started
  :output: 'yes'

and will output the same as in the previous example.

typical use within a script is:

eval $(parse_yaml sample.yml)

parse_yaml accepts a prefix argument so that imported settings all have a common prefix (which will reduce the risk of namespace collisions).

parse_yaml sample.yml "CONF_"

yields:

CONF_global_debug="yes"
CONF_global_verbose="no"
CONF_global_debugging_detailed="no"
CONF_global_debugging_header="debugging started"
CONF_output_file="yes"

Note that previous settings in a file can be referred to by later settings:

## global definitions
global:
  debug: yes
  verbose: no
  debugging:
    detailed: no
    header: "debugging started"

## output
output:
   debug: $global_debug

Another nice usage is to first parse a defaults file and then the user settings, which works since the latter settings overrides the first ones:

eval $(parse_yaml defaults.yml)
eval $(parse_yaml project.yml)
空城仅有旧梦在 2024-10-24 18:41:58

我已经用 python 编写了 shyaml 来满足 shell 命令行中的 YAML 查询需求。

概述:

$ pip install shyaml      ## installation

示例的 YAML 文件(具有复杂功能):

$ cat <<EOF > test.yaml
name: "MyName !!"
subvalue:
    how-much: 1.1
    things:
        - first
        - second
        - third
    other-things: [a, b, c]
    maintainer: "Valentin Lab"
    description: |
        Multiline description:
        Line 1
        Line 2
EOF

基本查询:

$ cat test.yaml | shyaml get-value subvalue.maintainer
Valentin Lab

对复杂值进行更复杂的循环查询:

$ cat test.yaml | shyaml values-0 | \
  while read -r -d 

几个关键点:

  • 所有 YAML 类型和语法怪异都得到正确处理,如多行、带引号的字符串、内联序列...
  • \0 填充输出可用于可靠的多行输入操作。
  • 用于选择子值的简单点符号(即:subvalue.maintainer 是有效键)。
  • 通过索引访问序列(即:subvalue.things.-1subvalue.things 序列的最后一个元素。)
  • 对所有序列/结构元素的访问一次在 bash 循环中使用。
  • 您可以将 YAML 文件的整个子部分输出为 ... YAML,它可以很好地与 shyaml 进行进一步操作。

更多示例和文档可在 shyaml github 页面shyaml PyPI 页面

\0' value; do echo "RECEIVED: '$value'" done RECEIVED: '1.1' RECEIVED: '- first - second - third' RECEIVED: '2' RECEIVED: 'Valentin Lab' RECEIVED: 'Multiline description: Line 1 Line 2'

几个关键点:

  • 所有 YAML 类型和语法怪异都得到正确处理,如多行、带引号的字符串、内联序列...
  • \0 填充输出可用于可靠的多行输入操作。
  • 用于选择子值的简单点符号(即:subvalue.maintainer 是有效键)。
  • 通过索引访问序列(即:subvalue.things.-1subvalue.things 序列的最后一个元素。)
  • 对所有序列/结构元素的访问一次在 bash 循环中使用。
  • 您可以将 YAML 文件的整个子部分输出为 ... YAML,它可以很好地与 shyaml 进行进一步操作。

更多示例和文档可在 shyaml github 页面shyaml PyPI 页面

I've written shyaml in python for YAML query needs from the shell command line.

Overview:

$ pip install shyaml      ## installation

Example's YAML file (with complex features):

$ cat <<EOF > test.yaml
name: "MyName !!"
subvalue:
    how-much: 1.1
    things:
        - first
        - second
        - third
    other-things: [a, b, c]
    maintainer: "Valentin Lab"
    description: |
        Multiline description:
        Line 1
        Line 2
EOF

Basic query:

$ cat test.yaml | shyaml get-value subvalue.maintainer
Valentin Lab

More complex looping query on complex values:

$ cat test.yaml | shyaml values-0 | \
  while read -r -d 

A few key points:

  • all YAML types and syntax oddities are correctly handled, as multiline, quoted strings, inline sequences...
  • \0 padded output is available for solid multiline entry manipulation.
  • simple dotted notation to select sub-values (ie: subvalue.maintainer is a valid key).
  • access by index is provided to sequences (ie: subvalue.things.-1 is the last element of the subvalue.things sequence.)
  • access to all sequence/structs elements in one go for use in bash loops.
  • you can output whole subpart of a YAML file as ... YAML, which blend well for further manipulations with shyaml.

More sample and documentation are available on the shyaml github page or the shyaml PyPI page.

\0' value; do echo "RECEIVED: '$value'" done RECEIVED: '1.1' RECEIVED: '- first - second - third' RECEIVED: '2' RECEIVED: 'Valentin Lab' RECEIVED: 'Multiline description: Line 1 Line 2'

A few key points:

  • all YAML types and syntax oddities are correctly handled, as multiline, quoted strings, inline sequences...
  • \0 padded output is available for solid multiline entry manipulation.
  • simple dotted notation to select sub-values (ie: subvalue.maintainer is a valid key).
  • access by index is provided to sequences (ie: subvalue.things.-1 is the last element of the subvalue.things sequence.)
  • access to all sequence/structs elements in one go for use in bash loops.
  • you can output whole subpart of a YAML file as ... YAML, which blend well for further manipulations with shyaml.

More sample and documentation are available on the shyaml github page or the shyaml PyPI page.

居里长安 2024-10-24 18:41:58

yq 是一个轻量级、可移植的命令行 YAML 处理器

该项目的目标是成为 yaml 文件的 jq 或 sed。

https://github.com/mikefarah/yq#readme

作为示例(直接盗用来自文档),给定一个sample.yaml文件:

---
bob:
  item1:
    cats: bananas
  item2:
    cats: apples

然后

yq eval '.bob.*.cats' sample.yaml

将输出

- bananas
- apples

yq is a lightweight and portable command-line YAML processor

The aim of the project is to be the jq or sed of yaml files.

(https://github.com/mikefarah/yq#readme)

As an example (stolen straight from the documentation), given a sample.yaml file of:

---
bob:
  item1:
    cats: bananas
  item2:
    cats: apples

then

yq eval '.bob.*.cats' sample.yaml

will output

- bananas
- apples
相对绾红妆 2024-10-24 18:41:58

鉴于如今 Python3 和 PyYAML 是很容易满足的依赖项,以下内容可能会有所帮助:

yaml() {
    python3 -c "import yaml;print(yaml.safe_load(open('$1'))$2)"
}

VALUE=$(yaml ~/my_yaml_file.yaml "['a_key']")

Given that Python3 and PyYAML are quite easy dependencies to meet nowadays, the following may help:

yaml() {
    python3 -c "import yaml;print(yaml.safe_load(open('$1'))$2)"
}

VALUE=$(yaml ~/my_yaml_file.yaml "['a_key']")
无妨# 2024-10-24 18:41:58

我的用例可能与原始帖子所要求的完全相同,也可能不完全相同,但绝对相似。

我需要引入一些 YAML 作为 bash 变量。 YAML 的深度永远不会超过一层。

YAML 看起来像这样:

KEY:                value
ANOTHER_KEY:        another_value
OH_MY_SO_MANY_KEYS: yet_another_value
LAST_KEY:           last_value

Output like-a dis:

KEY="value"
ANOTHER_KEY="another_value"
OH_MY_SO_MANY_KEYS="yet_another_value"
LAST_KEY="last_value"

我用这一行实现了输出:

sed -e 's/:[^:\/\/]/="/g;s/$/"/g;s/ *=/=/g' file.yaml > file.sh
  • s/:[^:\/\/]/="/g 找到 : 并将其替换为 =",同时忽略 ://(对于 URL)
  • s/$/"/g 附加 " 到每行末尾
  • s/ *=/=/g 删除 = 之前的所有空格

My use case may or may not be quite the same as what this original post was asking, but it's definitely similar.

I need to pull in some YAML as bash variables. The YAML will never be more than one level deep.

YAML looks like so:

KEY:                value
ANOTHER_KEY:        another_value
OH_MY_SO_MANY_KEYS: yet_another_value
LAST_KEY:           last_value

Output like-a dis:

KEY="value"
ANOTHER_KEY="another_value"
OH_MY_SO_MANY_KEYS="yet_another_value"
LAST_KEY="last_value"

I achieved the output with this line:

sed -e 's/:[^:\/\/]/="/g;s/$/"/g;s/ *=/=/g' file.yaml > file.sh
  • s/:[^:\/\/]/="/g finds : and replaces it with =", while ignoring :// (for URLs)
  • s/$/"/g appends " to the end of each line
  • s/ *=/=/g removes all spaces before =
太阳公公是暖光 2024-10-24 18:41:58

这是 Stefan Farestam 答案的扩展版本:

function parse_yaml {
   local prefix=$2
   local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
   sed -ne "s|,$s\]$s\$|]|" \
        -e ":1;s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s,$s\(.*\)$s\]|\1\2: [\3]\n\1  - \4|;t1" \
        -e "s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s\]|\1\2:\n\1  - \3|;p" $1 | \
   sed -ne "s|,$s}$s\$|}|" \
        -e ":1;s|^\($s\)-$s{$s\(.*\)$s,$s\($w\)$s:$s\(.*\)$s}|\1- {\2}\n\1  \3: \4|;t1" \
        -e    "s|^\($s\)-$s{$s\(.*\)$s}|\1-\n\1  \2|;p" | \
   sed -ne "s|^\($s\):|\1|" \
        -e "s|^\($s\)-$s[\"']\(.*\)[\"']$s\$|\1$fs$fs\2|p" \
        -e "s|^\($s\)-$s\(.*\)$s\$|\1$fs$fs\2|p" \
        -e "s|^\($s\)\($w\)$s:$s[\"']\(.*\)[\"']$s\$|\1$fs\2$fs\3|p" \
        -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p" | \
   awk -F$fs '{
      indent = length($1)/2;
      vname[indent] = $2;
      for (i in vname) {if (i > indent) {delete vname[i]; idx[i]=0}}
      if(length($2)== 0){  vname[indent]= ++idx[indent] };
      if (length($3) > 0) {
         vn=""; for (i=0; i<indent; i++) { vn=(vn)(vname[i])("_")}
         printf("%s%s%s=\"%s\"\n", "'$prefix'",vn, vname[indent], $3);
      }
   }'
}

此版本支持 - 表示法以及字典和列表的简短表示法。以下输入:

global:
  input:
    - "main.c"
    - "main.h"
  flags: [ "-O3", "-fpic" ]
  sample_input:
    -  { property1: value, property2: "value2" }
    -  { property1: "value3", property2: 'value 4' }

产生此输出:

global_input_1="main.c"
global_input_2="main.h"
global_flags_1="-O3"
global_flags_2="-fpic"
global_sample_input_1_property1="value"
global_sample_input_1_property2="value2"
global_sample_input_2_property1="value3"
global_sample_input_2_property2="value 4"

如您所见, - 项会自动编号,以便为每个项获取不同的变量名称。在 bash 中没有多维数组,因此这是一种解决方法。支持多个级别。
要解决 @briceburg 提到的尾随空格问题,应该将值括在单引号或双引号中。但是,仍然存在一些限制: 当值包含逗号时,字典和列表的扩展可能会产生错误的结果。此外,尚不支持更复杂的结构,例如跨多行的值(例如 ssh-keys)。

关于代码的几句话:第一个 sed 命令将字典的简写形式 { key: value, ...} 扩展为常规形式,并将其转换为更简单的 yaml 样式。第二个 sed 调用对列表的简短表示法执行相同的操作,并将 [entry, ... ] 转换为带有 - 的逐项列表符号。第三个 sed 调用是处理普通字典的原始调用,现在添加了处理带有 - 和缩进的列表。 awk 部分为每个缩进级别引入一个索引,并在变量名称为空时(即处理列表时)增加索引。使用计数器的当前值而不是空 vname。当上升一级时,计数器归零。

编辑:我为此创建了一个 github 存储库

here an extended version of the Stefan Farestam's answer:

function parse_yaml {
   local prefix=$2
   local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
   sed -ne "s|,$s\]$s\$|]|" \
        -e ":1;s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s,$s\(.*\)$s\]|\1\2: [\3]\n\1  - \4|;t1" \
        -e "s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s\]|\1\2:\n\1  - \3|;p" $1 | \
   sed -ne "s|,$s}$s\$|}|" \
        -e ":1;s|^\($s\)-$s{$s\(.*\)$s,$s\($w\)$s:$s\(.*\)$s}|\1- {\2}\n\1  \3: \4|;t1" \
        -e    "s|^\($s\)-$s{$s\(.*\)$s}|\1-\n\1  \2|;p" | \
   sed -ne "s|^\($s\):|\1|" \
        -e "s|^\($s\)-$s[\"']\(.*\)[\"']$s\$|\1$fs$fs\2|p" \
        -e "s|^\($s\)-$s\(.*\)$s\$|\1$fs$fs\2|p" \
        -e "s|^\($s\)\($w\)$s:$s[\"']\(.*\)[\"']$s\$|\1$fs\2$fs\3|p" \
        -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p" | \
   awk -F$fs '{
      indent = length($1)/2;
      vname[indent] = $2;
      for (i in vname) {if (i > indent) {delete vname[i]; idx[i]=0}}
      if(length($2)== 0){  vname[indent]= ++idx[indent] };
      if (length($3) > 0) {
         vn=""; for (i=0; i<indent; i++) { vn=(vn)(vname[i])("_")}
         printf("%s%s%s=\"%s\"\n", "'$prefix'",vn, vname[indent], $3);
      }
   }'
}

This version supports the - notation and the short notation for dictionaries and lists. The following input:

global:
  input:
    - "main.c"
    - "main.h"
  flags: [ "-O3", "-fpic" ]
  sample_input:
    -  { property1: value, property2: "value2" }
    -  { property1: "value3", property2: 'value 4' }

produces this output:

global_input_1="main.c"
global_input_2="main.h"
global_flags_1="-O3"
global_flags_2="-fpic"
global_sample_input_1_property1="value"
global_sample_input_1_property2="value2"
global_sample_input_2_property1="value3"
global_sample_input_2_property2="value 4"

as you can see the - items automatically get numbered in order to obtain different variable names for each item. In bash there are no multidimensional arrays, so this is one way to work around. Multiple levels are supported.
To work around the problem with trailing white spaces mentioned by @briceburg one should enclose the values in single or double quotes. However, there are still some limitations: Expansion of the dictionaries and lists can produce wrong results when values contain commas. Also, more complex structures like values spanning multiple lines (like ssh-keys) are not (yet) supported.

A few words about the code: The first sed command expands the short form of dictionaries { key: value, ...} to regular and converts them to more simple yaml style. The second sed call does the same for the short notation of lists and converts [ entry, ... ] to an itemized list with the - notation. The third sed call is the original one that handled normal dictionaries, now with the addition to handle lists with - and indentations. The awk part introduces an index for each indentation level and increases it when the variable name is empty (i.e. when processing a list). The current value of the counters are used instead of the empty vname. When going up one level, the counters are zeroed.

Edit: I have created a github repository for this.

紫竹語嫣☆ 2024-10-24 18:41:58

可以将小脚本传递给某些解释器,例如 Python。使用 Ruby 及其 YAML 库执行此操作的简单方法如下:

$ RUBY_SCRIPT="data = YAML::load(STDIN.read); puts data['a']; puts data['b']"
$ echo -e '---\na: 1234\nb: 4321' | ruby -ryaml -e "$RUBY_SCRIPT"
1234
4321

,其中 data 是包含 yaml 值的哈希(或数组)。

作为奖励,它可以很好地解析 Jekyll 的头条内容

ruby -ryaml -e "puts YAML::load(open(ARGV.first).read)['tags']" example.md

It's possible to pass a small script to some interpreters, like Python. An easy way to do so using Ruby and its YAML library is the following:

$ RUBY_SCRIPT="data = YAML::load(STDIN.read); puts data['a']; puts data['b']"
$ echo -e '---\na: 1234\nb: 4321' | ruby -ryaml -e "$RUBY_SCRIPT"
1234
4321

, wheredata is a hash (or array) with the values from yaml.

As a bonus, it'll parse Jekyll's front matter just fine.

ruby -ryaml -e "puts YAML::load(open(ARGV.first).read)['tags']" example.md
丶视觉 2024-10-24 18:41:58

将我的答案从 How to conversion a json response into yaml in bash 移开,因为这似乎是关于处理的权威帖子从命令行解析 YAML 文本。

我想添加有关 yq YAML 实现的详细信息。由于此 YAML 解析器有两个实现,且名称均为 yq,因此在不查看实现的 DSL 的情况下很难区分正在使用哪一个。有两个可用的实现是

  1. kislyuk/yq - 最常讨论的版本,它是 jq,用 Python 编写,使用 PyYAML 库进行 YAML 解析
  2. mikefarah/yq - A Go实现,使用 go-yaml v3 解析器拥有自己的动态 DSL。

两者都可以通过几乎所有主要发行版

  1. kislyuk/yq 上的标准安装包管理器进行安装 - 安装说明
  2. mikefarah/yq - 安装说明

这两个版本都有一些优点和缺点,但是需要强调的一些有效点(从他们的 repo 指令中采用)

kislyuk/yq

  1. 由于 DSL 完全从 jq 中采用,对于熟悉后者的用户来说,解析和操作变得非常简单
  2. 支持 保留 YAML 标签和样式,但在转换过程中丢失注释。由于 jq 不保留评论,在回合期间 -行程转换,评论丢失。
  3. 作为软件包的一部分,内置了 XML 支持。可执行文件 xq,它使用 xmltodict 将 XML 转码为 JSON,并将其通过管道传递给 jq,您可以在其中应用相同的 DSL 对对象执行 CRUD 操作并将输出往返返回 XML。
  4. 支持带有 -i 标志的就地编辑模式(类似于 sed -i

mikefarah/yq

  1. 容易频繁更改 DSL,从 2.x - 3.x 迁移
  2. 对锚点、样式和标签的丰富支持。但偶尔要注意一下错误
  3. 相对简单的 路径表达式 语法来导航和匹配 yaml 节点
  4. 支持 YAML->JSON、JSON->YAML 格式和漂亮打印 YAML(带注释)
  5. 支持带有 -i 标志的就地编辑模式(类似于 sed -i)
  6. 支持使用 -C 标志对输出 YAML 进行着色(不适用于 JSON 输出)和子元素缩进(默认为 2 个空格)
  7. 支持大多数 shell 的 Shell 补全 - Bash、zsh (因为 spf13/cobra 的强大支持用于生成 CLI 标志)

我的看法在以下两个版本的 YAML(也在其他答案中引用)上使用两个

root_key1: this is value one
root_key2: "this is value two"

drink:
  state: liquid
  coffee:
    best_served: hot
    colour: brown
  orange_juice:
    best_served: cold
    colour: orange

food:
  state: solid
  apple_pie:
    best_served: warm

root_key_3: this is value three

实现执行的各种操作(一些常用操作)

  1. 在根级别修改节点值 - 更改 root_key2 的值
  2. 修改数组内容,添加值 - 将属性添加到 coffee
  3. 修改数组内容,删除值 - 从 orange_juice 删除属性
  4. 打印带有路径的键/值对 - 对于 下的所有项目食物

使用kislyuk/yq

  1. yq -y '.root_key2 |= "这是一个新值"' yaml
    
  2. yq -y '.drink.coffee += { time: "always"}' yaml
    
  3. yq -y 'del(.drink.orange_juice.colour)' yaml
    
  4. yq -r '.food|paths(标量) as $p | [($p|join(".")), (getpath($p)|tojson)] | @tsv'yaml
    

这非常简单。您所需要做的就是使用 -y 标志将 jq JSON 输出转码回 YAML。

使用 mikefarah/yq

  1.  yq w yaml root_key2“这是一个新值”
    
  2.  yq w yaml Drink.coffee.time“总是”
    
  3.  yq d yaml Drink.orange_juice.colour
    
  4.  yq r yaml --printMode pv "food.**"
    

截至 2020 年 12 月 21 日,yq v4 处于测试阶段,支持更强大的路径表达式,并支持类似于使用 jq 的 DSL。阅读过渡说明 - 从 V3 升级

Moving my answer from How to convert a json response into yaml in bash, since this seems to be the authoritative post on dealing with YAML text parsing from command line.

I would like to add details about the yq YAML implementation. Since there are two implementations of this YAML parser lying around, both having the name yq, it is hard to differentiate which one is in use, without looking at the implementations' DSL. There two available implementations are

  1. kislyuk/yq - The more often talked about version, which is a wrapper over jq, written in Python using the PyYAML library for YAML parsing
  2. mikefarah/yq - A Go implementation, with its own dynamic DSL using the go-yaml v3 parser.

Both are available for installation via standard installation package managers on almost all major distributions

  1. kislyuk/yq - Installation instructions
  2. mikefarah/yq - Installation instructions

Both the versions have some pros and cons over the other, but a few valid points to highlight (adopted from their repo instructions)

kislyuk/yq

  1. Since the DSL is the adopted completely from jq, for users familiar with the latter, the parsing and manipulation becomes quite straightforward
  2. Supports mode to preserve YAML tags and styles, but loses comments during the conversion. Since jq doesn't preserve comments, during the round-trip conversion, the comments are lost.
  3. As part of the package, XML support is built in. An executable, xq, which transcodes XML to JSON using xmltodict and pipes it to jq, on which you can apply the same DSL to perform CRUD operations on the objects and round-trip the output back to XML.
  4. Supports in-place edit mode with -i flag (similar to sed -i)

mikefarah/yq

  1. Prone to frequent changes in DSL, migration from 2.x - 3.x
  2. Rich support for anchors, styles and tags. But lookout for bugs once in a while
  3. A relatively simple Path expression syntax to navigate and match yaml nodes
  4. Supports YAML->JSON, JSON->YAML formatting and pretty printing YAML (with comments)
  5. Supports in-place edit mode with -i flag (similar to sed -i)
  6. Supports coloring the output YAML with -C flag (not applicable for JSON output) and indentation of the sub elements (default at 2 spaces)
  7. Supports Shell completion for most shells - Bash, zsh (because of powerful support from spf13/cobra used to generate CLI flags)

My take on the following YAML (referenced in other answer as well) with both the versions

root_key1: this is value one
root_key2: "this is value two"

drink:
  state: liquid
  coffee:
    best_served: hot
    colour: brown
  orange_juice:
    best_served: cold
    colour: orange

food:
  state: solid
  apple_pie:
    best_served: warm

root_key_3: this is value three

Various actions to be performed with both the implementations (some frequently used operations)

  1. Modifying node value at root level - Change value of root_key2
  2. Modifying array contents, adding value - Add property to coffee
  3. Modifying array contents, deleting value - Delete property from orange_juice
  4. Printing key/value pairs with paths - For all items under food

Using kislyuk/yq

  1. yq -y '.root_key2 |= "this is a new value"' yaml
    
  2. yq -y '.drink.coffee += { time: "always"}' yaml
    
  3. yq -y 'del(.drink.orange_juice.colour)' yaml
    
  4. yq -r '.food|paths(scalars) as $p | [($p|join(".")), (getpath($p)|tojson)] | @tsv' yaml
    

Which is pretty straightforward. All you need is to transcode jq JSON output back into YAML with the -y flag.

Using mikefarah/yq

  1.  yq w yaml root_key2 "this is a new value"
    
  2.  yq w yaml drink.coffee.time "always"
    
  3.  yq d yaml drink.orange_juice.colour
    
  4.  yq r yaml --printMode pv "food.**"
    

As of today Dec 21st 2020, yq v4 is in beta and supports much powerful path expressions and supports DSL similar to using jq. Read the transition notes - Upgrading from V3

谎言月老 2024-10-24 18:41:58

我刚刚编写了一个名为 Yay! 的解析器(Yaml 不是 Yamlesque!),它解析 Yamlesque,YAML 的一个小子集。因此,如果您正在为 Bash 寻找 100% 兼容的 YAML 解析器,那么这不是您的选择。但是,引用 OP,如果您想要一个类似于 YAML 的结构化配置文件,让非技术用户尽可能轻松地编辑,这可能会很有趣。

受到早期答案的启发,但写入关联数组(是的,它需要 Bash 4.x)的基本变量。它的实现方式允许在不事先知道键的情况下解析数据,以便可以编写数据驱动的代码。

除了键/值数组元素之外,每个数组还有一个包含键名称列表的 keys 数组、一个包含子数组名称的 children 数组和一个 引用其父级的parent 键。

这是 Yamlesque 的示例:

root_key1: this is value one
root_key2: "this is value two"

drink:
  state: liquid
  coffee:
    best_served: hot
    colour: brown
  orange_juice:
    best_served: cold
    colour: orange

food:
  state: solid
  apple_pie:
    best_served: warm

root_key_3: this is value three

这里是一个示例,展示如何使用它:

#!/bin/bash
# An example showing how to use Yay

. /usr/lib/yay

# helper to get array value at key
value() { eval echo \${$1[$2]}; }

# print a data collection
print_collection() {
  for k in $(value $1 keys)
  do
    echo "$2$k = $(value $1 $k)"
  done

  for c in $(value $1 children)
  do
    echo -e "$2$c\n$2{"
    print_collection $c "  $2"
    echo "$2}"
  done
}

yay example
print_collection example

哪个输出:

root_key1 = this is value one
root_key2 = this is value two
root_key_3 = this is value three
example_drink
{
  state = liquid
  example_coffee
  {
    best_served = hot
    colour = brown
  }
  example_orange_juice
  {
    best_served = cold
    colour = orange
  }
}
example_food
{
  state = solid
  example_apple_pie
  {
    best_served = warm
  }
}

以及 这里是解析器:

yay_parse() {

   # find input file
   for f in "$1" "$1.yay" "$1.yml"
   do
     [[ -f "$f" ]] && input="$f" && break
   done
   [[ -z "$input" ]] && exit 1

   # use given dataset prefix or imply from file name
   [[ -n "$2" ]] && local prefix="$2" || {
     local prefix=$(basename "$input"); prefix=${prefix%.*}
   }

   echo "declare -g -A $prefix;"

   local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
   sed -n -e "s|^\($s\)\($w\)$s:$s\"\(.*\)\"$s\$|\1$fs\2$fs\3|p" \
          -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p" "$input" |
   awk -F$fs '{
      indent       = length($1)/2;
      key          = $2;
      value        = $3;

      # No prefix or parent for the top level (indent zero)
      root_prefix  = "'$prefix'_";
      if (indent ==0 ) {
        prefix = "";          parent_key = "'$prefix'";
      } else {
        prefix = root_prefix; parent_key = keys[indent-1];
      }

      keys[indent] = key;

      # remove keys left behind if prior row was indented more than this row
      for (i in keys) {if (i > indent) {delete keys[i]}}

      if (length(value) > 0) {
         # value
         printf("%s%s[%s]=\"%s\";\n", prefix, parent_key , key, value);
         printf("%s%s[keys]+=\" %s\";\n", prefix, parent_key , key);
      } else {
         # collection
         printf("%s%s[children]+=\" %s%s\";\n", prefix, parent_key , root_prefix, key);
         printf("declare -g -A %s%s;\n", root_prefix, key);
         printf("%s%s[parent]=\"%s%s\";\n", root_prefix, key, prefix, parent_key);
      }
   }'
}

# helper to load yay data file
yay() { eval $(yay_parse "$@"); }

链接的源文件中有一些文档下面是该代码功能的简短说明。

yay_parse 函数首先定位input 文件或以退出状态 1 退出。接下来,它确定数据集前缀,可以显式指定或从文件名派生。

它将有效的 bash 命令写入其标准输出,如果执行该命令,则会定义表示输入数据文件内容的数组。其中第一个定义了顶级数组:

echo "declare -g -A $prefix;"

请注意,数组声明是关联的 (-A),这是 Bash 版本 4 的一个功能。声明也是全局的 (-g) >),因此它们可以在函数中执行,但可在全局范围内使用,如 yay 帮助程序:

yay() { eval $(yay_parse "$@"); }

输入数据最初使用 sed 进行处理。在使用 ASCII 文件分隔符 字符并删除值字段周围的所有双引号。

 local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
 sed -n -e "s|^\($s\)\($w\)$s:$s\"\(.*\)\"$s\$|\1$fs\2$fs\3|p" \
        -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p" "$input" |

两种表达方式相似;它们的不同之处仅在于第一个选择带引号的值,而第二个选择不带引号的值。

文件分隔符 (28使用 /hex 12/octal 034) 是因为,作为不可打印字符,它不太可能出现在输入数据中。

结果通过管道输送到 awk 中,awk 一次处理一行输入。它使用 FS 字符将每个字段分配给一个变量:

indent       = length($1)/2;
key          = $2;
value        = $3;

所有行都有一个缩进(可能为零)和一个键,但它们并不都有值。它计算将第一个字段(包含前导空格)的长度除以二的行的缩进级别。没有任何缩进的顶级项目处于缩进级别零。

接下来,它计算出当前项目使用的前缀。这是添加到键名称以形成数组名称的内容。顶级数组有一个 root_prefix ,它定义为数据集名称和下划线:

root_prefix  = "'$prefix'_";
if (indent ==0 ) {
  prefix = "";          parent_key = "'$prefix'";
} else {
  prefix = root_prefix; parent_key = keys[indent-1];
}

parent_key 是当前行缩进级别之上的缩进级别的键并表示当前行所属的集合。集合的键/值对将存储在一个数组中,其名称定义为 prefixparent_key 的串联。

对于顶层(缩进级别零),数据集前缀用作父键,因此它没有前缀(设置为 "")。所有其他数组都以根前缀为前缀。

接下来,当前键被插入到包含键的(awk 内部)数组中。该数组在整个 awk 会话中持续存在,因此包含先前行插入的键。使用其缩进作为数组索引将键插入数组中。

keys[indent] = key;

由于此数组包含来自前一行的键,因此缩进级别大于当前行缩进级别的任何键都将被删除:

 for (i in keys) {if (i > indent) {delete keys[i]}}

这使得键数组包含从缩进级别 0 的根到当前行的键链。它会删除前一行缩进比当前行更深时留下的过时键。

最后一部分输出 bash 命令:不带值的输入行启动新的缩进级别(YAML 术语中的集合),带值的输入行添加一个键到当前集合。

集合的名称是当前行的 prefixparent_key 的串联。

当键具有值时,具有该值的键将被分配给当前集合,如下所示:

printf("%s%s[%s]=\"%s\";\n", prefix, parent_key , key, value);
printf("%s%s[keys]+=\" %s\";\n", prefix, parent_key , key);

第一个语句输出将值分配给以该键命名的关联数组元素的命令,第二个语句输出添加该值的命令集合的空格分隔 keys 列表的键:

<current_collection>[<key>]="<value>";
<current_collection>[keys]+=" <key>";

当键没有值时,将像这样启动一个新集合:

printf("%s%s[children]+=\" %s%s\";\n", prefix, parent_key , root_prefix, key);
printf("declare -g -A %s%s;\n", root_prefix, key);

第一个语句输出将新集合添加到当前集合的命令以空格分隔的 children 列表,第二个输出命令为新集合声明一个新的关联数组:

<current_collection>[children]+=" <new_collection>"
declare -g -A <new_collection>;

yay_parse 的所有输出都可以通过以下方式解析为 bash 命令: bash evalsource 内置命令。

I just wrote a parser that I called Yay! (Yaml ain't Yamlesque!) which parses Yamlesque, a small subset of YAML. So, if you're looking for a 100% compliant YAML parser for Bash then this isn't it. However, to quote the OP, if you want a structured configuration file which is as easy as possible for a non-technical user to edit that is YAML-like, this may be of interest.

It's inspred by the earlier answer but writes associative arrays (yes, it requires Bash 4.x) instead of basic variables. It does so in a way that allows the data to be parsed without prior knowledge of the keys so that data-driven code can be written.

As well as the key/value array elements, each array has a keys array containing a list of key names, a children array containing names of child arrays and a parent key that refers to its parent.

This is an example of Yamlesque:

root_key1: this is value one
root_key2: "this is value two"

drink:
  state: liquid
  coffee:
    best_served: hot
    colour: brown
  orange_juice:
    best_served: cold
    colour: orange

food:
  state: solid
  apple_pie:
    best_served: warm

root_key_3: this is value three

Here is an example showing how to use it:

#!/bin/bash
# An example showing how to use Yay

. /usr/lib/yay

# helper to get array value at key
value() { eval echo \${$1[$2]}; }

# print a data collection
print_collection() {
  for k in $(value $1 keys)
  do
    echo "$2$k = $(value $1 $k)"
  done

  for c in $(value $1 children)
  do
    echo -e "$2$c\n$2{"
    print_collection $c "  $2"
    echo "$2}"
  done
}

yay example
print_collection example

which outputs:

root_key1 = this is value one
root_key2 = this is value two
root_key_3 = this is value three
example_drink
{
  state = liquid
  example_coffee
  {
    best_served = hot
    colour = brown
  }
  example_orange_juice
  {
    best_served = cold
    colour = orange
  }
}
example_food
{
  state = solid
  example_apple_pie
  {
    best_served = warm
  }
}

And here is the parser:

yay_parse() {

   # find input file
   for f in "$1" "$1.yay" "$1.yml"
   do
     [[ -f "$f" ]] && input="$f" && break
   done
   [[ -z "$input" ]] && exit 1

   # use given dataset prefix or imply from file name
   [[ -n "$2" ]] && local prefix="$2" || {
     local prefix=$(basename "$input"); prefix=${prefix%.*}
   }

   echo "declare -g -A $prefix;"

   local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
   sed -n -e "s|^\($s\)\($w\)$s:$s\"\(.*\)\"$s\$|\1$fs\2$fs\3|p" \
          -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p" "$input" |
   awk -F$fs '{
      indent       = length($1)/2;
      key          = $2;
      value        = $3;

      # No prefix or parent for the top level (indent zero)
      root_prefix  = "'$prefix'_";
      if (indent ==0 ) {
        prefix = "";          parent_key = "'$prefix'";
      } else {
        prefix = root_prefix; parent_key = keys[indent-1];
      }

      keys[indent] = key;

      # remove keys left behind if prior row was indented more than this row
      for (i in keys) {if (i > indent) {delete keys[i]}}

      if (length(value) > 0) {
         # value
         printf("%s%s[%s]=\"%s\";\n", prefix, parent_key , key, value);
         printf("%s%s[keys]+=\" %s\";\n", prefix, parent_key , key);
      } else {
         # collection
         printf("%s%s[children]+=\" %s%s\";\n", prefix, parent_key , root_prefix, key);
         printf("declare -g -A %s%s;\n", root_prefix, key);
         printf("%s%s[parent]=\"%s%s\";\n", root_prefix, key, prefix, parent_key);
      }
   }'
}

# helper to load yay data file
yay() { eval $(yay_parse "$@"); }

There is some documentation in the linked source file and below is a short explanation of what the code does.

The yay_parse function first locates the input file or exits with an exit status of 1. Next, it determines the dataset prefix, either explicitly specified or derived from the file name.

It writes valid bash commands to its standard output that, if executed, define arrays representing the contents of the input data file. The first of these defines the top-level array:

echo "declare -g -A $prefix;"

Note that array declarations are associative (-A) which is a feature of Bash version 4. Declarations are also global (-g) so they can be executed in a function but be available to the global scope like the yay helper:

yay() { eval $(yay_parse "$@"); }

The input data is initially processed with sed. It drops lines that don't match the Yamlesque format specification before delimiting the valid Yamlesque fields with an ASCII File Separator character and removing any double-quotes surrounding the value field.

 local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
 sed -n -e "s|^\($s\)\($w\)$s:$s\"\(.*\)\"$s\$|\1$fs\2$fs\3|p" \
        -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p" "$input" |

The two expressions are similar; they differ only because the first one picks out quoted values where as the second one picks out unquoted ones.

The File Separator (28/hex 12/octal 034) is used because, as a non-printable character, it is unlikely to be in the input data.

The result is piped into awk which processes its input one line at a time. It uses the FS character to assign each field to a variable:

indent       = length($1)/2;
key          = $2;
value        = $3;

All lines have an indent (possibly zero) and a key but they don't all have a value. It computes an indent level for the line dividing the length of the first field, which contains the leading whitespace, by two. The top level items without any indent are at indent level zero.

Next, it works out what prefix to use for the current item. This is what gets added to a key name to make an array name. There's a root_prefix for the top-level array which is defined as the data set name and an underscore:

root_prefix  = "'$prefix'_";
if (indent ==0 ) {
  prefix = "";          parent_key = "'$prefix'";
} else {
  prefix = root_prefix; parent_key = keys[indent-1];
}

The parent_key is the key at the indent level above the current line's indent level and represents the collection that the current line is part of. The collection's key/value pairs will be stored in an array with its name defined as the concatenation of the prefix and parent_key.

For the top level (indent level zero) the data set prefix is used as the parent key so it has no prefix (it's set to ""). All other arrays are prefixed with the root prefix.

Next, the current key is inserted into an (awk-internal) array containing the keys. This array persists throughout the whole awk session and therefore contains keys inserted by prior lines. The key is inserted into the array using its indent as the array index.

keys[indent] = key;

Because this array contains keys from previous lines, any keys with an indent level grater than the current line's indent level are removed:

 for (i in keys) {if (i > indent) {delete keys[i]}}

This leaves the keys array containing the key-chain from the root at indent level 0 to the current line. It removes stale keys that remain when the prior line was indented deeper than the current line.

The final section outputs the bash commands: an input line without a value starts a new indent level (a collection in YAML parlance) and an input line with a value adds a key to the current collection.

The collection's name is the concatenation of the current line's prefix and parent_key.

When a key has a value, a key with that value is assigned to the current collection like this:

printf("%s%s[%s]=\"%s\";\n", prefix, parent_key , key, value);
printf("%s%s[keys]+=\" %s\";\n", prefix, parent_key , key);

The first statement outputs the command to assign the value to an associative array element named after the key and the second one outputs the command to add the key to the collection's space-delimited keys list:

<current_collection>[<key>]="<value>";
<current_collection>[keys]+=" <key>";

When a key doesn't have a value, a new collection is started like this:

printf("%s%s[children]+=\" %s%s\";\n", prefix, parent_key , root_prefix, key);
printf("declare -g -A %s%s;\n", root_prefix, key);

The first statement outputs the command to add the new collection to the current's collection's space-delimited children list and the second one outputs the command to declare a new associative array for the new collection:

<current_collection>[children]+=" <new_collection>"
declare -g -A <new_collection>;

All of the output from yay_parse can be parsed as bash commands by the bash eval or source built-in commands.

你又不是我 2024-10-24 18:41:58

很难说,因为这取决于您希望解析器从 YAML 文档中提取什么。对于简单的情况,您可能可以使用 grepcutawk 等。对于更复杂的解析,您需要使用 full-崩溃的解析库,例如 Python 的 PyYAMLYAML::Perl

Hard to say because it depends on what you want the parser to extract from your YAML document. For simple cases, you might be able to use grep, cut, awk etc. For more complex parsing you would need to use a full-blown parsing library such as Python's PyYAML or YAML::Perl.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文