jq 变量替换在 shell 中有效,但在脚本中无效

发布于 2025-01-12 22:45:15 字数 3198 浏览 1 评论 0原文

以下命令在 shell 中运行良好,但通过脚本执行时则不然。我缺少什么。

jsonSelectWords='select(.words!=6) | select(.words!=1173) | select(.words!=1) | select(.words!=8) | select(.words!=9) | select(.words!=27)'

cat file.json | jq "$jsonSelectWords"

在脚本中,选择语句是动态创建的,因此我无法直接提供它。

input=file.json

local jsonSelectWords="'"
for word in "${wordDupArray[@]}"
do
    jsonSelectWords+="select(.words!=$word) | "
done
jsonSelectWords="${jsonSelectWords::-3}"
jsonSelectWords+="'"


cat $input | jq "$jsonSelectWords"

执行最后一行会出现以下错误。

jq: error: syntax error, unexpected INVALID_CHARACTER, expecting $end (Unix shell quoting issues?) at <top-level>, line 1:

'select(.words!=6) | select(.words!=1173) | select(.words!=1) | select(.words!=8) | select(.words!=9) | select(.words!=27)'

jq: 1 compile error

任何提示。它尝试了不同的变体以及 $(cat $input | jq "$jsonSelectWords") 中的整个语句

我还使用了以下内容 <代码>猫$输入| jq --args JSW "$jsonSelectWords" '$JSW' (从初始字符串中删除单引号,使用 '[$JSW]' 等)。这只是输出 jsonSelectWords 的内容。

以下几行是 file.json 又名 $input 的内容示例。

{"timestamp":"2022-03-09T12:30:23.329630917+01:00","scheme":"http","port":"80","path":"/","body-sha256":"0bfc0bdeb920ce4701f130e6e6a33c8aaf558fae44c7479cc1629930cb0f4535","header-sha256":"d9522b92bb09e71b719804f522f0b3b49aa77974c8d79e644fb45a7b3327f73e","a":["81.91.86.14"],"url":"http://01.akce.omv.com:80","input":"01.akce.omv.com","location":"https://01.akce.omv.com/","webserver":"openresty","content-type":"text/html","method":"GET","host":"81.91.86.14","content-length":95,"status-code":301,"response-time":"194.004475ms","failed":false,"lines":3,"words":6}
{"timestamp":"2022-03-09T12:30:23.355007661+01:00","scheme":"http","port":"80","path":"/","body-sha256":"d6285599bd6f2851fc17e0244ad212a58d8d539231f804f81b5b98289197afa0","header-sha256":"96884ec058c78d0ea282a2d51be4ce0f5c7bc05d8fe3e8dd8f6fb73dd4fa2cd6","a":["81.91.86.14","40.90.4.7","64.4.48.7","2603:1061::7","2620:1ec:8ec::7"],"url":"http://09-mail2.akce.omv.com:80","input":"09-mail2.akce.omv.com","location":"https://09-mail2.akce.omv.com/","webserver":"openresty","content-type":"text/html","method":"GET","host":"81.91.86.14","content-length":101,"status-code":301,"response-time":"233.377898ms","failed":false,"lines":3,"words":6}
{"timestamp":"2022-03-09T12:30:23.450849812+01:00","scheme":"http","port":"80","path":"/","body-sha256":"c186820e328bf631a2943f77e52e9e8319ddfefade6d308a2a22ef996176bbe6","header-sha256":"61e4f3139518b49cac86b77a4f9f06da98d53f2eb12dbff574b5a0ea66327478","a":["81.91.86.14"],"url":"http://09-server2.akce.omv.com:80","input":"09-server2.akce.omv.com","location":"https://09-server2.akce.omv.com/","webserver":"openresty","content-type":"text/html","method":"GET","host":"81.91.86.14","content-length":103,"status-code":301,"response-time":"268.856986ms","failed":false,"lines":3,"words":6}

解决方案

local jsonSelectWords=""
for word in "${wordDupArray[@]}"
do
    jsonSelectWords+="select(.words!=$word) | "
done
jsonSelectWords="${jsonSelectWords::-3}"

cat $input | jq "$jsonSelectWords"

The following command works in the shell just fine, but when executed via an script it doesn't. What am I missing.

jsonSelectWords='select(.words!=6) | select(.words!=1173) | select(.words!=1) | select(.words!=8) | select(.words!=9) | select(.words!=27)'

cat file.json | jq "$jsonSelectWords"

In the script the select statement is created dynamically, thus I am not able to directly provide it.

input=file.json

local jsonSelectWords="'"
for word in "${wordDupArray[@]}"
do
    jsonSelectWords+="select(.words!=$word) | "
done
jsonSelectWords="${jsonSelectWords::-3}"
jsonSelectWords+="'"


cat $input | jq "$jsonSelectWords"

The execution of the last line gives the following error.

jq: error: syntax error, unexpected INVALID_CHARACTER, expecting $end (Unix shell quoting issues?) at <top-level>, line 1:

'select(.words!=6) | select(.words!=1173) | select(.words!=1) | select(.words!=8) | select(.words!=9) | select(.words!=27)'

jq: 1 compile error

Any hints. It tried different variations as well as the whole statement in $(cat $input | jq "$jsonSelectWords")

I have also used the following
cat $input | jq --args JSW "$jsonSelectWords" '$JSW' (with single quotes removed from the initial string, with '[$JSW]' and so on). This just outputs the content of jsonSelectWords.

The following lines are examples of the content of file.json aka $input.

{"timestamp":"2022-03-09T12:30:23.329630917+01:00","scheme":"http","port":"80","path":"/","body-sha256":"0bfc0bdeb920ce4701f130e6e6a33c8aaf558fae44c7479cc1629930cb0f4535","header-sha256":"d9522b92bb09e71b719804f522f0b3b49aa77974c8d79e644fb45a7b3327f73e","a":["81.91.86.14"],"url":"http://01.akce.omv.com:80","input":"01.akce.omv.com","location":"https://01.akce.omv.com/","webserver":"openresty","content-type":"text/html","method":"GET","host":"81.91.86.14","content-length":95,"status-code":301,"response-time":"194.004475ms","failed":false,"lines":3,"words":6}
{"timestamp":"2022-03-09T12:30:23.355007661+01:00","scheme":"http","port":"80","path":"/","body-sha256":"d6285599bd6f2851fc17e0244ad212a58d8d539231f804f81b5b98289197afa0","header-sha256":"96884ec058c78d0ea282a2d51be4ce0f5c7bc05d8fe3e8dd8f6fb73dd4fa2cd6","a":["81.91.86.14","40.90.4.7","64.4.48.7","2603:1061::7","2620:1ec:8ec::7"],"url":"http://09-mail2.akce.omv.com:80","input":"09-mail2.akce.omv.com","location":"https://09-mail2.akce.omv.com/","webserver":"openresty","content-type":"text/html","method":"GET","host":"81.91.86.14","content-length":101,"status-code":301,"response-time":"233.377898ms","failed":false,"lines":3,"words":6}
{"timestamp":"2022-03-09T12:30:23.450849812+01:00","scheme":"http","port":"80","path":"/","body-sha256":"c186820e328bf631a2943f77e52e9e8319ddfefade6d308a2a22ef996176bbe6","header-sha256":"61e4f3139518b49cac86b77a4f9f06da98d53f2eb12dbff574b5a0ea66327478","a":["81.91.86.14"],"url":"http://09-server2.akce.omv.com:80","input":"09-server2.akce.omv.com","location":"https://09-server2.akce.omv.com/","webserver":"openresty","content-type":"text/html","method":"GET","host":"81.91.86.14","content-length":103,"status-code":301,"response-time":"268.856986ms","failed":false,"lines":3,"words":6}

Solution

local jsonSelectWords=""
for word in "${wordDupArray[@]}"
do
    jsonSelectWords+="select(.words!=$word) | "
done
jsonSelectWords="${jsonSelectWords::-3}"

cat $input | jq "$jsonSelectWords"

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

迷途知返 2025-01-19 22:45:15

以编程方式生成正在执行的代码(这里是生成并运行 jq 过滤器的 bash 脚本)通常被认为不仅可读性较差(jq 中发生的事情非常分散),而且更容易出错(这实际上将您带到了这里),而且也是一个主要的安全风险(在复杂的依赖链中,您可能无法完全控制最终执行的内容)。

因此,您应该尝试修改您的方法,使唯一的变量是正在输入的数据,而代码的制定方式可以对变化的数据做出反应,但其本身只是一个文字不变的字符串。

给定您的示例(我认为上面的代码只是与实际问题相关的一个小片段,因此让它更多地成为一个提示指南而不是一般解决方案),您正在尝试减少来自 file.json 通过将其 words 字段的数字内容与存储在 bash 数组 wordDupArray 中的数字列表进行比较。更具体地说,给定一个输入对象,如果 .words 包含的数字不存在于所提供的数字列表中,则您希望将其传递到输出,否则如果数字 < em>出现在列表中。让我们来实现它。

如果给 jq 一个对象流,它将一一处理它们,因此已经自动将输入流分解为单个对象。对于比较部分,需要向 jq 提供 bash 数组中的数字列表。由于 jq 是一个 JSON 处理器,因此最好以 JSON 数组的形式提供列表,因此当前的任务是将 bash 数组转换为 JSON 数组。

有很多方法可以实现这一点。由于数组仅包含数字,因此您可以利用数字本身已经是有效 JSON 文档的事实,因此一种方法可能是进行另一个 jq 调用,该调用接受数字流并将它们输出为 JSON 编码使用 --slurp (或 -s)选项创建数组,然后返回 bash,将该输出存储在变量中并将其提供给实际的 jq 调用,使用--argjson选项,它允许您将 JSON 数组作为 jq 内的变量进行访问。

wordDupArray=(6 1173 1 8 9 27)                   # dummy init of your bash array
jsonarray="$(jq -sc <<< "${wordDupArray[@]}")"   # will contain "[6,1173,1,8,9,27]"
jq --argjson list "$jsonarray" ' … jq filter using the array in $list … ' file.json

为了实现变化,另一种方法可能是使用 --slurpfile 选项,该选项本身已经将 JSON 文档流组合到 JSON 数组中,并且类似地允许您使用变量访问该数组。但主要区别在于,它要求文档以文件形式提供,而不是 JSON 编码的字符串。这可以通过在 bash 中使用 进程替换 来模仿:

wordDupArray=(6 1173 1 8 9 27)                   # dummy init of your bash array
jq --slurpfile list <(cat <<< "${wordDupArray[@]}") ' … using $list … ' file.json

对于主要任务,根据 $list 数组中的匹配从 file.json 中过滤输入对象,您可以像以前一样检查不等式,但现在使用数组的项目相反,使用 $list[] ,并使用 all 函数检查给定条件是否适用于所有项(全部为true意味着没有匹配项)。

jq --slurpfile list <(cat <<< "${wordDupArray[@]}") \
  'select([.words != $list[]] | all)' file.json

演示

同样,为了变化,您也可以使用 IN 函数,返回给定值是否出现在给定流中(不要与 in 函数 混淆用于检查对象中的键),以及 not 功能选择无法找到匹配项的情况。

jq --slurpfile list <(cat <<< "${wordDupArray[@]}") \
  'select(.words | IN($list[]) | not)' file.json

Demo

总而言之,这些解决方案更加稳定和健壮,因为代码是不变的和独立的,也更容易理解,因为连续的代码更容易遵循,即使在失败的情况下,您也可以期待比通用的“编译错误”更方便的错误消息,如果实际代码执行,则更难跟踪由于其波动性,未知。

Programmatically producing code that is being executed (here, a bash script producing and running a jq filter) is generally considered not only less readable (what happens in jq is very fragmented), more error-prone (this has actually brought you here), but also a principal safety risk (in a complex chain of dependencies you might not have full control over what is being executed in the end).

Therefore, you should try to modify your approach in a way that the only thing variable is the data that is being input, while the code is formulated in a way that it can react on the varying data but by itself is just a literal invariable string.

Given your sample (I presume, the code above is just a small snippet relevant to the actual question, so let this be more a hinting guide rather than a general solution), you are trying to reduce an input stream of JSON objects from file.json by comparing their words field's numeric content to a list of numbers stored in the bash array wordDupArray. More specifically, given a single input object, you want to pass it through to the output if .words holds a number that is not present in a list of numbers provided, or else drop it if the number is present in the list. Let's implement that.

If jq is given a stream of objects, it will process them one by one, so breaking down the input stream to a single object already happens automatically. For the comparison part, jq needs to be given the list of numbers from the bash array. As jq is a JSON processor, it'd be best to provide the list as a JSON array, thus the task at hand is to convert the bash array into a JSON array.

There are many ways to accomplish this. As the array contains only numbers, you can cash in on the fact that a number by itself is already a valid JSON document, so one approach could be to have another jq call which takes a stream of numbers and outputs them as a JSON-ecoded array using the --slurp (or -s) option, and to then, back in bash, store that output in a variable and provide it to the actual jq call using the --argjson option, which lets you access that JSON array as a variable inside jq.

wordDupArray=(6 1173 1 8 9 27)                   # dummy init of your bash array
jsonarray="$(jq -sc <<< "${wordDupArray[@]}")"   # will contain "[6,1173,1,8,9,27]"
jq --argjson list "$jsonarray" ' … jq filter using the array in $list … ' file.json

For the sake of variation, another way could be to use the --slurpfile option which by itself already combines a stream of JSON documents to a JSON array, and similarly lets you access that array using a variable. But as a major difference, it requires the document to be provided as a file rather than a JSON-encoded string. This can be mimicked by using Process Substitution in bash:

wordDupArray=(6 1173 1 8 9 27)                   # dummy init of your bash array
jq --slurpfile list <(cat <<< "${wordDupArray[@]}") ' … using $list … ' file.json

For the main task, filtering the input objects from file.json according to a match in the $list array, you can check for inequality just as you did before but now using the array's items $list[] instead, and have the all function check whether the given condition holds for all items or not (all hold true means none did match).

jq --slurpfile list <(cat <<< "${wordDupArray[@]}") \
  'select([.words != $list[]] | all)' file.json

Demo

Again, for the sake of variation, you could also use the IN function which returns whether or not a given value appears in a given stream (not to be confused with the in function which is for checking keys in objects), and the not function to select the cases where a match could not be found.

jq --slurpfile list <(cat <<< "${wordDupArray[@]}") \
  'select(.words | IN($list[]) | not)' file.json

Demo

All in all, these solutions are more stable and robust as the code is invariable and self-contained, also more comprehensible as a contiguous code is easier to follow, and even in the case of a failure you can expect more convenient error messages than the generic "compile error" which is even harder to trace if the actual code executed is unknown because of its volatility.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文