使用 JQ - 连接 2 个结构不同但具有公共密钥的 json 文件
我有 2 个文件想要基于公用密钥加入。
第一个文件是一个数组,其中:
{"_time": "2022-02-20T23","csp_name": "1","tool_bf_id": "1234", "dvc_ssr": "aa-1111"}
{"_time": "2022-02-20T23","csp_name": "2","tool_bf_id": "4567", "dvc_ssr": "aa-2222"}
{"_time": "2022-02-20T23","csp_name": "3","tool_bf_id": "1357", "dvc_ssr": "null"}
{"_time": "2022-02-20T23","csp_name": "4","tool_bf_id": "2468", "dvc_ssr": "aa-1111"}
{"_time": "2022-02-20T23","csp_name": "5","tool_bf_id": "1246", "dvc_ssr": "null"}
第二个文件也是一个数组,是所有不同“dvc_ssr”的完整列表,其中:
{"host": "hostId1","dvc_ssr": "aa-1111"}
{"host": "hostId2","dvc_ssr": "aa-2222"}
我试图通过使用 dvc_ssr 键将第二个文件中的信息添加到第一个文件中来加入这两个文件价值。
我期待这样的事情:
{"_time": "2022-02-20T23","csp_name": "1","tool_bf_id": "1234", "dvc_ssr": "aa-1111","host": "hostId1"}
{"_time": "2022-02-20T23","csp_name": "2",,"tool_bf_id": "4567", "dvc_ssr": "aa-2222","host": "hostId2"}
{"_time": "2022-02-20T23","csp_name": "3","tool_bf_id": "1357", "dvc_ssr": "null"}
{"_time": "2022-02-20T23","csp_name": "4","tool_bf_id": "2468", "dvc_ssr": "aa-1111","host": "hostId1"}
{"_time": "2022-02-20T23","csp_name": "5","tool_bf_id": "1246", "dvc_ssr": "null"}
/group_by/add 的想法。
经过一些研究,我发现了使用flatten group_by(.dvc_ssr) | group_by(.dvc_ssr) | map(reduce .[] as $x ({}; . * $x))
或
[.[1] + . [0]| group_by(.dvc_ssr) [] | group_by(.dvc_ssr) [] | add]
但这不起作用。 问题是通过对它们进行分组,我将“null”以及具有相同“dvc_ssr”的那些分组。最后我失去了一些记录。
我能够使用 jq -s (--slurp) 连接文件并尝试对数组
jq -s '[.[0] + .[1] | group_by(.dvc_ssr) []]' file1.json file2.json
然后我可以使用以下方法从第二个文件中删除未使用的记录: |map(select(if ._time!=null then .else empty end))| .[]
这个想法实际上是像 SQL 中那样进行 JOIN,其中“dvc_ssr”是相同的。
I have 2 files that I would like to join based a common key.
The first file is an array with:
{"_time": "2022-02-20T23","csp_name": "1","tool_bf_id": "1234", "dvc_ssr": "aa-1111"}
{"_time": "2022-02-20T23","csp_name": "2","tool_bf_id": "4567", "dvc_ssr": "aa-2222"}
{"_time": "2022-02-20T23","csp_name": "3","tool_bf_id": "1357", "dvc_ssr": "null"}
{"_time": "2022-02-20T23","csp_name": "4","tool_bf_id": "2468", "dvc_ssr": "aa-1111"}
{"_time": "2022-02-20T23","csp_name": "5","tool_bf_id": "1246", "dvc_ssr": "null"}
The second file is also an array, a complete list of all the different "dvc_ssr" with:
{"host": "hostId1","dvc_ssr": "aa-1111"}
{"host": "hostId2","dvc_ssr": "aa-2222"}
I am trying to join both files by adding information from the second file into the first one using the dvc_ssr key value.
I am expecting something like this:
{"_time": "2022-02-20T23","csp_name": "1","tool_bf_id": "1234", "dvc_ssr": "aa-1111","host": "hostId1"}
{"_time": "2022-02-20T23","csp_name": "2",,"tool_bf_id": "4567", "dvc_ssr": "aa-2222","host": "hostId2"}
{"_time": "2022-02-20T23","csp_name": "3","tool_bf_id": "1357", "dvc_ssr": "null"}
{"_time": "2022-02-20T23","csp_name": "4","tool_bf_id": "2468", "dvc_ssr": "aa-1111","host": "hostId1"}
{"_time": "2022-02-20T23","csp_name": "5","tool_bf_id": "1246", "dvc_ssr": "null"}
After some research I found ideas using flatten/group_by/add ..
flatten | group_by(.dvc_ssr) | map(reduce .[] as $x ({}; . * $x))
or
[.[1] + . [0] | group_by(.dvc_ssr) [] | add]
But this is not working out.
The issue is by grouping them, I am grouping the "null" and also those which have the same "dvc_ssr". At then end I am losing some records.
I was able to join the files with jq -s (--slurp) and try to group array
jq -s '[.[0] + .[1] | group_by(.dvc_ssr) []]' file1.json file2.json
and I can remove then the not used record from the second file, by using:|map(select(if ._time!=null then . else empty end))| .[]
The idea is really to do a JOIN like in SQL where "dvc_ssr" are identical.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
使用
JOIN
的所有参数,您可能会根据您的应用程序上下文来删除其中一些参数。例如,仅包含两个参数的 JOIN 接受流作为输入,并允许您在
JOIN
外部处理连接。相同的示例(注意
-s
而不是-n
,以及JOIN(…)[]
而不是JOIN(…)):
Using all parameters of
JOIN
, you'd doDepending on your application context, you could probably cut off some of them. For instance, JOIN with just two parameters accepts the stream as input and lets you process the joining outside of
JOIN
.Same example (notice
-s
instead of-n
, andJOIN(…)[]
instead ofJOIN(…)
):