使用JQ进行流,过滤大JSON文件,然后将输出另存为CSV
我有一个非常大的JSON文件,我想流式传输(使用- 流
),然后使用JQ过滤,然后将其保存为CSV。
这是带有两个对象的示例数据:
[{"_id":"1","time":"2021-07-22","body":["text1"],"region":[{"percentage":"0.1","region":"region1"},{"percentage":"0.9","region":"region2"}],"reach":{"lower_bound":"100","upper_bound":"200"},"languages":["de"]},
{"_id":"2","time":"2021-07-23","body":["text2"],"region":[{"percentage":"0.3","region":"region1"},{"percentage":"0.7","region":"region2"}],"reach":{"lower_bound":"10","upper_bound":"20"},"languages":["en"]}]
我想在JQ流中的“语言”
字段上过滤,因此我仅保留lankage == [“ de”]
>的对象。 ,然后将其保存为一个新的CSV文件,标题为grinefile.csv
,以使新的CSV文件看起来如下:
_id,time,body,percentage_region1,percentage_region2,reach_lower_bound,reach_upper_bound,languages
"1","2021-07-22","text1","0.1","0.9","100","200","de"
我到目前为止有以下代码,但似乎不起作用:
cat largefile.json -r | jq -cn --stream 'fromstream(1|truncate_stream(inputs | select(.))) | with_entries(select(.value.languages==[“de”])) | @csv
任何帮助都会值得赞赏!
I have a very large json file I would like to stream (using --stream
) and filter with jq, then save it as a csv.
This is the sample data with two objects:
[{"_id":"1","time":"2021-07-22","body":["text1"],"region":[{"percentage":"0.1","region":"region1"},{"percentage":"0.9","region":"region2"}],"reach":{"lower_bound":"100","upper_bound":"200"},"languages":["de"]},
{"_id":"2","time":"2021-07-23","body":["text2"],"region":[{"percentage":"0.3","region":"region1"},{"percentage":"0.7","region":"region2"}],"reach":{"lower_bound":"10","upper_bound":"20"},"languages":["en"]}]
I want to filter on the "languages"
field in jq stream so I only retain objects where languages==[“de”]
, then save it as a new csv file titled largefile.csv
such that the new csv file looks like the following:
_id,time,body,percentage_region1,percentage_region2,reach_lower_bound,reach_upper_bound,languages
"1","2021-07-22","text1","0.1","0.9","100","200","de"
I have the following code so far but it doesn’t seem to work:
cat largefile.json -r | jq -cn --stream 'fromstream(1|truncate_stream(inputs | select(.))) | with_entries(select(.value.languages==[“de”])) | @csv
Any help would be much appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这里涉及几个独立的任务,有些任务被指定,但希望以下将帮助您通过灌木丛:
There are several separate tasks involved here, and some are underspecified, but hopefully the following will help you through the thicket: