使用 Unix 工具解析 JSON
我正在尝试解析从curl请求返回的JSON,如下所示:
curl 'http://twitter.com/users/username.json' |
sed -e 's/[{}]/''/g' |
awk -v k="text" '{n=split($0,a,","); for (i=1; i<=n; i++) print a[i]}'
上面将JSON拆分为字段,例如:
% ...
"geo_enabled":false
"friends_count":245
"profile_text_color":"000000"
"status":"in_reply_to_screen_name":null
"source":"web"
"truncated":false
"text":"My status"
"favorited":false
% ...
如何打印特定字段(由-vk=text
表示) ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(30)
有许多专门设计用于从命令行操作 JSON 的工具,并且比使用 Awk 更容易、更可靠,例如
jq
:您还可以使用系统上可能已安装的工具来执行此操作,例如使用
json
module,因此避免任何额外的依赖项,同时仍然具有正确的 JSON 解析器。以下假设您想要使用 UTF-8,原始 JSON 应该使用 UTF-8 进行编码,并且大多数现代终端也使用这种编码:Python 3:
Python 2:
常见问题
为什么不使用纯 shell 解决方案?
标准 POSIX/Single Unix 规范 shell 是一种非常有限的语言,它不不包含表示序列(列表或数组)或关联数组(在某些其他语言中也称为哈希表、映射、字典或对象)的工具。这使得在可移植 shell 脚本中表示解析 JSON 的结果有些棘手。有一些有些古怪的方法,但是如果按键或值包含某些特殊字符。
Bash 4 及更高版本、zsh 和 ksh 支持数组和关联数组,但这些 shell 并非普遍可用(由于从 GPLv2 更改为 GPLv3,macOS 在 Bash 3 上停止更新 Bash,而许多 Linux 系统没有zsh 开箱即用)。您可以编写一个可以在 Bash 4 或 zsh 中运行的脚本,其中一个可以在当今的大多数 macOS、Linux 和 BSD 系统上使用,但是编写适用于这样的 shell 的 shebang 行是很困难的。多语言脚本。
最后,在 shell 中编写一个成熟的 JSON 解析器将是一个足够重要的依赖项,您也可以使用现有的依赖项(例如 jq 或 Python)来代替。要想实现良好的实现,不会是一行代码,甚至是五行的小片段。
为什么不使用 awk、sed 或 grep?
可以使用这些工具从具有已知形状并以已知方式格式化的 JSON 中进行一些快速提取,例如每行一个键。其他答案中有几个对此建议的示例。
然而,这些工具是为基于行或基于记录的格式而设计的;它们不是为递归解析具有可能的转义字符的匹配分隔符而设计的。
因此,这些使用 awk/sed/grep 的快速而肮脏的解决方案可能很脆弱,并且如果输入格式的某些方面发生变化(例如折叠空白,或向 JSON 对象添加额外的嵌套级别,或内部的转义引号),就会中断。一个字符串。一个足够强大、能够在不中断的情况下处理所有 JSON 输入的解决方案也将相当庞大和复杂,因此与添加另一个对
jq
或 Python 的依赖项没有太大区别。我之前曾处理过由于 shell 脚本中的输入解析不佳而导致大量客户数据被删除的情况,因此我从不推荐快速而肮脏的方法,因为这样可能会很脆弱。如果您正在进行一些一次性处理,请参阅其他答案以获取建议,但我仍然强烈建议仅使用现有的经过测试的 JSON 解析器。
历史笔记
这个答案最初推荐jsawk,它应该仍然有效,但使用起来比< code>jq,并且依赖于安装的独立 JavaScript 解释器,该解释器比 Python 解释器不太常见,因此上面的答案可能更可取:
这个答案最初也使用了问题中的 Twitter API,但该 API 没有工作时间较长,因此很难复制示例进行测试,而且新的 Twitter API 需要 API 密钥,因此我转而使用 GitHub API,无需 API 密钥即可轻松使用。原始问题的第一个答案是:
There are a number of tools specifically designed for the purpose of manipulating JSON from the command line, and will be a lot easier and more reliable than doing it with Awk, such as
jq
:You can also do this with tools that are likely already installed on your system, like Python using the
json
module, and so avoid any extra dependencies, while still having the benefit of a proper JSON parser. The following assume you want to use UTF-8, which the original JSON should be encoded in and is what most modern terminals use as well:Python 3:
Python 2:
Frequently Asked Questions
Why not a pure shell solution?
The standard POSIX/Single Unix Specification shell is a very limited language which doesn't contain facilities for representing sequences (list or arrays) or associative arrays (also known as hash tables, maps, dicts, or objects in some other languages). This makes representing the result of parsing JSON somewhat tricky in portable shell scripts. There are somewhat hacky ways to do it, but many of them can break if keys or values contain certain special characters.
Bash 4 and later, zsh, and ksh have support for arrays and associative arrays, but these shells are not universally available (macOS stopped updating Bash at Bash 3, due to a change from GPLv2 to GPLv3, while many Linux systems don't have zsh installed out of the box). It's possible that you could write a script that would work in either Bash 4 or zsh, one of which is available on most macOS, Linux, and BSD systems these days, but it would be tough to write a shebang line that worked for such a polyglot script.
Finally, writing a full fledged JSON parser in shell would be a significant enough dependency that you might as well just use an existing dependency like jq or Python instead. It's not going to be a one-liner, or even small five-line snippet, to do a good implementation.
Why not use awk, sed, or grep?
It is possible to use these tools to do some quick extraction from JSON with a known shape and formatted in a known way, such as one key per line. There are several examples of suggestions for this in other answers.
However, these tools are designed for line based or record based formats; they are not designed for recursive parsing of matched delimiters with possible escape characters.
So these quick and dirty solutions using awk/sed/grep are likely to be fragile, and break if some aspect of the input format changes, such as collapsing whitespace, or adding additional levels of nesting to the JSON objects, or an escaped quote within a string. A solution that is robust enough to handle all JSON input without breaking will also be fairly large and complex, and so not too much different than adding another dependency on
jq
or Python.I have had to deal with large amounts of customer data being deleted due to poor input parsing in a shell script before, so I never recommend quick and dirty methods that may be fragile in this way. If you're doing some one-off processing, see the other answers for suggestions, but I still highly recommend just using an existing tested JSON parser.
Historical notes
This answer originally recommended jsawk, which should still work, but is a little more cumbersome to use than
jq
, and depends on a standalone JavaScript interpreter being installed which is less common than a Python interpreter, so the above answers are probably preferable:This answer also originally used the Twitter API from the question, but that API no longer works, making it hard to copy the examples to test out, and the new Twitter API requires API keys, so I've switched to using the GitHub API which can be used easily without API keys. The first answer for the original question would be:
为了快速提取特定键的值,我个人喜欢使用“grep -o”,它只返回正则表达式的匹配项。例如,要从推文中获取“文本”字段,类似于:
此正则表达式比您想象的更强大;例如,它可以很好地处理其中嵌入逗号和转义引号的字符串。我认为,只要多做一点工作,您就可以制作一个实际上保证提取值的值(如果它是原子的)。 (如果它有嵌套,那么正则表达式当然无法做到这一点。)
为了进一步清理(尽管保留字符串的原始转义),您可以使用类似以下内容的内容:
| perl -pe 's/"text"://; s/^"//;s/",$//'
。 (我这样做是为了此分析。)对于所有坚持认为应该使用真正的 JSON 解析器的讨厌者 - - 是的,这对于正确性至关重要,但是
grep -o
比 Python 标准json
库快几个数量级,至少在对推文(每条约 2 KB)执行此操作时是这样。我不确定这是否只是因为 json 很慢(我应该有时与 yajl 进行比较);但原则上,正则表达式应该更快,因为它是有限状态并且更可优化,而不是必须支持递归的解析器,在这种情况下,花费大量 CPU 为您不关心的结构构建树。 (如果有人编写了一个有限状态转换器来执行正确的(深度有限的)JSON 解析,那就太棒了!同时我们有“grep -o”。)为了编写可维护的代码,我总是使用真正的解析库。我还没有尝试过 jsawk,但如果它运行良好,那就可以解决第一点。
最后一个更古怪的解决方案:我编写了一个使用 Python
json
的脚本,并将所需的键提取到制表符分隔的列中;然后我通过一个围绕awk
的包装器进行管道传输,允许对列进行命名访问。 此处:json2tsv 和 tsvawk 脚本。因此,对于这个例子来说,它是:这种方法不解决#2,比单个 Python 脚本效率更低,而且有点脆弱:它强制字符串值中的换行符和制表符规范化,以与 awk 的 field/ 配合良好记录界定的世界观。但它确实可以让您留在命令行上,并且比 grep -o 更正确。
To quickly extract the values for a particular key, I personally like to use "grep -o", which only returns the regex's match. For example, to get the "text" field from tweets, something like:
This regex is more robust than you might think; for example, it deals fine with strings having embedded commas and escaped quotes inside them. I think with a little more work you could make one that is actually guaranteed to extract the value, if it's atomic. (If it has nesting, then a regex can't do it of course.)
And to further clean (albeit keeping the string's original escaping) you can use something like:
| perl -pe 's/"text"://; s/^"//; s/",$//'
. (I did this for this analysis.)To all the haters who insist you should use a real JSON parser -- yes, that is essential for correctness, but
grep -o
is orders of magnitude faster than the Python standardjson
library, at least when doing this for tweets (which are ~2 KB each). I'm not sure if this is just becausejson
is slow (I should compare to yajl sometime); but in principle, a regex should be faster since it's finite state and much more optimizable, instead of a parser that has to support recursion, and in this case, spends lots of CPU building trees for structures you don't care about. (If someone wrote a finite state transducer that did proper (depth-limited) JSON parsing, that would be fantastic! In the meantime we have "grep -o".)To write maintainable code, I always use a real parsing library. I haven't tried jsawk, but if it works well, that would address point #1.
One last, wackier, solution: I wrote a script that uses Python
json
and extracts the keys you want, into tab-separated columns; then I pipe through a wrapper aroundawk
that allows named access to columns. In here: the json2tsv and tsvawk scripts. So for this example it would be:This approach doesn't address #2, is more inefficient than a single Python script, and it's a little brittle: it forces normalization of newlines and tabs in string values, to play nice with awk's field/record-delimited view of the world. But it does let you stay on the command line, with more correctness than
grep -o
.基于这里的一些建议(尤其是评论中)建议使用Python,我很失望没有找到示例。
因此,这里有一个从一些 JSON 数据中获取单个值的单行代码。它假设您正在(从某处)通过管道输入数据,因此在脚本上下文中应该很有用。
On the basis that some of the recommendations here (especially in the comments) suggested the use of Python, I was disappointed not to find an example.
So, here's a one-liner to get a single value from some JSON data. It assumes that you are piping the data in (from somewhere) and so should be useful in a scripting context.
遵循 martinr 和 Boecko 的领先:
这会给你一个非常grep 友好的输出。非常方便:
Following martinr's and Boecko's lead:
That will give you an extremely grep-friendly output. Very convenient:
您只需 下载适合您平台的
jq
二进制文件 并运行 (< code>chmod +x jq):它从 json 对象中提取
"name"
属性。jq
homepage 说它就像sed
for JSON 数据。You could just download
jq
binary for your platform and run (chmod +x jq
):It extracts
"name"
attribute from the json object.jq
homepage says it is likesed
for JSON data.使用 Node.js
如果系统安装了 Node.js,则可以使用
-p< /code> print 和
-e
使用JSON.parse
提取所需的任何值。一个使用 JSON 字符串
{ "foo": "bar" }
并提取 "foo" 值的简单示例:输出:
因为我们可以访问
cat
和其他实用程序,我们可以将其用于文件:输出:
或任何其他格式,例如包含 JSON 的 URL:
输出:
Using Node.js
If the system has Node.js installed, it's possible to use the
-p
print and-e
evaluate script flags withJSON.parse
to pull out any value that is needed.A simple example using the JSON string
{ "foo": "bar" }
and pulling out the value of "foo":Output:
Because we have access to
cat
and other utilities, we can use this for files:Output:
Or any other format such as an URL that contains JSON:
Output:
使用 Python 的 JSON 支持 而不是使用 AWK!
像这样的东西:
macOS v12.3 (蒙特雷)删除了
/usr/bin/python
,因此对于 macOS v12.3 及更高版本,我们必须使用/usr/bin/python3
。Use Python's JSON support instead of using AWK!
Something like this:
macOS v12.3 (Monterey) removed
/usr/bin/python
, so we must use/usr/bin/python3
for macOS v12.3 and later.您问过如何搬起石头砸自己的脚,我在这里提供弹药:
您可以使用
tr -d '{}'
而不是sed
。但完全将它们排除在外似乎也能达到预期的效果。如果您想去掉外部引号,请通过
sed 's/\(^"\|"$\)//g' 管道传输上述结果
我认为其他人已经敲响了足够的警报。我会拿着手机站在一旁叫救护车。准备好后开火。
You've asked how to shoot yourself in the foot and I'm here to provide the ammo:
You could use
tr -d '{}'
instead ofsed
. But leaving them out completely seems to have the desired effect as well.If you want to strip off the outer quotes, pipe the result of the above through
sed 's/\(^"\|"$\)//g'
I think others have sounded sufficient alarm. I'll be standing by with a cell phone to call an ambulance. Fire when ready.
将 Bash 与 Python 结合使用
在您的 .bashrc 文件中创建一个 Bash 函数:
然后
输出:
这是相同的函数,但带有错误检查。
其中 $# -ne 1 确保至少有 1 个输入,而 -t 0 确保您从管道重定向。
此实现的好处是您可以访问嵌套的 JSON 值并获取 JSON 内容作为回报! =)
示例:
输出:
如果您想要真正花哨,您可以漂亮地打印数据:
Using Bash with Python
Create a Bash function in your .bashrc file:
Then
Output:
Here is the same function, but with error checking.
Where $# -ne 1 makes sure at least 1 input, and -t 0 make sure you are redirecting from a pipe.
The nice thing about this implementation is that you can access nested JSON values and get JSON content in return! =)
Example:
Output:
If you want to be really fancy, you could pretty print the data:
更新(2020)
我使用外部工具(例如 Python)的最大问题是您必须处理包管理器和依赖项才能安装它们。
然而,现在我们有了
jq
作为一个独立的静态工具,可以通过 GitHub 版本 和 Webi (webinstall.dev/jq),我建议:Mac、Linux:
Windows 10:
备忘单:https://webinstall.dev/ jq
Original (2011)
TickTick 是一个用 bash 编写的 JSON 解析器(少于 250 行)的代码)。
以下是作者文章中的摘录,想象一个 Bash 支持 JSON 的世界 :
Update (2020)
My biggest issue with external tools (e.g., Python) was that you have to deal with package managers and dependencies to install them.
However, now that we have
jq
as a standalone, static tool that's easy to install cross-platform via GitHub Releases and Webi (webinstall.dev/jq), I'd recommend that:Mac, Linux:
Windows 10:
Cheat Sheet: https://webinstall.dev/jq
Original (2011)
TickTick is a JSON parser written in bash (less than 250 lines of code).
Here's the author's snippet from his article, Imagine a world where Bash supports JSON:
这是使用大多数发行版上提供的标准 Unix 工具。它还可以很好地与反斜杠 (\) 和引号 (") 配合使用。
警告:这与 jq 并且仅适用于非常简单的 JSON 对象。这是回答原始问题的尝试,并且在您无法安装其他工具的情况下。
This is using standard Unix tools available on most distributions. It also works well with backslashes (\) and quotes (").
Warning: This doesn't come close to the power of jq and will only work with very simple JSON objects. It's an attempt to answer to the original question and in situations where you can't install additional tools.
使用 PHP CLI 解析 JSON
这可以说是偏离主题的,但由于优先级占主导地位,如果不提及我们值得信赖和忠实的PHP,我说得对吗?
它使用相同的示例 JSON,但让我们将其分配给一个变量以减少模糊性。
现在,对于 PHP 来说,它使用 file_get_contents 和 < a href="http://php.net/manual/en/wrappers.php.php" rel="nofollow noreferrer">php://stdin 流包装器。
或者如使用 fgets 和 CLI 常量中已打开的流所指出的STDIN。
Parsing JSON with PHP CLI
It is arguably off-topic, but since precedence reigns, this question remains incomplete without a mention of our trusty and faithful PHP, am I right?
It is using the same example JSON, but let’s assign it to a variable to reduce obscurity.
Now for PHP goodness, it is using file_get_contents and the php://stdin stream wrapper.
Or as pointed out using fgets and the already opened stream at CLI constant STDIN.
如果有人只想从简单的 JSON 对象中提取值而不需要嵌套结构,则可以使用正则表达式,甚至无需离开 Bash。
这是我使用基于 JSON 标准 的 bash 正则表达式定义的函数:
注意:对象和数组不是支持作为值,但支持标准中定义的所有其他值类型。此外,无论 JSON 文档有多深,只要它具有完全相同的键名,就会匹配。
使用OP的例子:
If someone just wants to extract values from simple JSON objects without the need for nested structures, it is possible to use regular expressions without even leaving Bash.
Here is a function I defined using bash regular expressions based on the JSON standard:
Caveats: objects and arrays are not supported as values, but all other value types defined in the standard are supported. Also, a pair will be matched no matter how deep in the JSON document it is as long as it has exactly the same key name.
Using the OP's example:
不幸的是,使用
grep
的投票最高的答案返回了 full 匹配,这在我的场景中不起作用,但如果您知道 JSON 格式将保持不变,您可以使用 lookbehind 和 lookahead 仅提取所需的值。Unfortunately the top voted answer that uses
grep
returns the full match that didn't work in my scenario, but if you know the JSON format will remain constant you can use lookbehind and lookahead to extract just the desired values.使用 Ruby 和 http://flori.github.com/json/ 的版本
或更简洁地说:
Version which uses Ruby and http://flori.github.com/json/
Or more concisely:
这是另一个 Bash 和 Python 混合答案。我发布这个答案是因为我想处理更复杂的 JSON 输出,但是,降低 bash 应用程序的复杂性。我想从 http://www.arcgis 破解以下 JSON 对象。 Bash 中的 com/sharing/rest/info?f=json:
在以下示例中,我利用 Python 创建了自己的
jq
和unquote
实现。您会注意到,一旦我们将 Python 对象从json
导入到 Python 字典中,我们就可以使用 Python 语法来导航字典。要导航上述内容,语法为:data
data[ "authInfo" ]
data[ "authInfo" ][ "tokenServicesUrl" ]
通过使用Bash 中的魔法,我们省略
data
并仅在数据右侧提供 Python 文本,即jq
jq '[ "authInfo" ]'
jq '[ "authInfo" ][ "tokenServicesUrl" ]'
请注意,如果没有参数,jq 充当 JSON 修饰器。通过参数,我们可以使用 Python 语法从字典中提取我们想要的任何内容,包括导航子字典和数组元素。
以下是 Bash Python 混合函数:
这是 Bash Python 函数的示例用法:
This is yet another Bash and Python hybrid answer. I posted this answer, because I wanted to process more complex JSON output, but, reducing the complexity of my bash application. I want to crack open the following JSON object from http://www.arcgis.com/sharing/rest/info?f=json in Bash:
In the following example, I created my own implementation of
jq
andunquote
leveraging Python. You'll note that once we import the Python object fromjson
to a Python dictionary we can use Python syntax to navigate the dictionary. To navigate the above, the syntax is:data
data[ "authInfo" ]
data[ "authInfo" ][ "tokenServicesUrl" ]
By using magic in Bash, we omit
data
and only supply the Python text to the right of data, i.e.jq
jq '[ "authInfo" ]'
jq '[ "authInfo" ][ "tokenServicesUrl" ]'
Note, with no parameters, jq acts as a JSON prettifier. With parameters, we can use Python syntax to extract anything we want from the dictionary including navigating subdictionaries and array elements.
Here are the Bash Python hybrid functions:
Here's a sample usage of the Bash Python functions:
有一种更简单的方法可以从 JSON 字符串获取属性。使用
package.json
文件作为示例,尝试以下操作:我们使用
process.env
,因为这会将文件的内容作为字符串获取到 Node.js 中,而不包含任何内容恶意内容逃脱其引用并被解析为代码的风险。There is an easier way to get a property from a JSON string. Using a
package.json
file as an example, try this:We're using
process.env
, because this gets the file's contents into Node.js as a string without any risk of malicious contents escaping their quoting and being parsed as code.既然 PowerShell 是跨平台的,我想我应该放弃它,因为我发现它相当直观且极其简单。
ConvertFrom-Json 转换 JSON到 PowerShell 自定义对象中,以便您可以轻松地使用此后的属性。例如,如果您只想要 'id' 属性,您只需这样做:
如果您想从 Bash 中调用整个内容,那么您必须这样调用它:
当然,有一种纯粹的 PowerShell 方式不使用curl来做到这一点,这将是:
最后,还有 ConvertTo-Json 可以轻松地将自定义对象转换为 JSON。下面是一个示例:
它将生成如下所示的漂亮 JSON:
}
诚然,在 Unix 上使用 Windows shell 有点亵渎,但 PowerShell 确实擅长某些事情,解析 JSON 和 XML 就是其中之一。这是跨平台版本的 GitHub 页面:PowerShell
Now that PowerShell is cross platform, I thought I'd throw its way out there, since I find it to be fairly intuitive and extremely simple.
ConvertFrom-Json converts the JSON into a PowerShell custom object, so you can easily work with the properties from that point forward. If you only wanted the 'id' property for example, you'd just do this:
If you wanted to invoke the whole thing from within Bash, then you'd have to call it like this:
Of course, there's a pure PowerShell way to do it without curl, which would be:
Finally, there's also ConvertTo-Json which converts a custom object to JSON just as easily. Here's an example:
Which would produce nice JSON like this:
}
Admittedly, using a Windows shell on Unix is somewhat sacrilegious, but PowerShell is really good at some things, and parsing JSON and XML are a couple of them. This is the GitHub page for the cross platform version: PowerShell
我无法使用这里的任何答案。 jq、shell 数组、声明、grep -P、lookbehind、lookahead、Python、Perl、Ruby,甚至 Bash 都不可用。
剩下的答案根本不起作用。 JavaScript 听起来很熟悉,但罐头上写着雀巢咖啡 - 所以它也是不行的:)即使可用,对于我的简单需求 - 它们会杀伤力太大而且速度慢。
然而,从调制解调器的 JSON 格式回复中获取许多变量对我来说非常重要。我正在 Bourne shell (
sh
) 中使用非常在我的路由器上修剪BusyBox!单独使用 AWK 没有任何问题:只需设置分隔符并读取数据即可。对于单个变量来说,仅此而已!还记得我没有任何数组吗?我必须将 AWK 解析的数据分配给 shell 脚本中需要的 11 个变量。无论我往哪里看,这都被认为是一项不可能完成的任务。这也没有问题。
我的解决方案很简单。此代码将:
从问题中解析 .json 文件(实际上,我从最受支持的答案中借用了一个工作数据样本)并挑选出引用的数据,加上
从 awk 中创建 shell 变量,分配免费的命名 shell 变量名称。
eval $(curl -s 'https://api.github.com/users/lambda' |
awk ' BEGIN { FS="""; RS="," };
{
if ($2 == "登录") { print "登录=""$4""" }
if ($2 == "名称") { print "名称=""$4""" }
if ($2 == "updated_at") { print "更新=""$4""" }
}')
echo "$Login, $Name, $Updated"
里面的空白没有任何问题。在我的使用中,相同的命令解析长单行输出。由于使用了eval,因此该解决方案仅适用于可信数据。
调整它以获取未引用的数据很简单。对于大量变量,可以使用else if来实现边际速度增益。缺乏数组显然意味着:没有额外的摆弄就没有多个记录。但在有阵列的情况下,采用此解决方案是一项简单的任务。
@maikel 的 sed 答案几乎有效(但我无法对此发表评论)。对于我格式良好的数据 - 它有效。与这里使用的示例不太一样(缺少引号会导致失败)。其复杂性和修改难度较大。另外,我不喜欢必须进行 11 次调用才能提取 11 个变量。为什么?我计时了 100 个循环,提取 9 个变量:sed 函数花了 48.99 秒,而我的解决方案花了 0.91 秒!不公平吗?仅提取 9 个变量:0.51 秒与 0.02 秒。
I can not use any of the answers here. Neither jq, shell arrays, declare, grep -P, lookbehind, lookahead, Python, Perl, Ruby, or even Bash, is available.
The remaining answers simply do not work well. JavaScript sounded familiar, but the tin says Nescaffe - so it is a no go, too :) Even if available, for my simple needs - they would be overkill and slow.
Yet, it is extremely important for me to get many variables from the JSON formatted reply of my modem. I am doing it in Bourne shell (
sh
) with a very trimmed down BusyBox at my routers! There aren't any problems using AWK alone: just set delimiters and read the data. For a single variable, that is all!Remember I don't have any arrays? I had to assign within the AWK parsed data to the 11 variables which I need in a shell script. Wherever I looked, that was said to be an impossible mission. No problem with that, either.
My solution is simple. This code will:
parse .json file from the question (actually, I have borrowed a working data sample from the most upvoted answer) and picked out the quoted data, plus
create shell variables from within the awk assigning free named shell variable names.
eval $( curl -s 'https://api.github.com/users/lambda' |
awk ' BEGIN { FS="""; RS="," };
{
if ($2 == "login") { print "Login=""$4""" }
if ($2 == "name") { print "Name=""$4""" }
if ($2 == "updated_at") { print "Updated=""$4""" }
}' )
echo "$Login, $Name, $Updated"
There aren't any problems with blanks within. In my use, the same command parses a long single line output. As eval is used, this solution is suited for trusted data only.
It is simple to adapt it to pickup unquoted data. For a huge number of variables, a marginal speed gain can be achieved using else if. Lack of arrays obviously means: no multiple records without extra fiddling. But where arrays are available, adapting this solution is a simple task.
@maikel's sed answer almost works (but I can not comment on it). For my nicely formatted data - it works. Not so much with the example used here (missing quotes throw it off). It is complicated and difficult to modify. Plus, I do not like having to make 11 calls to extract 11 variables. Why? I timed 100 loops extracting 9 variables: the sed function took 48.99 seconds and my solution took 0.91 second! Not fair? Doing just a single extraction of 9 variables: 0.51 vs. 0.02 second.
也有 XML 文件的人可能想看看我的 Xidel。它是一个命令行界面、无依赖的 JSONiq 处理器。 (即,它还支持用于 XML 或 JSON 处理的 XQuery。)
问题中的示例是:
或者使用我自己的非标准扩展语法:
Someone who also has XML files, might want to look at my Xidel. It is a command-line interface, dependency-free JSONiq processor. (I.e., it also supports XQuery for XML or JSON processing.)
The example in the question would be:
Or with my own, nonstandard extension syntax:
你可以尝试这样的事情 -
You can try something like this -
现有答案中尚未涵盖的一个有趣的工具是使用用 Go 编写的
gron
< /a> 的标语是 Make JSON grepable! 这正是它的作用。因此,本质上,
gron
将您的 JSON 分解为离散的分配,请参阅它的绝对“路径”。与 jq 等其他工具相比,它的主要优点是允许在不知道要搜索的记录如何嵌套的情况下搜索值,而不会破坏原始的 JSON 结构,例如,我想搜索来自以下链接的
'twitter_username'
字段,我只是做这么简单的事情。请注意 gron -u(ungron 的缩写)如何从搜索路径重建 JSON。对 fgrep 的需求只是将搜索过滤到所需的路径,而不是让搜索表达式被计算为正则表达式,而是作为固定字符串(本质上是 grep -F ) code>)
另一个搜索字符串以查看记录在嵌套结构中的位置的示例
它还支持使用其
-s
命令行标志进行流式 JSON,您可以在其中连续 gron 输入流匹配的记录。此外,gron
的运行时依赖性为零。您可以下载适用于 Linux、Mac、Windows 或 FreeBSD 的二进制文件并运行它。更多使用示例和行程可以在官方 Github 页面找到 - 高级用法
至于为什么可以使用
gron
而不是其他 JSON 解析工具,请参阅项目页面的作者注释。为什么我不应该只使用 jq ?
One interesting tool that hasn't be covered in the existing answers is using
gron
written in Go which has a tagline that says Make JSON greppable! which is exactly what it does.So essentially
gron
breaks down your JSON into discrete assignments see the absolute 'path' to it. The primary advantage of it over other tools likejq
would be to allow searching for the value without knowing how nested the record to search is present at, without breaking the original JSON structuree.g., I want to search for the
'twitter_username'
field from the following link, I just doAs simple as that. Note how the
gron -u
(short for ungron) reconstructs the JSON back from the search path. The need forfgrep
is just to filter your search to the paths needed and not let the search expression be evaluated as a regex, but as a fixed string (which is essentiallygrep -F
)Another example to search for a string to see where in the nested structure the record is under
It also supports streaming JSON with its
-s
command line flag, where you can continuously gron the input stream for a matching record. Alsogron
has zero runtime dependencies. You can download a binary for Linux, Mac, Windows or FreeBSD and run it.More usage examples and trips can be found at the official Github page - Advanced Usage
As for why you one can use
gron
over other JSON parsing tools, see from author's note from the project page.Why shouldn't I just use jq?
您可以使用
jshon
:You can use
jshon
:这是使用 AWK 实现此操作的一种方法:
Here's one way you can do it with AWK:
还有一个非常简单但功能强大的 JSON CLI 处理工具,fx。
示例
使用匿名函数:
输出:
如果不传递匿名函数参数 → ...,代码将自动转换为匿名函数。您可以通过此关键字访问 JSON:
或者也只使用点语法:
输出:
您可以传递任意数量的匿名函数来减少 JSON:
输出:
您可以使用扩展运算符更新现有 JSON:
输出:
只是简单JavaScript。无需学习新语法。
后来的fx版本有交互模式了! -
There is also a very simple, but powerful, JSON CLI processing tool, fx.
Examples
Use an anonymous function:
Output:
If you don't pass anonymous function parameter → ..., code will be automatically transformed into an anonymous function. And you can get access to JSON by this keyword:
Or just use dot syntax too:
Output:
You can pass any number of anonymous functions for reducing JSON:
Output:
You can update existing JSON using spread operator:
Output:
Just plain JavaScript. There isn't any need to learn new syntax.
Later version of fx has an interactive mode! -
这是一个很好的参考。在这种情况下:
Here is a good reference. In this case:
在 shell 脚本中解析 JSON 是一件很痛苦的事情。使用更合适的语言,创建一个以符合 shell 脚本约定的方式提取 JSON 属性的工具。您可以使用新工具来解决当前的 shell 脚本问题,然后将其添加到您的工具包中以备将来使用。
例如,考虑一个工具 jsonlookup,如果我说
jsonlookup 访问令牌 id
,它将返回属性 中定义的属性 id在标准输入的属性 access 中定义的令牌,可能是 JSON 数据。如果该属性不存在,该工具将不返回任何内容(退出状态 1)。如果解析失败,则退出状态 2 并向标准错误发送消息。如果查找成功,该工具将打印属性的值。创建了一个专门用于提取 JSON 值的 Unix 工具后,您可以轻松地在 shell 脚本中使用它:
任何语言都可以实现 jsonlookup。这是一个相当简洁的 Python 版本:
Parsing JSON is painful in a shell script. With a more appropriate language, create a tool that extracts JSON attributes in a way consistent with shell scripting conventions. You can use your new tool to solve the immediate shell scripting problem and then add it to your kit for future situations.
For example, consider a tool jsonlookup such that if I say
jsonlookup access token id
it will return the attribute id defined within the attribute token defined within the attribute access from standard input, which is presumably JSON data. If the attribute doesn't exist, the tool returns nothing (exit status 1). If the parsing fails, exit status 2 and a message to standard error. If the lookup succeeds, the tool prints the attribute's value.Having created a Unix tool for the precise purpose of extracting JSON values you can easily use it in shell scripts:
Any language will do for the implementation of jsonlookup. Here is a fairly concise Python version:
使用 Python 的两行代码。如果您正在编写单个 .sh 文件并且不想依赖于另一个 .py 文件,那么它的效果特别好。它还利用了管道
|
的使用。echo "{\"field\": \"value\"}"
可以替换为将 JSON 文件打印到标准输出的任何内容。A two-liner which uses Python. It works particularly well if you're writing a single .sh file and you don't want to depend on another .py file. It also leverages the usage of pipe
|
.echo "{\"field\": \"value\"}"
can be replaced by anything printing a JSON file to standard output.如果您安装了 PHP 解释器:
例如:
我们有一个提供 JSON 内容的资源国家的 ISO 代码:http://country.io/iso3.json 我们可以很容易地在 shell 中使用curl看到它:
但是看起来不太方便,而且不可读。更好地解析 JSON 内容并查看可读结构:
此代码将打印类似以下内容:
如果您有嵌套数组,则此输出看起来会好得多...
If you have the PHP interpreter installed:
For example:
We have a resource that provides JSON content with countries' ISO codes: http://country.io/iso3.json and we can easily see it in a shell with curl:
But it looks not very convenient, and not readable. Better parse the JSON content and see a readable structure:
This code will print something like:
If you have nested arrays this output will looks much better...
我需要 Bash 中的一些简短的东西,并且可以在不依赖于普通 Linux LSB 和 Mac OS 之外的情况下运行对于 Python 2.7 和3 并处理错误,例如将报告 JSON 解析错误和缺少属性错误,而不会引发 Python 异常:
I needed something in Bash that was short and would run without dependencies beyond vanilla Linux LSB and Mac OS for both Python 2.7 & 3 and handle errors, e.g. would report JSON parse errors and missing property errors without spewing Python exceptions: