如何使用 sed 或 awk 替换文本?
我有以下 json 文件:
{ "last_modified": {
"type": "/type/datetime",
"value": "2008-04-01T03:28:50.625462" },
"type": { "key": "/type/author" },
"name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.",
"key": "/authors/OL2108538A",
"revision": 1 }
名称值有一个双引号,我只想用单引号替换这个双引号(而不是任何其他双引号)。我该怎么做呢?
I have the following json file:
{ "last_modified": {
"type": "/type/datetime",
"value": "2008-04-01T03:28:50.625462" },
"type": { "key": "/type/author" },
"name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.",
"key": "/authors/OL2108538A",
"revision": 1 }
The name value has a double quote and I only want to replace this double quote with a single quote (not any other double quote). How can I do it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
如果要替换所有出现的单个字符,还可以使用命令
tr
,比 sed 或 awk 更简单:请注意,两个引号都被转义了。如果除引号之外还有其他字符,则只需编写:
编辑:请注意,编辑问题后,此答案不再有效:它会替换所有双引号,而不仅仅是 Name 属性中的双引号。
If you want to repleace all occurences of a single character, you can also use the command
tr
, simpler than sed or awk:Notice that both quotes are escaped. If you have other chars than quotes, you just write:
Edit: Note that after the question was edited this answer is no longer valid: it replaces all double quotes, not only the one inside the Name property.
我认为使用 sed 会更好,如下所示:
I think would be better to use
sed
something like this:在您的输入中添加一些其他奇怪的错误情况,
此 Perl 程序使用启发式更正未转义的内部双引号,即字符串的实际结束引号后跟可选的空格以及冒号、逗号、分号或大括号
产生以下输出:
从输入到输出的增量:
Adding some other weird error cases to your input
this Perl program that corrects unescaped internal double-quotes using the heuristic that a string's actual closing quote is followed by optional whitespace and either a colon, comma, semicolon, or curly brace
produces the following output:
Delta from input to output:
如果您指的是 只是
'Rico"s'
中的双引号,则可以使用:,如下所示:
If you mean just the double quote in
'Rico"s'
, you can use:as in:
假设您的数据与您显示的完全相同,并且额外的双引号仅出现在名称值字段中:
更新:
我使脚本稍微更加健壮(处理字段内的 ', ')。
将此脚本放入文件中(例如
dequote.awk
)并使用运行该脚本
awk -f dequote.awk input.json >输出.json
。更新 2:
好的,所以您的输入非常难以处理。我能想到的唯一一件事是:
解释:我尝试将该行分成三部分:
在第 2 部分中,我将所有双引号替换为单引号。然后我将这三个部分重新粘在一起并打印出来。
Assuming your data is exactly like you showed and the extra double quotes only appear in the name value field:
Update:
I made the script slightly more robust (handling ', ' inside fields).
Put this script in a file (say
dequote.awk
) and run the script withawk -f dequote.awk input.json > output.json
.Update 2:
Okay, so your input is extremely difficult to process. The only thing other thing I can think of is this:
Explanation: I try to chop the line in three parts:
In part 2 I replace all double quotes by single quotes. Then I glue the three parts back together and print them.
如果只是“名称”周围的引号,那么您可以从命令行或在 bash 脚本中使用 sed:
经过测试,有效。
If just the quotes around "name" then you can use sed from command line or in a bash script:
Tested, works.