如何使用 sed 或 awk 替换文本？

发布于 2024-09-12 18:47:47 字数 425 浏览 9 评论 0原文

我有以下 json 文件：

 { "last_modified": {
         "type": "/type/datetime", 
         "value": "2008-04-01T03:28:50.625462" }, 
     "type": { "key": "/type/author" }, 
     "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.", 
     "key": "/authors/OL2108538A", 
     "revision": 1 }

名称值有一个双引号，我只想用单引号替换这个双引号（而不是任何其他双引号）。我该怎么做呢？

原文

I have the following json file:

 { "last_modified": {
         "type": "/type/datetime", 
         "value": "2008-04-01T03:28:50.625462" }, 
     "type": { "key": "/type/author" }, 
     "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.", 
     "key": "/authors/OL2108538A", 
     "revision": 1 }

The name value has a double quote and I only want to replace this double quote with a single quote (not any other double quote). How can I do it?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

柠栀 2024-09-19 18:47:47

如果要替换所有出现的单个字符，还可以使用命令 tr，比 sed 或 awk 更简单：

   cat myfile.txt | tr \" \'

请注意，两个引号都被转义了。如果除引号之外还有其他字符，则只需编写：

   cat myfile.txt | tr a A

编辑：请注意，编辑问题后，此答案不再有效：它会替换所有双引号，而不仅仅是 Name 属性中的双引号。

If you want to repleace all occurences of a single character, you can also use the command tr, simpler than sed or awk:

   cat myfile.txt | tr \" \'

Notice that both quotes are escaped. If you have other chars than quotes, you just write:

   cat myfile.txt | tr a A

Edit: Note that after the question was edited this answer is no longer valid: it replaces all double quotes, not only the one inside the Name property.

回复收藏 0 原文

伴我心暖 2024-09-19 18:47:47

我认为使用 sed 会更好，如下所示：

sed 's/"/'/g' 你的文件

回复收藏 0 原文

痴骨ら 2024-09-19 18:47:47

在您的输入中添加一些其他奇怪的错误情况，

{ "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"},
  "type": {"key": "/type/author"},
  "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.",
  "key": "/authors/OL2108538A",
  "revision": 1,
  "has \" escaped quote": 1,
  "has \" escaped quotes \"": 1,
  "has multiple " internal " quotes": 1,
}

此 Perl 程序使用启发式更正未转义的内部双引号，即字符串的实际结束引号后跟可选的空格以及冒号、逗号、分号或大括号

#! /usr/bin/perl -p

s<"(.+?)"(\s*[:,;}])> {
  my($text,$terminator) = ($1,$2);
  $text =~ s/(?<!\\)"/'/g;  # " oh, the irony!
  qq["$text"] . $terminator;
}eg;

产生以下输出：

$ ./fixdqs input.json
{ "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"},
  "type": {"key": "/type/author"},
  "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico's Economy.",
  "key": "/authors/OL2108538A",
  "revision": 1,
  "has \" escaped quote": 1,
  "has \" escaped quotes \"": 1,
  "has multiple ' internal ' quotes": 1,
}

从输入到输出的增量：

$ diff -ub input.json <(./fixdqs input.json)
--- input.json
+++ /dev/fd/63
@@ -1,9 +1,9 @@
 { "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"},
   "type": {"key": "/type/author"},
-  "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.",
+  "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico's Economy.",
   "key": "/authors/OL2108538A",
   "revision": 1,
   "has \" escaped quote": 1,
   "has \" escaped quotes \"": 1,
-  "has multiple " internal " quotes": 1,
+  "has multiple ' internal ' quotes": 1,
 }

Adding some other weird error cases to your input

{ "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"},
  "type": {"key": "/type/author"},
  "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.",
  "key": "/authors/OL2108538A",
  "revision": 1,
  "has \" escaped quote": 1,
  "has \" escaped quotes \"": 1,
  "has multiple " internal " quotes": 1,
}

this Perl program that corrects unescaped internal double-quotes using the heuristic that a string's actual closing quote is followed by optional whitespace and either a colon, comma, semicolon, or curly brace

#! /usr/bin/perl -p

s<"(.+?)"(\s*[:,;}])> {
  my($text,$terminator) = ($1,$2);
  $text =~ s/(?<!\\)"/'/g;  # " oh, the irony!
  qq["$text"] . $terminator;
}eg;

produces the following output:

$ ./fixdqs input.json
{ "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"},
  "type": {"key": "/type/author"},
  "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico's Economy.",
  "key": "/authors/OL2108538A",
  "revision": 1,
  "has \" escaped quote": 1,
  "has \" escaped quotes \"": 1,
  "has multiple ' internal ' quotes": 1,
}

Delta from input to output:

$ diff -ub input.json <(./fixdqs input.json)
--- input.json
+++ /dev/fd/63
@@ -1,9 +1,9 @@
 { "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"},
   "type": {"key": "/type/author"},
-  "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.",
+  "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico's Economy.",
   "key": "/authors/OL2108538A",
   "revision": 1,
   "has \" escaped quote": 1,
   "has \" escaped quotes \"": 1,
-  "has multiple " internal " quotes": 1,
+  "has multiple ' internal ' quotes": 1,
 }

回复收藏 0 原文

_蜘蛛 2024-09-19 18:47:47

如果您指的是只是 'Rico"s' 中的双引号，则可以使用：，

sed "s/Rico\"s/Rico's/"

如下所示：

pax> echo '{"name": "National Res...rto Rico"s Economy.", "key": "blah"}'
     | sed "s/Rico\"s/Rico's/"
{"name": "National Res...rto Rico's Economy.", "key": "blah"}

If you mean just the double quote in 'Rico"s', you can use:

sed "s/Rico\"s/Rico's/"

as in:

pax> echo '{"name": "National Res...rto Rico"s Economy.", "key": "blah"}'
     | sed "s/Rico\"s/Rico's/"
{"name": "National Res...rto Rico's Economy.", "key": "blah"}

回复收藏 0 原文

笑梦风尘 2024-09-19 18:47:47

假设您的数据与您显示的完全相同，并且额外的双引号仅出现在名称值字段中：

更新：

我使脚本稍微更加健壮（处理字段内的 ', '）。

BEGIN {
    q = "\""
    FS = OFS = q ", " q
}
{
    split($1, arr, ": " q)
    gsub(q, "'", arr[2])
    print arr[1] ": " q arr[2], $2, $3
}

将此脚本放入文件中（例如 dequote.awk）并使用
运行该脚本
awk -f dequote.awk input.json >输出.json。

更新 2：

好的，所以您的输入非常难以处理。我能想到的唯一一件事是：

{
    start = match($0, "\"name\": ") + 8
    stop = match($0, "\", \"key\": ")
    if (start == 8 || stop == 0) {
        print
        next
    }
    pre = substr($0, 1, start)
    post = substr($0, stop)
    name = substr($0, start + 1, stop - start - 1)
    gsub("\"", "'", name)
    print pre name post
}

解释：我尝试将该行分成三部分：

直到“名称”值字段的第一个双引号；
“名称”值字段减去双引号；
结束双引号和该行的其余部分。

在第 2 部分中，我将所有双引号替换为单引号。然后我将这三个部分重新粘在一起并打印出来。

Assuming your data is exactly like you showed and the extra double quotes only appear in the name value field:

Update:

I made the script slightly more robust (handling ', ' inside fields).

BEGIN {
    q = "\""
    FS = OFS = q ", " q
}
{
    split($1, arr, ": " q)
    gsub(q, "'", arr[2])
    print arr[1] ": " q arr[2], $2, $3
}

Put this script in a file (say dequote.awk) and run the script with
awk -f dequote.awk input.json > output.json.

Update 2:

Okay, so your input is extremely difficult to process. The only thing other thing I can think of is this:

{
    start = match($0, "\"name\": ") + 8
    stop = match($0, "\", \"key\": ")
    if (start == 8 || stop == 0) {
        print
        next
    }
    pre = substr($0, 1, start)
    post = substr($0, stop)
    name = substr($0, start + 1, stop - start - 1)
    gsub("\"", "'", name)
    print pre name post
}

Explanation: I try to chop the line in three parts:

Up to the first double quote for the "name" value field;
the "name" value field minus the double quotes;
the closing double quote and the rest of the line.

In part 2 I replace all double quotes by single quotes. Then I glue the three parts back together and print them.

回复收藏 0 原文

君勿笑 2024-09-19 18:47:47

awk '{for(i=1;i<=NF;i++) if($i~/name/) { gsub("\042","\047",$(i+1)) }   }1' file

awk '{for(i=1;i<=NF;i++) if($i~/name/) { gsub("\042","\047",$(i+1)) }   }1' file

回复收藏 0 原文

暮色兮凉城 2024-09-19 18:47:47

如果只是“名称”周围的引号，那么您可以从命令行或在 bash 脚本中使用 sed：

    sed -i 's/ "name"/ '\'name\''/g' filename.json

经过测试，有效。

If just the quotes around "name" then you can use sed from command line or in a bash script:

    sed -i 's/ "name"/ '\'name\''/g' filename.json

Tested, works.

回复收藏 0 原文

~没有更多了~

关于作者

野生奥特曼

暂无简介

0 文章

0 评论

24 人气

关注发私信

友情链接

文江博客

如何使用 sed 或 awk 替换文本？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（7）

更新：

更新 2：

Update:

Update 2:

关于作者

相关话题

热门标签

推荐作者

1CH1MKgiKxn9p

ゞ记忆︶ㄣ

JackDx

信远

yaoduoduo1995

霞映澄塘

友情链接

如何使用 sed 或 awk 替换文本？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（7）

更新：

更新 2：

Update:

Update 2:

关于作者

相关话题

热门标签

推荐作者

1CH1MKgiKxn9p

ゞ记忆︶ㄣ

JackDx

信远

yaoduoduo1995

霞映澄塘

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。