在 sed 中插入换行符 (Mac OS X)

发布于 2024-11-09 13:44:26 字数 424 浏览 0 评论 0原文

如何在 sed 的替换部分插入换行符?

这段代码不起作用:

sed "s/\(1234\)/\n\1/g" input.txt > output.txt

其中 input.txt 是:

test1234foo123bar1234

和 output.txt 应该是:

test
1234foo123bar
1234

但插入我得到这个:

testn1234foo123barn1234

注意:

这个问题专门关于 Mac OS X 版本的“sed”,社区注意到它的行为与 Linux 版本不同。

How do I insert a newline in the replacement part of sed?

This code isn't working:

sed "s/\(1234\)/\n\1/g" input.txt > output.txt

where input.txt is:

test1234foo123bar1234

and output.txt should be:

test
1234foo123bar
1234

but insted I get this:

testn1234foo123barn1234

NOTE:

This question is specifically about the Mac OS X version of "sed", and the community has noted that it behaves differently than, say, Linux versions.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

怎樣才叫好 2024-11-16 13:44:26

您的 sed 版本显然不支持 RHS(替换的右侧)中的 \n 。您应该阅读 Eric Pement 维护的 SED 常见问题解答,以选择一种可能的解决方案。我建议首先尝试插入文字换行符。

以下是其中的引述。


4.1。如何在替换的 RHS 中插入换行符?

sed 的多个版本允许直接在 RHS 中键入 \n,然后在输出时将其转换为换行符: ssed 、gsed302a+、gsed103(使用 -x 开关)、sed15+、sedmod 和 UnixDOS sed。最简单的解决方案是使用这些版本之一。

对于其他版本的 sed,请尝试以下操作之一:

(a) 如果从 Bourne shell 中键入 sed 脚本,如果脚本使用“单引号”,请使用一个反斜杠 \ ' 或两个反斜杠 \\ 如果脚本需要“双引号”。在下面的示例中,请注意第二行的前导 > 由 shell 生成,以提示用户进行更多输入。用户输入斜杠、单引号,然后按 ENTER 终止命令:

 [sh-prompt]$ echo twolines | sed 's/two/& new\
 >/'
 two new
 lines
 [bash-prompt]$

(b) 使用脚本文件,脚本中带有一个反斜杠 \,紧接着是换行符。这会将换行符嵌入到“替换”部分中。示例:

 sed -f newline.sed files

 # newline.sed
 s/twolines/two new\
 lines/g

某些版本的 sed 可能不需要尾随反斜杠。如果是这样,请将其删除。

(c) 插入一个未使用的字符并通过 tr 管道输出:

 echo twolines | sed 's/two/& new=/' | tr "=" "\n"   # produces
 two new
 lines

(d) 使用 G 命令:

G 附加一个换行符,加上保留空间的内容到模式空间的末尾。如果保留空间为空,则无论如何都会附加换行符。换行符以 \n 形式存储在模式空间中,可以通过分组 \(...\) 对其进行寻址并在 RHS 中移动。因此,要更改之前使用的“twolines”示例,可以使用以下脚本:

 sed '/twolines/{G;s/\(two\)\(lines\)\(\n\)/\1\3\2/;}'

(e) 插入整行,而不是打断行:

如果不更改行,而仅在 或 之前插入完整行有了模式之后,程序就容易多了。使用i(插入)或a(追加)命令,通过外部脚本进行更改。要在与正则表达式匹配的每行之前插入 This line is new

 /RE/i This line is new               # HHsed, sedmod, gsed 3.02a
 /RE/{x;s/$/This line is new/;G;}     # other seds

上面的两个示例旨在作为从控制台输入的“单行”命令。如果使用 sed 脚本,紧随其后的 i\ 文字换行符将适用于所有版本的 sed。此外,命令 s/$/This line is new/ 仅当保留空间已为空(默认情况下为空)时才有效。

在与正则表达式匹配的每行之后追加 This line is new

 /RE/a This line is new               # HHsed, sedmod, gsed 3.02a
 /RE/{G;s/$/This line is new/;}       # other seds

在与正则表达式匹配的每行之后追加 2 个空行:

 /RE/{G;G;}                    # assumes the hold space is empty

将与正则表达式匹配的每行替换为 5 个空行:

 /RE/{s/.*//;G;G;G;G;}         # assumes the hold space is empty

(f) 如果可能,请使用 y/// 命令:

在某些 Unix 版本的 sed(不是 GNU sed!)上,尽管 s/// 命令不起作用。不接受 RHS 中的 \n,而 y/// 命令则接受。如果您的 Unix sed 支持它,则可以通过这种方式在 aaa 之后插入换行符(这不能移植到 GNU sed 或其他 sed):

 s/aaa/&~/; y/~/\n/;    # assuming no other '~' is on the line!

Your sed version apparently does not support \n in RHS (right-hand side of substitution). You should read THE SED FAQ maintained by Eric Pement to choose one of possible solutions. I suggest trying first inserting literal newline character.

Below is the quote from it.


4.1. How do I insert a newline into the RHS of a substitution?

Several versions of sed permit \n to be typed directly into the RHS, which is then converted to a newline on output: ssed, gsed302a+, gsed103 (with the -x switch), sed15+, sedmod, and UnixDOS sed. The easiest solution is to use one of these versions.

For other versions of sed, try one of the following:

(a) If typing the sed script from a Bourne shell, use one backslash \ if the script uses 'single quotes' or two backslashes \\ if the script requires "double quotes". In the example below, note that the leading > on the 2nd line is generated by the shell to prompt the user for more input. The user types in slash, single-quote, and then ENTER to terminate the command:

 [sh-prompt]$ echo twolines | sed 's/two/& new\
 >/'
 two new
 lines
 [bash-prompt]$

(b) Use a script file with one backslash \ in the script, immediately followed by a newline. This will embed a newline into the "replace" portion. Example:

 sed -f newline.sed files

 # newline.sed
 s/twolines/two new\
 lines/g

Some versions of sed may not need the trailing backslash. If so, remove it.

(c) Insert an unused character and pipe the output through tr:

 echo twolines | sed 's/two/& new=/' | tr "=" "\n"   # produces
 two new
 lines

(d) Use the G command:

G appends a newline, plus the contents of the hold space to the end of the pattern space. If the hold space is empty, a newline is appended anyway. The newline is stored in the pattern space as \n where it can be addressed by grouping \(...\) and moved in the RHS. Thus, to change the "twolines" example used earlier, the following script will work:

 sed '/twolines/{G;s/\(two\)\(lines\)\(\n\)/\1\3\2/;}'

(e) Inserting full lines, not breaking lines up:

If one is not changing lines but only inserting complete lines before or after a pattern, the procedure is much easier. Use the i (insert) or a (append) command, making the alterations by an external script. To insert This line is new BEFORE each line matching a regex:

 /RE/i This line is new               # HHsed, sedmod, gsed 3.02a
 /RE/{x;s/$/This line is new/;G;}     # other seds

The two examples above are intended as "one-line" commands entered from the console. If using a sed script, i\ immediately followed by a literal newline will work on all versions of sed. Furthermore, the command s/$/This line is new/ will only work if the hold space is already empty (which it is by default).

To append This line is new AFTER each line matching a regex:

 /RE/a This line is new               # HHsed, sedmod, gsed 3.02a
 /RE/{G;s/$/This line is new/;}       # other seds

To append 2 blank lines after each line matching a regex:

 /RE/{G;G;}                    # assumes the hold space is empty

To replace each line matching a regex with 5 blank lines:

 /RE/{s/.*//;G;G;G;G;}         # assumes the hold space is empty

(f) Use the y/// command if possible:

On some Unix versions of sed (not GNU sed!), though the s/// command won't accept \n in the RHS, the y/// command does. If your Unix sed supports it, a newline after aaa can be inserted this way (which is not portable to GNU sed or other seds):

 s/aaa/&~/; y/~/\n/;    # assuming no other '~' is on the line!
月光色 2024-11-16 13:44:26

这是一个单行解决方案,适用于任何 POSIX 兼容 sed(包括 macOS 上的 FreeBSD 版本),假设您的 shell是 bashkshzsh

sed 's/\(1234\)/\'

请注意,您可以使用 >单个 ANSI C 引号字符串作为整个 sed 脚本,sed $'...' <<<<,但这将需要 \ - 转义所有 \ 实例(将它们加倍),这非常麻烦并且妨碍可读性,如 @tovk 的回答)。

  • $'\n' 表示换行符,是 ANSI C 引用,它允许您创建带有控制字符转义序列的字符串。
  • 上述将 ANSI C 引用字符串拼接sed脚本中,如下所示:
    • 该脚本只是分成 2 个单引号字符串,ANSI C 引号字符串卡在两半之间
    • 's/\(1234\)/\' 是第一半 - 请注意,它\ 结尾,以便转义换行符将作为下一个字符插入。(此转义对于将换行符标记为替换字符串的一部分而不是被解释为命令的结尾是必要的)。
    • $'\n' 是换行符的 ANSI C 引用表示形式,shell 在将脚本传递给之前将其扩展为实际换行符sed
    • '\1/g' 是第二半。

请注意,此解决方案对于其他控制字符的工作方式类似,例如表示制表符的$'\t'


背景信息

\n''\1/g' <<<'test1234foo123bar1234'

请注意,您可以使用 >单个 ANSI C 引号字符串作为整个 sed 脚本,sed $'...' <<<<,但这将需要 \ - 转义所有 \ 实例(将它们加倍),这非常麻烦并且妨碍可读性,如 @tovk 的回答)。

  • $'\n' 表示换行符,是 ANSI C 引用,它允许您创建带有控制字符转义序列的字符串。
  • 上述将 ANSI C 引用字符串拼接sed脚本中,如下所示:
    • 该脚本只是分成 2 个单引号字符串,ANSI C 引号字符串卡在两半之间
    • 's/\(1234\)/\' 是第一半 - 请注意,它\ 结尾,以便转义换行符将作为下一个字符插入。(此转义对于将换行符标记为替换字符串的一部分而不是被解释为命令的结尾是必要的)。
    • $'\n' 是换行符的 ANSI C 引用表示形式,shell 在将脚本传递给之前将其扩展为实际换行符sed
    • '\1/g' 是第二半。

请注意,此解决方案对于其他控制字符的工作方式类似,例如表示制表符的$'\t'


背景信息

Here's a single-line solution that works with any POSIX-compatible sed (including the FreeBSD version on macOS), assuming your shell is bash or ksh or zsh:

sed 's/\(1234\)/\'

Note that you could use a single ANSI C-quoted string as the entire sed script, sed $'...' <<<, but that would necessitate \-escaping all \ instances (doubling them), which is quite cumbersome and hinders readability, as evidenced by @tovk's answer).

  • $'\n' represents a newline and is an instance of ANSI C quoting, which allows you to create strings with control-character escape sequences.
  • The above splices the ANSI C-quoted string into the sed script as follows:
    • The script is simply broken into 2 single-quoted strings, with the ANSI C-quoted string stuck between the two halves:
    • 's/\(1234\)/\' is the 1st half - note that it ends in \, so as to escape the newline that will be inserted as the next char. (this escaping is necessary to mark the newline as part of the replacement string rather than being interpreted as the end of the command).
    • $'\n' is the ANSI C-quoted representation of a newline character, which the shell expands to an actual newline before passing the script to sed.
    • '\1/g' is the 2nd half.

Note that this solution works analogously for other control characters, such as $'\t' to represent a tab character.


Background info:

\n''\1/g' <<<'test1234foo123bar1234'

Note that you could use a single ANSI C-quoted string as the entire sed script, sed $'...' <<<, but that would necessitate \-escaping all \ instances (doubling them), which is quite cumbersome and hinders readability, as evidenced by @tovk's answer).

  • $'\n' represents a newline and is an instance of ANSI C quoting, which allows you to create strings with control-character escape sequences.
  • The above splices the ANSI C-quoted string into the sed script as follows:
    • The script is simply broken into 2 single-quoted strings, with the ANSI C-quoted string stuck between the two halves:
    • 's/\(1234\)/\' is the 1st half - note that it ends in \, so as to escape the newline that will be inserted as the next char. (this escaping is necessary to mark the newline as part of the replacement string rather than being interpreted as the end of the command).
    • $'\n' is the ANSI C-quoted representation of a newline character, which the shell expands to an actual newline before passing the script to sed.
    • '\1/g' is the 2nd half.

Note that this solution works analogously for other control characters, such as $'\t' to represent a tab character.


Background info:

柠栀 2024-11-16 13:44:26

我可以说服Solaris版本的sed以这种方式工作(在bash中):(

echo test1234foo123bar1234 | sed 's/\(1234\)/\
\1/g'

您必须将换行符直接放在反斜杠之后)。

csh 中,我不得不再放一个反斜杠:

echo test1234foo123bar1234 | sed 's/\(1234\)/\\
\1/g'

sed 的 Gnu 版本只需使用 \n 即可:

echo test1234foo123bar1234 | sed 's/\(1234\)/\n\1/g'

The solaris version of sed I could convince to work this way (in bash):

echo test1234foo123bar1234 | sed 's/\(1234\)/\
\1/g'

(you have to put the line break directly after the backslash).

In csh I had to put one more backslash:

echo test1234foo123bar1234 | sed 's/\(1234\)/\\
\1/g'

The Gnu version of sed simply worked using \n:

echo test1234foo123bar1234 | sed 's/\(1234\)/\n\1/g'
甜嗑 2024-11-16 13:44:26

Perl 提供了更丰富的“扩展”正则表达式语法,这在这里很有用:

perl -p -e 's/(?=1234)/\n/g'

意味着“用换行符替换模式 1234 后面的零宽度匹配”。这避免了必须通过反向引用捕获和重复部分表达式。

Perl provides a richer "extended" regex syntax which is useful here:

perl -p -e 's/(?=1234)/\n/g'

means "substitute a newline for the zero-width match following the pattern 1234". This avoids having to capture and repeat part the expression with backreferences.

紫竹語嫣☆ 2024-11-16 13:44:26

获取一个 GNU sed

$ brew install gnu-sed

然后你的命令将按预期工作:

$ gsed "s/\(1234\)/\n\1/g" input.txt
test
1234foo123bar
1234

注意:由于 mac 端口,你也可能会得到 GNU sed。

Get a GNU sed.

$ brew install gnu-sed

Then your command will work as expected:

$ gsed "s/\(1234\)/\n\1/g" input.txt
test
1234foo123bar
1234

nb: you may get GNU sed thanks to mac ports too.

叶落知秋 2024-11-16 13:44:26

不幸的是,对我来说, sed 似乎忽略了替换字符串中的 \n

$ echo test1234foo123bar1234 | sed "s/\(1234\)/\n\1/g"
testn1234foo123barn1234

如果您也遇到这种情况,另一种方法是使用:

$ echo test1234foo123bar1234 | sed "s/\(1234\)/\\`echo -e '\n\r'`\1/g"

这应该在任何地方都可以工作,并且会产生:

test
1234foo123bar
1234

对于您的示例,以 input.txt 文件作为输入,output.txt 作为输出,使用:

$ sed "s/\(1234\)/\\`echo -e '\n\r'`\1/g" input.txt > output.txt

Unfortunately, for me, sed seems to ignore \ns in the replacement string.

$ echo test1234foo123bar1234 | sed "s/\(1234\)/\n\1/g"
testn1234foo123barn1234

If that happens for you as well, an alternative is to use:

$ echo test1234foo123bar1234 | sed "s/\(1234\)/\\`echo -e '\n\r'`\1/g"

This should work anywhere and will produce:

test
1234foo123bar
1234

For your example with an input.txt file as input and output.txt as output, use:

$ sed "s/\(1234\)/\\`echo -e '\n\r'`\1/g" input.txt > output.txt
南烟 2024-11-16 13:44:26

试试这个:

$ echo test1234foo123bar1234 | sed "s/\(1234\)/\n\1/g"
test
1234foo123bar
1234

来自 Sed Gnu doc

g
    Apply the replacement to all matches to the regexp, not just the first. 

Try this:

$ echo test1234foo123bar1234 | sed "s/\(1234\)/\n\1/g"
test
1234foo123bar
1234

From Sed Gnu doc

g
    Apply the replacement to all matches to the regexp, not just the first. 
猥琐帝 2024-11-16 13:44:26

您还可以使用 Bash 的 $'string' 功能:

man bash | less -p "\\
"

printf  '%s' 'test1234foo123bar1234'  | sed 
s/\\(1234\\)/\\\n\\1/g'

You may also use the $'string' feature of Bash:

man bash | less -p "\\
"

printf  '%s' 'test1234foo123bar1234'  | sed 
s/\\(1234\\)/\\\n\\1/g'
帝王念 2024-11-16 13:44:26

命令中间的换行符可能感觉有点笨拙:

$ echo abc | sed 's/b/\
/'
a
c

这里有这个问题的两个解决方案,我认为应该是相当可移植的
(应该适用于任何符合 POSIX 标准的 shprintfsed):

解决方案 1:

记住转义此处 printf 的任何 \% 字符:

$ echo abc | sed "$(printf 's/b/\\\n/')"
a
c

避免需要转义 \%< printf 的 /code> 字符:

$ echo abc | sed "$(printf '%s\n%s' 's/b/\' '/')"
a
c

解决方案 2:

创建一个包含换行符的变量,如下所示:

newline="$(printf '\nx')"; newline="${newline%x}"

或如下:

newline='
'

然后按如下方式使用它:

$ echo abc | sed "s/b/\\${newline}/"
a
c

The newline in the middle of the command can feel a bit clumsy:

$ echo abc | sed 's/b/\
/'
a
c

Here are two solutions to this problem which I think should be quite portable
(should work for any POSIX-compliant sh, printf, and sed):

Solution 1:

Remember to escape any \ and % characters for printf here:

$ echo abc | sed "$(printf 's/b/\\\n/')"
a
c

To avoid the need for escaping \ and % characters for printf:

$ echo abc | sed "$(printf '%s\n%s' 's/b/\' '/')"
a
c

Solution 2:

Make a variable containing a newline like this:

newline="$(printf '\nx')"; newline="${newline%x}"

Or like this:

newline='
'

Then use it like this:

$ echo abc | sed "s/b/\\${newline}/"
a
c
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文