在 sed 中插入换行符 (Mac OS X)
如何在 sed 的替换部分插入换行符?
这段代码不起作用:
sed "s/\(1234\)/\n\1/g" input.txt > output.txt
其中 input.txt 是:
test1234foo123bar1234
和 output.txt 应该是:
test
1234foo123bar
1234
但插入我得到这个:
testn1234foo123barn1234
注意:
这个问题专门关于 Mac OS X 版本的“sed”,社区注意到它的行为与 Linux 版本不同。
How do I insert a newline in the replacement part of sed?
This code isn't working:
sed "s/\(1234\)/\n\1/g" input.txt > output.txt
where input.txt is:
test1234foo123bar1234
and output.txt should be:
test
1234foo123bar
1234
but insted I get this:
testn1234foo123barn1234
NOTE:
This question is specifically about the Mac OS X version of "sed", and the community has noted that it behaves differently than, say, Linux versions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
您的 sed 版本显然不支持 RHS(替换的右侧)中的
\n
。您应该阅读 Eric Pement 维护的 SED 常见问题解答,以选择一种可能的解决方案。我建议首先尝试插入文字换行符。以下是其中的引述。
4.1。如何在替换的 RHS 中插入换行符?
sed 的多个版本允许直接在 RHS 中键入
\n
,然后在输出时将其转换为换行符: ssed 、gsed302a+、gsed103(使用-x
开关)、sed15+、sedmod 和 UnixDOS sed。最简单的解决方案是使用这些版本之一。对于其他版本的 sed,请尝试以下操作之一:
(a) 如果从 Bourne shell 中键入 sed 脚本,如果脚本使用“单引号”,请使用一个反斜杠
\
' 或两个反斜杠\\
如果脚本需要“双引号”。在下面的示例中,请注意第二行的前导>
由 shell 生成,以提示用户进行更多输入。用户输入斜杠、单引号,然后按 ENTER 终止命令:(b) 使用脚本文件,脚本中带有一个反斜杠
\
,紧接着是换行符。这会将换行符嵌入到“替换”部分中。示例:某些版本的 sed 可能不需要尾随反斜杠。如果是这样,请将其删除。
(c) 插入一个未使用的字符并通过 tr 管道输出:
(d) 使用
G
命令:G 附加一个换行符,加上保留空间的内容到模式空间的末尾。如果保留空间为空,则无论如何都会附加换行符。换行符以
\n
形式存储在模式空间中,可以通过分组\(...\)
对其进行寻址并在 RHS 中移动。因此,要更改之前使用的“twolines”示例,可以使用以下脚本:(e) 插入整行,而不是打断行:
如果不更改行,而仅在 或 之前插入完整行有了模式之后,程序就容易多了。使用
i
(插入)或a
(追加)命令,通过外部脚本进行更改。要在与正则表达式匹配的每行之前插入This line is new
:上面的两个示例旨在作为从控制台输入的“单行”命令。如果使用 sed 脚本,紧随其后的
i\
文字换行符将适用于所有版本的 sed。此外,命令 s/$/This line is new/ 仅当保留空间已为空(默认情况下为空)时才有效。在与正则表达式匹配的每行之后追加
This line is new
:在与正则表达式匹配的每行之后追加 2 个空行:
将与正则表达式匹配的每行替换为 5 个空行:
(f) 如果可能,请使用
y///
命令:在某些 Unix 版本的 sed(不是 GNU sed!)上,尽管
s///
命令不起作用。不接受 RHS 中的\n
,而y///
命令则接受。如果您的 Unix sed 支持它,则可以通过这种方式在aaa
之后插入换行符(这不能移植到 GNU sed 或其他 sed):Your sed version apparently does not support
\n
in RHS (right-hand side of substitution). You should read THE SED FAQ maintained by Eric Pement to choose one of possible solutions. I suggest trying first inserting literal newline character.Below is the quote from it.
4.1. How do I insert a newline into the RHS of a substitution?
Several versions of sed permit
\n
to be typed directly into the RHS, which is then converted to a newline on output: ssed, gsed302a+, gsed103 (with the-x
switch), sed15+, sedmod, and UnixDOS sed. The easiest solution is to use one of these versions.For other versions of sed, try one of the following:
(a) If typing the sed script from a Bourne shell, use one backslash
\
if the script uses 'single quotes' or two backslashes\\
if the script requires "double quotes". In the example below, note that the leading>
on the 2nd line is generated by the shell to prompt the user for more input. The user types in slash, single-quote, and then ENTER to terminate the command:(b) Use a script file with one backslash
\
in the script, immediately followed by a newline. This will embed a newline into the "replace" portion. Example:Some versions of sed may not need the trailing backslash. If so, remove it.
(c) Insert an unused character and pipe the output through tr:
(d) Use the
G
command:G appends a newline, plus the contents of the hold space to the end of the pattern space. If the hold space is empty, a newline is appended anyway. The newline is stored in the pattern space as
\n
where it can be addressed by grouping\(...\)
and moved in the RHS. Thus, to change the "twolines" example used earlier, the following script will work:(e) Inserting full lines, not breaking lines up:
If one is not changing lines but only inserting complete lines before or after a pattern, the procedure is much easier. Use the
i
(insert) ora
(append) command, making the alterations by an external script. To insertThis line is new
BEFORE each line matching a regex:The two examples above are intended as "one-line" commands entered from the console. If using a sed script,
i\
immediately followed by a literal newline will work on all versions of sed. Furthermore, the commands/$/This line is new/
will only work if the hold space is already empty (which it is by default).To append
This line is new
AFTER each line matching a regex:To append 2 blank lines after each line matching a regex:
To replace each line matching a regex with 5 blank lines:
(f) Use the
y///
command if possible:On some Unix versions of sed (not GNU sed!), though the
s///
command won't accept\n
in the RHS, they///
command does. If your Unix sed supports it, a newline afteraaa
can be inserted this way (which is not portable to GNU sed or other seds):这是一个单行解决方案,适用于任何 POSIX 兼容
sed
(包括 macOS 上的 FreeBSD 版本),假设您的 shell是bash
或ksh
或zsh
:请注意,您可以使用 >单个 ANSI C 引号字符串作为整个
sed
脚本,sed $'...' <<<<
,但这将需要\
- 转义所有\
实例(将它们加倍),这非常麻烦并且妨碍可读性,如 @tovk 的回答)。$'\n'
表示换行符,是 ANSI C 引用,它允许您创建带有控制字符转义序列的字符串。sed
脚本中,如下所示:'s/\(1234\)/\'
是第一半 - 请注意,它以\
结尾,以便转义换行符将作为下一个字符插入。(此转义对于将换行符标记为替换字符串的一部分而不是被解释为命令的结尾是必要的)。$'\n'
是换行符的 ANSI C 引用表示形式,shell 在将脚本传递给之前将其扩展为实际换行符sed
。'\1/g'
是第二半。请注意,此解决方案对于其他控制字符的工作方式类似,例如表示制表符的
$'\t'
。背景信息:
sed
规范:http://man.cx /sedsed
(也在 macOS 上使用)接近此规范,而 GNUsed
提供了许多扩展。sed
和 BSDsed
之间的差异摘要可以在 https://stackoverflow.com/a/24276470/45375Here's a single-line solution that works with any POSIX-compatible
sed
(including the FreeBSD version on macOS), assuming your shell isbash
orksh
orzsh
:Note that you could use a single ANSI C-quoted string as the entire
sed
script,sed $'...' <<<
, but that would necessitate\
-escaping all\
instances (doubling them), which is quite cumbersome and hinders readability, as evidenced by @tovk's answer).$'\n'
represents a newline and is an instance of ANSI C quoting, which allows you to create strings with control-character escape sequences.sed
script as follows:'s/\(1234\)/\'
is the 1st half - note that it ends in\
, so as to escape the newline that will be inserted as the next char. (this escaping is necessary to mark the newline as part of the replacement string rather than being interpreted as the end of the command).$'\n'
is the ANSI C-quoted representation of a newline character, which the shell expands to an actual newline before passing the script tosed
.'\1/g'
is the 2nd half.Note that this solution works analogously for other control characters, such as
$'\t'
to represent a tab character.Background info:
sed
specification: http://man.cx/sedsed
(also used on macOS) stays close to this spec, while GNUsed
offers many extensions.sed
and BSDsed
can be found at https://stackoverflow.com/a/24276470/45375我可以说服Solaris版本的
sed
以这种方式工作(在bash
中):(您必须将换行符直接放在反斜杠之后)。
在
csh
中,我不得不再放一个反斜杠:sed
的 Gnu 版本只需使用\n
即可:The solaris version of
sed
I could convince to work this way (inbash
):(you have to put the line break directly after the backslash).
In
csh
I had to put one more backslash:The Gnu version of
sed
simply worked using\n
:Perl 提供了更丰富的“扩展”正则表达式语法,这在这里很有用:
意味着“用换行符替换模式 1234 后面的零宽度匹配”。这避免了必须通过反向引用捕获和重复部分表达式。
Perl provides a richer "extended" regex syntax which is useful here:
means "substitute a newline for the zero-width match following the pattern 1234". This avoids having to capture and repeat part the expression with backreferences.
获取一个 GNU sed。
然后你的命令将按预期工作:
注意:由于 mac 端口,你也可能会得到 GNU sed。
Get a GNU sed.
Then your command will work as expected:
nb: you may get GNU sed thanks to mac ports too.
不幸的是,对我来说, sed 似乎忽略了替换字符串中的
\n
。如果您也遇到这种情况,另一种方法是使用:
这应该在任何地方都可以工作,并且会产生:
对于您的示例,以
input.txt
文件作为输入,output.txt
作为输出,使用:Unfortunately, for me,
sed
seems to ignore\n
s in the replacement string.If that happens for you as well, an alternative is to use:
This should work anywhere and will produce:
For your example with an
input.txt
file as input andoutput.txt
as output, use:试试这个:
来自 Sed Gnu doc
Try this:
From Sed Gnu doc
您还可以使用 Bash 的
$'string'
功能:You may also use the
$'string'
feature of Bash:命令中间的换行符可能感觉有点笨拙:
这里有这个问题的两个解决方案,我认为应该是相当可移植的
(应该适用于任何符合 POSIX 标准的
sh
、printf
和sed
):解决方案 1:
记住转义此处
printf
的任何\
和%
字符:避免需要转义
\
和%<
printf
的 /code> 字符:解决方案 2:
创建一个包含换行符的变量,如下所示:
或如下:
然后按如下方式使用它:
The newline in the middle of the command can feel a bit clumsy:
Here are two solutions to this problem which I think should be quite portable
(should work for any POSIX-compliant
sh
,printf
, andsed
):Solution 1:
Remember to escape any
\
and%
characters forprintf
here:To avoid the need for escaping
\
and%
characters forprintf
:Solution 2:
Make a variable containing a newline like this:
Or like this:
Then use it like this: