排除 sed /START/,/END/ 中的第一行和最后一行
考虑输入:
=sec1=
some-line
some-other-line
foo
bar=baz
=sec2=
c=baz
如果我只想处理=sec1=,我可以通过以下方式注释掉该部分:
sed -e '/=sec1=/,/=[a-z]*=/s:^:#:' < input
...好吧,几乎。
这将注释行包括“=sec1=”和“=sec2=”行,结果将类似于:
#=sec1=
#some-line
#some-other-line
#
#foo
#bar=baz
#
#=sec2=
c=baz
我的问题是:排除开始的最简单方法是什么sed 中 /START/,/END/ 范围内的结束行?
我知道在很多情况下,“s:::”爪的细化可以在这种特定情况下提供解决方案,但我在这里寻求通用解决方案。
在“Sed - 简介和教程”中 Bruce Barnett 写道:“我稍后将向您展示如何限制命令最多但不包括包含指定模式的行。”,但我无法找到他实际显示此内容的位置。
在 Eric Pement 编译的“USEFUL ONE-LINE SCRIPTS FOR SED”中,我只能找到包容性示例:
# print section of file between two regular expressions (inclusive)
sed -n '/Iowa/,/Montana/p' # case sensitive
Consider the input:
=sec1=
some-line
some-other-line
foo
bar=baz
=sec2=
c=baz
If I wish to process only =sec1= I can for example comment out the section by:
sed -e '/=sec1=/,/=[a-z]*=/s:^:#:' < input
... well, almost.
This will comment the lines including "=sec1=" and "=sec2=" lines, and the result will be something like:
#=sec1=
#some-line
#some-other-line
#
#foo
#bar=baz
#
#=sec2=
c=baz
My question is: What is the easiest way to exclude the start and end lines from a /START/,/END/ range in sed?
I know that for many cases refinement of the "s:::" claws can give solution in this specific case, but I am after the generic solution here.
In "Sed - An Introduction and Tutorial" Bruce Barnett writes: "I will show you later how to restrict a command up to, but not including the line containing the specified pattern.", but I was not able to find where he actually show this.
In the "USEFUL ONE-LINE SCRIPTS FOR SED" Compiled by Eric Pement, I could find only the inclusive example:
# print section of file between two regular expressions (inclusive)
sed -n '/Iowa/,/Montana/p' # case sensitive
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
这应该可以解决问题:
这会在 sec1 和 sec2 之间进行匹配(包含),然后使用
b
命令跳过第一行和最后一行。 这会在 sec1 和 sec2(不包括)之间留下所需的行,并且s
命令添加注释符号。不幸的是,您确实需要重复正则表达式来匹配分隔符。 据我所知,没有更好的方法可以做到这一点。 至少你可以保持正则表达式干净,即使它们被使用了两次。
这是改编自 SED 常见问题解答:如何处理 RE1 和 RE2 之间的所有线路,不包括线条本身?
This should do the trick:
This matches between sec1 and sec2 inclusively and then just skips the first and last line with the
b
command. This leaves the desired lines between sec1 and sec2 (exclusive), and thes
command adds the comment sign.Unfortunately, you do need to repeat the regexps for matching the delimiters. As far as I know there's no better way to do this. At least you can keep the regexps clean, even though they're used twice.
This is adapted from the SED FAQ: How do I address all the lines between RE1 and RE2, excluding the lines themselves?
如果您对范围之外的行不感兴趣,而只是想要问题中爱荷华州/蒙大拿州示例的非包容性变体(这就是我来到这里的原因),您可以写“除了第一个和最后一个使用第二个 sed 可以轻松匹配行”子句:
sed -n '/PATTERN1/,/PATTERN2/p'
sed -n '/PATTERN1/,/PATTERN2/p'
输入| sed '1d;$d'
稍微清晰一些(尽管在大文件上速度较慢)
就个人而言,我发现这比等效的
sed -n '1,/PATTERN1/d;/PATTERN2/q;p' < ; 输入
If you're not interested in lines outside of the range, but just want the non-inclusive variant of the Iowa/Montana example from the question (which is what brought me here), you can write the "except for the first and last matching lines" clause easily enough with a second sed:
sed -n '/PATTERN1/,/PATTERN2/p' < input | sed '1d;$d'
Personally, I find this slightly clearer (albeit slower on large files) than the equivalent
sed -n '1,/PATTERN1/d;/PATTERN2/q;p' < input
另一种方式是
/begin/n
-> 跳过具有“开始”模式的行/end/ !p
-> 打印所有没有“结束”模式的行取自 Bruce Barnett 的 sed 教程 http://www.grymoire.com/Unix/Sed.html#toc-uh-35a
Another way would be
/begin/n
-> skip over the line that has the "begin" pattern/end/ !p
-> print all lines that don't have the "end" patternTaken from Bruce Barnett's sed tutorial http://www.grymoire.com/Unix/Sed.html#toc-uh-35a
你也可以使用 awk
you could also use awk
我使用过:
这将搜索模式之间的所有行,然后打印不包含模式的所有内容
I've used:
This will search all the lines between the patterns, then print everything not containing the patterns
您不必重复任何正则表达式即可完成此操作。
输出将是:
拆开 sed 脚本,我们有:
/^=sec1=$/
一个与开头部分标记匹配的正则表达式地址{:0;n;/^=\ w*=$/!{s/^/#/;b0}}
一个命令块,如下::0
稍后返回的标签n
打印当前行,并读取下一行/^=\w*=$/!
与任何不是节标记的行相匹配的正则表达式地址{s/^/#/;b0}
一个命令块,如下:s/^/#/
在该行前面添加#
b0
分支到标签0
< 之间的内部循环code>:0 和
b0
继续循环,直到遇到任何作为节标记的行(或文件末尾)。You don't have to repeat any regular expression(s) to make this work.
The output will be:
Taking apart the sed script, we have:
/^=sec1=$/
a regular expression address matching the opening section marker{:0;n;/^=\w*=$/!{s/^/#/;b0}}
a command block, as follows::0
a label to return to latern
print the current line, and read the next line/^=\w*=$/!
a regular expression address matching any line that isn't a section marker{s/^/#/;b0}
a command block, as follows:s/^/#/
prepend a#
to the lineb0
branch to label0
The inner loop between
:0
andb0
continues looping until it encounters any line that is a section marker (or the end of the file).