bash:从文本文件中过滤掉连续行

发布于 2024-08-24 19:47:53 字数 243 浏览 8 评论 0原文

我想从许多文件中删除段落的每个实例。我将段落称为行序列。

例如:

my first line
my second line
my third line
the fourth
5th and last

问题是我只想在它们作为一组出现时删除它们。例如,如果

my first line
appears alone I don't want to delete it.

I want to delete from many files each instance of a paragraph. I call paragraph a sequence of lines.

For example:

my first line
my second line
my third line
the fourth
5th and last

the problem is that I only want to delete them when they appear as a group. For example, if

my first line

appears alone I don't want to delete it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

旧伤慢歌 2024-08-31 19:47:53

@OP,我看到你接受了你的段落句子是“硬编码”的答案,所以我认为这些段落总是相同的?确实如此,您可以使用grep。将要删除的段落存储在文件中,例如“filter”,然后使用 grep 的 -f-v 选项来完成这项工作,

grep -v -f filter file

@OP, i see you accepted the answer whereby your paragraph sentences are "hardcorded", so i assume those paragraphs are always the same? its that's true, you can use grep. Store the paragraph you want to get rid of in a file eg "filter", then use -f and -v option of grep to do the job,

grep -v -f filter file
我为君王 2024-08-31 19:47:53

如果您能够使用 Perl,您可以像这样一行完成:

perl -0777 -pe 's/my first line\nmy second line\nmy third line\nthe fourth\n5th and last\n//g' paragraph_file

解释位于 perlrun< /a>:

特殊值 00 将导致 Perl 以段落模式读取文件。值 0777 将导致 Perl 读取整个文件,因为该值没有合法字节。

示例输入:

my first line
my second line
my third line
the fourth
5th and last
hey
my first line
my second line
my third line
the fourth
5th and last

hello
my first line

输出:

$ perl -0777 -pe 's/my first line\nmy second line\nmy third line
\nthe fourth\n5th and last\n//g' paragraph_file
hey

hello
my first line

If you are able to use Perl, you can do it in one line like this:

perl -0777 -pe 's/my first line\nmy second line\nmy third line\nthe fourth\n5th and last\n//g' paragraph_file

the explanation is in perlrun:

The special value 00 will cause Perl to slurp files in paragraph mode. The value 0777 will cause Perl to slurp files whole because there is no legal byte with that value.

Sample input:

my first line
my second line
my third line
the fourth
5th and last
hey
my first line
my second line
my third line
the fourth
5th and last

hello
my first line

Output:

$ perl -0777 -pe 's/my first line\nmy second line\nmy third line
\nthe fourth\n5th and last\n//g' paragraph_file
hey

hello
my first line
待天淡蓝洁白时 2024-08-31 19:47:53

你可以用 sed 来做到这一点:

sed '$!N; /^\(.*\)\n\1$/!P; D' file_to_filter

You can do it with sed:

sed '$!N; /^\(.*\)\n\1$/!P; D' file_to_filter
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文