如何从文件中提取文本行？

发布于 2024-07-08 08:26:11 字数 338 浏览 9 评论 0原文

我有一个充满文件的目录，我需要从中提取页眉和页脚。它们的长度都是可变的，因此使用头部或尾部是行不通的。每个文件都有一行我可以搜索，但我不想在结果中包含该行。

通常

*** Start (more text here)

以 And 结尾，

*** Finish (more text here)

我希望文件名保持不变，因此我需要覆盖原始文件，或者写入不同的目录，然后我自己覆盖它们。

哦，是的，当然是在 Linux 服务器上，所以我有 Perl、sed、awk、grep 等。

原文

I have a directory full of files and I need to pull the headers and footers off of them. They are all variable length so using head or tail isn't going to work. Each file does have a line I can search for, but I don't want to include the line in the results.

It's usually

*** Start (more text here)

And ends with

*** Finish (more text here)

I want the file names to stay the same, so I need to overwrite the originals, or write to a different directory and I'll overwrite them myself.

Oh yeah, it's on a linux server of course, so I have Perl, sed, awk, grep, etc.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一身仙ぐ女味 2024-07-15 08:26:12

尝试使用触发器！“..”运算符。

# flip-flop.pl
use strict;
use warnings;

my $start  = qr/^\*\*\* Start/;
my $finish = qr/^\*\*\* Finish/;

while ( <> ) {
    if ( /$start/ .. /$finish/ ) {
        next  if /$start/ or /$finish/;
        print $_;
    }
}

然后，您可以使用 -i perl 开关来更新您的文件，如下所示......

 $ perl -i'copy_*' flip-flop.pl data.txt

这会更改 data.txt 但事先将其复制为“copy_data.txt”。

Try the flip flop! ".." operator.

# flip-flop.pl
use strict;
use warnings;

my $start  = qr/^\*\*\* Start/;
my $finish = qr/^\*\*\* Finish/;

while ( <> ) {
    if ( /$start/ .. /$finish/ ) {
        next  if /$start/ or /$finish/;
        print $_;
    }
}

U can then use the -i perl switch to update your file(s) like so.....

 $ perl -i'copy_*' flip-flop.pl data.txt

...which changes data.txt but makes a copy beforehand as "copy_data.txt".

回复收藏 0 原文

初心 2024-07-15 08:26:12

GNU coreutils 是你的朋友...

csplit inputfile %^\*\*\* Start%1 /^\*\*\* Finish/ %% {*}

这会生成你想要的文件 xx00。您可以通过选项 --prefix、--suffix 和 --digits 更改此行为，但请参阅手册为您自己。由于 csplit 旨在生成多个文件，因此不可能生成没有后缀的文件，因此您必须手动或通过脚本进行覆盖：

csplit $1 %^\*\*\* Start%1 /^\*\*\* Finish/ %% {*}
mv -f xx00 $1

根据需要添加循环。

GNU coreutils are your friend...

csplit inputfile %^\*\*\* Start%1 /^\*\*\* Finish/ %% {*}

This produces your desired file as xx00. You can change this behaviour through the options --prefix, --suffix, and --digits, but see the manual for yourself. Since csplit is designed to produce a number of files, it is not possible to produce a file without suffix, so you will have to do the overwriting manually or through a script:

csplit $1 %^\*\*\* Start%1 /^\*\*\* Finish/ %% {*}
mv -f xx00 $1

Add loops as you desire.

回复收藏 0 原文

咽泪装欢 2024-07-15 08:26:12

获取页眉：

cat yourFileHere | awk '{if (d > 0) print $0} /.*Start.*/ {d = 1}'

获取页脚：

cat yourFileHere | awk '/.*Finish.*/ {d = 1} {if (d < 1) print $0}'

根据需要从页眉到页脚获取文件：

cat yourFileHere | awk '/.*Start.*/ {d = 1; next} /.*Finish.*/ {d = 0; next} {if (d > 0) print $0}'

还有一种方法，使用 csplit命令，您应该尝试类似的操作：

csplit yourFileHere /Start/ /Finish/

并检查名为“xxNN”的文件，其中 NN 正在运行，还请查看 csplit 联机帮助页。

To get the header:

cat yourFileHere | awk '{if (d > 0) print $0} /.*Start.*/ {d = 1}'

To get the footer:

cat yourFileHere | awk '/.*Finish.*/ {d = 1} {if (d < 1) print $0}'

To get the file from header to footer as you want:

cat yourFileHere | awk '/.*Start.*/ {d = 1; next} /.*Finish.*/ {d = 0; next} {if (d > 0) print $0}'

There's one more way, with csplit command, you should try something like:

csplit yourFileHere /Start/ /Finish/

And examine files named 'xxNN' where NN is running number, also take a look at csplit manpage.

回复收藏 0 原文

另类 2024-07-15 08:26:12

或许？从不删除开始到结束。

$ sed -i '/^\*\*\* Start/,/^\*\*\* Finish/d!' *

或者...不太确定...但是，如果它有效，也应该删除开始和结束行：

$ sed -i -e '/./,/^\*\*\* Start/d' -e '/^\*\*\* Finish/,/./d' *

d！可能取决于sed的构建你有——不确定。
而且，我完全凭记忆（可能很差）写下了这篇文章。

Maybe? Start to Finish with not-delete.

$ sed -i '/^\*\*\* Start/,/^\*\*\* Finish/d!' *

or...less sure of it...but, if it works, should remove the Start and Finish lines as well:

$ sed -i -e '/./,/^\*\*\* Start/d' -e '/^\*\*\* Finish/,/./d' *

d! may depend on the build of sed you have -- not sure.
And, I wrote that entirely on (probably poor) memory.

回复收藏 0 原文

冰雪之触 2024-07-15 08:26:12

一个快速的 Perl hack，未经测试。我对 sed 或 awk 的使用不够流利，无法使用它们获得这种效果，但我对如何做到这一点很感兴趣。

#!/usr/bin/perl -w
use strict;
use Tie::File;
my $Filename=shift;  
tie my @File, 'Tie::File', $Filename or die "could not access $Filename.\n";  
while (shift @File !~ /^\*\*\* Start/) {};  
while (pop @File !~ /^\*\*\* Finish/) {};  
untie @File;

A quick Perl hack, not tested. I am not fluent enough in sed or awk to get this effect with them, but I would be interested in how that would be done.

#!/usr/bin/perl -w
use strict;
use Tie::File;
my $Filename=shift;  
tie my @File, 'Tie::File', $Filename or die "could not access $Filename.\n";  
while (shift @File !~ /^\*\*\* Start/) {};  
while (pop @File !~ /^\*\*\* Finish/) {};  
untie @File;

回复收藏 0 原文

活雷疯 2024-07-15 08:26:12

perlfaq5：如何在文件中更改、删除或插入行中的一些示例，或附加到文件的开头？可能会有所帮助。您必须使它们适应您的情况。另外，Leon 的触发器运算符答案是在 Perl 中执行此操作的惯用方法，尽管您不必修改文件即可使用它。

回复收藏 0 原文

故事还在继续 2024-07-15 08:26:12

覆盖原始文件的 Perl 解决方案。

#!/usr/bin/perl -ni
if(my $num = /^\*\*\* Start/ .. /^\*\*\* Finish/) {
    print if $num != 1 and $num + 0 eq $num;
}

A Perl solution that overwrites the original file.

#!/usr/bin/perl -ni
if(my $num = /^\*\*\* Start/ .. /^\*\*\* Finish/) {
    print if $num != 1 and $num + 0 eq $num;
}

回复收藏 0 原文

~没有更多了~

关于作者

娇纵

暂无简介

0 文章

0 评论

21 人气

关注发私信

友情链接

文江博客

如何从文件中提取文本行？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（7）

关于作者

相关话题

热门标签

推荐作者

内心激荡

JSmiles

赏烟花じ飞满天

左秋

迪街小绵羊

瞳孔里扚悲伤

友情链接

如何从文件中提取文本行？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（7）

关于作者

相关话题

热门标签

推荐作者

内心激荡

JSmiles

赏烟花じ飞满天

左秋

迪街小绵羊

瞳孔里扚悲伤

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。