如何从文件中提取文本行?
我有一个充满文件的目录,我需要从中提取页眉和页脚。 它们的长度都是可变的,因此使用头部或尾部是行不通的。 每个文件都有一行我可以搜索,但我不想在结果中包含该行。
通常
*** Start (more text here)
以 And 结尾,
*** Finish (more text here)
我希望文件名保持不变,因此我需要覆盖原始文件,或者写入不同的目录,然后我自己覆盖它们。
哦,是的,当然是在 Linux 服务器上,所以我有 Perl、sed、awk、grep 等。
I have a directory full of files and I need to pull the headers and footers off of them. They are all variable length so using head or tail isn't going to work. Each file does have a line I can search for, but I don't want to include the line in the results.
It's usually
*** Start (more text here)
And ends with
*** Finish (more text here)
I want the file names to stay the same, so I need to overwrite the originals, or write to a different directory and I'll overwrite them myself.
Oh yeah, it's on a linux server of course, so I have Perl, sed, awk, grep, etc.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
尝试使用触发器!“..”运算符。
然后,您可以使用 -i perl 开关来更新您的文件,如下所示......
这会更改 data.txt 但事先将其复制为“copy_data.txt”。
Try the flip flop! ".." operator.
U can then use the -i perl switch to update your file(s) like so.....
...which changes data.txt but makes a copy beforehand as "copy_data.txt".
GNU coreutils 是你的朋友...
这会生成你想要的文件
xx00
。 您可以通过选项--prefix
、--suffix
和--digits
更改此行为,但请参阅 手册为您自己。 由于csplit
旨在生成多个文件,因此不可能生成没有后缀的文件,因此您必须手动或通过脚本进行覆盖:根据需要添加循环。
GNU coreutils are your friend...
This produces your desired file as
xx00
. You can change this behaviour through the options--prefix
,--suffix
, and--digits
, but see the manual for yourself. Sincecsplit
is designed to produce a number of files, it is not possible to produce a file without suffix, so you will have to do the overwriting manually or through a script:Add loops as you desire.
获取页眉:
获取页脚:
根据需要从页眉到页脚获取文件:
还有一种方法,使用 csplit命令,您应该尝试类似的操作:
并检查名为“xxNN”的文件,其中 NN 正在运行,还请查看 csplit 联机帮助页。
To get the header:
To get the footer:
To get the file from header to footer as you want:
There's one more way, with csplit command, you should try something like:
And examine files named 'xxNN' where NN is running number, also take a look at csplit manpage.
或许? 从不删除开始到结束。
或者...不太确定...但是,如果它有效,也应该删除开始和结束行:
d!
可能取决于sed
的构建你有——不确定。而且,我完全凭记忆(可能很差)写下了这篇文章。
Maybe? Start to Finish with not-delete.
or...less sure of it...but, if it works, should remove the Start and Finish lines as well:
d!
may depend on the build ofsed
you have -- not sure.And, I wrote that entirely on (probably poor) memory.
一个快速的 Perl hack,未经测试。 我对 sed 或 awk 的使用不够流利,无法使用它们获得这种效果,但我对如何做到这一点很感兴趣。
A quick Perl hack, not tested. I am not fluent enough in sed or awk to get this effect with them, but I would be interested in how that would be done.
perlfaq5:如何在文件中更改、删除或插入行中的一些示例,或附加到文件的开头? 可能会有所帮助。 您必须使它们适应您的情况。 另外,Leon 的触发器运算符答案是在 Perl 中执行此操作的惯用方法,尽管您不必修改文件即可使用它。
Some of the examples in perlfaq5: How do I change, delete, or insert a line in a file, or append to the beginning of a file? may help. You'll have to adapt them to your situation. Also, Leon's flip-flop operator answer is the idiomatic way to do this in Perl, although you don't have to modify the file in place to use it.
覆盖原始文件的 Perl 解决方案。
A Perl solution that overwrites the original file.