如何使用 sed/awk 从文件中删除文本块(模式)
我导入了数千个文本文件,其中包含我想要删除的一段文本。
它不仅仅是一段文本,而是一种模式。
<!--
# Translator(s):
#
# username1 <email1>
# username2 <email2>
# usernameN <emailN>
#
-->
如果该块出现,则将列出 1 个或多个用户及其电子邮件地址。
I have thousands of text files that I have imported that contain a piece of text that I would like to remove.
It is not just a block of text but a pattern.
<!--
# Translator(s):
#
# username1 <email1>
# username2 <email2>
# usernameN <emailN>
#
-->
The block if it appears it will have 1 or more users being listed with their email addresses.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我有另一个小 awk 程序,可以用很少的代码行完成任务。它可用于从文件中删除文本模式。可以设置启动和停止正则表达式。
将代码保存在“remove_email.awk”中并通过以下方式运行:
awk -f remove_email.awk 你的文件
I have another small awk program that accomplish the task in a very few rows of code. It can be used to remove patterns of text from a file. Start as well as stop regexp can be set.
Save the code in 'remove_email.awk' and run it by:
awk -f remove_email.awk yourfile
这个 sed 解决方案可能有效:
另一种选择(也许更好的解决方案?):
这会收集以
结尾的行,然后集合上的模式匹配,即第二行是
# Translator(s):
第三行是#
,第四行以及可能更多行遵循# username
# username <电子邮件地址>
,倒数第二行是#
,最后一行是-->
。如果匹配,则删除整个集合,否则将正常打印。This sed solution might work:
An alternative (perhaps better solution?):
This gathers up the lines that start with
<!--
and end with-->
then pattern matches on the collection i.e. the second line is# Translator(s):
the third line is#
, the fourth and perhaps more lines follow# username <email address>
, the penultimate line is#
and the last line is-->
. If a match is made the entire collection is deleted otherwise it is printed as normal.对于此任务,您需要先行查看,这通常是通过解析器完成的。
另一种解决方案,但不是很有效:
HTH Chris
for this task you need look-ahead, which is normally done with a parser.
Another solution, but not very efficient would be:
HTH Chris
如果我正确理解你的问题,这是我的解决方案。将以下内容保存到名为remove_blocks.awk的文件中:
假设您的文本文件位于data.txt(或许多文件中):
上述命令将打印出文本文件中的所有内容,减去包含用户电子邮件的块。
Here is my solution, if I understood your problem correctly. Save the following to a file called remove_blocks.awk:
Assume that your text file is in data.txt (or many files, for that matter):
The above command will print out everything in the text file, minus the blocks which contain user email.