如何使用awk删除文件的一部分
我正在编写一个 shell 脚本,它在某些时候必须获取一个文件,在其中搜索特定单词并删除该单词后面的整个文本(包括单词本身) - 我认为 awk 是正确的工具,但我对其中的编程不太了解。
有人可以帮助我吗?
I'm writing a shell script, which at some point has to take a file, search for a particular word in it and delete the whole text that comes after this word (including the word itself) - awk is the right tool I suppose, but I don't really know much about programming in it.
Could anyone help me?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
我认为“awk”是完成这项工作的工具之一,尽管我认为“sed”对于这个特定的操作更简单。 该规范有点模糊。 简单的版本是:
为此,我会使用“sed”:
更复杂的版本是:
我可能仍然会使用“sed”:
这颠倒了逻辑。 默认情况下,它不打印任何内容,但对于第 1 行,直到包含单词的第一行,它会进行替换(直到包含单词的行为止不执行任何操作),然后打印。
可以在“awk”中完成吗? 这并不完全是微不足道的,因为“awk”会自动将输入行分割成单词,并且因为您必须使用函数来进行替换。
(编辑:将“delete”更改为“found”,因为“delete”是“awk”中的保留字。)
在所有这些示例中,输入文件的截断版本都写入标准输出。 要就地修改文件,您需要使用 Perl 或 Python 或类似语言,或者将输出捕获到临时文件中,在命令完成后将其复制到原始文件上。 (如果您尝试“脚本文件”,您将处理一个空文件。)
有多种早期退出优化可以应用于 sed 和 awk 脚本,例如:
并且,如果您假设使用 GNU 版本的 awk 或 sed ,有各种非标准扩展可以帮助就地修改文件。
I suppose 'awk' is one tool for the job, though I think 'sed' is simpler for this particular operation. The specification is a bit vague. The simple version is:
For that, I'd use 'sed':
The more complex version is:
I'd probably still use 'sed':
This inverts the logic. It doesn't print anything by default, but for lines 1 until the first line containing word it does a substitute (which does nothing until the line containing the word), and then print.
Can it be done in 'awk'? Not completely trivially because 'awk' autosplits input lines into words, and because you have to use functions to do substitutions.
(Edited: change 'delete' to 'found' since 'delete' is a reserved word in 'awk'.)
In all these examples, the truncated version of the input file is written to standard output. To modify the file in situ, you either need to use Perl or Python or a similar language, or you capture the output in a temporary file which you copy over the original once the command has completed. (If you try 'script file' you process an empty file.)
There are various early exit optimizations that could be applied to the sed and awk scripts, such as:
And, if you assume the use of the GNU versions of awk or sed, there are various non-standard extensions that can help with in-situ modification of the file.
我假设您的输入是这样的:
并且您希望输出在单词
'vel'
处被切断,如下所示:在这种情况下,您的 awk 脚本将是:
您想要在需要处截断的单词需要替换脚本中单词
vel
的两个实例。您也可以安全地将整个脚本放在一行上。
I'm assuming your input is something like this:
and you want the output to be cut off at the word
'vel'
like so:In that case, your awk script would be:
The word you want to cut off at needs to replace both instances of the word
vel
in the script.You can safely put the entire script on one line, too.
我不确定如何使用 awk 执行此操作,但您可以使用 sed 执行此操作:
这将删除从
the-word-to-find
到行尾的所有内容,在每一行上包含要查找的单词
。 如果您想在第一次出现the-word-to-find
时删除文件的其余部分,您可以这样做:I'm not sure how to do it with awk, but you could do it with sed:
This will delete everything from
the-word-to-find
to the end of the line, on every line that containsthe-word-to-find
. If you want to delete the rest of the file upon the first occurrence ofthe-word-to-find
, you could do:这个 awk 一行应该可以解决问题:
{ sub(/ 单词。*/, ""); 打印 }
对于每一行,如果该行包含以单词开头(以空格开头)并到达该行末尾的模式 - 用空字符串替换该模式 - 然后打印更新的行。
[认为问题可以以任何一种方式阅读(该行的整个文本或文件中的整个文本)。 如果想跳过文件的其余部分,可以: {skip = gsub(/ word.*/, ""); 打印 ; if (跳过) 退出 } ]
This awk one-liner should do the trick:
{ sub(/ word.*/, ""); print }
For every line, if the line contains a pattern that starts with word (proceeded by space) and goes to the end of the line - replace the pattern with the empty string - then print the updated line.
[ Figured the question could read either way (whole text on that line or whole text in the file). If one wanted to skip the rest of the file one could: { skip = gsub(/ word.*/, ""); print ; if (skip) exit } ]
使用 sed 删除部分行,例如:
To delete part of line with sed, eg: