基于黑名单的grep——没有程序代码?
这是一项众所周知的任务,描述起来很简单:
给定一个文本文件 foo.txt 和一个排除字符串的黑名单文件(每行一个),生成 foo_filtered.txt,其中仅包含 foo.txt 中不包含任何排除的行细绳。
常见的应用程序是从构建日志中过滤编译器警告,但忽略不属于您的文件的警告。文件 foo.txt 是警告文件(本身从构建日志中过滤),以及一个包含文件名的黑名单文件 excepted_filenames.txt,每行一个。
我知道如何使用 Perl 或 AWK 等过程语言来完成此操作,甚至还使用 Linux 命令(如 cut、comm 和 sort)的组合来完成此操作。
但我觉得我应该和xargs很接近,只是看不到最后一步。
我知道如果 excepted_filenames.txt 中只有 1 个文件名,那么
grep -v foo.txt `cat excluded_filenames.txt`
就会执行此操作。
我知道我可以通过以下方式获取每行一个文件名
xargs -L1 -a excluded_filenames.txt
那么如何将这两个组合成一个解决方案,而无需过程语言中的显式循环呢?
寻找简单而优雅的解决方案。
It's a well-known task, simple to describe:
Given a text file foo.txt, and a blacklist file of exclusion strings, one per line, produce foo_filtered.txt that has only the lines of foo.txt that do not contain any exclusion string.
A common application is filtering compiler warnings from a build log, but to ignore warnings on files that are not yours. The file foo.txt is the warnings file (itself filtered from the build log), and a blacklist file excluded_filenames.txt with file names, one per line.
I know how it's done in procedural languages like Perl or AWK, and I've even done it with combinations of Linux commands such as cut, comm, and sort.
But I feel that I should be really close with xargs, and just can't see the last step.
I know that if excluded_filenames.txt has only 1 file name in it, then
grep -v foo.txt `cat excluded_filenames.txt`
will do it.
And I know that I can get the filenames one per line with
xargs -L1 -a excluded_filenames.txt
So how do I combine those two into a single solution, without explicit loops in a procedural language?
Looking for the simple and elegant solution.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您应该使用
-f
选项(或者您可以使用相同的fgrep
):您还可以使用
-F
,它更直接您所问问题的答案:来自
man grep
You should use the
-f
option (or you can usefgrep
which is the same):You could also use
-F
which is more directly the answer to what you asked:from
man grep