基于黑名单的grep——没有程序代码?

发布于 2024-12-09 11:04:17 字数 653 浏览 0 评论 0原文

这是一项众所周知的任务,描述起来很简单:

给定一个文本文件 foo.txt 和一个排除字符串的黑名单文件(每行一个),生成 foo_filtered.txt,其中仅包含 foo.txt 中不包含任何排除的行细绳。

常见的应用程序是从构建日志中过滤编译器警告,但忽略不属于您的文件的警告。文件 foo.txt 是警告文件(本身从构建日志中过滤),以及一个包含文件名的黑名单文件 excepted_filenames.txt,每行一个。

我知道如何使用 Perl 或 AWK 等过程语言来完成此操作,甚至还使用 Linux 命令(如 cut、comm 和 sort)的组合来完成此操作。

但我觉得我应该和xargs很接近,只是看不到最后一步。

我知道如果 excepted_filenames.txt 中只有 1 个文件名,那么

grep -v foo.txt `cat excluded_filenames.txt`

就会执行此操作。

我知道我可以通过以下方式获取每行一个文件名

xargs -L1 -a excluded_filenames.txt

那么如何将这两个组合成一个解决方案,而无需过程语言中的显式循环呢?

寻找简单而优雅的解决方案。

It's a well-known task, simple to describe:

Given a text file foo.txt, and a blacklist file of exclusion strings, one per line, produce foo_filtered.txt that has only the lines of foo.txt that do not contain any exclusion string.

A common application is filtering compiler warnings from a build log, but to ignore warnings on files that are not yours. The file foo.txt is the warnings file (itself filtered from the build log), and a blacklist file excluded_filenames.txt with file names, one per line.

I know how it's done in procedural languages like Perl or AWK, and I've even done it with combinations of Linux commands such as cut, comm, and sort.

But I feel that I should be really close with xargs, and just can't see the last step.

I know that if excluded_filenames.txt has only 1 file name in it, then

grep -v foo.txt `cat excluded_filenames.txt`

will do it.

And I know that I can get the filenames one per line with

xargs -L1 -a excluded_filenames.txt

So how do I combine those two into a single solution, without explicit loops in a procedural language?

Looking for the simple and elegant solution.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

北城挽邺 2024-12-16 11:04:17

您应该使用 -f 选项(或者您可以使用相同的 fgrep):

grep -vf excluded_filenames.txt foo.txt

您还可以使用 -F ,它更直接您所问问题的答案:

grep -vF "`cat excluded_filenames.txt`" foo.txt

来自 man grep

-f FILE, --file=FILE
          Obtain patterns from FILE, one per line.  The empty file contains zero patterns, and therefore matches nothing.

-F, --fixed-strings
          Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched.

You should use the -f option (or you can use fgrep which is the same):

grep -vf excluded_filenames.txt foo.txt

You could also use -F which is more directly the answer to what you asked:

grep -vF "`cat excluded_filenames.txt`" foo.txt

from man grep

-f FILE, --file=FILE
          Obtain patterns from FILE, one per line.  The empty file contains zero patterns, and therefore matches nothing.

-F, --fixed-strings
          Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文