grep 查找文件中不同行上的多个字符串(即整个文件,而不是基于行的搜索)?
我想 grep 查找任何行上包含单词 Dansk
、Svenska
或 Norsk
的文件,并带有可用的返回码(因为我真的只喜欢有了包含字符串的信息,我的一句台词比这个更进一步)。
我有许多文件,其中包含如下行:
Disc Title: unknown
Title: 01, Length: 01:33:37.000 Chapters: 33, Cells: 31, Audio streams: 04, Subpictures: 20
Subtitle: 01, Language: ar - Arabic, Content: Undefined, Stream id: 0x20,
Subtitle: 02, Language: bg - Bulgarian, Content: Undefined, Stream id: 0x21,
Subtitle: 03, Language: cs - Czech, Content: Undefined, Stream id: 0x22,
Subtitle: 04, Language: da - Dansk, Content: Undefined, Stream id: 0x23,
Subtitle: 05, Language: de - Deutsch, Content: Undefined, Stream id: 0x24,
(...)
这是我想要的伪代码:
for all files in directory;
if file contains "Dansk" AND "Norsk" AND "Svenska" then
then echo the filename
end
执行此操作的最佳方法是什么?可以用一根线完成吗?
I want to grep for files containing the words Dansk
, Svenska
or Norsk
on any line, with a usable returncode (as I really only like to have the info that the strings are contained, my one-liner goes a little further then this).
I have many files with lines in them like this:
Disc Title: unknown
Title: 01, Length: 01:33:37.000 Chapters: 33, Cells: 31, Audio streams: 04, Subpictures: 20
Subtitle: 01, Language: ar - Arabic, Content: Undefined, Stream id: 0x20,
Subtitle: 02, Language: bg - Bulgarian, Content: Undefined, Stream id: 0x21,
Subtitle: 03, Language: cs - Czech, Content: Undefined, Stream id: 0x22,
Subtitle: 04, Language: da - Dansk, Content: Undefined, Stream id: 0x23,
Subtitle: 05, Language: de - Deutsch, Content: Undefined, Stream id: 0x24,
(...)
Here is the pseudocode of what I want:
for all files in directory;
if file contains "Dansk" AND "Norsk" AND "Svenska" then
then echo the filename
end
What is the best way to do this? Can it be done on one line?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(17)
您可以使用:
如果您还想在隐藏文件中查找:
You can use:
If you want also to find in hidden files:
另一种仅使用 bash 和 grep 的方法:
对于单个文件“test.txt”:
当且仅当该文件包含所有三个文件(任意组合)时,将打印
test.txt
。前两个 grep 不打印任何内容 (-q
),最后一个仅在其他两个通过后才打印文件。如果您想对目录中的每个文件执行此操作:
Yet another way using just bash and grep:
For a single file 'test.txt':
Will print
test.txt
iff the file contains all three (in any combination). The first two greps don't print anything (-q
) and the last only prints the file if the other two have passed.If you want to do it for every file in the directory:
-i
使搜索不区分大小写-r
使文件搜索在文件夹中递归-l
通过管道传输包含找到的单词的文件列表cat -
使下一个 grep 查找传递给它的文件列表。-i
makes search case insensitive-r
makes file search recursive through folders-l
pipes the list of files with the word foundcat -
causes the next grep to look through the files passed to it list.您可以使用 ack 轻松完成此操作:
-l
:返回文件列表-x
:从 STDIN(之前的搜索)获取文件并仅搜索这些文件,您可以继续管道传输,直到获得所需的文件。
You can do this really easily with ack:
-l
: return a list of files-x
: take the files from STDIN (the previous search) and only search those filesAnd you can just keep piping until you get just the files you want.
如何在不同行的文件中查找多个字符串(使用管道符号):
注意:
如果在 grep 中使用双引号
""
,您将必须像这样转义管道:\|
来搜索 Dansk、Norsk 和 Svenska。假设一行只有一种语言。
演练:http://www.cyberciti.biz/常见问题解答/如何在 linux-unix 中使用 grep 命令/
How to grep for multiple strings in file on different lines (Use the pipe symbol):
Notes:
If you use double quotes
""
with your grep, you will have to escape the pipe like this:\|
to search for Dansk, Norsk and Svenska.Assumes that one line has only one language.
Walkthrough: http://www.cyberciti.biz/faq/howto-use-grep-command-in-linux-unix/
这是 Glenn jackman 和 kurumi 答案的混合,它允许使用任意数量的正则表达式,而不是任意数量的固定单词或一组固定的正则表达式。
像这样运行它:
This is a blending of glenn jackman's and kurumi's answers which allows an arbitrary number of regexes instead of an arbitrary number of fixed words or a fixed set of regexes.
Run it like this:
则可以使用 shell 捕获返回值
如果您有 Ruby(1.9+),
you can then catch the return value with the shell
if you have Ruby(1.9+)
这会在多个文件中搜索多个单词:
This searches multiple words in multiple files:
这对我来说效果很好:
如果我只想找到这三个文件的 .sh 文件,那么我可以使用:
Here's what worked well for me:
If I just wanted to find .sh files with these three, then I could have used:
简单地说:
请参阅这篇文章了解更多信息
Simply:
see this post for more info
如果安装了 git
--no-index 搜索当前目录中不受 Git 管理的文件。所以这个命令将在任何目录中工作,无论它是否是 git 存储库。
If you have git installed
The --no-index searches files in the current directory that is not managed by Git. So this command will work in any directory irrespective of whether it is a git repository or not.
扩展 @kurumi 的 awk 答案,这是一个 bash 函数:
用法:
Expanding on @kurumi's awk answer, here's a bash function:
Usage:
我用两步做到了。在一个文件中创建 csv 文件列表
借助此页面评论,我执行了两个无脚本步骤来获取我需要的内容。只需在终端中输入即可:
它完全满足了我的需要 - 打印包含所有三个单词的文件名。
还要注意像
`' "
这样的符号I did that with two steps. Make a list of csv files in one file
With a help of this page comments I made two scriptless steps to get what I needed. Just type into terminal:
it did exactly what I needed - print file names containing all three words.
Also mind the symbols like
`' "
如果您只需要两个搜索词,可以说最易读的方法是运行每个搜索并交叉结果:
If you only need two search terms, arguably the most readable approach is to run each search and intersect the results:
我今天遇到了这个问题,这里的所有俏皮话对我来说都失败了,因为文件名称中包含空格。
这就是我想出的有效方法:
I had this problem today, and all one-liners here failed to me because the files contained spaces in the names.
This is what I came up with that worked:
bash 中用于文件
my_file.txt
的任意列表LIST
的简单单行代码可以是:将
eval
替换为echo 显示,以下命令被评估:
A simple one-liner in bash for an arbitrary list
LIST
for filemy_file.txt
can be:Replacing
eval
withecho
reveals, that the following command is evaluated:要在管道输入中搜索多个字符串(可以预测输入的最大长度),grep context 很有帮助:
To search piped input for multiple strings, where the maximum length of the input can be predicted, grep context is helpful: