grep 查找文件中不同行上的多个字符串(即整个文件,而不是基于行的搜索)?

发布于 2024-10-14 01:49:10 字数 911 浏览 1 评论 0原文

我想 grep 查找任何行上包含单词 DanskSvenskaNorsk 的文件,并带有可用的返回码(因为我真的只喜欢有了包含字符串的信息,我的一句台词比这个更进一步)。

我有许多文件,其中包含如下行:

Disc Title: unknown
Title: 01, Length: 01:33:37.000 Chapters: 33, Cells: 31, Audio streams: 04, Subpictures: 20
        Subtitle: 01, Language: ar - Arabic, Content: Undefined, Stream id: 0x20, 
        Subtitle: 02, Language: bg - Bulgarian, Content: Undefined, Stream id: 0x21, 
        Subtitle: 03, Language: cs - Czech, Content: Undefined, Stream id: 0x22, 
        Subtitle: 04, Language: da - Dansk, Content: Undefined, Stream id: 0x23, 
        Subtitle: 05, Language: de - Deutsch, Content: Undefined, Stream id: 0x24, 
(...)

这是我想要的伪代码:

for all files in directory;
 if file contains "Dansk" AND "Norsk" AND "Svenska" then
 then echo the filename
end

执行此操作的最佳方法是什么?可以用一根线完成吗?

I want to grep for files containing the words Dansk, Svenska or Norsk on any line, with a usable returncode (as I really only like to have the info that the strings are contained, my one-liner goes a little further then this).

I have many files with lines in them like this:

Disc Title: unknown
Title: 01, Length: 01:33:37.000 Chapters: 33, Cells: 31, Audio streams: 04, Subpictures: 20
        Subtitle: 01, Language: ar - Arabic, Content: Undefined, Stream id: 0x20, 
        Subtitle: 02, Language: bg - Bulgarian, Content: Undefined, Stream id: 0x21, 
        Subtitle: 03, Language: cs - Czech, Content: Undefined, Stream id: 0x22, 
        Subtitle: 04, Language: da - Dansk, Content: Undefined, Stream id: 0x23, 
        Subtitle: 05, Language: de - Deutsch, Content: Undefined, Stream id: 0x24, 
(...)

Here is the pseudocode of what I want:

for all files in directory;
 if file contains "Dansk" AND "Norsk" AND "Svenska" then
 then echo the filename
end

What is the best way to do this? Can it be done on one line?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(17

但可醉心 2024-10-21 01:49:10

您可以使用:

grep -l Dansk * | xargs grep -l Norsk | xargs grep -l Svenska

如果您还想在隐藏文件中查找:

grep -l Dansk .* | xargs grep -l Norsk | xargs grep -l Svenska

You can use:

grep -l Dansk * | xargs grep -l Norsk | xargs grep -l Svenska

If you want also to find in hidden files:

grep -l Dansk .* | xargs grep -l Norsk | xargs grep -l Svenska
紫瑟鸿黎 2024-10-21 01:49:10

另一种仅使用 bash 和 grep 的方法:

对于单个文件“test.txt”:

  grep -q Dansk test.txt && grep -q Norsk test.txt && grep -l Svenska test.txt

当且仅当该文件包含所有三个文件(任意组合)时,将打印 test.txt。前两个 grep 不打印任何内容 (-q),最后一个仅在其他两个通过后才打印文件。

如果您想对目录中的每个文件执行此操作:

   for f in *; do grep -q Dansk $f && grep -q Norsk $f && grep -l Svenska $f; done

Yet another way using just bash and grep:

For a single file 'test.txt':

  grep -q Dansk test.txt && grep -q Norsk test.txt && grep -l Svenska test.txt

Will print test.txt iff the file contains all three (in any combination). The first two greps don't print anything (-q) and the last only prints the file if the other two have passed.

If you want to do it for every file in the directory:

   for f in *; do grep -q Dansk $f && grep -q Norsk $f && grep -l Svenska $f; done
人疚 2024-10-21 01:49:10
grep –irl word1 * | grep –il word2 `cat -` | grep –il word3 `cat -`
  • -i 使搜索不区分大小写
  • -r 使文件搜索在文件夹中递归
  • -l 通过管道传输包含找到的单词的文件列表
  • cat - 使下一个 grep 查找传递给它的文件列表。
grep –irl word1 * | grep –il word2 `cat -` | grep –il word3 `cat -`
  • -i makes search case insensitive
  • -r makes file search recursive through folders
  • -l pipes the list of files with the word found
  • cat - causes the next grep to look through the files passed to it list.
煮酒 2024-10-21 01:49:10

您可以使用 ack 轻松完成此操作:

ack -l 'cats' | ack -xl 'dogs'
  • -l:返回文件列表
  • -x:从 STDIN(之前的搜索)获取文件并仅搜索这些文件

,您可以继续管道传输,直到获得所需的文件。

You can do this really easily with ack:

ack -l 'cats' | ack -xl 'dogs'
  • -l: return a list of files
  • -x: take the files from STDIN (the previous search) and only search those files

And you can just keep piping until you get just the files you want.

甩你一脸翔 2024-10-21 01:49:10

如何在不同行的文件中查找多个字符串(使用管道符号):

for file in *;do 
   test $(grep -E 'Dansk|Norsk|Svenska' $file | wc -l) -ge 3 && echo $file
done

注意:

  1. 如果在 grep 中使用双引号 "",您将必须像这样转义管道: \| 来搜索 Dansk、Norsk 和 Svenska。

  2. 假设一行只有一种语言。

演练:http://www.cyberciti.biz/常见问题解答/如何在 linux-unix 中使用 grep 命令/

How to grep for multiple strings in file on different lines (Use the pipe symbol):

for file in *;do 
   test $(grep -E 'Dansk|Norsk|Svenska' $file | wc -l) -ge 3 && echo $file
done

Notes:

  1. If you use double quotes "" with your grep, you will have to escape the pipe like this: \| to search for Dansk, Norsk and Svenska.

  2. Assumes that one line has only one language.

Walkthrough: http://www.cyberciti.biz/faq/howto-use-grep-command-in-linux-unix/

懵少女 2024-10-21 01:49:10

这是 Glenn jackman 和 kurumi 答案的混合,它允许使用任意数量的正则表达式,而不是任意数量的固定单词或一组固定的正则表达式。

#!/usr/bin/awk -f
# by Dennis Williamson - 2011-01-25

BEGIN {
    for (i=ARGC-2; i>=1; i--) {
        patterns[ARGV[i]] = 0;
        delete ARGV[i];
    }
}

{
    for (p in patterns)
        if ($0 ~ p)
            matches[p] = 1
            # print    # the matching line could be printed
}

END {
    for (p in patterns) {
        if (matches[p] != 1)
            exit 1
    }
}

像这样运行它:

./multigrep.awk Dansk Norsk Svenska 'Language: .. - A.*c' dvdfile.dat

This is a blending of glenn jackman's and kurumi's answers which allows an arbitrary number of regexes instead of an arbitrary number of fixed words or a fixed set of regexes.

#!/usr/bin/awk -f
# by Dennis Williamson - 2011-01-25

BEGIN {
    for (i=ARGC-2; i>=1; i--) {
        patterns[ARGV[i]] = 0;
        delete ARGV[i];
    }
}

{
    for (p in patterns)
        if ($0 ~ p)
            matches[p] = 1
            # print    # the matching line could be printed
}

END {
    for (p in patterns) {
        if (matches[p] != 1)
            exit 1
    }
}

Run it like this:

./multigrep.awk Dansk Norsk Svenska 'Language: .. - A.*c' dvdfile.dat
ま昔日黯然 2024-10-21 01:49:10
awk '/Dansk/{a=1}/Norsk/{b=1}/Svenska/{c=1}END{ if (a && b && c) print "0" }' 

则可以使用 shell 捕获返回值

如果您有 Ruby(1.9+),

ruby -0777 -ne 'print if /Dansk/ and /Norsk/ and /Svenka/' file
awk '/Dansk/{a=1}/Norsk/{b=1}/Svenska/{c=1}END{ if (a && b && c) print "0" }' 

you can then catch the return value with the shell

if you have Ruby(1.9+)

ruby -0777 -ne 'print if /Dansk/ and /Norsk/ and /Svenka/' file
全部不再 2024-10-21 01:49:10

这会在多个文件中搜索多个单词:

egrep 'abc|xyz' file1 file2 ..filen 

This searches multiple words in multiple files:

egrep 'abc|xyz' file1 file2 ..filen 
晨光如昨 2024-10-21 01:49:10

这对我来说效果很好:

find . -path '*/.svn' -prune -o -type f -exec gawk '/Dansk/{a=1}/Norsk/{b=1}/Svenska/{c=1}END{ if (a && b && c) print FILENAME }' {} \;
./path/to/file1.sh
./another/path/to/file2.txt
./blah/foo.php

如果我只想找到这三个文件的 .sh 文件,那么我可以使用:

find . -path '*/.svn' -prune -o -type f -name "*.sh" -exec gawk '/Dansk/{a=1}/Norsk/{b=1}/Svenska/{c=1}END{ if (a && b && c) print FILENAME }' {} \;
./path/to/file1.sh

Here's what worked well for me:

find . -path '*/.svn' -prune -o -type f -exec gawk '/Dansk/{a=1}/Norsk/{b=1}/Svenska/{c=1}END{ if (a && b && c) print FILENAME }' {} \;
./path/to/file1.sh
./another/path/to/file2.txt
./blah/foo.php

If I just wanted to find .sh files with these three, then I could have used:

find . -path '*/.svn' -prune -o -type f -name "*.sh" -exec gawk '/Dansk/{a=1}/Norsk/{b=1}/Svenska/{c=1}END{ if (a && b && c) print FILENAME }' {} \;
./path/to/file1.sh
乖乖 2024-10-21 01:49:10

简单地说:

grep 'word1\|word2\|word3' *

请参阅这篇文章了解更多信息

Simply:

grep 'word1\|word2\|word3' *

see this post for more info

乖乖 2024-10-21 01:49:10

如果安装了 git

git grep -l --all-match --no-index -e Dansk -e Norsk -e Svenska

--no-index 搜索当前目录中不受 Git 管理的文件。所以这个命令将在任何目录中工作,无论它是否是 git 存储库。

If you have git installed

git grep -l --all-match --no-index -e Dansk -e Norsk -e Svenska

The --no-index searches files in the current directory that is not managed by Git. So this command will work in any directory irrespective of whether it is a git repository or not.

幸福丶如此 2024-10-21 01:49:10

扩展 @kurumi 的 awk 答案,这是一个 bash 函数:

all_word_search() {
    gawk '
        BEGIN {
            for (i=ARGC-2; i>=1; i--) {
                search_terms[ARGV[i]] = 0;
                ARGV[i] = ARGV[i+1];
                delete ARGV[i+1];
            }
        }
        {
            for (i=1;i<=NF; i++) 
                if ($i in search_terms) 
                    search_terms[$1] = 1
        }
        END {
            for (word in search_terms) 
                if (search_terms[word] == 0) 
                    exit 1
        }
    ' "$@"
    return $?
}

用法:

if all_word_search Dansk Norsk Svenska filename; then
    echo "all words found"
else
    echo "not all words found"
fi

Expanding on @kurumi's awk answer, here's a bash function:

all_word_search() {
    gawk '
        BEGIN {
            for (i=ARGC-2; i>=1; i--) {
                search_terms[ARGV[i]] = 0;
                ARGV[i] = ARGV[i+1];
                delete ARGV[i+1];
            }
        }
        {
            for (i=1;i<=NF; i++) 
                if ($i in search_terms) 
                    search_terms[$1] = 1
        }
        END {
            for (word in search_terms) 
                if (search_terms[word] == 0) 
                    exit 1
        }
    ' "$@"
    return $?
}

Usage:

if all_word_search Dansk Norsk Svenska filename; then
    echo "all words found"
else
    echo "not all words found"
fi
灰色世界里的红玫瑰 2024-10-21 01:49:10

我用两步做到了。在一个文件中创建 csv 文件列表
借助此页面评论,我执行了两个无脚本步骤来获取我需要的内容。只需在终端中输入即可:

$ find /csv/file/dir -name '*.csv' > csv_list.txt
$ grep -q Svenska `cat csv_list.txt` && grep -q Norsk `cat csv_list.txt` && grep -l Dansk `cat csv_list.txt`

它完全满足了我的需要 - 打印包含所有三个单词的文件名。

还要注意像 `' " 这样的符号

I did that with two steps. Make a list of csv files in one file
With a help of this page comments I made two scriptless steps to get what I needed. Just type into terminal:

$ find /csv/file/dir -name '*.csv' > csv_list.txt
$ grep -q Svenska `cat csv_list.txt` && grep -q Norsk `cat csv_list.txt` && grep -l Dansk `cat csv_list.txt`

it did exactly what I needed - print file names containing all three words.

Also mind the symbols like `' "

悲欢浪云 2024-10-21 01:49:10

如果您只需要两个搜索词,可以说最易读的方法是运行每个搜索并交叉结果:

 comm -12 <(grep -rl word1 . | sort) <(grep -rl word2 . | sort)

If you only need two search terms, arguably the most readable approach is to run each search and intersect the results:

 comm -12 <(grep -rl word1 . | sort) <(grep -rl word2 . | sort)
顾铮苏瑾 2024-10-21 01:49:10

我今天遇到了这个问题,这里的所有俏皮话对我来说都失败了,因为文件名称中包含空格。

这就是我想出的有效方法:

grep -ril <WORD1> | sed 's/.*/"&"/' | xargs grep -il <WORD2>

I had this problem today, and all one-liners here failed to me because the files contained spaces in the names.

This is what I came up with that worked:

grep -ril <WORD1> | sed 's/.*/"&"/' | xargs grep -il <WORD2>
挽心 2024-10-21 01:49:10

bash 中用于文件 my_file.txt 的任意列表 LIST 的简单单行代码可以是:

LIST="Dansk Norsk Svenska"
EVAL=$(echo "$LIST" | sed 's/[^ ]* */grep -q & my_file.txt \&\& /g'); eval "$EVAL echo yes || echo no"

eval 替换为 echo 显示,以下命令被评估:

grep -q Dansk  my_file.txt && grep -q Norsk  my_file.txt && grep -q Svenska my_file.txt &&  echo yes || echo no

A simple one-liner in bash for an arbitrary list LIST for file my_file.txt can be:

LIST="Dansk Norsk Svenska"
EVAL=$(echo "$LIST" | sed 's/[^ ]* */grep -q & my_file.txt \&\& /g'); eval "$EVAL echo yes || echo no"

Replacing eval with echo reveals, that the following command is evaluated:

grep -q Dansk  my_file.txt && grep -q Norsk  my_file.txt && grep -q Svenska my_file.txt &&  echo yes || echo no
孤云独去闲 2024-10-21 01:49:10

要在管道输入中搜索多个字符串(可以预测输入的最大长度),grep context 很有帮助:

content_generator | grep -C 1000 Dansk | grep -C 1000 Norsk | grep Svenska

To search piped input for multiple strings, where the maximum length of the input can be predicted, grep context is helpful:

content_generator | grep -C 1000 Dansk | grep -C 1000 Norsk | grep Svenska
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文