在bash中,如何在一个文件中找到一个模式,而与另一个文件的任何行不匹配?

发布于 2025-02-02 11:39:20 字数 1017 浏览 4 评论 0原文

我如何在一个文件中找到一个模式,该模式与我知道GREP具有-f选项的另一个文件的任何行匹配

,因此,我可以将其馈送为模式的文件,而不是给GREP添加模式。

(aa是我的主文件)

user@system:~/test# cat a.a
Were Alexander-ZBn1gozZoEM.mp4
Will Ate-vP-2ahd8pHY.mp4

(pp是我的模式文件)

user@system:~/test# cat p.p
ZBn1gozZoEM
0maL4cQ8zuU
vP-2ahd8pHY

,因此命令可能是

somekindofgrep pp aa

,但它应该给出0mal4cq8zuu,它是该模式的模式模式文件,PP,与文件AA中的任何内容都不匹配,

我不确定要做什么命令。

$grep -f p.p a.a<ENTER>
Were Alexander-ZBn1gozZoEM.mp4
Will Ate-vP-2ahd8pHY.mp4
$

我知道,如果AA中有一条其他行与PP中的任何模式不匹配,则grep -f pp aa不会显示它。如果我做grep -v -v -f pp aa,那么它只会显示AA行,而在PP中不匹配

,但我有兴趣在(我的模式文件)中找到什么模式PP不匹配AA!

我看着让Grep Print缺少查询,但他想要两个文件中的所有内容。而且,那里的一个答案之一提到-v,但我看不到适用于我的案件的答案,因为-v显示了不匹配任何模式的文件的行。因此,拥有或没有-v不会帮助我,因为我正在寻找与文件的任何行不匹配的模式。

How can I find a pattern in one file that doesn't match any line of another file

I'm aware that grep has a -f option, so instead of feeding grep a pattern, I can feed it a file of patterns.

(a.a is my main file)

user@system:~/test# cat a.a
Were Alexander-ZBn1gozZoEM.mp4
Will Ate-vP-2ahd8pHY.mp4

(p.p is my file of patterns)

user@system:~/test# cat p.p
ZBn1gozZoEM
0maL4cQ8zuU
vP-2ahd8pHY

So the command might be something like

somekindofgrep p.p a.a

but it should give 0maL4cQ8zuU which is the pattern in the file of patterns, p.p, that doesn't match anything in the file a.a

I am not sure what command to do.

$grep -f p.p a.a<ENTER>
Were Alexander-ZBn1gozZoEM.mp4
Will Ate-vP-2ahd8pHY.mp4
$

I know that if there was an additional line in a.a not matched by any pattern in p.p, then grep -f p.p a.a won't show it. And if I do grep -v -f p.p a.a then it'd only show that line of a.a, not matched in p.p

But i'm interested in finding what pattern in (my file of patterns) p.p doesn't match a.a!

I looked at Make grep print missing queries but he wants everything from both files. And also, one of the answers there mentions -v but I can't quite see that applying to my case because -v shows the lines of a file that don't match any pattern. So having or not having -v won't help me there, because i'm looking for a pattern that doesn't match any line of a file.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

放肆 2025-02-09 11:39:20

建议awk扫描aa一次:

script.awk

FNR==NR{wordsArr[$0] = 1; next} # read patterns list from 1st file into array wordsArr
{ # for each line in 2nd file
  for (i in wordsArr){ # iterate over all patterns in array
    if ($0 ~ i) delete wordsArr[i]; # if pattern is matched to current line remove the pattern from array
  }
}
END {for (i in wordsArr) print "Unmatched: " i} # print all patterns left in wordsArray

运行:script.awk

awk -f script.awk p.p a.a

testing:

pp pp

aa
bb
cc
dd
ee

<代码> AA

ddd
eee
ggg
fff
aaa

测试:

awk -f script.awk p.p a.a
Unmatched: bb
Unmatched: cc

Suggesting awk script that scans a.a once:

script.awk

FNR==NR{wordsArr[$0] = 1; next} # read patterns list from 1st file into array wordsArr
{ # for each line in 2nd file
  for (i in wordsArr){ # iterate over all patterns in array
    if ($0 ~ i) delete wordsArr[i]; # if pattern is matched to current line remove the pattern from array
  }
}
END {for (i in wordsArr) print "Unmatched: " i} # print all patterns left in wordsArray

running: script.awk

awk -f script.awk p.p a.a

Testing:

p.p

aa
bb
cc
dd
ee

a.a

ddd
eee
ggg
fff
aaa

test:

awk -f script.awk p.p a.a
Unmatched: bb
Unmatched: cc
无语# 2025-02-09 11:39:20

自制脚本:

#!/bin/bash

if [[ $# -eq 2 ]]
then
    patterns="$1"
    mainfile="$2"

    if [[ ! -f "$patterns" ]]
    then
        echo "ERROR: file $patterns does not exist."
        exit 1
    fi
    if [[ ! -f "$mainfile" ]]
    then
        echo "ERROR: file $mainfile does not exist."
        exit 1
    fi
else
    echo "Usage: $0 <PATTERNS FILE> <MAIN FILE>"
    exit 1
fi

while IFS= read -r pattern
do
    if [[ ! grep -q "$pattern" "$mainfile" ]]
    then
        echo "$pattern"
    fi
done < "$patterns"

就像建议的User1934428一样,此脚本在文件pp中的模式上循环,并打印出文件aa中未找到的任何模式。

Home made script:

#!/bin/bash

if [[ $# -eq 2 ]]
then
    patterns="$1"
    mainfile="$2"

    if [[ ! -f "$patterns" ]]
    then
        echo "ERROR: file $patterns does not exist."
        exit 1
    fi
    if [[ ! -f "$mainfile" ]]
    then
        echo "ERROR: file $mainfile does not exist."
        exit 1
    fi
else
    echo "Usage: $0 <PATTERNS FILE> <MAIN FILE>"
    exit 1
fi

while IFS= read -r pattern
do
    if [[ ! grep -q "$pattern" "$mainfile" ]]
    then
        echo "$pattern"
    fi
done < "$patterns"

Like user1934428 suggested, this script loops on the patterns in file p.p and prints out any pattern that is not found in file a.a.

朮生 2025-02-09 11:39:20

这是基于您要做的事情的一种可能解释的可能解决方案(pp与第一个- 以及aa的行中的最后一个):

$ awk '
    NR==FNR {
        sub(/[^-]*-/,"")
        sub(/\.[^.]*$/,"")
        file1[$0]
        next
    }
    !($0 in file1)
' a.a p.p
0maL4cQ8zuU

以上将使用每个Unix框中的任何Shell中的任何尴尬来稳健,便便且有效地工作。它将运行的数量级比当前的壳循环答案快,比现有的尴尬答案或XARGS答案快,并且无论任一个文件中存在哪个字符,包括REGEXP Metachars中的哪个字符,以及是否来自其中包括REGEXP Metachars,以及是否来自pp作为子字符串或其他上下文中存在于aa中。无论输入文件中有什么问题,它也将具有零安全性问题。

Here's a possible solution based on one possible interpretation of what it is you're trying to do (a full-string match on the lines in p.p against the substrings between the first - and the last . in the lines in a.a):

$ awk '
    NR==FNR {
        sub(/[^-]*-/,"")
        sub(/\.[^.]*$/,"")
        file1[$0]
        next
    }
    !($0 in file1)
' a.a p.p
0maL4cQ8zuU

The above will work robustly, portably, and efficiently using any awk in any shell on every Unix box. It'll run orders of magnitude faster than the current shell loop answer, faster than the existing awk answer or the xargs answer, and will work no matter which characters exist in either file, regexp metachars included, and whether or not the search strings from p.p exist as substrings or in other contexts in a.a. It also will have zero security concerns no matter what is in the input files.

幸福丶如此 2025-02-09 11:39:20
# grep p.p pattern in a.a and output pattern 
# if grep is true (pattern matched in a.a)
xargs -i sh -c 'grep -q "{}" a.a && echo "{}"' < p.p
# if grep is false (pattern NOT matched in a.a <--- what you need)
xargs -i sh -c 'grep -q "{}" a.a || echo "{}"' < p.p
# grep p.p pattern in a.a and output pattern 
# if grep is true (pattern matched in a.a)
xargs -i sh -c 'grep -q "{}" a.a && echo "{}"' < p.p
# if grep is false (pattern NOT matched in a.a <--- what you need)
xargs -i sh -c 'grep -q "{}" a.a || echo "{}"' < p.p
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文