计算文件中某个模式的出现次数(即使在同一行)

发布于 2024-09-03 03:20:36 字数 219 浏览 7 评论 0原文

当搜索文件中字符串的出现次数时,我通常使用:

grep pattern file | wc -l

但是,由于 grep 的工作方式,每行只能找到一次出现的情况。如何搜索字符串在文件中出现的次数,无论它们是在同一行还是不同行?

另外,如果我正在搜索正则表达式模式而不是简单的字符串怎么办?我如何计算这些,或者更好的是,在新行上打印每个匹配项?

When searching for number of occurrences of a string in a file, I generally use:

grep pattern file | wc -l

However, this only finds one occurrence per line, because of the way grep works. How can I search for the number of times a string appears in a file, regardless of whether they are on the same or different lines?

Also, what if I'm searching for a regex pattern, not a simple string? How can I count those, or, even better, print each match on a new line?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

标点 2024-09-10 03:20:36

要计算所有出现次数,请使用 -o。试试这个:

echo afoobarfoobar | grep -o foo | wc -l

当然还有 man grep (:

更新

一些建议仅使用 grep -co foo 而不是 grep -o foo | wc -l

不。

此快捷方式并非在所有情况下都有效。手册页显示:

-c print a count of matching lines

这些方法的差异如下所示:

1.

$ echo afoobarfoobar | grep -oc foo
1

一旦在行中找到匹配项(a{foo}barfoobar) 只检查了一行并匹配,因此输出为 1 实际上,这里忽略了 -o。您可以使用 grep -c 代替。

2.

$ echo afoobarfoobar | grep -o foo
foo
foo

$ echo afoobarfoobar | grep -o foo | wc -l
2

在 (a{foo}bar{foo}bar) 行中找到两个匹配项,因为我们明确要求查找。每次出现 (-o)。每次出现都打印在单独的行上,wc -l 仅计算输出中的行数。 。

To count all occurrences, use -o. Try this:

echo afoobarfoobar | grep -o foo | wc -l

And man grep of course (:

Update

Some suggest to use just grep -co foo instead of grep -o foo | wc -l.

Don't.

This shortcut won't work in all cases. Man page says:

-c print a count of matching lines

Difference in these approaches is illustrated below:

1.

$ echo afoobarfoobar | grep -oc foo
1

As soon as the match is found in the line (a{foo}barfoobar) the searching stops. Only one line was checked and it matched, so the output is 1. Actually -o is ignored here and you could just use grep -c instead.

2.

$ echo afoobarfoobar | grep -o foo
foo
foo

$ echo afoobarfoobar | grep -o foo | wc -l
2

Two matches are found in the line (a{foo}bar{foo}bar) because we explicitly asked to find every occurrence (-o). Every occurence is printed on a separate line, and wc -l just counts the number of lines in the output.

混吃等死 2024-09-10 03:20:36

试试这个:

grep "string to search for" FileNameToSearch | cut -d ":" -f 4 | sort -n | uniq -c

示例:

grep "SMTP connect from unknown" maillog | cut -d ":" -f 4 | sort -n | uniq -c
  6  SMTP connect from unknown [188.190.118.90]
 54  SMTP connect from unknown [62.193.131.114]
  3  SMTP connect from unknown [91.222.51.253]

Try this:

grep "string to search for" FileNameToSearch | cut -d ":" -f 4 | sort -n | uniq -c

Sample:

grep "SMTP connect from unknown" maillog | cut -d ":" -f 4 | sort -n | uniq -c
  6  SMTP connect from unknown [188.190.118.90]
 54  SMTP connect from unknown [62.193.131.114]
  3  SMTP connect from unknown [91.222.51.253]
も让我眼熟你 2024-09-10 03:20:36

Ripgrep 是 grep 的快速替代品,刚刚引入了--count-matches 标志允许在 0.9 版本中对 each 匹配进行计数(我使用上面的示例来保持一致):

> echo afoobarfoobar | rg --count foo
1
> echo afoobarfoobar | rg --count-matches foo
2

根据 OP 的要求,ripgrep 允许使用正则表达式模式以及(--regexp)。
它还可以在单​​独的行上打印每个(行)匹配:

> echo -e "line1foo\nline2afoobarfoobar" | rg foo
line1foo
line2afoobarfoobar

Ripgrep, which is a fast alternative to grep, has just introduced the --count-matches flag allowing counting each match in version 0.9 (I'm using the above example to stay consistent):

> echo afoobarfoobar | rg --count foo
1
> echo afoobarfoobar | rg --count-matches foo
2

As asked by OP, ripgrep allows for regex pattern as well (--regexp <PATTERN>).
Also it can print each (line) match on a separate line:

> echo -e "line1foo\nline2afoobarfoobar" | rg foo
line1foo
line2afoobarfoobar
猫性小仙女 2024-09-10 03:20:36

迟来的帖子:
awk 中使用搜索正则表达式模式作为记录分隔符 (RS)
这允许您的正则表达式跨越 \n 分隔的行(如果您需要的话)。

printf 'X \n moo X\n XX\n' | 
   awk -vRS='X[^X]*X' 'END{print (NR<2?0:NR-1)}'

A belated post:
Use the search regex pattern as a Record Separator (RS) in awk
This allows your regex to span \n-delimited lines (if you need it).

printf 'X \n moo X\n XX\n' | 
   awk -vRS='X[^X]*X' 'END{print (NR<2?0:NR-1)}'
一场春暖 2024-09-10 03:20:36

破解 grep 的颜色函数,并计算它打印出多少个颜色标签:

echo -e "a\nb  b b\nc\ndef\nb e brb\nr" \
| GREP_COLOR="033" grep --color=always  b \
| perl -e 'undef $/; $_=<>; s/\n//g; s/\x1b\x5b\x30\x33\x33/\n/g; print $_' \
| wc -l

Hack grep's color function, and count how many color tags it prints out:

echo -e "a\nb  b b\nc\ndef\nb e brb\nr" \
| GREP_COLOR="033" grep --color=always  b \
| perl -e 'undef $/; $_=<>; s/\n//g; s/\x1b\x5b\x30\x33\x33/\n/g; print $_' \
| wc -l
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文