计算文件中某个模式的出现次数(即使在同一行)
当搜索文件中字符串的出现次数时,我通常使用:
grep pattern file | wc -l
但是,由于 grep 的工作方式,每行只能找到一次出现的情况。如何搜索字符串在文件中出现的次数,无论它们是在同一行还是不同行?
另外,如果我正在搜索正则表达式模式而不是简单的字符串怎么办?我如何计算这些,或者更好的是,在新行上打印每个匹配项?
When searching for number of occurrences of a string in a file, I generally use:
grep pattern file | wc -l
However, this only finds one occurrence per line, because of the way grep works. How can I search for the number of times a string appears in a file, regardless of whether they are on the same or different lines?
Also, what if I'm searching for a regex pattern, not a simple string? How can I count those, or, even better, print each match on a new line?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
要计算所有出现次数,请使用
-o
。试试这个:当然还有
man grep
(:更新
一些建议仅使用
grep -co foo
而不是grep -o foo | wc -l
不。
此快捷方式并非在所有情况下都有效。手册页显示:
这些方法的差异如下所示:
1.
一旦在行中找到匹配项(
a{foo}barfoobar) 只检查了一行并匹配,因此输出为
1
实际上,这里忽略了-o
。您可以使用grep -c
代替。2.
在 (
a{foo}bar{foo}bar
) 行中找到两个匹配项,因为我们明确要求查找。每次出现 (-o
)。每次出现都打印在单独的行上,wc -l
仅计算输出中的行数。 。To count all occurrences, use
-o
. Try this:And
man grep
of course (:Update
Some suggest to use just
grep -co foo
instead ofgrep -o foo | wc -l
.Don't.
This shortcut won't work in all cases. Man page says:
Difference in these approaches is illustrated below:
1.
As soon as the match is found in the line (
a{foo}barfoobar
) the searching stops. Only one line was checked and it matched, so the output is1
. Actually-o
is ignored here and you could just usegrep -c
instead.2.
Two matches are found in the line (
a{foo}bar{foo}bar
) because we explicitly asked to find every occurrence (-o
). Every occurence is printed on a separate line, andwc -l
just counts the number of lines in the output.试试这个:
示例:
Try this:
Sample:
Ripgrep 是 grep 的快速替代品,刚刚引入了
--count-matches
标志允许在 0.9 版本中对 each 匹配进行计数(我使用上面的示例来保持一致):根据 OP 的要求,ripgrep 允许使用正则表达式模式以及(
--regexp
)。它还可以在单独的行上打印每个(行)匹配:
Ripgrep, which is a fast alternative to grep, has just introduced the
--count-matches
flag allowing counting each match in version 0.9 (I'm using the above example to stay consistent):As asked by OP, ripgrep allows for regex pattern as well (
--regexp <PATTERN>
).Also it can print each (line) match on a separate line:
迟来的帖子:
在
awk
中使用搜索正则表达式模式作为记录分隔符 (RS)这允许您的正则表达式跨越
\n
分隔的行(如果您需要的话)。A belated post:
Use the search regex pattern as a Record Separator (RS) in
awk
This allows your regex to span
\n
-delimited lines (if you need it).破解 grep 的颜色函数,并计算它打印出多少个颜色标签:
Hack grep's color function, and count how many color tags it prints out: