在同一行中将具有模式的行匹配 n 次

发布于 2024-11-06 22:54:07 字数 299 浏览 0 评论 0原文

我有一个文件，我需要过滤出现（或不出现）N 次模式的行。即，如果我的模式是字母 o 并且我要匹配字母 o 恰好出现 4 次的行，则表达式应匹配以下示例行中的第一行，但是不是其他的：

foo foo  
foo  
foo foo foo

我想我可以使用 vim、sed、awk 或任何其他工具中的正则表达式来完成此操作。我用谷歌搜索过，没有发现有人做过类似的事情。可能会做一个脚本或类似的东西来解析每一行。有人做过类似的事情吗？

谢谢

原文

I have a file and I need to filter lines that have (or don't have) N occurrences of a pattern.
I.e., if my pattern is the letter o and I what to match lines where the letter o occurs exactly 4 times, the expression should match the first of the following example lines but not the others:

foo foo  
foo  
foo foo foo

I thouth I could do it with a regex in vim, or sed, awk, or any other tool.
I've googled and haven't found anyone that has done a similar thing.
Probably will have do a script or something similar to parse each line.
Does anyone have done a similar thing?

Thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

哥，最终变帅啦 2024-11-13 22:54:07

您可以使用如下正则表达式：

(?=(.*o){4})(?!(.*o){5,}).*

Regexr - http://regexr.com?2toro

这应该适用于任何模式你想要的。例如，您想要查找包含四个 foo 的行，请使用：

(?=(.*foo){4})(?!(.*foo){5,}).*

Regexr - http://regexr.com?2tosa< /a>

You can use a regex like below:

(?=(.*o){4})(?!(.*o){5,}).*

Regexr - http://regexr.com?2toro

This should work with any pattern you want. For instance, you want to find lines with exactly four foos in it, use:

(?=(.*foo){4})(?!(.*foo){5,}).*

Regexr - http://regexr.com?2tosa

回复收藏 0 原文

っ左 2024-11-13 22:54:07

Perl 一行代码：

perl -ne 'print if(tr/o/o/ == 4)' foo_file

A Perl one-liner :

perl -ne 'print if(tr/o/o/ == 4)' foo_file

回复收藏 0 原文

失与倦＂ 2024-11-13 22:54:07

perl -lnwe '@c=$_=~/o/g;if(scalar(@c)==4){print $_}' file_to_parse

perl -lnwe '@c=$_=~/o/g;if(scalar(@c)==4){print $_}' file_to_parse

回复收藏 0 原文

迷乱花海 2024-11-13 22:54:07

在 awk 中...

awk '{ if (gsub(/o/, "o") == 4) print }' # lines that matched
awk '{ if (gsub(/o/, "o") != 4) print }' # lines that didn't

如果您要使用不同的模式/匹配计数一遍又一遍地执行此操作，并且模式不是正则表达式，您也可以执行以下操作：

awk -v pattern=o -v matches=4 '{ if (gsub(pattern, pattern) == matches) print }'

In awk...

awk '{ if (gsub(/o/, "o") == 4) print }' # lines that matched
awk '{ if (gsub(/o/, "o") != 4) print }' # lines that didn't

If you're going to be doing this over and over with different patterns/match counts, and pattern isn't a regular expression, you could also do something like...

awk -v pattern=o -v matches=4 '{ if (gsub(pattern, pattern) == matches) print }'

回复收藏 0 原文

浊酒尽余欢 2024-11-13 22:54:07

如果您想编写代码，那么您可以构建基于 DFA 的字符串匹配，或者我会告诉您查看移位或字符串匹配算法，您可以轻松编写。然后，您可以根据算法需要将字符串输入到正确的数据结构中。阅读 http://en.wikipedia.org/wiki/Shift_Or_Algorithm 了解移位或字符串匹配算法。

回复收藏 0 原文