需要正则表达式来匹配特殊情况
我正在拼命寻找与这些场景匹配的正则表达式:
1)匹配交替字符
我有一个像“This is my foobababababaf string”这样的字符串 - 我想匹配“babababa”
我唯一知道的是片段的长度搜索 - 我不知道可能是什么字符/数字 - 但它们是交替的。
我真的不知道从哪里开始:(
2)
在像“这是我的 foobaafoobaaaooo 字符串”这样的字符串中匹配组合组 - 我想匹配“aaaooo”。就像 1) 我不知道可能是什么字符/数字。我只知道他们会分两组出现。
我尝试使用 (.)\1\1\1(.)\1\1\1 之类的东西......
I'm desperately searching for regular expressions that match these scenarios:
1) Match alternating chars
I've a string like "This is my foobababababaf string" - and I want to match "babababa"
Only thing I know is the length of the fragment to search - I don't know what chars/digits that might be - but they are alternating.
I've really no clue where to start :(
2) Match combined groups
In a string like "This is my foobaafoobaaaooo string" - and I want to match "aaaooo". Like in 1) I don't know what chars/digits that might be. I only know that they will appear in two groups.
I experimented using (.)\1\1\1(.)\1\1\1 and things like this...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我想这样的事情就是你想要的。
对于交替字符:
\0
将是整个交替序列,\1
和\2
是两个(不同的)交替字符。对于 N 和 M 字符的运行,可能由其他字符分隔(此处将
N
和M
替换为数字):\0
将是整个匹配,包括中缀。\1
是字符重复(至少)N
次,\2
是字符重复(至少)M次。
这是 Java 中的测试工具。
这会产生以下输出:
Explanation
(?=(.)(?!\1)(.))(?:\1\2){2,}
有两部分(?=(.)(?!\1)(.))
使用前瞻建立\1
和\2
\1
!=\2
\0
拥有整个匹配项(而不仅仅是“尾部”)(?:\1\2){2,}
捕获\1\2
序列,该序列必须至少重复两次。(?=(.))\1{N}.*?(?=(?!\1)(.))\2{M}
分为三部分(?=(.))\1{N}
在前瞻中捕获\1
,然后匹配它N
次N
而不是N-1
.*?
允许中缀分隔两个运行,但不愿意使其尽可能短(?=(?!\1)(.))\2{M}
\1
!=\2
运行正则表达式将匹配更长的运行,例如
run(2,2 )
匹配"xxxyyy"
:此外,它不允许重叠匹配。即
“xx11yyy222”
中只有一个run(2,3)
。I think something like this is what you want.
For alternating characters:
\0
will be the entire alternating sequence,\1
and\2
are the two (distinct) alternating characters.For run of N and M characters, possibly separated by other characters (replace
N
andM
with numbers here):\0
will be entire match, including infix.\1
is the character repeated (at least)N
times,\2
is the character repeated (at least)M
times.Here's a test harness in Java.
This produces the following output:
Explanation
(?=(.)(?!\1)(.))(?:\1\2){2,}
has two parts(?=(.)(?!\1)(.))
establishes\1
and\2
using lookahead\1
!=\2
\0
have the entire match (instead of just the "tail" end)(?:\1\2){2,}
captures the\1\2
sequence, which must repeat at least twice.(?=(.))\1{N}.*?(?=(?!\1)(.))\2{M}
has three parts(?=(.))\1{N}
captures\1
in a lookahead, and then match itN
timesN
instead ofN-1
.*?
allows an infix to separate the two runs, reluctant to keep it as short as possible(?=(?!\1)(.))\2{M}
\1
!=\2
The run regex will match longer runs, e.g.
run(2,2)
matches"xxxyyy"
:Also, it does not allow overlapping matches. That is, there is only one
run(2,3)
in"xx11yyy222"
.假设您使用 perl/PCRE:
(.{2})\1+
或((.)(?!\2)(.))\1+
。第二个正则表达式阻止匹配诸如oooo
之类的内容。UPD:那么2.将是
((.)\2{N}).*?((?!\2)(.)\4{M})
。如果您想获得像oooaoooo
这样的匹配,请删除(?!\2)
并将 N 和 M 替换为 n-1 和 m-1。Assuming that you use perl/PCRE:
(.{2})\1+
or((.)(?!\2)(.))\1+
. Second regex prevents matching things likeoooo
.UPD: Then 2. will be
((.)\2{N}).*?((?!\2)(.)\4{M})
. Remove(?!\2)
if you want to get matches likeoooaoooo
and replace N and M with n-1 and m-1.嗯,这适用于第一个......
Well, this works for the first one...
JavaScript 中的示例
Examples in javascript