正则表达式帮助:我的正则表达式模式将匹配无效字符串
我想要验证的文本字符串由我所说的“段”组成。单个段可能如下所示:
[A-Z,S,3]
到目前为止,我设法构建了这个正则表达式模式,
(?:\[(?<segment>[^,\]\[}' ]+?,[S|D],\d{1})\])+?
它可以工作,但即使整个文本字符串包含无效文本,它也会返回匹配项。我想我需要在模式中的某个地方使用 ^
和 $
但我不知道如何!?
我希望我的模式产生以下结果:
[AZ,S,3][A-Za-z0-9åäöÅäÖ,D,4]
OK(两段)[AZ,S,3]aaaa[A-Za-z0-9åäöÅäÖ,D,4]
不匹配废话[AZ,S,3][A -Za-z0-9åäöÅäÖ,D,4]
不匹配[AZ,S,3][]
不匹配- < code>[AZ,S,3][klm,D,4][0-9,S,1] 确定(三段)
The text string I want to validate consists of what I call "segments". A single segment might look like this:
[A-Z,S,3]
So far I managed to build this regex pattern
(?:\[(?<segment>[^,\]\[}' ]+?,[S|D],\d{1})\])+?
it works but it will return matches even though the whole text string contains invalid text. I guess I need to use ^
and $
somewhere in my pattern but I can't figure out how!?
I would like my pattern to produce the following results:
[A-Z,S,3][A-Za-z0-9åäöÅÄÖ,D,4]
OK(two segments)[A-Z,S,3]aaaa[A-Za-z0-9åäöÅÄÖ,D,4]
No matchcrap[A-Z,S,3][A-Za-z0-9åäöÅÄÖ,D,4]
No match[A-Z,S,3][]
No match[A-Z,S,3][klm,D,4][0-9,S,1]
OK(three segments)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用 ^ 锚定开始,使用 $ 锚定结束。例如:
^(abc)*$
,这匹配该组的零次或多次重复(本例中为“abc”),并且必须从输入字符串的开头开始并在输入字符串的末尾结束它。^(?:[(?[^,][}' ]+?,[S|D],\d{1})])+$
- 使用非贪婪的+?
并不重要,因为无论如何你都需要它匹配到最后。但是,您的正则表达式有一些问题。^(?:\[[^,]+,[SD],\d\])+$
- 看起来更像你想要的。[^,]+,
将匹配任何非逗号后跟逗号的序列,并且在事实上,您可能应该将]
添加到这个否定字符类中。[S|D]
是三个 个字符的字符类,因为|
在这里并不意味着交替((S|D )
与[SD]
的含义相同)。{1}
是任何原子的默认值,您无需指定它。伪代码(在 codepad.org):
这里最大的区别是表达式只匹配完整的
[...]
部分,但它是连续应用的,所以它们必须从最后一个结束处(或在字符串末尾结束)重新开始。Use ^ to anchor the start and $ to anchor the end. E.g.:
^(abc)*$
, this matches zero or more repetitions of the group ("abc" in this example) and that must start at the start of the input string and end at the end of it.^(?:[(?[^,][}' ]+?,[S|D],\d{1})])+$
—using an ungreedy+?
doesn't matter, as you require it to match until the end anyway. However, your regex has a few issues.^(?:\[[^,]+,[SD],\d\])+$
—seems more like what you want.[^,]+,
will match any sequence of non-commas followed by a comma, and in fact you should probably add]
to this negated character class.[S|D]
is a character class of three characters, as|
doesn't mean alternation here ((S|D)
would mean the same as[SD]
though).{1}
is the default for any atom, you don't need to specify it.Pseudocode (run it at codepad.org):
The big difference here is the expression matches only the complete
[...]
part, but it is applied in succession, so they must start again where the last ends (or end at the end of the string).您想要这样的内容:
以下是如何在 C# 中使用此正则表达式的示例:
输出:
You want something like this:
Here is an example of how you could use this regular expression in C#:
Output: