分组正则表达式以匹配*有时*以空格开头的行?

发布于 2024-10-13 08:25:58 字数 836 浏览 4 评论 0原文

正则表达式风格:wxRegEx

我正在尝试创建一个“分组”正则表达式,它匹配有时以空格开头的字符串。当它不以空格开头时,它以目标组开头(以下示例中的第二个带括号的表达式)。这是一个相对简单的行,由一些可预测的标记和任意文本的一部分组成,例如

"good: Sed ut perspiciatis unde omnis iste natus error "
"better: Sit voluptatem accusantium doloremque laudantium "
"best: Nemo enim ipsam voluptatem quia voluptas "
" ok: Sit voluptatem accusantium doloremque laudantium "

注意:引用的字符不是我输入的一部分。通过在我的帖子中引入引号,我试图使每行/字符串的边界更加清晰。

我想出的以“分组”方式匹配上述内容的正则表达式(即我可以单独处理每个组以进行进一步处理)是:

(^\s*)(good|better|best|ok)(: )(.*)( $)

注意: \s 是 wxRegEx 的类简写转义[[:空间:]]。

问题是这个正则表达式仅当该行实际上以空格开头时才有效。为什么? '\s' 后面的 '*' 不是表示“0 次或多次出现 \s”吗?

我知道我在这里遗漏了一些基本的东西,但它是什么?

RegEx flavor: wxRegEx.

I am trying to create a "grouped" regex that matches a string that sometimes begins with a whitespace. When it doesn't begin with a whitespace, it begins with the target group (second parenthesized expression in the following sample). It is a relatively simple line made of a few predictable tokens and one portion of arbitrary text, e.g.

"good: Sed ut perspiciatis unde omnis iste natus error "
"better: Sit voluptatem accusantium doloremque laudantium "
"best: Nemo enim ipsam voluptatem quia voluptas "
" ok: Sit voluptatem accusantium doloremque laudantium "

Note: The quoted characters are not part of my input. By introducing the quotes in my posting I am trying to make the boundaries of each line/string clearer.

The regex that I came up with to match the above in a "grouped" manner (i.e. that I can address each group separately for further processing) is:

(^\s*)(good|better|best|ok)(: )(.*)( $)

Note: \s is wxRegEx's class-shorthand escape for [[:space:]].

The problem is that this regex works only when the line actually begins with a space. Why? doesn't the '*' right after '\s' mean "0 or more occurrences of \s" ?

I know I am missing something fundamental here, but what is it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

反话 2024-10-20 08:25:58

您是否尝试过使用 (^ *) 而不是 (^\s*) ?您是否可能对 \s 语法有误?我自己也不知道wxRegEx。

Have you tried this with (^ *) instead of (^\s*)? Is it possible you're wrong about the \s syntax? I don't know wxRegEx myself.

半世晨晓 2024-10-20 08:25:58

我不熟悉 wxRegEx,但如果它是 PCRE,我想你可能想要 (^\s*)?(good|...

'?' 修改整个零或多个捕获以使其为零-或一。

I'm not familiar with wxRegEx, but if it is PCRE, I think you may want (^\s*)?(good|...

The '?' modifies the entire zero-or-more capture to make it zero-or-one.

離殇 2024-10-20 08:25:58

这很奇怪.. 你是对的,* 应该匹配 0 次或多次出现...将插入符号 (^) 移到组外有什么区别吗?

That's weird.. you are right that * should match 0 or more occurrences... Does moving the caret (^) outside the group make any difference?

小瓶盖 2024-10-20 08:25:58

我在你的正则表达式中没有看到明显的错误。当然,您对 * 的解释也是正确的。你的表达中可能有一些实际的空格吗?空格(如 -> <- )在正则表达式中没有特殊含义,引擎会尝试匹配它。如果您的第一个捕获组看起来像 (^ \s*) 这将具有您所描述的效果。

I see no obvious error in your regex. Your interpretation of the * is also correct, of course. Do you maybe have some actual spaces in your expression? The space ( like -> <- ) has no special meaning in regex and the engine will try to match it. If your first capturing group looked like (^ \s*) this would have the effect you describe.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文