AS3 RegExp 用于匹配单词及其中的边界类型字符

发布于 2024-07-18 04:48:23 字数 428 浏览 2 评论 0原文

我想要匹配一个单词列表,当这些单词是真正的单词时,这很容易。 例如 /\b (pop|push) \b/gsx 当针对字符串运行时

啪的一声推了门,但门又弹了回来

将匹配单词 pop 和 push 但没有 poped。

对于包含通常符合单词边界的字符的单词,我需要类似的功能。 因此,当针对字符串运行时,我需要 /\b (reverse!|push) \b/gsx

反向推! 反向!推

只匹配反向! 和push 但不匹配reverse!push。 显然这个正则表达式不会这样做,所以我需要使用什么来代替 \b 来使我的正则表达式足够智能来处理这些时髦的要求?

I'm wanting to match a list of words which is easy enough when those words are truly words. For example /\b (pop|push) \b/gsx when ran against the string

pop gave the door a push but it popped back

will match the words pop and push but not popped.

I need similar functionality for words that contain characters that would normally qualify as word boundaries. So I need /\b (reverse!|push) \b/gsx when ran against the string

push reverse! reverse!push

to only match reverse! and push but not match reverse!push. Obviously this regex isn't going to do that so what do I need to use instead of \b to make my regex smart enough to handle these funky requirements?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

稳稳的幸福 2024-07-25 04:48:23

在单词末尾,\b 表示“前一个字符是单词字符,下一个字符(如果有下一个字符)不是单词字符。您要删除第一个条件,因为可能存在非单词字符” “word”末尾的 -word 字符会给您带来负面的前瞻:

/\b (reverse!|push) (?!\w)/gx

我很确定 AS3 正则表达式支持前瞻。

At the end of a word, \b means "the previous character was a word character, and the next character (if there is a next character) is not a word character. You want to drop the first condition because there might be a non-word character at the end of the "word". That leaves you with a negative lookahead:

/\b (reverse!|push) (?!\w)/gx

I'm pretty sure AS3 regexes support lookahead.

氛圍 2024-07-25 04:48:23

您的第一个问题是您的轮换中需要三个(可能是四个)案例,而不是两个。

  • /\breverse!(?:\s|$)/ 反转! 单独
  • /\bpush\b/ 单独推送
  • /\breverse!push\b/ 一起
  • /\bpushreverse!(?:\s|$)/ 这是可能的情况

您的第二个问题是 \b"!" 之后不会匹配,因为它不是 \w。 以下是 Perl 5 对 \b 的说法,您可能需要咨询您的文档以查看他们是否同意:

单词边界(“\b”)是两个字符之间的一个点,其一侧有“\w”,另一侧有“\W”(无论顺序),计算字符串开头和结尾处的虚构字符与“\W”匹配。 (在字符类中,“\b”表示退格而不是单词边界,就像在任何双引号字符串中通常所做的那样。)

因此,您需要的正则表达式就像

/ \b ( reverse!push | reverse! | push ) (?: \s | \b | $ )+ /gx;

我省略了 /s 因为这个正则表达式中没有句点,所以视为单行是没有意义的。 如果 /s 并不意味着将其视为引擎中的一行,您可能应该将其添加回来。 另外,您应该了解您的引擎如何处理交替。 我知道在 Perl 5 中,为了获得正确的行为,您必须以这种方式排列项目(否则,reverse! 总是会战胜 reverse!push)。

Your first problem is that you need three (possibly four) cases in your alternation, not two.

  • /\breverse!(?:\s|$)/ reverse! by itself
  • /\bpush\b/ push by itself
  • /\breverse!push\b/ together
  • /\bpushreverse!(?:\s|$)/ this is the possible case

Your second problem is that a \b won't match after a "!" because it is not a \w. Here is what Perl 5 has to say about \b, you may want to consult your docs to see if they agree:

A word boundary ("\b") is a spot between two characters that has a "\w" on one side of it and a "\W" on the other side of it (in either order), counting the imaginary characters off the beginning and end of the string as matching a "\W". (Within character classes "\b" represents backspace rather than a word boundary, just as it normally does in any double-quoted string.)

So, the regex that you need is something like

/ \b ( reverse!push | reverse! | push ) (?: \s | \b | $ )+ /gx;

I left out the /s because there are not periods in this regex, so treat as single line makes no sense. If /s doesn't mean treat as a single line in your engine you should probably add it back. Also, you should read up on how your engine handles alternation. I know in Perl 5 to get the right behaviour you must arrange the items this way (otherwise reverse! would always win over reverse!push).

潦草背影 2024-07-25 04:48:23

您可以将 \b 替换为等效的但不太严格的内容:

/(?<=\s|^)(reverse!|push)(?=\s|$)/g

这样 \b 的限制因素(它只能匹配实际 \w 单词之前或之后)字符)被删除。

现在,空格或字符串的开头/结尾可以作为有效的分隔符,并且内部表达式可以在运行时轻松构建,例如从搜索词列表中构建。

You can replace \b by something equivalent, but less strict:

/(?<=\s|^)(reverse!|push)(?=\s|$)/g

This way the limiting factor of the \b (that it can only match before or after an actual \w word character) is removed.

Now white space or the start/end of the string function as valid separators, and the inner expression can be easily built at run-time, from a list of search terms for example.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文