如何在正则表达式中表示 epsilon?

发布于 2024-09-19 20:08:32 字数 899 浏览 5 评论 0原文

教科书教我们使用 epsilon (ε) 符号编写正则表达式,但是如何将该符号直接转换为代码,而不必完全重新编写正则表达式呢?

例如,我将如何编写这个正则表达式来捕获所有以 a 开头或结尾(或两者)的小写字符串。

不能 100% 确定这是正确的,但是...

((a|epsilon)[az]*a) | (a[az]*(a|epsilon))

所以一些应该匹配的字符串包括:

a //single "a" starts or ends with "a"

aa //starts and ends with "a"

ab //starts with "a"

ba //ends with "a"

aba //starts and ends with "a"

aaaaaaaa //starts and ends with "a"

abbbbbbb //starts with "a"

bbbbbbba //ends with "a"

abbbbbba //starts and ends with "a"

asdfhgdu //starts with "a"

onoineca //ends with "a"

ahnrtyna //starts and ends with "a"

我只用什么来交换 epsilon 为正确的符号,我不想修改表达式其余部分的任何部分。另外我想澄清一下,我实际上并不是在检查 epsilon 符号,我想选择一个字符或什么都没有(当然不是什么都没有...... epsilon)。

这样的符号存在吗?

我想要的可能吗?

The text book teaches us to write regular expressions using the epsilon (ε) symbol, but how can I translate that symbol directly to code without having to completely rework my regular expression?

For instance, how would I write this regex which would catch all lowercase strings that either begin or end in a (or both).

Not 100% sure this is correct but...

((a|epsilon)[a-z]*a) | (a[a-z]*(a|epsilon))

So some strings that should match include:

a //single "a" starts or ends with "a"

aa //starts and ends with "a"

ab //starts with "a"

ba //ends with "a"

aba //starts and ends with "a"

aaaaaaaa //starts and ends with "a"

abbbbbbb //starts with "a"

bbbbbbba //ends with "a"

abbbbbba //starts and ends with "a"

asdfhgdu //starts with "a"

onoineca //ends with "a"

ahnrtyna //starts and ends with "a"

I only what to exchange epsilon for the correct symbol, I do not want to modify any part of the rest of the expression. Also I want to be clear, I am not actually checking for the epsilon symbol, I want to have a choice of a character or nothing (well not nothing... epsilon).

Does such a symbol exist?

Is what I want possible?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

旧时光的容颜 2024-09-26 20:08:39

只需省略

Just omit the ????, since it denotes the empty string:

([1-9]|)[0-9]*

There’s also a shortcut for this particular case:

([1-9]?)[0-9]*

The ? means zero or one occurrences of the preceding token.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文