如何在正则表达式中表示 epsilon?
教科书教我们使用 epsilon (ε)
符号编写正则表达式,但是如何将该符号直接转换为代码,而不必完全重新编写正则表达式呢?
例如,我将如何编写这个正则表达式来捕获所有以 a
开头或结尾(或两者)的小写字符串。
不能 100% 确定这是正确的,但是...
((a|epsilon
)[az]*
a) | (a[az]*
(a|epsilon
))
所以一些应该匹配的字符串包括:
a //single "a" starts or ends with "a"
aa //starts and ends with "a"
ab //starts with "a"
ba //ends with "a"
aba //starts and ends with "a"
aaaaaaaa //starts and ends with "a"
abbbbbbb //starts with "a"
bbbbbbba //ends with "a"
abbbbbba //starts and ends with "a"
asdfhgdu //starts with "a"
onoineca //ends with "a"
ahnrtyna //starts and ends with "a"
我只用什么来交换 epsilon
为正确的符号,我不想修改表达式其余部分的任何部分。另外我想澄清一下,我实际上并不是在检查 epsilon 符号,我想选择一个字符或什么都没有(当然不是什么都没有...... epsilon)。
这样的符号存在吗?
我想要的可能吗?
The text book teaches us to write regular expressions using the epsilon (ε)
symbol, but how can I translate that symbol directly to code without having to completely rework my regular expression?
For instance, how would I write this regex which would catch all lowercase strings that either begin or end in a
(or both).
Not 100% sure this is correct but...
((a|epsilon
)[a-z]*
a) | (a[a-z]*
(a|epsilon
))
So some strings that should match include:
a //single "a" starts or ends with "a"
aa //starts and ends with "a"
ab //starts with "a"
ba //ends with "a"
aba //starts and ends with "a"
aaaaaaaa //starts and ends with "a"
abbbbbbb //starts with "a"
bbbbbbba //ends with "a"
abbbbbba //starts and ends with "a"
asdfhgdu //starts with "a"
onoineca //ends with "a"
ahnrtyna //starts and ends with "a"
I only what to exchange epsilon
for the correct symbol, I do not want to modify any part of the rest of the expression. Also I want to be clear, I am not actually checking for the epsilon symbol, I want to have a choice of a character or nothing (well not nothing... epsilon).
Does such a symbol exist?
Is what I want possible?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
只需省略
Just omit the ????, since it denotes the empty string:
There’s also a shortcut for this particular case:
The
?
means zero or one occurrences of the preceding token.