如何整齐地匹配“x”和“[x]”使用正则表达式而不重复?

发布于 2024-11-17 09:08:08 字数 321 浏览 3 评论 0原文

我正在编写一个 Perl 正则表达式来匹配字符串 x bla[x] bla。一种替代方法是 /(?:x|\[x\]) bla/。这是不可取的,因为在现实世界中,x 更复杂,所以我想避免重复它。

迄今为止最好的解决方案是将 x 放入变量中并预编译正则表达式:

my $x = 'x';
my $re = qr/(?:$x|\[$x\]) bla/o;

是否有更简洁的解决方案?在这种情况下,可读性比性能更重要。

I'm writing a Perl regex to match both the strings x bla and [x] bla. One alternative is /(?:x|\[x\]) bla/. This isn't desirable, because in the real world, x is more complicated, so I want to avoid repeating it.

The best solution so far is putting x in a variable and pre-compiling the regex:

my $x = 'x';
my $re = qr/(?:$x|\[$x\]) bla/o;

Is there a neater solution? In this case, readability is more important than performance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

守护在此方 2024-11-24 09:08:08

这是可能的,但并不是那么干净。您可以利用条件子模式支持诸如 (?(N)) 之类的测试来检查第 N 个捕获子模式是否成功匹配。因此,您可以使用 /(\[)?X(?(1)\])/ 等表达式来匹配“[X]”或“X”。

It's possible, but not all that clean. You can use the fact that conditional subpatterns support tests such as (?(N)) to check that the Nth capturing subpattern successfully matched. So you can use an expression such as /(\[)?X(?(1)\])/ to match '[X]' or 'X'.

嗫嚅 2024-11-24 09:08:08

您也可以预编译 $x 。如果 $x 确实是 ?(+[*{) 或正则表达式编译器完全抓狂的其他东西,这也会使错误变得更加明显。

my $x = qr/x/;
my $re = qr/(?:$x|\[$x\]) bla/o;

You can pre-compile $x as well. This also makes errors a little more obvious if $x is really ?(+[*{ or something else that a regex compiler will completely freak out on.

my $x = qr/x/;
my $re = qr/(?:$x|\[$x\]) bla/o;
半衾梦 2024-11-24 09:08:08

实际上没有更简洁的解决方案,因为这就是我们离开常规语言领域并开始需要具有某种内存的更复杂的自动机的地方。 (Backrefs 可以做到这一点,只不过 backref 扩展为与字符串前面部分的文字匹配,而不是“this,但仅当 that 匹配时”。 )

有时,可以使用两步过程,将复杂的 X 替换为已知源文本中不存在的单个字符(控制字符可能适合这种情况),从而允许更简单的第二阶段比赛。但并不总是可能;取决于你要匹配什么。

There isn't a neater solution really, because this is where we leave the domain of regular languages and start requiring a more complex automaton with some kind of memory. (Backrefs would do it, except that the backref expands to a literal match against a preceding part of the string, not to “this, but only if that was matched”.)

Sometimes, it's possible to instead use a two step process, replacing a complex X with a single character known to not be present in the source text (control characters can be suitable for that) so allowing a simpler second-stage match. Not always possible though; depends on what you're matching.

撩人痒 2024-11-24 09:08:08

您可以编写类似于 (\[)?x(??{ Defined $1 ? "]" : "" }) 的内容,但您可能不应该这样做。

You can write something like (\[)?x(??{ defined $1 ? "]" : "" }) but you probably shouldn't.

可可 2024-11-24 09:08:08

我测试了 /(\[)?X(?(1)\])/ 解决方案(得分为 7),它也匹配 [XX],这是不正确的。原始海报的 /(?:$x|\[$x\]) bla/ 实际上可以工作,需要匹配的括号或不需要。

I tested the /(\[)?X(?(1)\])/ solution (which garnered a score of 7), and it also matched [X and X], which are incorrect. The original poster's /(?:$x|\[$x\]) bla/ actually works, requiring either matched brackets or none.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文