PCRE中的匹配顺序
如何设置匹配 PCRE 正则表达式中的内容的顺序?
我有一个用户可以提供的动态正则表达式,用于从字符串中提取两个值并将它们存储在两个字符串中。 但是,在某些情况下,两个值可以以相反的顺序出现在字符串中,因此第一个 (\w+) 或其他值需要存储在第二个字符串中。
How can I set which order to match things in a PCRE regular expression?
I have a dynamic regular expression that a user can supply that is used to extract two values from a string and stores them in two strings. However, there are cases where the two values can be in the string in reverse order, so the first (\w+) or whatever needs to be stored in the second string.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以使用名称提取字符串
并获取值
you can extract the strings by name using
and get the values with
如果您将两个部分与相同的子模式(如
\w+
)进行匹配,那么您就不走运了。 但是,如果子模式明显不同,您有几个选择,但没有一个非常漂亮。 下面是一个正则表达式,它使用条件构造来按任一顺序匹配 HTML 脚本元素的src
和type
属性:(免责声明:此正则表达式做出了许多不切实际的假设,其中主要是他们知道这两个属性都会出现,并且它们会彼此相邻。我只是用它来说明该技术。)
如果
src
属性首先出现,则src<。 /code> 和
type
值将分别在第一组和第二组中捕获。 否则,他们将分别出现在第四组和第三组。 命名组将使跟踪事物变得更加容易,特别是如果可以像在 .NET 正则表达式中一样在多个地方使用相同的名称。 不幸的是,PCRE 要求每个命名组都有一个唯一的名称,这太糟糕了; 这是一个非常好的功能。If you're matching both parts with the same subpattern (like
\w+
), you're out of luck. But if the subpatterns are distinctively different you have a few options, none of them very pretty. Here's a regex that uses a conditional construct to match thesrc
andtype
attributes of an HTML script element in either order:(DISCLAIMER: This regex makes many unrealistic assumptions, chief among them that both attributes will be present and that they'll be adjacent to each other. I'm only using it to illustrate the technique.)
If the
src
attribute appears first, thesrc
andtype
values will be captured in the first and second groups respectively. Otherwise, they'll appear in the fourth and third groups respectively. Named groups would make it easier to keep track of things, especially if could use the same name in more than place like you can in .NET regexes. Unfortunately, PCRE requires every named group to have a unique name, which is too bad; that's a very nice feature.