Perl - 正则表达式 - 第一个不匹配字符的位置
我想找到字符串中正则表达式停止匹配的位置。
简单示例:
my $x = 'abcdefghijklmnopqrstuvwxyz';
$x =~ /gho/;
此示例将给出字符“h”的位置,因为“h”匹配,而“o”是第一个不匹配的字符。
我想过使用 pos 或 $- 但它没有写在不成功的匹配上。 另一个解决方案是迭代地缩短正则表达式模式直到它匹配,但这非常难看并且不适用于复杂的模式。
编辑:
好吧,对于语言学家来说:我对我糟糕的解释感到抱歉。
为了澄清我的情况:如果您将正则表达式视为有限自动机,则存在一个点,测试会中断,因为字符不适合。这一点就是我正在寻找的。
使用迭代括号(如 eugene y 提到的)是一个好主意,但它不适用于量词,我必须编辑模式。
还有其他想法吗?
I want to find the position in a string, where a regular expression stops matching.
Simple example:
my $x = 'abcdefghijklmnopqrstuvwxyz';
$x =~ /gho/;
This example shall give me the position of the character 'h' because 'h' matches and 'o' is the first nonmatching character.
I thought of using pos or $- but it is not written on unsuccessful match.
Another solution would be to iteratively shorten the regex pattern until it matches but that's very ugly and doesn't work on complex patterns.
EDIT:
Okay for the linguists: I'm sorry for my awful explanation.
To clarify my situation: If you think of a regular expression as a finite automaton, there is a point, where the testing interrupts, because a character doesn't fit. This point is what I'm searching for.
Use of iterative paranthesis (as mentioned by eugene y) is a nice idea, but it doesn't work with quantifiers and I had to edit the pattern.
Are there other ideas?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您提出的建议很困难,但可行。
如果我能解释一下我的理解的话,你想知道一场失败的比赛进入了一场比赛的程度。为了做到这一点,您需要能够解析正则表达式。
最好的正则表达式解析器可能是将 Perl 本身与
-re=debug
命令行开关一起使用:您可以使用正则表达式来解析 Perl 命令行并解析 stdout 的返回。查找 `
Here is a Matching regex:
You will need to build a parser that can process 从 Perl 重新调试器返回。左手和右手尖括号显示正则表达式引擎尝试匹配时到字符串的距离。
顺便说一句,这不是一个容易的项目......
What you are proposing is difficult but doable.
If I can paraphrase what I understand, you are wanting to find out how far a failing match got into a match. In order to do this, you need to be able to parse a regex.
The best regex parser is probably to use Perl itself with the
-re=debug
command line switch:You can shell out that Perl command line with your regex and parse the return of stdout. Look for the `
Here is a matching regex:
You will need to build a parser that can handle the return from the Perl re debugger. The left hand and right hand angle braces show the distance into the string as the regex engine is trying to match.
This is not an easy project btw...
您可以获取匹配的部分,并使用
index
函数查找其位置:You can get the matching part, and use the
index
function to find its position:这似乎有效。基本上,这个想法是将正则表达式分割成它的组成部分并按顺序尝试它们,返回最后一个匹配位置。固定字符串需要拆分,但字符类和量词可以保留在一起。
理论上这应该可行,但可能需要调整。
This seems to work. Basically the idea is to split the regex into it's constituent parts and try them sequentially, returning the last matching position. The fixed strings need to be split up, but the character classes and quantifiers can be kept together.
In theory this should work, but it may need tweaking.
怎么样:
输出:
How about:
output:
我认为这正是 pos 函数的用途。注意:
pos
仅在使用/g
标志时才有效给出以下输出
I think thats exactly what the
pos
function is for. NOTE:pos
only works if you use the/g
flagGives the following output