RegExp 接口不明确
非常奇怪的事情。
var body="Received: from ([195.000.000.0])\r\nReceived: from ([77.000.000.000]) by (6.0.000.000)"
var lastMath="";
var subExp = "[\\[\\(](\\d+\\.\\d+\\.\\d+\\.\\d+)[\\]\\)]"
var re = new RegExp("Received\\: from.*?"+subExp +".*", "mg");
var re1 = new RegExp(subExp , "mg");
while(ares= re.exec(body))
{
print(ares[0])
while( ares1 = re1.exec(ares[0]))
{
if(!IsLocalIP(ares1[1]))
{
print(ares1[1])
lastMath=ares1[1];
break ;
}
}
}
print(lastMath)
它输出:
Received: from ([195.000.000.0])
195.000.000.0
Received: from ([77.000.000.000]) by (6.0.000.000)
6.0.000.000
6.0.000.000
但我认为应该是:
Received: from ([195.000.000.0])
195.000.000.0
Received: from ([77.000.000.000]) by (6.0.000.000)
77.000.000.000
77.000.000.000
因为显然“77.000.000.000”排在第一位。如果我评论“break”,输出顺序是正确的。 我的代码有什么问题吗?
Something very strange.
var body="Received: from ([195.000.000.0])\r\nReceived: from ([77.000.000.000]) by (6.0.000.000)"
var lastMath="";
var subExp = "[\\[\\(](\\d+\\.\\d+\\.\\d+\\.\\d+)[\\]\\)]"
var re = new RegExp("Received\\: from.*?"+subExp +".*", "mg");
var re1 = new RegExp(subExp , "mg");
while(ares= re.exec(body))
{
print(ares[0])
while( ares1 = re1.exec(ares[0]))
{
if(!IsLocalIP(ares1[1]))
{
print(ares1[1])
lastMath=ares1[1];
break ;
}
}
}
print(lastMath)
It outputs:
Received: from ([195.000.000.0])
195.000.000.0
Received: from ([77.000.000.000]) by (6.0.000.000)
6.0.000.000
6.0.000.000
But I think it should be:
Received: from ([195.000.000.0])
195.000.000.0
Received: from ([77.000.000.000]) by (6.0.000.000)
77.000.000.000
77.000.000.000
Because obviously "77.000.000.000" goes first. If I comment "break", output order is correct.
What's wrong with my code?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
请注意,Javascript(和大多数语言)中的正则表达式分组不适用于*
或+
运算符的非常明显的行为。例如:在这种情况下,您将获得最后一个匹配的组,仅此而已。我不确定此行为是在哪里指定的,但它可能因语言而异。编辑:IsLocalIP()
的作用是什么?好吧,我认为问题与 exec 的有状态性有关(这可能就是我不使用它的原因;我使用 String.match())——如果你'要执行此操作,您需要手动将正则表达式的 lastindex 属性初始化为 0,否则您会得到以下行为:
产生此结果:
您会注意到同一个函数获得三个不同的结果,这意味着状态性正在把某些事情搞砸奇怪的原因(Javascript以某种方式缓存/保留正则表达式?我正在使用JSDB,它使用Spidermonkey = Firefox的javascript引擎)。
因此,如果我将代码更改为以下内容:
然后我会得到预期的行为:
Note that regex grouping in Javascript (and most languages) does not work with a very obvious behavior with the*
or+
operators. For example:In this case, you get the last group that matches and that's it. I'm not sure where this behavior is specified, but it can vary from language to language.edit: What doesIsLocalIP()
do?OK, I think the problem has to do with
exec
's statefulness (which may be why I don't use it; I use String.match()) -- if you're going to do this, you need to manually initialize the regex's lastindex property to 0, otherwise you get this behavior:produces this result:
You'll note that the same function gets three different results, which implies statefulness is mucking things up for some bizarre reason (Javascript is caching/interning the regex somehow? I'm using JSDB which uses Spidermonkey = Firefox's javascript engine).
So if I change the code to the following:
Then I get the expected behavior: