否定正则表达式中的交替
我可以在正则表达式中使用“Alternation”来匹配任何出现的“cat”或“dog”:
(cat|dog)
是否可以否定此交替,并匹配不是“cat”或“dog”的任何内容?
如果是这样,怎么办?
例如:
假设我正在尝试以近似的方式匹配英语中的 END OF SENTENCE。
机智:
(\.)(\s+[A-Z][^.]|\s*?$)
用下面的段落:
敏捷的棕色狐狸跳过了懒狗。曾几何时,桑切斯博士、帕森斯先生和梅森州长去了商店。你好世界。
我错误地在 Dr.、Mr. 和 Gov. 处找到了“句子结尾”
(我正在使用 http://regexpal.com 进行测试。 com/ 如果您想看看我在上面的示例中看到的内容)
由于这是不正确的,我想说的是:
!(Dr\.|Mr\.|Gov\.)(\.)(\s+[A-Z][^.]|\s*?$)
当然,这不起作用,这就是我寻求帮助的原因。
我也尝试过 !/(Dr.|Mr.|Gov.)/, 和 !~ 但没有任何帮助。
如何避免匹配“Dr.”、“Mr.”和“政府”等?
提前致谢。
I can use "Alternation" in a regular expression to match any occurance of "cat" or "dog" thusly:
(cat|dog)
Is it possible to NEGATE this alternation, and match anything that is NOT "cat" or "dog"?
If so, how?
For Example:
Let's say I'm trying to match END OF SENTENCE in English, in an approximate way.
To Wit:
(\.)(\s+[A-Z][^.]|\s*?$)
With the following paragraph:
The quick brown fox jumps over the lazy dog. Once upon a time Dr. Sanches, Mr. Parsons and Gov. Mason went to the store. Hello World.
I incorrectly find "end of sentence" at Dr., Mr., and Gov.
(I'm testing using http://regexpal.com/ in case you want to see what I'm seeing with the above example)
Since this is incorrect, I would like to say something like:
!(Dr\.|Mr\.|Gov\.)(\.)(\s+[A-Z][^.]|\s*?$)
Of course, this isn't working, which is why I seek help.
I also tried !/(Dr.|Mr.|Gov.)/, and !~ which were no help whatsoever.
How can I avoid matches for "Dr.", "Mr." and "Gov.", etc?
Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
这是不可能的。您通常会使用负向后查找
(? 来执行此操作,但 JavaScript 的正则表达式风格不支持此操作。相反,您必须在事后过滤匹配项以丢弃您不想要的匹配项。
It is not possible. You would normally do this using negative lookbehind
(?<!…)
, but JavaScript's regex flavor does not support this. Instead, you will have to filter the matches after the fact to discard those you don't want.在 Perl/awk 等语言中,有
!~
运算符$string !~ /(cat|dog)/
在 Actionscript 中,您可以只使用 NOT 运算符
!
否定匹配。请参阅此处以供参考。另外此处用于正则表达式风格比较In language like Perl/awk, there's the
!~
operator$string !~ /(cat|dog)/
In Actionscript, you can just use NOT operator
!
to negate a match. See here for reference. Also here for regex flavors comparison您可以这样做:
编辑:您应该在问题中包含编程语言。它的动作脚本对吗?我不是一个动作脚本编码器,但据我所知它是这样完成的:
You can do this:
EDIT: You should've included the programming language on your question. Its Actionscript right? I'm not an actionscript coder but AFAIK its done like this:
(?!NotThisStuff) 是您想要的,也称为负前瞻组。
不幸的是,它不会按您的预期工作。 /(?!Dr\.)(\.)/ 由于第二次分组,仍将返回属于“Dr. Sanches”的句点。正则表达式解析器会对自己说:“是的,这个 '.'不是“博士”” /((?!Dr).)/ 也不起作用,尽管我相信它应该起作用。
更重要的是,无论如何,你最终都会浏览所有的“结束”句子。 Actionscript 没有“全部匹配”,只有先匹配。您必须设置全局标志(或将 g 添加到正则表达式的末尾)并调用 exec 直到结果对象为 null。
(?!NotThisStuff) is what you want, otherwise known as a negative lookahead group.
Unfortunately, it will not work as you intend. /(?!Dr\.)(\.)/ will still return the periods that belong to "Dr. Sanches" because of the second grouping. The Regex parser will say to itself, "Yep, this '.' isn't 'Dr.'" /((?!Dr).)/ won't work either, though I believe it should.
And what's more, you'll end up looking through all the sentence "ends" anyway. Actionscript doesn't have a "match all," only a match first. You have to set the global flag (or add g to the end of your regex) and call exec until your result object is null.