正向前瞻正则表达式令人困惑
我正在构建这个正则表达式,并对其进行积极的展望。基本上,它必须选择行中直到“:”之前的最后一个句点的所有文本,并添加“|”到最后来划定它。下面是一些示例文本。我正在 gskinner 和 editpadpro 中进行测试,它们显然具有完整的 grep 正则表达式支持,因此如果我能从中得到答案,我将不胜感激。
下面的正则表达式在一定程度上有效,但我不确定它是否正确。如果文本包含括号,它也会下降。
最后,我想添加另一条忽略规则,例如忽略但包含“Co”的规则。在选择中。第二个忽略规则将忽略但包括前面有一个大写字母的句点。下面还有示例文本。感谢您的所有帮助。
^(?:[^|]+\|){3}(.*?)[^(?:Co)]\.(?=[^:]*?\:)
121| Ryan, T.N. |2001. |I like regex. But does it like me (2) 2: 615-631.
122| O' Toole, H.Y. |2004. |(Note on the regex). Pages 90-91 In: Ryan, A. & Toole, B.L. (Editors) Guide to the regex functionality in php. Timmy, Tommy& Stewie, Quohog. * Produced for Family Guy in Quohog.
I'm building this regex with a positive look ahead in it. Basically it must select all text in the line up to last period that precedes a ":" and add a "|" to the end to delimit it. Some sample text below. I am testing this in gskinner and editpadpro which has full grep regex support apparently so if I could get the answers in that for I'd appreciate it.
The regex below works to a degree but I am unsure if it is correct. Also it falls down if the text contains brackets.
Finally I would like to add another ignore rule like the one that ignores but includes "Co." in the selection. This second ignore rule would ignore but include periods that have a single Capital letter before them. Sample text below too. Thanks for all the help.
^(?:[^|]+\|){3}(.*?)[^(?:Co)]\.(?=[^:]*?\:)
121| Ryan, T.N. |2001. |I like regex. But does it like me (2) 2: 615-631.
122| O' Toole, H.Y. |2004. |(Note on the regex). Pages 90-91 In: Ryan, A. & Toole, B.L. (Editors) Guide to the regex functionality in php. Timmy, Tommy& Stewie, Quohog. * Produced for Family Guy in Quohog.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
![扫码二维码加入Web技术交流群](/public/img/jiaqun_03.jpg)
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我想我不明白你想做什么。但这部分
[^(?:Co)]
肯定是不正确的。使用方括号您可以创建一个字符类,因为
^
它是一个否定类。这意味着在这个地方您不想匹配这些字符之一(?:Co)
,换句话说,它将匹配除“?)(:Co”之外的任何其他字符。更新:
我认为不可能。我应该如何区分 L. Co. 或类似的内容和句子的结尾?
但我在你的正则表达式中发现了另一个错误。 =[^:]*?\:) 应该是
(?=[^.]*?\:)
如果您想将:
之前的最后一个点与您的表达式匹配,它将匹配第一个点请参阅它 。 Regexr 此处
I don't think I understand what you want to do. But this part
[^(?:Co)]
is definitely not correct.With the square brackets you are creating a character class, because of the
^
it is a negated class. That means at this place you don't want to match one of those characters(?:Co)
, in other words it will match any other character than "?)(:Co".Update:
I don't think its possible. How should I distinguish between L. Co. or something similar and the end of the sentence?
But I found another error in your regex. The last part
(?=[^:]*?\:)
should be(?=[^.]*?\:)
if you want to match the last dot before the:
with your expression it will match on the first dot.See it here on Regexr
这似乎可以满足您的要求。
它非常简单地匹配所有文本,直到冒号之前出现的最后一个句号。
This seems to do what you want.
It quite simply matches all text up to the last full stop that occurs before the colon.