正向前瞻正则表达式令人困惑

发布于 2024-12-07 08:07:43 字数 660 浏览 2 评论 0原文

我正在构建这个正则表达式,并对其进行积极的展望。基本上,它必须选择行中直到“:”之前的最后一个句点的所有文本,并添加“|”到最后来划定它。下面是一些示例文本。我正在 gskinner 和 editpadpro 中进行测试,它们显然具有完整的 grep 正则表达式支持,因此如果我能从中得到答案,我将不胜感激。

下面的正则表达式在一定程度上有效,但我不确定它是否正确。如果文本包含括号,它也会下降。

最后,我想添加另一条忽略规则,例如忽略但包含“Co”的规则。在选择中。第二个忽略规则将忽略但包括前面有一个大写字母的句点。下面还有示例文本。感谢您的所有帮助。

^(?:[^|]+\|){3}(.*?)[^(?:Co)]\.(?=[^:]*?\:)

121| Ryan, T.N. |2001. |I like regex. But does it like me (2) 2: 615-631.
122| O' Toole, H.Y. |2004. |(Note on the regex). Pages 90-91 In: Ryan, A. & Toole, B.L. (Editors) Guide to the regex functionality in php. Timmy, Tommy& Stewie, Quohog. * Produced for Family Guy in Quohog.

I'm building this regex with a positive look ahead in it. Basically it must select all text in the line up to last period that precedes a ":" and add a "|" to the end to delimit it. Some sample text below. I am testing this in gskinner and editpadpro which has full grep regex support apparently so if I could get the answers in that for I'd appreciate it.

The regex below works to a degree but I am unsure if it is correct. Also it falls down if the text contains brackets.

Finally I would like to add another ignore rule like the one that ignores but includes "Co." in the selection. This second ignore rule would ignore but include periods that have a single Capital letter before them. Sample text below too. Thanks for all the help.

^(?:[^|]+\|){3}(.*?)[^(?:Co)]\.(?=[^:]*?\:)

121| Ryan, T.N. |2001. |I like regex. But does it like me (2) 2: 615-631.
122| O' Toole, H.Y. |2004. |(Note on the regex). Pages 90-91 In: Ryan, A. & Toole, B.L. (Editors) Guide to the regex functionality in php. Timmy, Tommy& Stewie, Quohog. * Produced for Family Guy in Quohog.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

℡Ms空城旧梦 2024-12-14 08:07:43

我想我不明白你想做什么。但这部分[^(?:Co)]肯定是不正确的。

使用方括号您可以创建一个字符类,因为 ^ 它是一个否定类。这意味着在这个地方您不想匹配这些字符之一 (?:Co),换句话说,它将匹配除“?)(:Co”之外的任何其他字符。

更新:

我认为不可能。我应该如何区分 L. Co. 或类似的内容和句子的结尾?

但我在你的正则表达式中发现了另一个错误。 =[^:]*?\:) 应该是(?=[^.]*?\:) 如果您想将 : 之前的最后一个点与您的表达式匹配,它将匹配第一个点

请参阅它 。 Regexr 此处

I don't think I understand what you want to do. But this part [^(?:Co)] is definitely not correct.

With the square brackets you are creating a character class, because of the ^ it is a negated class. That means at this place you don't want to match one of those characters (?:Co), in other words it will match any other character than "?)(:Co".

Update:

I don't think its possible. How should I distinguish between L. Co. or something similar and the end of the sentence?

But I found another error in your regex. The last part (?=[^:]*?\:) should be (?=[^.]*?\:) if you want to match the last dot before the : with your expression it will match on the first dot.

See it here on Regexr

如梦 2024-12-14 08:07:43

似乎可以满足您的要求。

(.*\.)(?=[^:]*?:)

它非常简单地匹配所有文本,直到冒号之前出现的最后一个句号。

This seems to do what you want.

(.*\.)(?=[^:]*?:)

It quite simply matches all text up to the last full stop that occurs before the colon.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文