正则表达式锚点清单

发布于 2024-10-26 23:59:12 字数 659 浏览 0 评论 0原文

  1. ^ 据说匹配行的开头,但它在 "\n" 之后不匹配,“\r”“\r\n”。但它匹配字符串的开头。它在什么意义上匹配行的开头,它与 \A 有何不同?

  2. $ 据说匹配行尾,但它与 "\n" 之前不匹配,“\r”“\r\n”。但它匹配字符串的结尾。它在什么意义上匹配行尾,它与 \z 有何不同?

  3. \Z\z 不同,如果位于字符串末尾,则匹配 "\n" 之前的位置。在我看来, \A\z 是自然配对的概念,而 \Z 是一个相当奇怪的概念。为什么 \Z\z 按原样定义,而不是相反?而且,您什么时候想使用 \Z

你能用例子来说明上面的内容吗? 如果语言/标准之间的差异很重要,那么列出它们会很有帮助。

  1. ^ is said to match the beginning of a line, but it does not match right after a "\n", "\r" or "\r\n". It matches the beginning of a string, though. In what sense does it match the beginning of a line, and how is it different from \A?

  2. $ is said to match the end of a line, but it does not match right before a "\n", "\r" or "\r\n". It matches the end of a string, though. In what sense does it match the end of a line, and how is it different from \z?

  3. \Z, unlike \z, matches right before "\n" if that is at the end of a string. It seems to me that \A and \z are naturally paired concept, and \Z is rather an odd one. Why is it that \Z and \z are defined as is, and not the other way around? And, when would you want to use \Z?

Can you illustrate the above using examples?
If difference among languages/standards matters, it would be helpful to list them.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

我要还你自由 2024-11-02 23:59:12

不同之处在于 ^$ 锚点可以修改行为。

启用 multiline 模式后,^$ 锚点断言行的开头和结尾。

multiline 模式关闭时,^$ 锚点断言字符串的开头和结尾。


大多数正则表达式实现都有多行模式。

对于 Ruby、Perl 或 Javascript,它是使用 m 修饰符定义的。例如 /pattern/m

在 .NET 中,它是在模式本身内使用 (?m) 定义的,或者从 RegexOptions.Multiline 枚举定义的。


要回答你的第三个问题...

\A - 匹配必须出现在字符串的开头。

\Z - 匹配必须出现在字符串末尾或字符串末尾的 \n 之前。

\z - 匹配必须出现在字符串的末尾。

这三个是不受任何修饰符影响的常量。我同意 \A\z 似乎是不合逻辑的配对。这对我来说也没有多大意义。但如果您可能希望忽略尾随换行符,那么 \Z 可能是首选。

The difference is that the ^ and $ anchors can have modified behaviors.

With multiline mode on, the ^ and $ anchors assert the beginning and end of a line.

With multiline mode off, the ^ and $ anchors assert the beginning and end of the string.


Most regex implementations have a multiline mode.

With Ruby, Perl, or Javascript, it's defined with the m modifier. e.g. /pattern/m

In .NET it's defined with (?m) inside the pattern itself, or from the RegexOptions.Multiline enumeration.


To answer your 3rd question...

\A - The match must occur at the start of the string.

\Z - The match must occur at the end of the string or before \n at the end of the string.

\z - The match must occur at the end of the string.

These three are constants that are not affected by any modifiers. I agree that \A and \z seem to be an illogical pairing. It doesn't make a great deal of sense to me either. But in a case where you may have a trailing line feed that you wish to ignore then \Z might be preferred.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文