为什么我的嵌套环视在 Perl 替换中不能正常工作?

发布于 2024-12-09 13:35:57 字数 446 浏览 6 评论 0原文

我有一个 Perl 替换,可以将超链接转换为小写:

's/(?<=<a href=")([^"]+)(?=")/\L$1/g'

我希望替换忽略以哈希开头的任何链接,例如我希望它更改 Foo Bar 转换为小写,但如果遇到 Bar 则跳过。

嵌套前瞻来指示它跳过这些链接对我来说无法正常工作。这是我写的一句话:

perl -pi -e 's/(?<=<a href=" (?! (?<=<a href="#) ) )([^"]+)(?=")/\L$1/g' *;

有人能告诉我这个替换哪里出了问题吗?它执行得很好,但什么也没做。

I have a Perl substitution which converts hyperlinks to lowercase:

's/(?<=<a href=")([^"]+)(?=")/\L$1/g'

I want the substitution to ignore any links which begin with a hash, for example I want it to change the path in <a href="FooBar/Foo.bar">Foo Bar</a> to lowercase but skip if it comes across <a href="#Bar">Bar</a>.

Nesting lookaheads to instruct it to skip these links isn't working correctly for me. This is the one-liner I've written:

perl -pi -e 's/(?<=<a href=" (?! (?<=<a href="#) ) )([^"]+)(?=")/\L$1/g' *;

Could anyone hint to me where I have gone wrong with this substitution? It executes just fine, but does not do anything.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

ぺ禁宫浮华殁 2024-12-16 13:35:57

据我所知,如果您添加链接中的第一个字符不能是哈希 # 或双引号的条件,例如 [ ^#"]

s/(?<=<a href=")([^#"][^"]+)(?=")/\L$1/gi;

如果您的链接不以哈希开头,例如 Foo Bar,它变得稍微复杂一些:

s{(?<=<a href=")([^#"]+)(#[^"]+)*(?=")}{ lc($1) . ($2 // "") }gei;

我们现在必须评估替换,否则当可选的锚引用不存在时,我们会收到未定义的变量警告。

As near as I can tell, your initial regex will work just fine, if you add the condition that the first character in the link may not be a hash # or a double quote, e.g. [^#"]

s/(?<=<a href=")([^#"][^"]+)(?=")/\L$1/gi;

In the case you have links which do not start with a hash, e.g. <a href="FooBar/Foo.bar#BarBar">Foo Bar</a>, it becomes slightly more complicated:

s{(?<=<a href=")([^#"]+)(#[^"]+)*(?=")}{ lc($1) . ($2 // "") }gei;

We now have to evaluate the substitution, since otherwise we get undefined variable warnings when the optional anchor reference is not present.

城歌 2024-12-16 13:35:57

您不需要环顾四周,从我看来

use 5.010;
...

s/<a \s+ href \s* = \s* "\K([^#"][^"]*)"/\L$1"/gx;

\K 意味着“保留”它之前的所有内容。它相当于可变长度的后向查找。

perlre

出于各种原因, \K 可能比等效的 (?<=...) 构造更有效,并且在您想要有效删除某些内容之后的某些内容的情况下,它特别有用else 在字符串中。

You don't need look-arounds, from what I see

use 5.010;
...

s/<a \s+ href \s* = \s* "\K([^#"][^"]*)"/\L$1"/gx;

\K means "keep" everything before it. It amounts to a variable-length look-behind.

perlre:

For various reasons \K may be significantly more efficient than the equivalent (?<=...) construct, and it is especially useful in situations where you want to efficiently remove something following something else in a string.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文