Perl模式匹配变量问题

发布于 2024-10-10 16:52:07 字数 520 浏览 6 评论 0原文

我正在尝试打开一个文件,匹配特定行,然后将 HTML 标签包裹在该行周围。看起来非常简单,但显然我遗漏了一些东西并且没有正确理解 Perl 匹配的模式变量。

我将这一行与以下内容相匹配:

$line =~ m/(Number of items:.*)/i;

它将整行放入 $1 中。我尝试像这样打印出我的新行:

print "<p>" . $1 . "<\/p>;

我希望它打印出这个:

<p>Number of items: 22</p>

但是,我实际上得到了这个:

</p>umber of items: 22

我已经尝试了各种变体 - 在单独的行上打印每个位,将 $1 设置为新的变量,使用 $+ 和 $& 等,我总是得到相同的结果。

我缺少什么?

I'm trying to open a file, match a particular line, and then wrap HTML tags around that line. Seems terribly simple but apparently I'm missing something and don't understand the Perl matched pattern variables correctly.

I'm matching the line with this:

$line =~ m/(Number of items:.*)/i;

Which puts the entire line into $1. I try to then print out my new line like this:

print "<p>" . $1 . "<\/p>;

I expect it to print this:

<p>Number of items: 22</p>

However, I'm actually getting this:

</p>umber of items: 22

I've tried all kinds of variations - printing each bit on a separate line, setting $1 to a new variable, using $+ and $&, etc. and I always get the same result.

What am I missing?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

欢烬 2024-10-17 16:52:07

您的匹配项中有一个 \r,打印时会导致输出格式错误。

编辑:
为了进一步解释,您的文件很可能具有 Windows 样式 \r\n 行结尾。 chomp 不会删除 \r,它会被纳入你的贪婪匹配中,并导致令人不快的输出(\r 意味着返回到行的开头并继续打印)。

您可以通过添加类似的内容来删除 \r

$line =~ tr/\015//d;

You have an \r in your match, which when printed results in the malformed output.

edit:
To explain further, chances are your file has windows style \r\n line endings. chomp won't remove the \r, which will then get slurped into your greedy match, and results in the unpleasant output (\r means go back to the start of the line and continue printing).

You can remove the \r by adding something like

$line =~ tr/\015//d;
花之痕靓丽 2024-10-17 16:52:07

您能提供一个完整的代码片段来说明您的问题吗?我没看到。

需要注意的一件事是 $1 和朋友指的是该动态范围内最后一次成功匹配的捕获。在使用匹配之前,您应该始终验证匹配是否成功:

$line = "Foo Number of items: 97\n";
if ( $line =~ m/(Number of items:.*)/i ) {
    print "<p>" . $1 . "<\/p>\n";
}

Can you provide a complete code snippet that demonstrates your problem? I'm not seeing it.

One thing to be cautious of is that $1 and friends refer to captures from the last successful match in that dynamic scope. You should always verify that a match succeeds before using one:

$line = "Foo Number of items: 97\n";
if ( $line =~ m/(Number of items:.*)/i ) {
    print "<p>" . $1 . "<\/p>\n";
}
风筝在阴天搁浅。 2024-10-17 16:52:07

您刚刚了解到(供将来参考).* 有多危险。

在经历过类似的不愉快之后,这些天我喜欢尽可能精确地描述我期望捕捉到的东西。也许

$line =~ m/(Number of items:\s+\d+)/;

那么我确定一开始就不会捕获有问题的控制角色。无论 Cygwin 对 Windows 文件做什么,我都可以保持幸福的无知。

You've just learned (for future reference) how dangerous .* can be.

Having banged my head against similar unpleasantnesses, these days I like to be as precise as I can about what I expect to capture. Maybe

$line =~ m/(Number of items:\s+\d+)/;

Then I'm sure of not capturing the offending control character in the first place. Whatever Cygwin may be doing with Windows files, I can remain blissfully ignorant.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文