如何忽略 perl grep 中的任何空值?

发布于 2024-11-19 02:16:46 字数 234 浏览 3 评论 0原文

我使用以下方法来计算文件中某个模式的出现次数:

my @lines = grep /$text/, <$fp>;
print ($#lines + 1);

但有时它打印的结果比实际值多 1。我查了一下,是因为@lines最后一个元素为null,所以也算进去了。

为什么有时 grep 结果的最后一个元素为空?另外,这个问题如何解决?

I am using the following to count the number of occurrences of a pattern in a file:

my @lines = grep /$text/, <$fp>;
print ($#lines + 1);

But sometimes it prints one more than the actual value. I checked and it is because the last element of @lines is null, and that is also counted.

How can the last element of the grep result be empty sometimes? Also, how can this issue be resolved?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

说不完的你爱 2024-11-26 02:16:46

这实际上很大程度上取决于您的模式,但您可以做的一件事是加入几个匹配项,第一个匹配项取消任何仅包含空格(或不包含空格)的行。此示例将拒绝任何空行、仅换行符或仅包含任意数量的空格的行。

my @lines = grep { not /^\s*$/ and /$test/ } <$fp>;

请记住,如果 $test 的内容碰巧包含 regexp 特殊元字符,则它们要么需要用于其元字符目的,要么使用 quotemeta() 进行消毒。

我的理论是,您可能有一个以 \n 结尾的行,它以某种方式与您的 $text 正则表达式匹配,或者您的 $text 正则表达式包含其中的元字符,这些元字符在您不知情的情况下影响匹配。无论哪种方式,我提供的代码片段至少会强制拒绝“空白行”,其中空白可能意味着完全空(不太可能)、换行符终止但否则为空(可能),或者包含空白行(可能)在打印时显示为空白。

It really depends a lot on your pattern, but one thing you could do is join a couple of matches, the first one disqualifying any line that contains only space (or nothing). This example will reject any line that is either empty, newline only, or any amount of whitespace only.

my @lines = grep { not /^\s*$/ and /$test/ } <$fp>;

Keep in mind that if the contents of $test happen to include regexp special metacharacters they either need to be intended for their metacharacter purposes, or sterilized with quotemeta().

My theories are that you might have a line terminated in \n which is somehow matching your $text regexp, or your $text regexp contains metacharacters in it that are affecting the match without you being aware. Either way, the snippet I provided will at least force rejection of "blank lines", where blank could mean completely empty (unlikely), newline terminated but otherwise empty (probable), or whitespace containing (possible) lines that appear blank when printed.

心的憧憬 2024-11-26 02:16:46

匹配空字符串的正则表达式将匹配 undef。 Perl 会警告这样做,但在尝试匹配之前将 undef 转换为 '',此时 grep 会很高兴地促进undef 到其结果。如果您不想选取空字符串(或任何将像空字符串一样进行匹配的内容),则需要重写正则表达式以使其不匹配。

A regular expression that matches the empty string will match undef. Perl will warn about doing so, but casts undef to '' before trying to match against it, at which point grep will quite happily promote the undef to its results. If you don't want to pick up the empty string (or anything that will be matched as though it were the empty string), you need to rewrite your regular expression to not match it.

平生欢 2024-11-26 02:16:46

要准确查看行中的内容,请执行以下操作:

use Data::Dumper;
$Data::Dumper::Useqq = 1;
print Dumper \@lines;

To accurately see what is in lines, do:

use Data::Dumper;
$Data::Dumper::Useqq = 1;
print Dumper \@lines;
日裸衫吸 2024-11-26 02:16:46

好吧,由于没有更多关于 $text (正则表达式)内容的信息,我想我会抛出一些一般信息。

考虑以下示例:

use Data::Dumper;

my @array = (' ', 1, 2, 'a', '');
print Dumper [ grep /\s*/, @array ];

我们得到:

$VAR1 = [
          ' ',
          1,
          2,
          'a',
          ''
        ];

所有值都匹配。为什么?因为它们也匹配空字符串。为了得到我们想要的,我们需要 \s\s+。 (两者不会有实际区别)

你可能会遇到这样的问题。

Ok, since no more information about the contents of $text (the regex) is forthcoming, I guess I'll toss out some general information.

Consider the following example:

use Data::Dumper;

my @array = (' ', 1, 2, 'a', '');
print Dumper [ grep /\s*/, @array ];

We get:

$VAR1 = [
          ' ',
          1,
          2,
          'a',
          ''
        ];

All the values match. Why? Because they also match the empty string. To get what we want, we need \s or \s+. (There will be no practical difference between the two)

You may have such a problem.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文