使用 Perl 循环遍历文件中的行的最具防御性的方法是什么?
我通常使用以下代码循环遍历文件中的行:
open my $fh, '<', $file or die "Could not open file $file for reading: $!\n";
while ( my $line = <$fh> ) {
...
}
但是, 在回答另一个问题时,埃文·卡罗尔编辑了我的答案,将我的 while
语句更改为:
while ( defined( my $line = <$fh> ) ) {
...
}
他的理由是,如果你有一行 0
(它必须是最后一行,否则它会回车符)那么如果您使用我的语句($line
将设置为 "0"
,并且 return因此,赋值的值也将为 "0"
,其计算结果为 false)。如果你检查定义性,那么你就不会遇到这个问题。这是完全有道理的。
所以我尝试了一下。我创建了一个文本文件,其最后一行是 0
,其中没有回车符。我在循环中运行了它,并且循环没有过早退出。
然后我想,“啊哈,也许这个值实际上并不是 0
,也许还有其他东西把事情搞砸了!”所以我使用了 Devel::Peek
中的 Dump()
,这就是它给我的:
SV = PV(0x635088) at 0x92f0e8
REFCNT = 1
FLAGS = (PADMY,POK,pPOK)
PV = 0X962600 "0"\0
CUR = 1
LEN = 80
这似乎告诉我该值实际上是字符串 "0 "
,因为如果我在已显式设置为 "0"
的标量上调用 Dump()
,我会得到类似的结果(唯一的区别在于LEN 字段——来自文件的 LEN 是 80,而来自标量的 LEN 是 8)。
那么到底是怎么回事呢?如果我向其传递只有 "0"
且没有回车符的行,为什么我的 while()
循环不会提前退出? Evan 的循环实际上更具防御性,还是 Perl 在内部做了一些疯狂的事情,这意味着您不需要担心这些事情,而 while()
实际上只在您点击 eof
时退出代码>?
I usually loop through lines in a file using the following code:
open my $fh, '<', $file or die "Could not open file $file for reading: $!\n";
while ( my $line = <$fh> ) {
...
}
However, in answering another question, Evan Carroll edited my answer, changing my while
statement to:
while ( defined( my $line = <$fh> ) ) {
...
}
His rationale was that if you have a line that's 0
(it'd have to be the last line, else it would have a carriage return) then your while
would exit prematurely if you used my statement ($line
would be set to "0"
, and the return value from the assignment would thus also be "0"
which gets evaluated to false). If you check for defined-ness, then you don't run into this problem. It makes perfect sense.
So I tried it. I created a textfile whose last line is 0
with no carriage return on it. I ran it through my loop and the loop did not exit prematurely.
I then thought, "Aha, maybe the value isn't actually 0
, maybe there's something else there that's screwing things up!" So I used Dump()
from Devel::Peek
and this is what it gave me:
SV = PV(0x635088) at 0x92f0e8
REFCNT = 1
FLAGS = (PADMY,POK,pPOK)
PV = 0X962600 "0"\0
CUR = 1
LEN = 80
That seems to tell me that the value is actually the string "0"
, as I get a similar result if I call Dump()
on a scalar I've explicitly set to "0"
(the only difference is in the LEN field -- from the file LEN is 80, whereas from the scalar LEN is 8).
So what's the deal? Why doesn't my while()
loop exit prematurely if I pass it a line that's only "0"
with no carriage return? Is Evan's loop actually more defensive, or does Perl do something crazy internally that means you don't need to worry about these things and while()
actually only does exit when you hit eof
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
因为
实际上编译为
在非常旧的 Perl 版本中可能是必要的,但现在不再需要了!您可以通过在脚本上运行 B::Deparse 来看到这一点:
所以您已经可以开始了!
Because
actually compiles down to
It may have been necessary in a very old version of perl, but not any more! You can see this from running B::Deparse on your script:
So you're already good to go!
顺便说一句,这在 perldoc perlop 的 I/O 运算符部分中有介绍。 :
BTW, this is covered in the I/O Operators section of perldoc perlop:
虽然
while (my $line=<$fh>) { ... }
的形式得到 编译 到while (已定义( my $line = <$fh> ) ) { ... }
考虑到,如果没有显式定义
,则值“0”的合法读取可能会被误解。在循环中或测试<>
的返回。这里有几个例子:
我并不是说这些是 Perl 的良好形式!我是说它们是可能的;特别是故障 3,4 和 5。请注意第 4 和 5 号上没有 Perl 警告的故障。前两个有自己的问题...
While it is correct that the form of
while (my $line=<$fh>) { ... }
gets compiled towhile (defined( my $line = <$fh> ) ) { ... }
consider there are a variety of times when a legitimate read of the value "0" is misinterpreted if you do not have an explicitdefined
in the loop or testing the return of<>
.Here are several examples:
I am not saying these are good forms of Perl! I am saying that they are possible; especially Failure 3,4 and 5. Note the failure with no Perl warning on number 4 and 5. The first two have their own issues...