如何使用 Perl 逐行读取仅 CR 文件？

发布于 2024-09-05 04:00:31 字数 283 浏览 20 评论 0原文

我正在尝试读取一个仅包含 CR 作为行分隔符的文件。我正在使用 Mac OS X 和 Perl v.5.8.8。该脚本应该在每个平台上针对每种行分隔符（CR、LF、CRLF）运行。

我当前的代码如下：

open(FILE, "test.txt");

while($record = <FILE>){
    print $record;
}

close(TEST);

当前仅打印最后一行（或最差的行）。到底是怎么回事？显然，我不想转换文件。是否可以？

原文

I'm trying to read a file which has only CR as line delimiter. I'm using Mac OS X and Perl v.5.8.8. This script should run on every platform, for every kind of line delimiter (CR, LF, CRLF).

My current code is the following :

open(FILE, "test.txt");

while($record = <FILE>){
    print $record;
}

close(TEST);

This currently print only the last line (or worst). What is going on?
Obvisously, I would like to not convert the file. Is it possible?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

撩发小公举 2024-09-12 04:00:31

您可以使用特殊变量 $/ 设置分隔符：

local $/ = "\r" # CR, use "\r\n" for CRLF or "\n" for LF
my $line = <FILE>;

请参阅 perldoc perlvar 了解更多信息。

另一个适用于各种换行符的解决方案是立即读取整个文件，然后使用正则表达式将其分成几行：

local $/ = undef;
my $content = <FILE>;
my @lines = split /\r\n|\n|\r/, $content;

不过，您不应该对非常大的文件执行此操作，因为文件会完全读入内存。请注意，将 $/ 设置为未定义的值会禁用行分隔符，这意味着将读取所有内容，直到文件末尾。

You can set the delimiter using the special variable $/:

local $/ = "\r" # CR, use "\r\n" for CRLF or "\n" for LF
my $line = <FILE>;

See perldoc perlvar for further information.

Another solution that works with all kinds of linebreaks would be to slurp the whole file at once and then split it into lines using a regex:

local $/ = undef;
my $content = <FILE>;
my @lines = split /\r\n|\n|\r/, $content;

You shouldn't do that with very large files though, as the file is read into memory completely. Note that setting $/ to the undefined value disables the line delimiter, meaning that everything is read until the end of the file.

回复收藏 0 原文

花辞树 2024-09-12 04:00:31

我解决了一个在这里可能有用的更普遍的问题：

如何使用任何行分隔符（CR/CRLF/LF）逐行解析大文件，但事先未知。

“大”文件意味着无法将整个文件读入一个变量。这里，函数“detectEndOfLine”获取文件名并返回“\r”或“\n”，无论用于行结束（它从文件末尾）。

my $file = "test.txt";
local $/ = detectEndOfLine($file);
open(IN, $file) or die "Can't open file \"$file\" for reading: $!\n";
while(<IN>) {
    s/\r\n|\n|\r$//;
    print "$_\n";
}

sub detectEndOfLine {
    my $file = $_[0];
    my $size = -s $file;
    print "\"$size\"\n";

    open(IN, $file) or die "Can't open file \"$file\" for reading: $!\n";
    for(my $i = $size; $i >= 0; --$i) {
        seek(IN, $i, 0);
        $_ = <IN>;
        my $sym = substr($_, 0, 1);
        return $sym if( $sym eq "\n" or $sym eq "\r" );
    }
    return undef;
}

I solved a more general problem that could be useful here:

How to parse big file line-by-line with any line delimiter (CR/CRLF/LF), but unknown beforehand.

'Big' file means that it is not ok to read the whole file into one variable. Here function 'detectEndOfLine' gets name of file and returns either '\r' or '\n', whatever is used for line ending (it searched for '\r' or '\n' symbol char-by-char starting from the end of the file).

my $file = "test.txt";
local $/ = detectEndOfLine($file);
open(IN, $file) or die "Can't open file \"$file\" for reading: $!\n";
while(<IN>) {
    s/\r\n|\n|\r$//;
    print "$_\n";
}

sub detectEndOfLine {
    my $file = $_[0];
    my $size = -s $file;
    print "\"$size\"\n";

    open(IN, $file) or die "Can't open file \"$file\" for reading: $!\n";
    for(my $i = $size; $i >= 0; --$i) {
        seek(IN, $i, 0);
        $_ = <IN>;
        my $sym = substr($_, 0, 1);
        return $sym if( $sym eq "\n" or $sym eq "\r" );
    }
    return undef;
}

回复收藏 0 原文

~没有更多了~