Perl - 从特定行开始读取，只获取该行的第一列，直到结束

发布于 2024-10-04 11:20:34 字数 629 浏览 5 评论 0原文

我有一个如下所示的文本文件：

Line 1
Line 2
Line 3
Line 4
Line 5
filename2.tif;Smpl/Pix & Bits/Smpl are missing.

有 5 行始终相同，第 6 行是我要开始读取数据的位置。读取数据时，每行（从第 6 行开始）均以分号分隔。我只需要获取每行的第一个条目（从第 6 行开始）。

例如：

Line 1
Line 2
Line 3
Line 4
Line 5
filename2.tif;Smpl/Pix & Bits/Smpl are missing.
filename4.tif;Smpl/Pix & Bits/Smpl are missing.
filename6.tif;Smpl/Pix & Bits/Smpl are missing.
filename8.tif;Smpl/Pix & Bits/Smpl are missing.

所需的输出将是：

filename2.tif
filename4.tif
filename6.tif
filename8.tif

这可能吗？如果可能，我从哪里开始？

原文

I have a text file that looks like the following:

Line 1
Line 2
Line 3
Line 4
Line 5
filename2.tif;Smpl/Pix & Bits/Smpl are missing.

There are 5 lines that are always the same, and on the 6th line is where I want to start reading data. Upon reading data, each line (starting from line 6) is delimited by semicolons. I need to just get the first entry of each line (starting on line 6).

For example:

Line 1
Line 2
Line 3
Line 4
Line 5
filename2.tif;Smpl/Pix & Bits/Smpl are missing.
filename4.tif;Smpl/Pix & Bits/Smpl are missing.
filename6.tif;Smpl/Pix & Bits/Smpl are missing.
filename8.tif;Smpl/Pix & Bits/Smpl are missing.

Output desired would be:

filename2.tif
filename4.tif
filename6.tif
filename8.tif

Is this possible, and if so, where do I begin?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

雪化雨蝶 2024-10-11 11:20:34

这使用 Perl 'autosplit'（或 'awk'）模式：

perl -n -F'/;/' -a -e 'next if $. <= 5; print "$F[0]\n";' < data.file

请参阅 'perlrun' 和 ' perlvar'。

如果您需要在给定文件句柄和要跳过的行数的函数中执行此操作，那么您将不会使用 Perl 的“自动拆分”模式。

sub skip_N_lines_read_column_1
{
    my($fh, $N) = @_;
    my $i = 0;
    my @files = ();
    while (my $line = <$fh>)
    {
        next if $i++ < $N;
        my($file) = split /;/, $line;
        push @files, $file;
    }
    return @files;
}

这会初始化一个循环，读取行，跳过前 N 个行，然后分割行并仅捕获第一个结果。 my($file) = split... 的那一行很微妙；括号意味着分割有一个列表上下文，因此它生成一个值列表（而不是值的计数）并将第一个值分配给变量。如果省略括号，您将为列表运算符提供标量上下文，因此您将获得分配给 $file 的拆分输出中的字段数量 - 而不是您需要的数量。文件名被附加到数组的末尾，并返回该数组。由于代码没有打开文件句柄，因此不会关闭它。另一种接口会将文件名（而不是打开的文件句柄）传递到函数中。然后，您可以在函数中打开和关闭文件，并担心错误处理。

如果您需要打开文件等方面的帮助，那么：

use Carp;

sub open_skip_read
{
    my($name) = @_;
    open my $fh, '<', $name or croak "Failed to open file $name ($!)";
    my @list = skip_N_lines_read_column_1($fh, 5);
    close $fh or croak "Failed to close file $name ($!)";
    return @list;
}

This uses the Perl 'autosplit' (or 'awk') mode:

perl -n -F'/;/' -a -e 'next if $. <= 5; print "$F[0]\n";' < data.file

See 'perlrun' and 'perlvar'.

If you need to do this in a function which is given a file handle and a number of lines to skip, then you won't be using the Perl 'autosplit' mode.

sub skip_N_lines_read_column_1
{
    my($fh, $N) = @_;
    my $i = 0;
    my @files = ();
    while (my $line = <$fh>)
    {
        next if $i++ < $N;
        my($file) = split /;/, $line;
        push @files, $file;
    }
    return @files;
}

This initializes a loop, reads lines, skipping the first N of them, then splitting the line and capturing the first result only. That line with my($file) = split... is subtle; the parentheses mean that the split has a list context, so it generates a list of values (rather than a count of values) and assigns the first to the variable. If the parentheses were omitted, you would be providing a scalar context to a list operator, so you'd get the number of fields in the split output assigned to $file - not what you needed. The file name is appended to the end of the array, and the array is returned. Since the code did not open the file handle, it does not close it. An alternative interface would pass the file name (instead of an open file handle) into the function. You'd then open and close the file in the function, worrying about error handling.

And if you need the help with opening the file, etc, then:

use Carp;

sub open_skip_read
{
    my($name) = @_;
    open my $fh, '<', $name or croak "Failed to open file $name ($!)";
    my @list = skip_N_lines_read_column_1($fh, 5);
    close $fh or croak "Failed to close file $name ($!)";
    return @list;
}

回复收藏 0 原文

记忆之渊 2024-10-11 11:20:34

#!/usr/bin/env perl
#
# name_of_program - what the program does as brief one-liner
#
# Your Name <your_email@your_host.TLA>
# Date program written/released
#################################################################

use 5.10.0;

use utf8;
use strict;
use autodie;
use warnings FATAL => "all";

#  ⚠ change to agree with your input: ↓
use open ":std" => IN    => ":encoding(ISO-8859-1)",
                   OUT   => ":utf8";
#  ⚠ change for your output: ↑ — *maybe*, but leaving as UTF-8 is sometimes better

END {close STDOUT}

our $VERSION = 1.0;

$| = 1;

if (@ARGV == 0 && -t STDIN) {
   warn "reading stdin from keyboard for want of file args or pipe";
}

while (<>) {
    next if 1 .. 5;
    my $initial_field = /^([^;]+)/ ? $1 : next;
    #    ╔═══════════════════════════╗
    #   ☞ your processing goes here ☜
    #    ╚═══════════════════════════╝
} continue {
    close ARGV if eof;
}

__END__

#!/usr/bin/env perl
#
# name_of_program - what the program does as brief one-liner
#
# Your Name <your_email@your_host.TLA>
# Date program written/released
#################################################################

use 5.10.0;

use utf8;
use strict;
use autodie;
use warnings FATAL => "all";

#  ⚠ change to agree with your input: ↓
use open ":std" => IN    => ":encoding(ISO-8859-1)",
                   OUT   => ":utf8";
#  ⚠ change for your output: ↑ — *maybe*, but leaving as UTF-8 is sometimes better

END {close STDOUT}

our $VERSION = 1.0;

$| = 1;

if (@ARGV == 0 && -t STDIN) {
   warn "reading stdin from keyboard for want of file args or pipe";
}

while (<>) {
    next if 1 .. 5;
    my $initial_field = /^([^;]+)/ ? $1 : next;
    #    ╔═══════════════════════════╗
    #   ☞ your processing goes here ☜
    #    ╚═══════════════════════════╝
} continue {
    close ARGV if eof;
}

__END__

回复收藏 0 原文

绿萝 2024-10-11 11:20:34

有点难看，但是，读出虚拟线，然后分开；对于其余的人。

my $logfile = '/path/to/logfile.txt';

open(FILE, $logfile) || die "Couldn't open $logfile: $!\n";

for (my $i = 0 ; $i < 5 ; $i++) {
   my $dummy = <FILE>;
}

while (<FILE>) {
   my (@fields) = split /;/;
   print $fields[0], "\n";
}

close(FILE);

Kinda ugly but, read out the dummy lines and then split on ; for the rest of them.

my $logfile = '/path/to/logfile.txt';

open(FILE, $logfile) || die "Couldn't open $logfile: $!\n";

for (my $i = 0 ; $i < 5 ; $i++) {
   my $dummy = <FILE>;
}

while (<FILE>) {
   my (@fields) = split /;/;
   print $fields[0], "\n";
}

close(FILE);

回复收藏 0 原文

~没有更多了~