使用 perl tie::file 和 utf 编码文件

发布于 2024-12-09 23:02:21 字数 1220 浏览 1 评论 0原文

我可以将 Tie::File 与 utf 编码的输出文件一起使用吗?我无法让它正常工作。 我想做的是打开这个 utf 编码的文件,从文件中删除匹配字符串并重命名该文件。

代码:

use strict;
use warnings;
use Tie::File;
use File::Copy;

my ($input_file) = qw (test.txt);

open my $infh, "<:encoding(UTF-16LE)", $input_file or die "cannot open '$input_file': $!";

for (<$infh>) {
    tie my @lines, "Tie::File", $_;
    shift @lines if $lines[0] =~ m/MyHeader/;
    untie @lines;
    my ($name) = /^(.*).csv/i;
    move($_, $name . ".dat");
}

close $infh
    or die "Cannot close '$input_file': $!";

代码:(已更新)

my ($input_file) = qw (test.txt);
my $qfn_in = $input_file;
my $qfn_out = $qfn_in . ".dat";

open(my $fh_in, "<:raw:perlio:encoding(UTF-16le):crlf:utf8", $qfn_in)
   or die("Can't open \"$qfn_in\": $!\n");

open(my $fh_out, ">:raw:perlio:encoding(UTF-16le):crlf:utf8", $qfn_out)
   or die("Can't open \"$qfn_out\": $!\n");

while (<$fh_in>) {
   next if $. == 1 && /MyHeader/; 
   print($fh_out $_)
      or die("Can't write to \"$qfn_out\": $!");
}

close($fh_in);
close($fh_out) or die("Can't write to \"$qfn_out\": $!");

rename($qfn_out, $qfn_in)
   or die("Can't rename: $!\n");

Can I use Tie::File with an output file of utf encoding? I can't get this to work right.
What I am trying to do is open this utf encoded file, remove the match string from the file and rename the file.

Code:

use strict;
use warnings;
use Tie::File;
use File::Copy;

my ($input_file) = qw (test.txt);

open my $infh, "<:encoding(UTF-16LE)", $input_file or die "cannot open '$input_file': $!";

for (<$infh>) {
    tie my @lines, "Tie::File", $_;
    shift @lines if $lines[0] =~ m/MyHeader/;
    untie @lines;
    my ($name) = /^(.*).csv/i;
    move($_, $name . ".dat");
}

close $infh
    or die "Cannot close '$input_file': $!";

Code: (updated)

my ($input_file) = qw (test.txt);
my $qfn_in = $input_file;
my $qfn_out = $qfn_in . ".dat";

open(my $fh_in, "<:raw:perlio:encoding(UTF-16le):crlf:utf8", $qfn_in)
   or die("Can't open \"$qfn_in\": $!\n");

open(my $fh_out, ">:raw:perlio:encoding(UTF-16le):crlf:utf8", $qfn_out)
   or die("Can't open \"$qfn_out\": $!\n");

while (<$fh_in>) {
   next if $. == 1 && /MyHeader/; 
   print($fh_out $_)
      or die("Can't write to \"$qfn_out\": $!");
}

close($fh_in);
close($fh_out) or die("Can't write to \"$qfn_out\": $!");

rename($qfn_out, $qfn_in)
   or die("Can't rename: $!\n");

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

白云不回头 2024-12-16 23:02:21

Tie::File perldoc 中对此进行了详细记录,但你想通过discipline =>;绑定文件时的 ':encoding(UTF-16LE)' 选项:

tie my @lines, 'Tie::File', $input_file, discipline => ':encoding(UTF-16LE)'

请注意,第三个参数是与绑定数组关联的文件的名称。 Tie::File 会自动为你打开并管理文件句柄;无需自己对文件调用open

@lines 现在包含文件的内容,因此接下来要做的就是检查第一行:

if ($lines[0] =~ m/pattern/) {
    my $line = shift @lines;
    untie @lines;   # rewrites, closes the file, w/o first line
    my ($name) = $line =~ /^(.*).csv/i;
    rename $input_file, "$name.dat";
}

但我同意 TLP 的观点,Tie::File 对此来说太过分了工作。

(我之前关于使用正确的编码打开文件句柄并将 glob 作为第三个参数传递给 Tie::File 的答案将不起作用,因为(1)它没有在 read 中打开文件/write 模式,并且 (2) 即使这样做,Tie::File 也不能或不会对文件句柄的读取和写入应用编码)

This is underdocumented in the Tie::File perldoc, but you want to pass the discipline => ':encoding(UTF-16LE)' option when you tie the file:

tie my @lines, 'Tie::File', $input_file, discipline => ':encoding(UTF-16LE)'

Note that the third argument is the name of the file to associate with the tied array. Tie::File will automatically open and manage the filehandle for you; there is no need to call open on the file yourself.

@lines now contains the contents of the file, so the next thing to do is check the first line:

if ($lines[0] =~ m/pattern/) {
    my $line = shift @lines;
    untie @lines;   # rewrites, closes the file, w/o first line
    my ($name) = $line =~ /^(.*).csv/i;
    rename $input_file, "$name.dat";
}

But I concur with TLP that Tie::File is overkill for this job.

(My previous answer about opening a filehandle with the correct encoding and passing the glob as the third arg to Tie::File won't work, as (1) it didn't open the file in read/write mode and (2) even if it did, Tie::File can't or doesn't apply the encoding on both the reading from and writing to the file handle)

自由如风 2024-12-16 23:02:21
my $qfn_in = ...;
my $qfn_out = $qfn_in . ".tmp";

open(my $fh_in, "<:raw:perlio:encoding(UTF-16le):crlf:utf8", $qfn_in)
   or die("Can't open \"$qfn_in\": $!\n");

open(my $fh_out, ">:raw:perlio:encoding(UTF-16le):crlf:utf8", $qfn_out)
   or die("Can't open \"$qfn_out\": $!\n");

while (<$fh_in>) {
   next if $. == 1 && /MyHeader/;
   print($fh_out $_)
      or die("Can't write to \"$qfn_out\": $!");
}

close($fh_in);
close($fh_out) or die("Can't write to \"$qfn_out\": $!");

rename($qfn_out, $qfn_in)
   or die("Can't rename: $!\n");

:perlio:utf8 是当时存在的错误的解决方法。)

my $qfn_in = ...;
my $qfn_out = $qfn_in . ".tmp";

open(my $fh_in, "<:raw:perlio:encoding(UTF-16le):crlf:utf8", $qfn_in)
   or die("Can't open \"$qfn_in\": $!\n");

open(my $fh_out, ">:raw:perlio:encoding(UTF-16le):crlf:utf8", $qfn_out)
   or die("Can't open \"$qfn_out\": $!\n");

while (<$fh_in>) {
   next if $. == 1 && /MyHeader/;
   print($fh_out $_)
      or die("Can't write to \"$qfn_out\": $!");
}

close($fh_in);
close($fh_out) or die("Can't write to \"$qfn_out\": $!");

rename($qfn_out, $qfn_in)
   or die("Can't rename: $!\n");

(:perlio and :utf8 are workarounds for bugs that existed back then.)

溺深海 2024-12-16 23:02:21

该行:

tie my @lines, "Tie::File", $_;

尝试将 @lines 绑定到一个文件,其中包含 test.txt 每行的名称。由于它似乎不是一个包含文件名的文件,我怀疑该 tie 失败了。

您可能想要在 test.txt 上使用 Tie::File。如果您只想检查该文件的第一行,则不需要循环。

所以你需要类似的东西:

use autodie;  #handy to check for fatal errors
tie my @lines, "Tie::File", $input_file;
shift @lines if $lines[0] =~ /MyHeader/;
untie @lines;
if ($input_file =~ /(.+).csv/i) {
    move($input_file, $1);
}

但是有更简单的方法来检查文件的第一行。这将检查一个文件:

perl -we '$_=<>; print if /MyHeader/; print <>;' test.txt > test.dat

The line:

tie my @lines, "Tie::File", $_;

Tries to tie @lines to a file with the name of each line of test.txt. Since it does not seem to be a file with filenames in it, I suspect that that tie fails.

What you are probably after is using Tie::File on test.txt. If you only want to check the first line of that file, you do not need a loop.

So you'd need something like:

use autodie;  #handy to check for fatal errors
tie my @lines, "Tie::File", $input_file;
shift @lines if $lines[0] =~ /MyHeader/;
untie @lines;
if ($input_file =~ /(.+).csv/i) {
    move($input_file, $1);
}

But there are simpler ways to check the first line of a file. This will check one file:

perl -we '$_=<>; print if /MyHeader/; print <>;' test.txt > test.dat
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文