为什么 gedit 无法识别从 perl 程序创建的输出文件的编码?
#!/usr/bin/perl -w
use strict;
open (EVENTLOGFILE, "<eventlog.txt") || die("Could not open file eventlog file");
open (EVENTLOGFILE_NODATETIME, ">eventlog_nodatetime.txt") || die("Could not open new event log file");
my($line) = "";
while ($line = <EVENTLOGFILE>) {
my @fields = split /[ \t]/, $line;
my($newline) = "";
my($i) = 1;
foreach( @fields )
{
my($field) = $_;
if( $i ne 3 )
{
$newline = $newline . $field;
}
$i++;
}
print EVENTLOGFILE_NODATETIME "$newline";
}
close(EVENTLOGFILE);
close(EVENTLOGFILE_NODATETIME);
如果我每次打印 $line 而不是 $newline 它可以检测编码没有问题。只有当我尝试修改线条时,它才会变得混乱。
#!/usr/bin/perl -w
use strict;
open (EVENTLOGFILE, "<eventlog.txt") || die("Could not open file eventlog file");
open (EVENTLOGFILE_NODATETIME, ">eventlog_nodatetime.txt") || die("Could not open new event log file");
my($line) = "";
while ($line = <EVENTLOGFILE>) {
my @fields = split /[ \t]/, $line;
my($newline) = "";
my($i) = 1;
foreach( @fields )
{
my($field) = $_;
if( $i ne 3 )
{
$newline = $newline . $field;
}
$i++;
}
print EVENTLOGFILE_NODATETIME "$newline";
}
close(EVENTLOGFILE);
close(EVENTLOGFILE_NODATETIME);
If I print out $line each time instead of $newline it can detect the encoding no problem. It's only when I try to modify the lines that it gets messed up.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我猜它不是编码(如 ISO 8859-1 与 UTF-8),而是行结尾(CR、LF 与 LF)。
如果您使用 chomp 并打印“\n”,您可能会将行结尾转换为平台本机。
我认为你的脚本可能更好地写成这样(未经测试):
或者
或者在拆分上使用拼接?
或者 ...
I guess it isn't encoding (as in say ISO 8859-1 vs UTF-8) but line-endings (CR, LF vs LF).
If you used chomp and printed "\n" you'd probably get line endings converted to platform native.
I think your script might be better written something like this (Untested):
Or
Or use a splice on a split?
Or ...