Perl 用正则表达式替代

发布于 2024-12-28 12:51:57 字数 1178 浏览 5 评论 0原文

当我在 Perl oneliner 上运行此命令时,它会选取正则表达式 - 所以这还不错。

more tagcommands | perl -nle 'print /(\d{8}_\d{9})/' | sort 

12012011_000005769
12012011_000005772
12162011_000005792
12162011_000005792

但是,当我通过下面的命令调用运行此脚本时,它不会获取 正则表达式。

#!/usr/bin/perl
use strict; 
my $switch="12012011_000005777";
open (FILE, "more /home/shortcasper/work/tagcommands|");
my @array_old = (<FILE>) ;
my @array_new = @array_old ;
foreach my $line(@array_new) {
        $line =~ s/\d{8}_\d{9}/$switch/g;
        print $line;
        sleep 1;
}

这是我输入脚本的数据

/CASPERBOT/START URL=simplefile:///data/tag/squirrels/squirrels    /12012011_000005777N.dart.gz CASPER=SeqRashMessage
/CASPERBOT/ADDSERVER simplefile:///data/tag/squirrels/12012011_0000057770.dart.trans.gz
/CASPERRIP/newApp multistitch CASPER_BIN
/CASPER_BIN/START URLS=simplefile:///data/tag/squirrels    /12012011_000005777R.rash.gz?exitOnEOF=false;binaryfile:///data/tag/squirrels/12162011_000005792D.binaryBlob.gz?exitOnEOF=false;simplefile:///data/tag/squirrels/12012011_000005777E.bean.trans.gz?exitOnEOF=false EXTRACTORS=rash;island;rash BINARY=T

When I run this command over a Perl one liner, it picks up the the regular expression -
so that can't be bad.

more tagcommands | perl -nle 'print /(\d{8}_\d{9})/' | sort 

12012011_000005769
12012011_000005772
12162011_000005792
12162011_000005792

But when I run this script over the command invocation below, it does not pick up the
regex.

#!/usr/bin/perl
use strict; 
my $switch="12012011_000005777";
open (FILE, "more /home/shortcasper/work/tagcommands|");
my @array_old = (<FILE>) ;
my @array_new = @array_old ;
foreach my $line(@array_new) {
        $line =~ s/\d{8}_\d{9}/$switch/g;
        print $line;
        sleep 1;
}

This is the data that I am feeding into the script

/CASPERBOT/START URL=simplefile:///data/tag/squirrels/squirrels    /12012011_000005777N.dart.gz CASPER=SeqRashMessage
/CASPERBOT/ADDSERVER simplefile:///data/tag/squirrels/12012011_0000057770.dart.trans.gz
/CASPERRIP/newApp multistitch CASPER_BIN
/CASPER_BIN/START URLS=simplefile:///data/tag/squirrels    /12012011_000005777R.rash.gz?exitOnEOF=false;binaryfile:///data/tag/squirrels/12162011_000005792D.binaryBlob.gz?exitOnEOF=false;simplefile:///data/tag/squirrels/12012011_000005777E.bean.trans.gz?exitOnEOF=false EXTRACTORS=rash;island;rash BINARY=T

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

流云如水 2025-01-04 12:51:57

你应该研究一下你的俏皮话,看看它是如何运作的。首先检查 perl -h 以了解所使用的开关:

-l[octal]         enable line ending processing, specifies line terminator
-n                assume "while (<>) { ... }" loop around program

第一个开关并不完全是不言自明的,但 -l 实际上所做的是 chomp 每行,然后将 $\$/ 更改为换行符。所以,你的一句台词:

perl -nle 'print /(\d{8}_\d{9})/'

实际上是这样做的:

$\ = "\n";
while (<>) {
    chomp;
    print /(\d{8}_\d{9})/;
}

查看这一点的一种非常简单的方法是使用 Deparse 命令:

$ perl -MO=Deparse -nle 'print /(\d{8}_\d{9})/'
BEGIN { $/ = "\n"; $\ = "\n"; }
LINE: while (defined($_ = <ARGV>)) {
    chomp $_;
    print /(\d{8}_\d{9})/;
}
-e syntax OK

所以,这就是将其转换为工作脚本的方法。

我不知道你是如何从这个变成这个的:

use strict; 
my $switch="12012011_000005777";
open (FILE, "more /home/shortcasper/work/tagcommands|");
my @array_old = (<FILE>) ;
my @array_new = @array_old ;
foreach my $line(@array_new) {
        $line =~ s/\d{8}_\d{9}/$switch/g;
        print $line;
        sleep 1;
}

首先,为什么要从 more 命令打开一个管道来读取文本文件?这就像叫一辆拖车来给你叫一辆出租车。只需打开文件即可。或者更好的是,不要。只需使用菱形运算符,就像您第一次一样。

您不需要首先将文件的行复制到数组,然后使用该数组。 while() 是一种简单的方法。

在一行中,您打印正则表达式。好吧,您打印正则表达式的返回值。在此脚本中,您将打印 $line。我不确定你怎么认为这会做同样的事情。

此处的正则表达式将删除所有数字集并将其替换为脚本中的数字。没有别的了。

您可能还知道 sleep 1 不会按照您的想法进行。例如,尝试一下这一行:

perl -we 'for (1 .. 10) { print "line $_\n"; sleep 1; }'

您会注意到,它只会等待 10 秒钟,然后立即打印所有内容。这是因为 perl 默认情况下打印到标准输出缓冲区(在 shell 中!),并且该缓冲区在填满或刷新(当 perl 执行结束时)之前不会打印。所以说,这是一个认知问题。一切都按其应有的方式运作,只是你看不到而已。

如果您绝对希望在脚本中包含 sleep 语句,您可能需要 autoflush,例如 STDOUT->autoflush(1);

但是,你为什么要这样做呢?这样您就有时间阅读数字吗?如果是这样,请将 more 语句放在单行代码的末尾处:

perl ...... | more

这会将输出通过管道传输到 more 命令中,因此您可以可以按照自己的节奏阅读。现在,对于你的一句台词:

总是也使用 -w ,除非你特别想避免收到警告(基本上你永远不应该)。

你的一行只会打印第一个匹配项。如果您想在新行上打印所有匹配项:

perl -wnle 'print for /(\d{8}_\d{9})/g'

如果您想打印所有匹配项,但将同一行中的匹配项保留在同一行上:

perl -wnle 'print "@a" if @a = /(\d{8}_\d{9})/g'

好吧,这应该涵盖它。

You should study your one-liner to see how it works. First check perl -h to learn about the switches used:

-l[octal]         enable line ending processing, specifies line terminator
-n                assume "while (<>) { ... }" loop around program

The first one is not exactly self-explanatory, but what -l actually does is chomp each line, and then change $\ and $/ to newline. So, your one-liner:

perl -nle 'print /(\d{8}_\d{9})/'

Actually does this:

$\ = "\n";
while (<>) {
    chomp;
    print /(\d{8}_\d{9})/;
}

A very easy way to see this is to use the Deparse command:

$ perl -MO=Deparse -nle 'print /(\d{8}_\d{9})/'
BEGIN { $/ = "\n"; $\ = "\n"; }
LINE: while (defined($_ = <ARGV>)) {
    chomp $_;
    print /(\d{8}_\d{9})/;
}
-e syntax OK

So, that's how you transform that into a working script.

I have no idea how you went from that to this:

use strict; 
my $switch="12012011_000005777";
open (FILE, "more /home/shortcasper/work/tagcommands|");
my @array_old = (<FILE>) ;
my @array_new = @array_old ;
foreach my $line(@array_new) {
        $line =~ s/\d{8}_\d{9}/$switch/g;
        print $line;
        sleep 1;
}

First of all, why are you opening a pipe from the more command to read a text file? That is like calling a tow truck to fetch you a cab. Just open the file. Or better yet, don't. Just use the diamond operator, like you did the first time.

You don't need to first copy the lines of a file to an array, and then use the array. while(<FILE>) is a simple way to do it.

In your one-liner, you print the regex. Well, you print the return value of the regex. In this script, you print $line. I'm not sure how you thought that would do the same thing.

Your regex here will remove all set of numbers and replace it with the ones in your script. Nothing else.

You may also be aware that sleep 1 will not do what you think. Try this one-liner, for example:

perl -we 'for (1 .. 10) { print "line $_\n"; sleep 1; }'

As you will notice, it will simply wait 10 seconds then print everything at once. That's because perl by default prints to the standard output buffer (in the shell!), and that buffer is not printed until it is full or flushed (when the perl execution ends). So, it's a perception problem. Everything works like it should, you just don't see it.

If you absolutely want to have a sleep statement in your script, you'll probably want to autoflush, e.g. STDOUT->autoflush(1);

However, why are you doing that? Is it so you will have time to read the numbers? If so, put that more statement at the end of your one-liner instead:

perl ...... | more

That will pipe the output into the more command, so you can read it at your own pace. Now, for your one-liner:

Always also use -w, unless you specifically want to avoid getting warnings (which basically you never should).

Your one-liner will only print the first match. If you want to print all the matches on a new line:

perl -wnle 'print for /(\d{8}_\d{9})/g'

If you want to print all the matches, but keep the ones from the same line on the same line:

perl -wnle 'print "@a" if @a = /(\d{8}_\d{9})/g'

Well, that should cover it.

放肆 2025-01-04 12:51:57

您的 open 调用可能会失败(您应该始终检查 open 的结果,以确保它成功(如果程序的其余部分依赖于它)),但我相信您的问题通过从 more 命令打开管道而不是简单地打开文件本身,使事情变得复杂。将开放更改为简单

open FILE, "/home/shortcasper/work/tagcommands" or die $!;

,事情应该会有所改善。

Your open call may be failing (you should always check the result of an open to make sure it succeeded if the rest of the program depends on it) but I believe your problem is in complicating things by opening a pipe from a more command instead of simply opening the file itself. Change the open to simply

open FILE, "/home/shortcasper/work/tagcommands" or die $!;

and things should improve.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文