Perl 用正则表达式替代
当我在 Perl oneliner 上运行此命令时,它会选取正则表达式 - 所以这还不错。
more tagcommands | perl -nle 'print /(\d{8}_\d{9})/' | sort
12012011_000005769
12012011_000005772
12162011_000005792
12162011_000005792
但是,当我通过下面的命令调用运行此脚本时,它不会获取 正则表达式。
#!/usr/bin/perl
use strict;
my $switch="12012011_000005777";
open (FILE, "more /home/shortcasper/work/tagcommands|");
my @array_old = (<FILE>) ;
my @array_new = @array_old ;
foreach my $line(@array_new) {
$line =~ s/\d{8}_\d{9}/$switch/g;
print $line;
sleep 1;
}
这是我输入脚本的数据
/CASPERBOT/START URL=simplefile:///data/tag/squirrels/squirrels /12012011_000005777N.dart.gz CASPER=SeqRashMessage
/CASPERBOT/ADDSERVER simplefile:///data/tag/squirrels/12012011_0000057770.dart.trans.gz
/CASPERRIP/newApp multistitch CASPER_BIN
/CASPER_BIN/START URLS=simplefile:///data/tag/squirrels /12012011_000005777R.rash.gz?exitOnEOF=false;binaryfile:///data/tag/squirrels/12162011_000005792D.binaryBlob.gz?exitOnEOF=false;simplefile:///data/tag/squirrels/12012011_000005777E.bean.trans.gz?exitOnEOF=false EXTRACTORS=rash;island;rash BINARY=T
When I run this command over a Perl one liner, it picks up the the regular expression -
so that can't be bad.
more tagcommands | perl -nle 'print /(\d{8}_\d{9})/' | sort
12012011_000005769
12012011_000005772
12162011_000005792
12162011_000005792
But when I run this script over the command invocation below, it does not pick up the
regex.
#!/usr/bin/perl
use strict;
my $switch="12012011_000005777";
open (FILE, "more /home/shortcasper/work/tagcommands|");
my @array_old = (<FILE>) ;
my @array_new = @array_old ;
foreach my $line(@array_new) {
$line =~ s/\d{8}_\d{9}/$switch/g;
print $line;
sleep 1;
}
This is the data that I am feeding into the script
/CASPERBOT/START URL=simplefile:///data/tag/squirrels/squirrels /12012011_000005777N.dart.gz CASPER=SeqRashMessage
/CASPERBOT/ADDSERVER simplefile:///data/tag/squirrels/12012011_0000057770.dart.trans.gz
/CASPERRIP/newApp multistitch CASPER_BIN
/CASPER_BIN/START URLS=simplefile:///data/tag/squirrels /12012011_000005777R.rash.gz?exitOnEOF=false;binaryfile:///data/tag/squirrels/12162011_000005792D.binaryBlob.gz?exitOnEOF=false;simplefile:///data/tag/squirrels/12012011_000005777E.bean.trans.gz?exitOnEOF=false EXTRACTORS=rash;island;rash BINARY=T
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你应该研究一下你的俏皮话,看看它是如何运作的。首先检查
perl -h
以了解所使用的开关:第一个开关并不完全是不言自明的,但
-l
实际上所做的是chomp
每行,然后将$\
和$/
更改为换行符。所以,你的一句台词:实际上是这样做的:
查看这一点的一种非常简单的方法是使用 Deparse 命令:
所以,这就是将其转换为工作脚本的方法。
我不知道你是如何从这个变成这个的:
首先,为什么要从
more
命令打开一个管道来读取文本文件?这就像叫一辆拖车来给你叫一辆出租车。只需打开文件即可。或者更好的是,不要。只需使用菱形运算符,就像您第一次一样。您不需要首先将文件的行复制到数组,然后使用该数组。
while()
是一种简单的方法。在一行中,您打印正则表达式。好吧,您打印正则表达式的返回值。在此脚本中,您将打印
$line
。我不确定你怎么认为这会做同样的事情。此处的正则表达式将删除所有数字集并将其替换为脚本中的数字。没有别的了。
您可能还知道
sleep 1
不会按照您的想法进行。例如,尝试一下这一行:您会注意到,它只会等待 10 秒钟,然后立即打印所有内容。这是因为 perl 默认情况下打印到标准输出缓冲区(在 shell 中!),并且该缓冲区在填满或刷新(当 perl 执行结束时)之前不会打印。所以说,这是一个认知问题。一切都按其应有的方式运作,只是你看不到而已。
如果您绝对希望在脚本中包含 sleep 语句,您可能需要 autoflush,例如
STDOUT->autoflush(1);
但是,你为什么要这样做呢?这样您就有时间阅读数字吗?如果是这样,请将
more
语句放在单行代码的末尾处:这会将输出通过管道传输到
more
命令中,因此您可以可以按照自己的节奏阅读。现在,对于你的一句台词:总是也使用
-w
,除非你特别想避免收到警告(基本上你永远不应该)。你的一行只会打印第一个匹配项。如果您想在新行上打印所有匹配项:
如果您想打印所有匹配项,但将同一行中的匹配项保留在同一行上:
好吧,这应该涵盖它。
You should study your one-liner to see how it works. First check
perl -h
to learn about the switches used:The first one is not exactly self-explanatory, but what
-l
actually does ischomp
each line, and then change$\
and$/
to newline. So, your one-liner:Actually does this:
A very easy way to see this is to use the Deparse command:
So, that's how you transform that into a working script.
I have no idea how you went from that to this:
First of all, why are you opening a pipe from the
more
command to read a text file? That is like calling a tow truck to fetch you a cab. Just open the file. Or better yet, don't. Just use the diamond operator, like you did the first time.You don't need to first copy the lines of a file to an array, and then use the array.
while(<FILE>)
is a simple way to do it.In your one-liner, you print the regex. Well, you print the return value of the regex. In this script, you print
$line
. I'm not sure how you thought that would do the same thing.Your regex here will remove all set of numbers and replace it with the ones in your script. Nothing else.
You may also be aware that
sleep 1
will not do what you think. Try this one-liner, for example:As you will notice, it will simply wait 10 seconds then print everything at once. That's because perl by default prints to the standard output buffer (in the shell!), and that buffer is not printed until it is full or flushed (when the perl execution ends). So, it's a perception problem. Everything works like it should, you just don't see it.
If you absolutely want to have a sleep statement in your script, you'll probably want to autoflush, e.g.
STDOUT->autoflush(1);
However, why are you doing that? Is it so you will have time to read the numbers? If so, put that
more
statement at the end of your one-liner instead:That will pipe the output into the
more
command, so you can read it at your own pace. Now, for your one-liner:Always also use
-w
, unless you specifically want to avoid getting warnings (which basically you never should).Your one-liner will only print the first match. If you want to print all the matches on a new line:
If you want to print all the matches, but keep the ones from the same line on the same line:
Well, that should cover it.
您的
open
调用可能会失败(您应该始终检查open
的结果,以确保它成功(如果程序的其余部分依赖于它)),但我相信您的问题通过从more
命令打开管道而不是简单地打开文件本身,使事情变得复杂。将开放更改为简单,事情应该会有所改善。
Your
open
call may be failing (you should always check the result of anopen
to make sure it succeeded if the rest of the program depends on it) but I believe your problem is in complicating things by opening a pipe from amore
command instead of simply opening the file itself. Change the open to simplyand things should improve.