如何开始用 Perl 编写 Web 日志分析器?
从文件中获取信息,该文件按以下格式输出一个又一个条目: IPAddress xx [date:time -x] "method url httpversion" statuscode bytes "referer" "useragent"
您将如何作为命令行参数访问该文件并存储该信息,以便您可以按 IP 的字母顺序排列它地址同时将所有信息保存在一起?我想我需要以某种方式使用哈希和数组。
理论上你可以有尽可能多的文本文件作为命令行参数,但到目前为止我还没有让这部分工作,我只是:
./logprocess.pl monster.log #monster.log is the file that contains entries
然后在代码中,假设所有未指定的变量都已声明为标量
my $x = 0;
my @hashstuff;
my $importPage = $ARGV[0];
my @pageFile = `$importPage`;
foreach my $line (@pageFile)
{
$ipaddy, $date, $time, $method, $url, $httpvers, $statuscode, $bytes, $referer, $useragent =~ m#(\d+.\d+.\d+.\d+) \S+ \S+ [(\d+/\S+/\d+):(\d+:\d+:\d+) \S+] "(\S+) (\S+) (\S+)" (\d+) (\d+) "(\S+)" "(\S+ \S+ \S+ \S+ \S+)"#
%info = ('ipaddy' => $ipaddy, 'date' => $date, 'time' => $time, 'method' => $method, 'url' => $url, 'httpvers' => $httpvers, 'statuscode' => $statuscode, 'bytes' => $bytes, 'referer' => $referer, 'useragent' => $useragent);
$hashstuff[$x] = %info;
$x++;
}
有这绝对是一个更好的方法,因为我的编译器说我有全局符号错误,例如:
在 ./logprocess.pl 第 51 行 (#2) 中将 % 解析为运算符 % 的不明确使用 (W 模棱两可)(S) 你说了一些可能无法被解读的话 你想。通常很容易通过提供来消除歧义 缺少引号、运算符、括号对或声明。
它不会执行。我无法使用任何模块。
Taking information from a file that outputs entry after entry in this format:
IPAddress x x [date:time -x] "method url httpversion" statuscode bytes "referer" "useragent"
How would you go about accessing that file as a command-line argument and storing that information so that you could arrange it alphabetically by the IP addresses while keeping all of the information together? I assume I would need to use hashes and arrays somehow.
You could theoretically have as many text files as you want as command-line arguments but so far I haven't gotten that part to work, I just have:
./logprocess.pl monster.log #monster.log is the file that contains entries
then in the code, assume all variables not specified have been declared as scalars
my $x = 0;
my @hashstuff;
my $importPage = $ARGV[0];
my @pageFile = `$importPage`;
foreach my $line (@pageFile)
{
$ipaddy, $date, $time, $method, $url, $httpvers, $statuscode, $bytes, $referer, $useragent =~ m#(\d+.\d+.\d+.\d+) \S+ \S+ [(\d+/\S+/\d+):(\d+:\d+:\d+) \S+] "(\S+) (\S+) (\S+)" (\d+) (\d+) "(\S+)" "(\S+ \S+ \S+ \S+ \S+)"#
%info = ('ipaddy' => $ipaddy, 'date' => $date, 'time' => $time, 'method' => $method, 'url' => $url, 'httpvers' => $httpvers, 'statuscode' => $statuscode, 'bytes' => $bytes, 'referer' => $referer, 'useragent' => $useragent);
$hashstuff[$x] = %info;
$x++;
}
There is definitely a better way to do this, as my compiler says I have global symbol errors like:
Ambiguous use of % resolved as operator % at ./logprocess.pl line 51 (#2)
(W ambiguous)(S) You said something that may not be interpreted the way
you thought. Normally it's pretty easy to disambiguate it by supplying
a missing quote, operator, parenthesis pair or declaration.
and it won't execute. I can't use any modules.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果日志是由 Apache 生成的,您可以使用 Apache::ParseLog 模块。查看本页末尾的示例以获取灵感。
关于您提到的错误,您应该使用
my
: 声明您的数组并在其中添加引用。还可以使用
$hashstuff[$x]
访问单个项目(请注意开头的美元):或者您可以完全摆脱
$x
:If the log is produced by Apache, you could utilize Apache::ParseLog module. Look at examples at the end of the page for inspiration.
Regarding the error you mention, you should declare your array with
my
:and adding there a references. Also single item is accessed with
$hashstuff[$x]
(note the dollar at the beginning):or you can get rid of
$x
completely: