如何开始用 Perl 编写 Web 日志分析器?

发布于 2024-10-20 13:49:42 字数 1284 浏览 1 评论 0原文

从文件中获取信息,该文件按以下格式输出一个又一个条目: IPAddress xx [date:time -x] "method url httpversion" statuscode bytes "referer" "useragent"

您将如何作为命令行参数访问该文件并存储该信息,以便您可以按 IP 的字母顺序排列它地址同时将所有信息保存在一起?我想我需要以某种方式使用哈希和数组。

理论上你可以有尽可能多的文本文件作为命令行参数,但到目前为止我还没有让这部分工作,我只是:

./logprocess.pl monster.log #monster.log is the file that contains entries

然后在代码中,假设所有未指定的变量都已声明为标量

my $x = 0;
my @hashstuff;
my $importPage = $ARGV[0];
my @pageFile = `$importPage`;
foreach my $line (@pageFile)
{

    $ipaddy, $date, $time, $method, $url, $httpvers, $statuscode, $bytes, $referer, $useragent =~ m#(\d+.\d+.\d+.\d+) \S+ \S+ [(\d+/\S+/\d+):(\d+:\d+:\d+) \S+] "(\S+) (\S+) (\S+)" (\d+) (\d+) "(\S+)" "(\S+ \S+ \S+ \S+ \S+)"#
    %info = ('ipaddy' => $ipaddy, 'date' => $date, 'time' => $time, 'method' => $method, 'url' => $url, 'httpvers' => $httpvers, 'statuscode' => $statuscode, 'bytes' => $bytes, 'referer' => $referer, 'useragent' => $useragent);
    $hashstuff[$x] = %info;
    $x++;
}

有这绝对是一个更好的方法,因为我的编译器说我有全局符号错误,例如:

在 ./logprocess.pl 第 51 行 (#2) 中将 % 解析为运算符 % 的不明确使用 (W 模棱两可)(S) 你说了一些可能无法被解读的话 你想。通常很容易通过提供来消除歧义 缺少引号、运算符、括号对或声明。

它不会执行。我无法使用任何模块。

Taking information from a file that outputs entry after entry in this format:
IPAddress x x [date:time -x] "method url httpversion" statuscode bytes "referer" "useragent"

How would you go about accessing that file as a command-line argument and storing that information so that you could arrange it alphabetically by the IP addresses while keeping all of the information together? I assume I would need to use hashes and arrays somehow.

You could theoretically have as many text files as you want as command-line arguments but so far I haven't gotten that part to work, I just have:

./logprocess.pl monster.log #monster.log is the file that contains entries

then in the code, assume all variables not specified have been declared as scalars

my $x = 0;
my @hashstuff;
my $importPage = $ARGV[0];
my @pageFile = `$importPage`;
foreach my $line (@pageFile)
{

    $ipaddy, $date, $time, $method, $url, $httpvers, $statuscode, $bytes, $referer, $useragent =~ m#(\d+.\d+.\d+.\d+) \S+ \S+ [(\d+/\S+/\d+):(\d+:\d+:\d+) \S+] "(\S+) (\S+) (\S+)" (\d+) (\d+) "(\S+)" "(\S+ \S+ \S+ \S+ \S+)"#
    %info = ('ipaddy' => $ipaddy, 'date' => $date, 'time' => $time, 'method' => $method, 'url' => $url, 'httpvers' => $httpvers, 'statuscode' => $statuscode, 'bytes' => $bytes, 'referer' => $referer, 'useragent' => $useragent);
    $hashstuff[$x] = %info;
    $x++;
}

There is definitely a better way to do this, as my compiler says I have global symbol errors like:

Ambiguous use of % resolved as operator % at ./logprocess.pl line 51 (#2)
(W ambiguous)(S) You said something that may not be interpreted the way
you thought. Normally it's pretty easy to disambiguate it by supplying
a missing quote, operator, parenthesis pair or declaration.

and it won't execute. I can't use any modules.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

暗喜 2024-10-27 13:49:42

如果日志是由 Apache 生成的,您可以使用 Apache::ParseLog 模块。查看本页末尾的示例以获取灵感。

关于您提到的错误,您应该使用 my: 声明您的数组

my @hashstuff;

并在其中添加引用。还可以使用 $hashstuff[$x] 访问单个项目(请注意开头的美元):

$hashstuff[$x] = { %info };

或者您可以完全摆脱 $x

push @hashstuff, { %info };

If the log is produced by Apache, you could utilize Apache::ParseLog module. Look at examples at the end of the page for inspiration.

Regarding the error you mention, you should declare your array with my:

my @hashstuff;

and adding there a references. Also single item is accessed with $hashstuff[$x] (note the dollar at the beginning):

$hashstuff[$x] = { %info };

or you can get rid of $x completely:

push @hashstuff, { %info };
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文