如何使用正则表达式分析日志文件？替代品？

发布于 2024-12-06 03:57:45 字数 1217 浏览 1 评论 0原文

我想分析一些日志以获取一些使用情况统计数据。基本上我想做的是使用正则表达式来减轻分析的痛苦

所以我有一个文本文件，其中记录了一些内容

2011-09-17 09:16:33,531 INFO  [someJava.class.special] sendRequest: fromGevoName=null, ctrlPageId=fooBar, actionId=search,
2011-09-17 09:16:33,976 INFO  [someJavaB.class] fooBar
2011-09-17 09:16:33,982 DEBUG [someOtherJava.class] abc blabala
2011-09-17 09:16:33,987 INFO  [someJava.class.special] sendRequest completed: fromGevoName=XYZ, toPageId=fooBar, userId=someUser

...... 我想计算位置

[someJava.class.special] ctrlPageId=....

处所有单词的出现次数，在本例中为 fooBar，并且仅计算此出现次数。有很多不同的 fooBar，我想计算其中一个发生的频率。

我的想法是用匹配组替换并重复它，沿着此进行一些操作

((?s).*\[someJava.class.special\] sendRequest: fromGevoname=.* ctrlPageId=([^,]*)(?-s).*)*

并将其替换为匹配组 \2

然后在 Excel 中分析列表。但我的 greptool 不会重复正则表达式，它只匹配一次。我使用 grepWin，可能有不同的工具/正则表达式吗？

嗯，这基本上是 wingrep 或 grepwin 的问题。如果重复使用，启用点换行符或禁用点换行符 (?-s) 的修饰符 (?s) 将不起作用。所以我用这样的东西交换了正则表达式：

([\n-\[\(\]\.,:0-9a-zA-Z]).*\[someJava.class.special\] sendRequest: fromGevoname=.* ctrlPageId ([^,]*)(?-s).*

所以基本上我用字符串中可能出现的所有符号（包括换行符）交换了第一个换行符匹配点。它有效......我确信有更好的解决方案，始终开放

原文

I want to analyze some logs for some statisics of usage.
Basically what I wanna do is use regexp to ease the pain of analysis

So I have a text file with logs something along this

2011-09-17 09:16:33,531 INFO  [someJava.class.special] sendRequest: fromGevoName=null, ctrlPageId=fooBar, actionId=search,
2011-09-17 09:16:33,976 INFO  [someJavaB.class] fooBar
2011-09-17 09:16:33,982 DEBUG [someOtherJava.class] abc blabala
2011-09-17 09:16:33,987 INFO  [someJava.class.special] sendRequest completed: fromGevoName=XYZ, toPageId=fooBar, userId=someUser

....
I want to count the occurrences of all words at position

[someJava.class.special] ctrlPageId=....

in this case fooBar and only this occurrences. There are many different fooBar and I want to count how often one occurred.

My idea was to replace with a matching group and repeat it, something along this

((?s).*\[someJava.class.special\] sendRequest: fromGevoname=.* ctrlPageId=([^,]*)(?-s).*)*

and replace it with the matching group \2

Afterwards analyse the list in excel.
But my greptool does not repeat the regexp, it only matches once. I use grepWin, is there maybe a different tool / regexp for this?

Well it basically was a problem of wingrep or grepwin. The modifier (?s) which enables linebreaks on dots or disables it (?-s) does not work if you use it repeatedly.
So I exchanged the regexp with something along this:

([\n-\[\(\]\.,:0-9a-zA-Z]).*\[someJava.class.special\] sendRequest: fromGevoname=.* ctrlPageId ([^,]*)(?-s).*

so basically i exchanged the first linebreakmatching dot with all symbols which might occur in the string including linebreaks. It works... i'm sure there is a better solution, always open for it

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

难以启齿的温柔 2024-12-13 03:57:45

我不确定我是否理解，但如果您正在寻找的输出是：
someJava fooBar

像这样的东西应该可以工作（php脚本）：

<?php
$log = file_get_contents('file.log')
preg_match_all("#\[(?<className>\w+)\.class(.special)?\](.*?)ctrlPageId=(?<controllerName>\w+)#i", $log, $m);

for ($i=0; $i < count($m[0]); $i++) {
  echo $m['className'][$i] . ' ' . $m['controllerName'][$i] . "\n";
}

I'm not sure I understand, but if the output you are looking for is:
someJava fooBar

Something like this should work (php script):

<?php
$log = file_get_contents('file.log')
preg_match_all("#\[(?<className>\w+)\.class(.special)?\](.*?)ctrlPageId=(?<controllerName>\w+)#i", $log, $m);

for ($i=0; $i < count($m[0]); $i++) {
  echo $m['className'][$i] . ' ' . $m['controllerName'][$i] . "\n";
}

回复收藏 0 原文

~没有更多了~