如何使用正则表达式分析日志文件?替代品?
我想分析一些日志以获取一些使用情况统计数据。 基本上我想做的是使用正则表达式来减轻分析的痛苦
所以我有一个文本文件,其中记录了一些内容
2011-09-17 09:16:33,531 INFO [someJava.class.special] sendRequest: fromGevoName=null, ctrlPageId=fooBar, actionId=search,
2011-09-17 09:16:33,976 INFO [someJavaB.class] fooBar
2011-09-17 09:16:33,982 DEBUG [someOtherJava.class] abc blabala
2011-09-17 09:16:33,987 INFO [someJava.class.special] sendRequest completed: fromGevoName=XYZ, toPageId=fooBar, userId=someUser
...... 我想计算位置
[someJava.class.special] ctrlPageId=....
处所有单词的出现次数,在本例中为 fooBar,并且仅计算此出现次数。有很多不同的 fooBar,我想计算其中一个发生的频率。
我的想法是用匹配组替换并重复它,沿着此进行一些操作
((?s).*\[someJava.class.special\] sendRequest: fromGevoname=.* ctrlPageId=([^,]*)(?-s).*)*
并将其替换为匹配组 \2
然后在 Excel 中分析列表。 但我的 greptool 不会重复正则表达式,它只匹配一次。我使用 grepWin,可能有不同的工具/正则表达式吗?
嗯,这基本上是 wingrep 或 grepwin 的问题。如果重复使用,启用点换行符或禁用点换行符 (?-s) 的修饰符 (?s) 将不起作用。 所以我用这样的东西交换了正则表达式:
([\n-\[\(\]\.,:0-9a-zA-Z]).*\[someJava.class.special\] sendRequest: fromGevoname=.* ctrlPageId ([^,]*)(?-s).*
所以基本上我用字符串中可能出现的所有符号(包括换行符)交换了第一个换行符匹配点。它有效......我确信有更好的解决方案,始终开放
I want to analyze some logs for some statisics of usage.
Basically what I wanna do is use regexp to ease the pain of analysis
So I have a text file with logs something along this
2011-09-17 09:16:33,531 INFO [someJava.class.special] sendRequest: fromGevoName=null, ctrlPageId=fooBar, actionId=search,
2011-09-17 09:16:33,976 INFO [someJavaB.class] fooBar
2011-09-17 09:16:33,982 DEBUG [someOtherJava.class] abc blabala
2011-09-17 09:16:33,987 INFO [someJava.class.special] sendRequest completed: fromGevoName=XYZ, toPageId=fooBar, userId=someUser
....
I want to count the occurrences of all words at position
[someJava.class.special] ctrlPageId=....
in this case fooBar and only this occurrences. There are many different fooBar and I want to count how often one occurred.
My idea was to replace with a matching group and repeat it, something along this
((?s).*\[someJava.class.special\] sendRequest: fromGevoname=.* ctrlPageId=([^,]*)(?-s).*)*
and replace it with the matching group \2
Afterwards analyse the list in excel.
But my greptool does not repeat the regexp, it only matches once. I use grepWin, is there maybe a different tool / regexp for this?
Well it basically was a problem of wingrep or grepwin. The modifier (?s) which enables linebreaks on dots or disables it (?-s) does not work if you use it repeatedly.
So I exchanged the regexp with something along this:
([\n-\[\(\]\.,:0-9a-zA-Z]).*\[someJava.class.special\] sendRequest: fromGevoname=.* ctrlPageId ([^,]*)(?-s).*
so basically i exchanged the first linebreakmatching dot with all symbols which might occur in the string including linebreaks. It works... i'm sure there is a better solution, always open for it
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不确定我是否理解,但如果您正在寻找的输出是:
someJava fooBar
像这样的东西应该可以工作(php脚本):
I'm not sure I understand, but if the output you are looking for is:
someJava fooBar
Something like this should work (php script):