解析 BASH 中的日志文件以查找“错误”某些时间戳之间的条目
我正在 BASH 中编写一个脚本,需要检查日志文件中是否有错误条目。我计划每小时将其作为 cron 运行,因此我只想让它仅返回过去一小时内发生的错误类型条目(所有服务器时间均为 GMT)。我建立以下变量
# Log file directory
LOGPATH="/path/to/logs/"
# Current date and time
CURDATE=`date +%Y-%m-%d`
CURTIME=`date +%H:%M:%S`
# Old date and time
OLDDATE=`date +%Y-%m-%d -d "1 hour ago"`
OLDTIME=`date +%H:%M:%S -d "1 hour ago"`
所有日志文件都遵循 ktYEAR-MONTH-DAY.root.log.txt 的文件名格式,其中 YEAR/MONTH/DAY 替换为记录条目的日期。例如,今天的日志文件将是 kt2011-08-15.root.log.txt。内容的一个示例条目是
2011-08-15 | 19:30:02 | ERROR | 18333 | 337 | n/a | dms | default | error | XMLRPC Lucene - addDocument - Reason: Failed to parse XML-RPC request: An invalid XML character (Unicode: 0xb) was found in the element content of the document.
感兴趣的列是第一、第二、第三(值可能是“INFO”、“DEBUG”等,但仅当“ERROR”是值时我感兴趣)和最后一列是日志消息的正文。
我想要完成的是让这个 BASH 脚本解析具有跨越最后一小时活动的条目的文件(如第一列和第二列中定义),并且如果第四列包含字符串“ERROR”,然后显示最右侧列的内容。当试图确定如何根据 $CURTIME
和 $OLDTIME
解析日志文件时,我感到困惑,当午夜到来时,情况会变得更糟,然后我有搜索前一天的日志文件。我不想对所有日志文件进行一揽子 grep 样式搜索,因为数量和大小可能过多,但如果必须这样做,那就这样吧。
I am writing a script in BASH that needs to check through log files for ERROR entries. I plan to run this as a cron hourly, so I only want to have it only return ERROR type entries that occurred within the last hour (all server times are GMT). I establish the following variables
# Log file directory
LOGPATH="/path/to/logs/"
# Current date and time
CURDATE=`date +%Y-%m-%d`
CURTIME=`date +%H:%M:%S`
# Old date and time
OLDDATE=`date +%Y-%m-%d -d "1 hour ago"`
OLDTIME=`date +%H:%M:%S -d "1 hour ago"`
All log files adhere to the file name format of ktYEAR-MONTH-DAY.root.log.txt Where YEAR/MONTH/DAY are replaced with the date that entries are recorded in. So for instance, today's log file would be kt2011-08-15.root.log.txt. An example entry of the contents is
2011-08-15 | 19:30:02 | ERROR | 18333 | 337 | n/a | dms | default | error | XMLRPC Lucene - addDocument - Reason: Failed to parse XML-RPC request: An invalid XML character (Unicode: 0xb) was found in the element content of the document.
The columns of interest are the 1st, 2nd, 3rd (value may be "INFO", "DEBUG", etc, but am only interested when "ERROR" is the value) and the last column which is the body of the log message.
What I am trying to accomplish is having this BASH script parse through the file(s) that have entries spanning the last hour of activity (as defined in the 1st and 2nd column), and if the 4th column contains the string "ERROR", then display the right-most column's contents. My confusion comes when trying to determine how to parse through the log file(s) based off of the $CURTIME
an $OLDTIME
, made worse when midnight comes and I then have to search through the previous day's log file. I would prefer not to do a blanket grep style search through all the log files as the quantity and size can be excessive, but if that's how it has to be done, then so be it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这就像在 awk 中进行字符串比较一样简单。当您经过午夜时,只需将
$OLDDATE
文件添加到搜索中即可:可以与 Glenn 的解决方案结合使用,从而变得更短。
This is as simple as doing string comparison in awk. When you pass midnight, simply add the
$OLDDATE
file to the search:Can be combined with glenn's solution to be much shorter.