使用 awk 检查两个日期之间的情况

发布于 2024-10-01 23:18:14 字数 2098 浏览 3 评论 0原文

我有一个包含多个数据结构的文件,如下所示:

eventTimestamp: 2010-03-23T07:56:19.166
result: Allowed
protocol: SMS
payload: RCOMM_SMS

eventTimestamp: 2010-03-23T07:56:19.167
result: Allowed
protocol: SMS
payload: RCOMM_SMS

eventTimestamp: 2010-03-23T07:56:19.186
result: Allowed
protocol: SMS
payload: SMS-MO-FSM

eventTimestamp: 2010-03-23T07:56:19.197
result: Allowed
protocol: SMS
payload: COPS

eventTimestamp: 2010-03-23T07:56:29.519
result: Blocked
protocol: SMS
payload: COPS
type: URL_IWF
result: Blocked

我想查找在 2010-03-23 之间发生的所有有效负载:SMS-MO-FSM 或有效负载:SMS-MO-FSM-INFO 事件12:56:47 和 2010-03-23 13:56:47。到目前为止,在查询此文件时,我按以下方式使用了 awk:

cat checkThis.txt |
awk 'BEGIN{FS="\n"; RS=""; OFS=";"; ORS="\n"}
     $1~/eventTimestamp: 2010-03-23T14\:16\:35/ && $4~/SMS-MO-FSM-INFO|SMS-MO-FSM$/ {$1=$1 ""; print $0}'

这将为我提供 2010 年 3 月 23 日 14:16:35 的第二个发生的所有事件。然而,我正在努力思考如何将日期范围放入查询中。我可以使用以下内容将日期放入纪元时间,但是如何在 awk 中使用以下内容来检查日期是否在所需的时间之间:

python -c "import time; ENGINE_TIME_FORMAT='%Y-%m-%dT%H:%M:%S'; print int(time.mktime(time.strptime('2010-03-23T12:52:52', ENGINE_TIME_FORMAT)))"

我知道这可以在 Python 中完成,但我已经用 Python 编写了一个解析器我希望这个方法作为替代检查器,所以如果可能的话我想使用 awk。

我更进一步,创建了一个用于时间转换的 python 脚本:

#!/usr/local/bin/python
import time, sys
ENGINE_TIME_FORMAT='%Y-%m-%dT%H:%M:%S'
testTime = sys.argv[1]
try:
    print int(time.mktime(time.strptime(testTime, ENGINE_TIME_FORMAT)))
except:
    print "Time to convert %s" % testTime
    raise

然后我尝试使用 getline 将转换分配给变量进行比较:

cat checkThis.txt| awk 'BEGIN {FS="\n"; RS=""; OFS=";"; ORS="\n"; "./firstDate '2010-03-23T12:56:47'" | getline start_time; close("firstDate"); "./firstDate '2010-03-23T13:56:47'" | getline end_time; close("firstDate");} ("./firstDate $1" | getline) > start_time {$1=$1 ""; print $0}'
Traceback (most recent call last):
  File "./firstDate", line 4, in <module>
testTime = sys.argv[1]
IndexError: list index out of range

getline 在 BEGIN 中工作,我在最终打印中检查了它,但我似乎有脚本比较部分的问题。

I have a file with multiple data structures in it like so:

eventTimestamp: 2010-03-23T07:56:19.166
result: Allowed
protocol: SMS
payload: RCOMM_SMS

eventTimestamp: 2010-03-23T07:56:19.167
result: Allowed
protocol: SMS
payload: RCOMM_SMS

eventTimestamp: 2010-03-23T07:56:19.186
result: Allowed
protocol: SMS
payload: SMS-MO-FSM

eventTimestamp: 2010-03-23T07:56:19.197
result: Allowed
protocol: SMS
payload: COPS

eventTimestamp: 2010-03-23T07:56:29.519
result: Blocked
protocol: SMS
payload: COPS
type: URL_IWF
result: Blocked

I want to find all of the events that are payload: SMS-MO-FSM or payload: SMS-MO-FSM-INFO that occurred between the times 2010-03-23 12:56:47 and 2010-03-23 13:56:47. When querying this file so far I have used awk in the following manner:

cat checkThis.txt |
awk 'BEGIN{FS="\n"; RS=""; OFS=";"; ORS="\n"}
     $1~/eventTimestamp: 2010-03-23T14\:16\:35/ && $4~/SMS-MO-FSM-INFO|SMS-MO-FSM$/ {$1=$1 ""; print $0}'

Which will give me all of the events that occurred on the second of 14:16:35 in 2010-03-23. I am struggling, however, to think of how I could put the date range into my query. I could use the following to put the dates into epoch time but how can I use the following in my awk to check whether the date is between the times needed:

python -c "import time; ENGINE_TIME_FORMAT='%Y-%m-%dT%H:%M:%S'; print int(time.mktime(time.strptime('2010-03-23T12:52:52', ENGINE_TIME_FORMAT)))"

I know this could done in Python but I have written a parser in Python for this and I want this method as an alternative checker so I want to use awk if at all possible.

I took this a little further and created a python script for time conversion:

#!/usr/local/bin/python
import time, sys
ENGINE_TIME_FORMAT='%Y-%m-%dT%H:%M:%S'
testTime = sys.argv[1]
try:
    print int(time.mktime(time.strptime(testTime, ENGINE_TIME_FORMAT)))
except:
    print "Time to convert %s" % testTime
    raise

I then tried to use getline to assign the conversion to a variable for comparison:

cat checkThis.txt| awk 'BEGIN {FS="\n"; RS=""; OFS=";"; ORS="\n"; "./firstDate '2010-03-23T12:56:47'" | getline start_time; close("firstDate"); "./firstDate '2010-03-23T13:56:47'" | getline end_time; close("firstDate");} ("./firstDate $1" | getline) > start_time {$1=$1 ""; print $0}'
Traceback (most recent call last):
  File "./firstDate", line 4, in <module>
testTime = sys.argv[1]
IndexError: list index out of range

The getline works in the BEGIN and I checked it in the final print but I seem to have problems in the comparison part of the script.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

遮了一弯 2024-10-08 23:18:14

关键的观察结果是,您可以使用字母数字比较来比较时间戳并获得正确的答案 - 这就是 ISO 的优点8601 表示法。

因此,稍微调整您的代码 - 并格式化以避免滚动条:

awk 'BEGIN {
        FS  = "\n"
        RS  = ""
        OFS = ";"
        ORS = "\n"
        t1  = "2010-03-23T07:45:00"
        t2  = "2010-03-23T08:00:00"
        m1  = "eventTimestamp: " t1
        m2  = "eventTimestamp: " t2
        }
$1 ~ /eventTimestamp:/ && $4 ~ /SMS-MO-FSM(-INFO)?$/ {
    if ($1 >= m1 && $1 <= m2) print $1, $2, $3, $4;
}' "$@"

显然,您可以将其放入脚本文件中 - 您不想经常键入它。而准确、方便地输入日期范围是难点之一。请注意,我已经调整了时间范围以匹配数据。

当运行示例数据时,它输出一条记录:

eventTimestamp: 2010-03-23T07:56:19.186;result: Allowed;protocol: SMS;payload: SMS-MO-FSM

The key observation is that you can compare your timestamps using alphanumeric comparisons and get the correct answer - that is the beauty of ISO 8601 notation.

Thus, adapting your code slightly - and formatting to avoid scroll bars:

awk 'BEGIN {
        FS  = "\n"
        RS  = ""
        OFS = ";"
        ORS = "\n"
        t1  = "2010-03-23T07:45:00"
        t2  = "2010-03-23T08:00:00"
        m1  = "eventTimestamp: " t1
        m2  = "eventTimestamp: " t2
        }
$1 ~ /eventTimestamp:/ && $4 ~ /SMS-MO-FSM(-INFO)?$/ {
    if ($1 >= m1 && $1 <= m2) print $1, $2, $3, $4;
}' "$@"

Obviously, you could put this into a script file - you wouldn't want to type it often. And getting the date range entered accurately and conveniently is one of the hard parts. Note that I've adjusted the time range to match the data.

When run on the sample data, it outputs one record:

eventTimestamp: 2010-03-23T07:56:19.186;result: Allowed;protocol: SMS;payload: SMS-MO-FSM
清音悠歌 2024-10-08 23:18:14

有点拼凑,但是这个脚本假设您有 unix“date”命令。还在 BEGIN 块中硬编码了开始和结束时间戳。请注意,上面列出的测试数据不属于样本开始/结束时间范围。

#!/usr/bin/awk -f
BEGIN {
        command="date -f\"%s\" -d \"2010-03-23 12:56:47\""; command | getline startTime; close(command)
        command="date -f\"%s\" -d \"2010-03-23 13:56:47\""; command | getline endTime; close(command)
}

$0 ~ /^eventTimestamp:/ {
        command="date -f\"%s\" -d " $2; command | getline currTime; close(command)

        if (currTime >= startTime && currTime <= endTime) {
                printIt="true"
        }else{
                printIt="false";
        }
}

printIt == "true" { print }             

A bit of a kludge, but this script assumes you have the unix "date" command. Also hard coded your start and end timestamps in the BEGIN block. Note that your test data listed above does not fall within your sample start/end times.

#!/usr/bin/awk -f
BEGIN {
        command="date -f\"%s\" -d \"2010-03-23 12:56:47\""; command | getline startTime; close(command)
        command="date -f\"%s\" -d \"2010-03-23 13:56:47\""; command | getline endTime; close(command)
}

$0 ~ /^eventTimestamp:/ {
        command="date -f\"%s\" -d " $2; command | getline currTime; close(command)

        if (currTime >= startTime && currTime <= endTime) {
                printIt="true"
        }else{
                printIt="false";
        }
}

printIt == "true" { print }             
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文