如何根据时间戳将记录分组到存储桶中?

发布于 2024-09-08 18:12:26 字数 373 浏览 6 评论 0原文

我有日志中的条目列表:

15:38:52.363 1031
15:41:06.347 1259
15:41:06.597 1171
15:48:44.115 1588
15:48:44.125 1366
15:48:44.125 1132
15:53:14.525 1348
15:53:15.121 1553
15:53:15.181 1286
15:53:15.187 1293

第一个是时间戳,第二个是值。

现在我试图以 20 秒的间隔将它们分组。我想对这些值求和,或者得到它们的平均值。我想知道最简单的方法是什么?最好我可以通过一些简单的 shell 脚本来完成此操作,这样我就可以将 grep 语句通过管道传输并获得一个划分的列表。谢谢!

i have a list of entries from the logs:

15:38:52.363 1031
15:41:06.347 1259
15:41:06.597 1171
15:48:44.115 1588
15:48:44.125 1366
15:48:44.125 1132
15:53:14.525 1348
15:53:15.121 1553
15:53:15.181 1286
15:53:15.187 1293

the first one is the timestamp, the second one is the value.

now i'm trying to group them up by an interval of, say, 20 sec. i want to either sum the values, or get their average. i wonder what's the easiest way to do this? preferrably i can do this thru some simple shell script, so i can pipe my grep statement into and get a divided list. thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

苦行僧 2024-09-15 18:12:26

这个 gawk 脚本完全忽略了秒的小数部分。它也不知道从一天到下一天的跨越(跨越 00:00:00):

grep ... | awk -v interval=20 'function groupout() {print "----", "Timespan ending:", strftime("%T", prevtime), "Sum:", sum, "Avg:", sum/count, "----"} BEGIN {prevtime = 0} {split($1, a, "[:.]"); time = mktime(strftime("%Y %m %d") " " a[1] " " a[2] " " a[3]); if (time > prevtime + interval) {if (NR != 1) {groupout(); sum=0; count=0}}; print; sum+=$2; count++; prevtime = time} END {groupout()}'

输出:

15:38:52.363 1031
---- Timespan ending: 15:38:52 Sum: 1031 Avg: 1031 ----
15:41:06.347 1259
15:41:06.597 1171
---- Timespan ending: 15:41:06 Sum: 2430 Avg: 1215 ----
15:48:44.115 1588
15:48:44.125 1366
15:48:44.125 1132
---- Timespan ending: 15:48:44 Sum: 4086 Avg: 1362 ----
15:53:14.525 1348
15:53:15.121 1553
15:53:15.181 1286
15:53:15.187 1293
---- Timespan ending: 15:53:15 Sum: 5480 Avg: 1370 ----

这里再次更易读:

awk -v interval=20 '
function groupout() {
    print "----", "Timespan ending:", strftime("%T", prevtime), "Sum:", sum, "Avg:", sum/count, "----"
}
BEGIN {
    prevtime = 0
} 
{
    split($1, a, "[:.]"); 
    time = mktime(strftime("%Y %m %d") " " a[1] " " a[2] " " a[3]); 
    if (time > prevtime + interval) {
        if (NR != 1) {groupout(); sum=0; count=0}
    }; 
    print; 
    sum+=$2; 
    count++; 
    prevtime = time
} 
END {groupout()}'

This gawk script completely ignores fractional seconds. It also knows nothing about spanning from one day to the next (crossing 00:00:00):

grep ... | awk -v interval=20 'function groupout() {print "----", "Timespan ending:", strftime("%T", prevtime), "Sum:", sum, "Avg:", sum/count, "----"} BEGIN {prevtime = 0} {split($1, a, "[:.]"); time = mktime(strftime("%Y %m %d") " " a[1] " " a[2] " " a[3]); if (time > prevtime + interval) {if (NR != 1) {groupout(); sum=0; count=0}}; print; sum+=$2; count++; prevtime = time} END {groupout()}'

Output:

15:38:52.363 1031
---- Timespan ending: 15:38:52 Sum: 1031 Avg: 1031 ----
15:41:06.347 1259
15:41:06.597 1171
---- Timespan ending: 15:41:06 Sum: 2430 Avg: 1215 ----
15:48:44.115 1588
15:48:44.125 1366
15:48:44.125 1132
---- Timespan ending: 15:48:44 Sum: 4086 Avg: 1362 ----
15:53:14.525 1348
15:53:15.121 1553
15:53:15.181 1286
15:53:15.187 1293
---- Timespan ending: 15:53:15 Sum: 5480 Avg: 1370 ----

Here it is again more readably:

awk -v interval=20 '
function groupout() {
    print "----", "Timespan ending:", strftime("%T", prevtime), "Sum:", sum, "Avg:", sum/count, "----"
}
BEGIN {
    prevtime = 0
} 
{
    split($1, a, "[:.]"); 
    time = mktime(strftime("%Y %m %d") " " a[1] " " a[2] " " a[3]); 
    if (time > prevtime + interval) {
        if (NR != 1) {groupout(); sum=0; count=0}
    }; 
    print; 
    sum+=$2; 
    count++; 
    prevtime = time
} 
END {groupout()}'
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文