如何在 AIX 5.3 中的 ksh 中按 hh:mm:ss.xx 排序?

发布于 2024-10-08 02:52:40 字数 514 浏览 3 评论 0原文

我有很多这样的日志文件:


......
……
CPU时间9.05秒
实时 8:02.07
……
……
CPU 时间 2:25.23
实时 1:39:44.15
……
......


为了获得所有时间,我只需 grep 所有 cpu 时间和实时时间。
然后,对 grep 输出文件进行排序。
我使用的是 AIX 5.2,有按字符串排序或按数字排序。
但是,没有按小时:分钟:秒排序。

为了解决这个问题,我将 grep 输出行传递给 while 循环。
然后,使用 sed 's/:/00/g' 创建一个新变量
这个新变量将使 hh:mm:ss.xx 变为 hh00mm00ss.xx
然后按这个新变量作为数字进行排序。

通过这种方式,我可以找出最耗时的步骤。
这个办法可以解决,但是速度有点慢。

谁能有更好的选择吗?
提前致谢。

萧艾文

I have many log files like this:


......
......
cpu time 9.05 seconds
real time 8:02.07
......
......
cpu time 2:25.23
real time 1:39:44.15
......
......


To get all the times, I simply grep all the cpu time and real time.
Then, sort the grep output files.
I am using AIX 5.2, there is sort by string or by numberic.
But, there is no sort by hour:minute:second.

To solve this problem, I pass the grep output lines to a while loop.
Then, create a new variables using sed 's/:/00/g'
This new var will make the hh:mm:ss.xx becomes hh00mm00ss.xx
and then sort by this new variable as numeric.

Using this way, I can find out the most time-consuming steps.
This work around can do but the speed is a little bit slow.

Can anyone have a better alternative ?
Thanks in advance.

Alvin SIU

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

从此见与不见 2024-10-15 02:52:40

在论文“构建工作排序例程的理论与实践”中,JP Linderman 展示了从系统 sort 命令(即“排序例程”)中获得良好性能的最佳方法。正在研究)使用复杂的键是创建命令来生成使比较简单的键。在示例中,具有复杂键的排序命令是:

sort -t' ' -k 9,9.2 -k3 -k17

替代机制使用键生成器来简化排序:

keygen | sort | keystrip

键生成器是:

awk -F' ' '{printf "%s:%s:%s:%s\n", substr($9, 1, 2), $3, $17, $0}'

键剥离器是:

awk -F':' {printf "%s\n", $4}'

对于 Lindeman 正在使用的测试数据,这减少了所用时间从复杂排序命令的大约 2100 秒到 awk | 的大约 600 秒排序| awk 组合。


在这里采用这个想法,我将使用 Perl 脚本以 sort 可以轻松处理的格式统一呈现不同的时间值。

在这种情况下,您似乎需要担心多种时间格式:

cpu time 9.05 seconds
real time 8:02.07
cpu time 2:25.23
real time 1:39:44.15

尚不清楚您是否需要保留正在排序的行的上下文,但在我看来,我会将时间转换为规范的时间形式。您是否需要允许 3 位数的实时小时数?如果时间到了20.05秒,后缀还保留吗?如果时间达到 80.05 秒,是否打印为 1:20.05?我假设是...

#!/usr/bin/env perl
use strict;
use warnings;

while (<>)
{
    if ($_ =~ m/ (?:cpu|real)\stime\s
                 (?:
                 (?:(\d+):)?      # Hours
                 (\d\d?):         # Minutes
                 )?
                 (\d\d?(?:\.\d+)) # Seconds
               /msx)
    {
        my($hh, $mm, $ss) = ($1, $2, $3);
        $hh //= 0;
        $mm //= 0;
        $_ = sprintf "%03d:%02d:%05.2f|%s", $hh, $mm, $ss, $_;
    }
    print;
}

给定输入数据:

cpu time 9.05 seconds
real time 8:02.07
cpu time 2:25.23
real time 1:39:44.15
cpu time 25.23 seconds
real time 39:44.15
cpu time 5.23 seconds
real time 44.15 seconds
real time 1:44.15
real time 1:04.15
real time 21:04.15
real time 1:01:04.15
real time 32:21:04.15
real time 122:21:04.15

这会生成输出数据:

000:00:09.05|cpu time 9.05 seconds
000:08:02.07|real time 8:02.07
000:02:25.23|cpu time 2:25.23
001:39:44.15|real time 1:39:44.15
000:00:25.23|cpu time 25.23 seconds
000:39:44.15|real time 39:44.15
000:00:05.23|cpu time 5.23 seconds
000:00:44.15|real time 44.15 seconds
000:01:44.15|real time 1:44.15
000:01:04.15|real time 1:04.15
000:21:04.15|real time 21:04.15
001:01:04.15|real time 1:01:04.15
032:21:04.15|real time 32:21:04.15
122:21:04.15|real time 122:21:04.15

可以将其输入到简单的sort中,以产生:

000:00:05.23|cpu time 5.23 seconds
000:00:09.05|cpu time 9.05 seconds
000:00:25.23|cpu time 25.23 seconds
000:00:44.15|real time 44.15 seconds
000:01:04.15|real time 1:04.15
000:01:44.15|real time 1:44.15
000:02:25.23|cpu time 2:25.23
000:08:02.07|real time 8:02.07
000:21:04.15|real time 21:04.15
000:39:44.15|real time 39:44.15
001:01:04.15|real time 1:01:04.15
001:39:44.15|real time 1:39:44.15
032:21:04.15|real time 32:21:04.15
122:21:04.15|real time 122:21:04.15

并且可以使用 'sed 从中删除排序列' 产生:

cpu time 5.23 seconds
cpu time 9.05 seconds
cpu time 25.23 seconds
real time 44.15 seconds
real time 1:04.15
real time 1:44.15
cpu time 2:25.23
real time 8:02.07
real time 21:04.15
real time 39:44.15
real time 1:01:04.15
real time 1:39:44.15
real time 32:21:04.15
real time 122:21:04.15

因此,假设数据文件是 'xx.data' 并且 Perl 脚本是 xx.pl,命令行是:

perl xx.pl xx.data | sort | sed 's/^[^|]*|//'

In the paper 'Theory and Practice in the Construction of a Working Sort Routine', J P Linderman shows that the best way to get good performance out of the system sort command (which is the 'sort routine' he was working on) with complex keys was to create commands to generate keys that make the comparisons simple. In the example, the sort command with the complex key was:

sort -t' ' -k 9,9.2 -k3 -k17

The alternative mechanism used a key generator to make it easy to sort:

keygen | sort | keystrip

and the key generator was:

awk -F' ' '{printf "%s:%s:%s:%s\n", substr($9, 1, 2), $3, $17, $0}'

and the key stripper was:

awk -F':' {printf "%s\n", $4}'

For the test data Lindeman was working with, this reduced the elapsed time from around 2100 seconds for the elaborate sort command to about 600 seconds for the awk | sort | awk combination.


Adopting that idea here, I'd use a Perl script to present the disparate time values uniformly in a format that sort can handle trivially.

In this case, you seem to have a variety of time formats to worry about:

cpu time 9.05 seconds
real time 8:02.07
cpu time 2:25.23
real time 1:39:44.15

It is not clear whether you need to preserve the context of the lines you are sorting, but it seems to me that I'd convert the times to a canonical form. Do you need to allow for 3-digit hours of real time? If the time goes to 20.05 seconds, does the suffix remain? If the time goes to 80.05 seconds, is that printed as 1:20.05? I'm assuming yes...

#!/usr/bin/env perl
use strict;
use warnings;

while (<>)
{
    if ($_ =~ m/ (?:cpu|real)\stime\s
                 (?:
                 (?:(\d+):)?      # Hours
                 (\d\d?):         # Minutes
                 )?
                 (\d\d?(?:\.\d+)) # Seconds
               /msx)
    {
        my($hh, $mm, $ss) = ($1, $2, $3);
        $hh //= 0;
        $mm //= 0;
        $_ = sprintf "%03d:%02d:%05.2f|%s", $hh, $mm, $ss, $_;
    }
    print;
}

Given the input data:

cpu time 9.05 seconds
real time 8:02.07
cpu time 2:25.23
real time 1:39:44.15
cpu time 25.23 seconds
real time 39:44.15
cpu time 5.23 seconds
real time 44.15 seconds
real time 1:44.15
real time 1:04.15
real time 21:04.15
real time 1:01:04.15
real time 32:21:04.15
real time 122:21:04.15

This generates the output data:

000:00:09.05|cpu time 9.05 seconds
000:08:02.07|real time 8:02.07
000:02:25.23|cpu time 2:25.23
001:39:44.15|real time 1:39:44.15
000:00:25.23|cpu time 25.23 seconds
000:39:44.15|real time 39:44.15
000:00:05.23|cpu time 5.23 seconds
000:00:44.15|real time 44.15 seconds
000:01:44.15|real time 1:44.15
000:01:04.15|real time 1:04.15
000:21:04.15|real time 21:04.15
001:01:04.15|real time 1:01:04.15
032:21:04.15|real time 32:21:04.15
122:21:04.15|real time 122:21:04.15

Which can be fed into a simple sort, to yield:

000:00:05.23|cpu time 5.23 seconds
000:00:09.05|cpu time 9.05 seconds
000:00:25.23|cpu time 25.23 seconds
000:00:44.15|real time 44.15 seconds
000:01:04.15|real time 1:04.15
000:01:44.15|real time 1:44.15
000:02:25.23|cpu time 2:25.23
000:08:02.07|real time 8:02.07
000:21:04.15|real time 21:04.15
000:39:44.15|real time 39:44.15
001:01:04.15|real time 1:01:04.15
001:39:44.15|real time 1:39:44.15
032:21:04.15|real time 32:21:04.15
122:21:04.15|real time 122:21:04.15

And from which the sort column can be stripped with 'sed' to yield:

cpu time 5.23 seconds
cpu time 9.05 seconds
cpu time 25.23 seconds
real time 44.15 seconds
real time 1:04.15
real time 1:44.15
cpu time 2:25.23
real time 8:02.07
real time 21:04.15
real time 39:44.15
real time 1:01:04.15
real time 1:39:44.15
real time 32:21:04.15
real time 122:21:04.15

So, given that the data file is 'xx.data' and the Perl script is xx.pl, the command line is:

perl xx.pl xx.data | sort | sed 's/^[^|]*|//'
眼眸里的那抹悲凉 2024-10-15 02:52:40

如果您显示脚本会有所帮助,但是我怀疑 while 循环是不必要的。尝试这样的事情:

grep -E '^(cpu|real) time' | sed 's/:/00/' | sort -n

If you show your script it would help, however I suspect that the while loop is unnecessary. Try something like this:

grep -E '^(cpu|real) time' | sed 's/:/00/' | sort -n
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文