如何在 Perl 中以 dd:mm:yyyy hh24:mi:ss 格式按降序对时间戳进行排序?
我必须对哈希键进行排序,该哈希键是时间戳 (dd:mm:yyyy hh24:mi:ss)
按降序排列。
sort { $b <=> $a } keys %time_spercent
这种方式并没有让我做我想做的事。相反,这最终会首先按较高的小时和分钟排序,即使日期并非如此。例如,这就是我在进行我提到的排序时得到的结果。
21:01:2011 16:51:09
21:01:2011 16:49:54
26:01:2011 11:02:55
26:01:2011 11:01:40
05:04:2011 11:51:13
05:04:2011 11:51:13
05:04:2011 11:48:37
05:04:2011 11:48:37
相反,我希望它们按照日期和时间的顺序排列。
05:04:2011 11:51:13
05:04:2011 11:51:13
05:04:2011 11:48:37
26:01:2011 11:02:55
26:01:2011 11:01:40
05:04:2011 11:48:37
21:01:2011 16:51:09
21:01:2011 16:49:54
任何关于如何做到这一点的建议都将不胜感激。
Update
foreach my $status_date(
map { $_->[0] }
sort { $b->[1] cmp $a->[1] }
map { [$_, sorting_desc($_)] } keys % {$com_sam->{ $s1 } } )
和
sub sorting_desc {
$_ = shift;
if (/(\d+):(\d+):(\d+) (\d+):(\d+):(\d+)/) {
return "$2:$1:$3:$4:$5:$6";
}
}
是排序的子程序。
我也尝试过
foreach my $status_date(
map { $_->[0] }
sort { $b->[1] cmp $a->[1] }
map { [$_, (split/[:\s][1]] } keys % {$com_sam->{ $s1 } } )
,但没有达到预期的结果。
我得到的只是:
WGA_PD7124a WGA_PD7124a 95(2) 95(2) 95 100.00 193 Unknown(Unknown) 192654 01:07:2011 16:13:55
WGA_PD7124a WGA_PD7124a 95(2) 95(2) 95 100.00 193 Unknown(Unknown) 192655 01:07:2011 16:11:23
WGA_PD7124a WGA_PD7124a 95(2) 95(2) 95 100.00 193 Male(Unknown) 192656 01:07:2011 11:04:26
WGA_PD6355b WGA_PD6355b 96(1) 96(1) 96 100.00 388 Unknown(Unknown) 184558 04:05:2011 17:35:52
WGA_PD6355b WGA_PD6355a 96(1) 66(31) 66 95.45 388 Unknown(Unknown) 184558 04:05:2011 17:35:52
WGA_PD6355b WGA_PD6355b 96(1) 96(1) 96 100.00 388 Unknown(Unknown) 184557 04:05:2011 17:34:27
WGA_PD6355b WGA_PD6355a 96(1) 66(31) 66 95.45 388 Unknown(Unknown) 184557 04:05:2011 17:34:27
3074 3074 87(10) 87(10) 87 100.00 109 Unknown(Unknown) 174878 15:02:2011 09:24:31
3074 3074 87(10) 87(10) 87 100.00 109 Unknown(Unknown) 174970 15:02:2011 09:21:19
3074 3074 87(10) 87(10) 87 100.00 109 Female(Unknown) 174860 15:02:2011 09:16:32
3163 3163 90(7) 90(7) 90 100.00 176 Unknown(Unknown) 173382 09:02:2011 09:54:48
3163 3163 90(7) 90(7) 90 100.00 176 Unknown(Unknown) 173284 09:02:2011 09:51:02
CHP-212 CHP-212 94(3) 94(3) 94 100.00 269 Unknown(Unknown) 173382 09:02:2011 09:54:48
CHP-212 CHP-212 94(3) 94(3) 94 100.00 269 Unknown(Unknown) 173284 09:02:2011 09:51:02
MGH_2631 MGH_2631 90(8) 90(8) 90 100.00 211 Male(Unknown) 200943 01:09:2011 10:48:18
MGH_2631 MGH_2631 90(8) 90(8) 90 100.00 211 Unknown(Unknown) 200944 25:08:2011 10:20:16
MGH_2631 MGH_2631 90(8) 90(8) 90 100.00 211 Unknown(Unknown) 200945 25:08:2011 10:19:05
MGH_2631 MGH_2631 90(8) 90(8) 90 100.00 211 Male(Unknown) 200946 25:08:2011 10:17:26
MGH_2101 MGH_2101 80(18) 80(18) 80 100.00 359 Male(Unknown) 200943 01:09:2011 10:48:18
MGH_2101 MGH_2101 80(18) 80(18) 80 100.00 359 Unknown(Unknown) 200944 25:08:2011 10:20:16
MGH_2101 MGH_2101 80(18) 80(18) 80 100.00 359 Unknown(Unknown) 200945 25:08:2011 10:19:05
MGH_2101 MGH_2101 80(18) 80(18) 80 100.00 359 Male(Unknown) 200946 25:08:2011 10:17:26
PD4294c PD4294c 95(2) 95(2) 95 100.00 221 Unknown(Unknown) 179502 23:03:2011 10:03:23
PD4294c PD4294c 95(2) 95(2) 95 100.00 221 Unknown(Unknown) 179470 23:03:2011 10:02:30
I have to sort my hash keys which is a timestamp (dd:mm:yyyy hh24:mi:ss)
in descending order.
sort { $b <=> $a } keys %time_spercent
this way is not getting me what I intend to do. Rather this ends in sorting with the higher hours and minutes first even though the date is not so. For example, this is how I get when I do the sorting as I have mentioned.
21:01:2011 16:51:09
21:01:2011 16:49:54
26:01:2011 11:02:55
26:01:2011 11:01:40
05:04:2011 11:51:13
05:04:2011 11:51:13
05:04:2011 11:48:37
05:04:2011 11:48:37
Rather I want them in this order arranged both by date as well as in time.
05:04:2011 11:51:13
05:04:2011 11:51:13
05:04:2011 11:48:37
26:01:2011 11:02:55
26:01:2011 11:01:40
05:04:2011 11:48:37
21:01:2011 16:51:09
21:01:2011 16:49:54
Any pointers are suggestion on how this could be done would be gratefully received.
Update
foreach my $status_date(
map { $_->[0] }
sort { $b->[1] cmp $a->[1] }
map { [$_, sorting_desc($_)] } keys % {$com_sam->{ $s1 } } )
and
sub sorting_desc {
$_ = shift;
if (/(\d+):(\d+):(\d+) (\d+):(\d+):(\d+)/) {
return "$2:$1:$3:$4:$5:$6";
}
}
is the subroutine for sorting.
I also tried
foreach my $status_date(
map { $_->[0] }
sort { $b->[1] cmp $a->[1] }
map { [$_, (split/[:\s][1]] } keys % {$com_sam->{ $s1 } } )
but not the expected results.
All I get is:
WGA_PD7124a WGA_PD7124a 95(2) 95(2) 95 100.00 193 Unknown(Unknown) 192654 01:07:2011 16:13:55
WGA_PD7124a WGA_PD7124a 95(2) 95(2) 95 100.00 193 Unknown(Unknown) 192655 01:07:2011 16:11:23
WGA_PD7124a WGA_PD7124a 95(2) 95(2) 95 100.00 193 Male(Unknown) 192656 01:07:2011 11:04:26
WGA_PD6355b WGA_PD6355b 96(1) 96(1) 96 100.00 388 Unknown(Unknown) 184558 04:05:2011 17:35:52
WGA_PD6355b WGA_PD6355a 96(1) 66(31) 66 95.45 388 Unknown(Unknown) 184558 04:05:2011 17:35:52
WGA_PD6355b WGA_PD6355b 96(1) 96(1) 96 100.00 388 Unknown(Unknown) 184557 04:05:2011 17:34:27
WGA_PD6355b WGA_PD6355a 96(1) 66(31) 66 95.45 388 Unknown(Unknown) 184557 04:05:2011 17:34:27
3074 3074 87(10) 87(10) 87 100.00 109 Unknown(Unknown) 174878 15:02:2011 09:24:31
3074 3074 87(10) 87(10) 87 100.00 109 Unknown(Unknown) 174970 15:02:2011 09:21:19
3074 3074 87(10) 87(10) 87 100.00 109 Female(Unknown) 174860 15:02:2011 09:16:32
3163 3163 90(7) 90(7) 90 100.00 176 Unknown(Unknown) 173382 09:02:2011 09:54:48
3163 3163 90(7) 90(7) 90 100.00 176 Unknown(Unknown) 173284 09:02:2011 09:51:02
CHP-212 CHP-212 94(3) 94(3) 94 100.00 269 Unknown(Unknown) 173382 09:02:2011 09:54:48
CHP-212 CHP-212 94(3) 94(3) 94 100.00 269 Unknown(Unknown) 173284 09:02:2011 09:51:02
MGH_2631 MGH_2631 90(8) 90(8) 90 100.00 211 Male(Unknown) 200943 01:09:2011 10:48:18
MGH_2631 MGH_2631 90(8) 90(8) 90 100.00 211 Unknown(Unknown) 200944 25:08:2011 10:20:16
MGH_2631 MGH_2631 90(8) 90(8) 90 100.00 211 Unknown(Unknown) 200945 25:08:2011 10:19:05
MGH_2631 MGH_2631 90(8) 90(8) 90 100.00 211 Male(Unknown) 200946 25:08:2011 10:17:26
MGH_2101 MGH_2101 80(18) 80(18) 80 100.00 359 Male(Unknown) 200943 01:09:2011 10:48:18
MGH_2101 MGH_2101 80(18) 80(18) 80 100.00 359 Unknown(Unknown) 200944 25:08:2011 10:20:16
MGH_2101 MGH_2101 80(18) 80(18) 80 100.00 359 Unknown(Unknown) 200945 25:08:2011 10:19:05
MGH_2101 MGH_2101 80(18) 80(18) 80 100.00 359 Male(Unknown) 200946 25:08:2011 10:17:26
PD4294c PD4294c 95(2) 95(2) 95 100.00 221 Unknown(Unknown) 179502 23:03:2011 10:03:23
PD4294c PD4294c 95(2) 95(2) 95 100.00 221 Unknown(Unknown) 179470 23:03:2011 10:02:30
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您可以将格式更改为
yyyy:mm:dd hh24:mi:ss
吗?那时您就会有一个自然排序。基本上,将所有内容按重要性降序排列对机器更友好:)编辑:然后只需使用字符串比较进行排序,因为它自然会以正确的方式排序。
Can you change your format to
yyyy:mm:dd hh24:mi:ss
? At that point you'd have a natural ordering. Basically it's a lot more machine-friendly to have everything in decreasing order of importance :)EDIT: Then just order using string comparisons, as it will naturally sort the right way.
从你的问题来看,我不清楚你真正想要如何排序以及如何生成示例。我无法在您预期的排序顺序示例中检测到任何顺序。
一个可能的解决方案在底部。
让我澄清一下:
给定一个包含以下内容的文本文件“ts”(您的示例):
标准排序会产生以下输出:
而您建议的数字降序排序会产生以下顺序:
澄清数字排序:太空船运算符 < ;=>强制对其两个操作数进行数值解释。因此,字符串 $a 和 $b(每个包含日期和时间)被解释为数字。为此,本示例中的 perl 提取日期并在第一个“:”处停止。这就是为什么时间,甚至月份和年份都被完全忽略,我们只按降序排列月份中的日期。
最后,如果您确实想对日期、时间进行反向排序,并且需要保留格式,您可以使用以下代码:
这是一个更好的格式化版本(我没有测试):
然后,此排序函数
会多次调用上面的帮助程序。在您的示例中,列表中有 8 个条目需要排序,并且该函数被调用 24 次。所以它的性能效率不高。但对于几百甚至几千个条目的小列表来说,这可能对你来说没问题。
如果您有很大的列表,您应该只进行一次格式转换,但这仍然会消耗内存。因此,对于大型列表,您需要在内存与执行时间之间进行权衡,这是常见的情况。
如果性能是优化标准,您可以按照其他答案和评论中的评论和显示进行动态转换,如下所示:
..对于我上面的示例。现在,每个元素只需进行一次转换。尽管如此,我们还是必须在内存中保存一个临时列表。我不太确定 perl 可以如何优化上述结构。人们可能认为以下内容更容易优化:
这也适用于测试集。排序默认返回字符串比较,这与相同长度的字符串的数字比较相同,在其他字符串具有数字的位置不使用空格。
希望这有帮助!
From your question it is unclear to me how you really want to sort and how you produced the examples. I cannot detect any order in the example of your expected sort order.
A likely solution is at the bottom.
Let me clarify:
Given a textfile "ts" with the following content (your example):
A standard sort produces the following output:
While the numerically descending sort you proposed produces the following order:
To clarify on the numerical sort: The spaceship operator <=> enforces numerical interpretation of its two operands. So the strings $a and $b, each containing the date and time, are interpreted as if they were numbers. To do this perl in this example extracts the date and stops at the first ':'. That's why the time, and even the month and year are completely ignored and we're only sorting for the day of the month in descending order.
Finally, if you really want to reverse sort for date, then time and need to keep the format you can use this code:
Here's a nicer formatted version (which I did not test):
This sort function
then calls the above helper quite a lot. In your example we have 8 entries in the list to sort and the function gets called 24 times. So it is not performance efficient. But for small lists up to a couple hundred or even thousand entries it may be alright for you.
If you have large lists, you should do the format conversion only once, but it still costs memory. So for large lists, you need to tradeoff memory versus execution time, as is often the case.
IF performance is the optimization criteria, you could do the transformation on the fly as has been commented and shown in other answers and comments like this:
..for my example above. Now you do the conversion only once per element. Still, we have to hold a temporary list in memory. I'm not exactly sure how well perl could optimize the above construct. One may think that the following is easier to optimize:
which would work for the testset, too. The sort defaults back to the string comparison which is the same as a numerical comparison for strings of identical length which do not use spaces in those locations where other strings have digits.
Hope this helps!
Jon Skeet的答案更好! (即,如果可以的话,只需将时间戳更改为 ISO 8601 格式。)
但是如果您无法更改格式,您可以执行以下操作:(
重复的时间戳我假设您有自己的逻辑要处理。通过对它们进行散列,对重复项进行计数,我只是打印它们的计数.. .)
结果:
编辑
好的,如果您关心效率,
(sort {iso_8601($a) cmp iso_8601($b)} keys %h)
不是 < em>最好,因为每个哈希元素都会多次调用 iso_8601() 函数。对于“Schwartzian Transform”的形式,您可以执行
以下 操作:与上面相同的输出。然后,每个哈希键仅调用
iso_8601()
一次,而不是多次...要剖析它(它从右到左,从下到上):
编辑 2
我很难理解你想要什么。试试这个:
输出:
这是你的想法吗?它解析行尾的时间戳并按降序对这些记录进行排序。这有什么问题吗?
Jon Skeet's answer is better! (i.e., just change your time stamp, if you can, to the ISO 8601 format.)
But if you can't change the format, you could do something like:
(The duplicate time stamps I assume you have your own logic to deal with. By hashing them, the duplicates are counted, and I am just printing their count...)
Result:
Edit
OK, if you are concerned about efficiency, the
(sort {iso_8601($a) cmp iso_8601($b)} keys %h)
is not the best since the iso_8601() function is called many times per hash element.For a form of "Schwartzian Transform" you can do:
Which will produce the same output as above. Then you are calling
iso_8601()
only once per hash key, not multiple times...To dissect that (it goes right to left, bottom to top):
EDIT 2
I am having a hard time understanding what you want. Try this:
Output:
Is this what you are thinking? It parses the time stamp at the end of line and sorts those records in descending order. What is the issue with this?
我前段时间遇到了同样的问题,当我按照 Jon Skeet 提出的方式对列表进行排序时,我解决了转换格式的问题,这是我的代码:
结果是:
I had the same problem some time ago, and I solved transforming the format when I sorted the list in the same way as Jon Skeet has propossed, this is my piece of code:
The result is:
首先,了解您想要做什么。接下来,让它发挥作用。然后,如有必要,进行优化。
轻松比较时间戳的一种方法是将它们转换为距纪元的偏移量。您可以使用Time::Local。鉴于您没有获得任意值,而是获得明确定义的时间戳,您可以进行一些过早的优化并使用
_nocheck
版本的timelocal
或timegm< /代码>。
这是使用您提供的示例数据执行此操作的一种方法:
First, understand what you are trying to do. Next, get it to work. Then, if necessary, optimize.
One way to easily compare the time stamps is to convert them to offsets from an epoch. You can use Time::Local. Given that you are not getting arbitrary values, but rather well defined timestamps, you could engage in a little premature optimization and use the
_nocheck
version oftimelocal
ortimegm
.Here is one way to do it using the sample data you provided: