检查列表是否已排序的 Pythonic 方法
是否有一种Python式的方法来检查列表是否已按 ASC
或 DESC
排序,
listtimestamps = [1, 2, 3, 5, 6, 7]
例如返回 的
或 isttimestamps.isSorted()
TrueFalse
。
我想输入一些消息的时间戳列表,并检查事务是否以正确的顺序出现。
Is there a pythonic way to check if a list is already sorted in ASC
or DESC
listtimestamps = [1, 2, 3, 5, 6, 7]
something like isttimestamps.isSorted()
that returns True
or False
.
I want to input a list of timestamps for some messages and check if the the transactions appeared in the correct order.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(27)
第三方封装方法
more_itertools.is_sorted
尚未提及:
The third-party package method
more_itertools.is_sorted
hasn't been mentioned yet:我使用这个基于 numpy.diff() 的单行代码:
我没有真正针对任何其他方法对它进行计时,但我认为它比任何纯 Python 方法都要快,特别是对于大 n,因为 numpy.diff 中的循环(可能)直接在 C 中运行(n-1 次减法,然后是 n-1 次比较)。
但是,如果 x 是无符号整数,则需要小心,这可能会导致 numpy.diff() 中的静默整数下溢,从而导致误报。这是修改后的版本:
I use this one-liner based on numpy.diff():
I haven't really timed it against any other method, but I assume it's faster than any pure Python method, especially for large n, since the loop in numpy.diff (probably) runs directly in C (n-1 subtractions followed by n-1 comparisons).
However, you need to be careful if x is an unsigned int, which might cause silent integer underflow in numpy.diff(), resulting in a false positive. Here's a modified version:
这与最上面的答案类似,但我更喜欢它,因为它避免了显式索引。假设您的列表的名称为
lst
,您可以使用zip
从列表中生成(item, next_item)
元组:在 Python 3 中,
zip
已经返回一个生成器,在 Python 2 中您可以使用itertools.izip
以获得更好的内存效率。小演示:
当评估元组
(3, 2)
时,最后一个失败。额外奖励:检查无法索引的有限(!)生成器:
如果您使用的是 Python 2,请务必在此处使用
itertools.izip
,否则您将无法实现不必从生成器创建列表的目的。This is similar to the top answer, but I like it better because it avoids explicit indexing. Assuming your list has the name
lst
, you can generate(item, next_item)
tuples from your list withzip
:In Python 3,
zip
already returns a generator, in Python 2 you can useitertools.izip
for better memory efficiency.Small demo:
The last one fails when the tuple
(3, 2)
is evaluated.Bonus: checking finite (!) generators which cannot be indexed:
Make sure to use
itertools.izip
here if you are using Python 2, otherwise you would defeat the purpose of not having to create lists from the generators.懒惰的
Lazy
虽然我认为不能保证
sorted
内置函数使用i+1, i
调用其 cmp 函数,但对于 CPython 来说似乎确实是这样做的。所以你可以这样做:
或者这样(没有 if 语句 -> EAFP 出错了?;-)):
Although I don't think there is a guarantee for that the
sorted
built-in calls its cmp function withi+1, i
, it does seem to do so for CPython.So you could do something like:
Or this way (without if statements -> EAFP gone wrong? ;-) ):
正如 @aaronsterling 所指出的,当数组已排序且不太小时,以下解决方案是最短的并且似乎最快:
def is_sorted(lst):
return (sorted(lst) == lst)
如果大多数时候数组没有排序,那么最好使用一种不扫描整个数组并在发现未排序前缀时立即返回 False 的解决方案。以下是我能找到的最快的解决方案,它并不是特别优雅:
使用 Nathan Farrington 的基准测试,除了在大型排序列表上运行之外,在所有情况下都比使用排序(lst)实现更好的运行时间。
这是我的计算机上的基准测试结果。
排序(lst)==lst解决方案
第二个解决方案:
As noted by @aaronsterling the following solution is the shortest and seems fastest when the array is sorted and not too small:
def is_sorted(lst):
return (sorted(lst) == lst)
If most of the time the array is not sorted, it would be desirable to use a solution that does not scan the entire array and returns False as soon as an unsorted prefix is discovered. Following is the fastest solution I could find, it is not particularly elegant:
Using Nathan Farrington's benchmark, this achieves better runtime than using sorted(lst) in all cases except when running on the large sorted list.
Here are the benchmark results on my computer.
sorted(lst)==lst solution
Second solution:
只是添加另一种方式(即使它需要额外的模块): < code>iteration_utilities.all_monotone:
检查 DESC 顺序:
如果需要严格检查(如果连续元素不应该相等)单调,还有一个
strict
参数序列。对于您的情况来说这不是问题,但是如果您的序列包含
nan
值,那么某些方法将失败,例如排序:请注意
iteration_utilities.all_monotone
与此处提到的其他解决方案相比执行速度更快,特别是未排序的输入(请参阅基准)。Just to add another way (even if it requires an additional module):
iteration_utilities.all_monotone
:To check for DESC order:
There is also a
strict
parameter if you need to check for strictly (if successive elements should not be equal) monotonic sequences.It's not a problem in your case but if your sequences contains
nan
values then some methods will fail, for example with sorted:Note that
iteration_utilities.all_monotone
performs faster compared to the other solutions mentioned here especially for unsorted inputs (see benchmark).一点也不 Pythonic,但我们至少需要一个
reduce()
答案,对吗?累加器变量只是存储最后检查的值,如果任何值小于前一个值,则累加器将设置为无穷大(因此最终仍将是无穷大,因为“前一个值”总是大于当前的)。
Not very Pythonic at all, but we need at least one
reduce()
answer, right?The accumulator variable simply stores that last-checked value, and if any value is smaller than the previous value, the accumulator is set to infinity (and thus will still be infinity at the end, since the 'previous value' will always be bigger than the current one).
这种使用 Pandas 的方法非常慢,但它以完整性着称。
This approach using Pandas is very slow, but it's noted for completeness.
SapphireSun 说得很对。您只需使用
lst.sort()
即可。 Python 的排序实现 (TimSort) 检查列表是否已排序。如果是这样,sort() 将在线性时间内完成。听起来像是一种确保列表排序的 Python 方式;)SapphireSun is quite right. You can just use
lst.sort()
. Python's sort implementation (TimSort) check if the list is already sorted. If so sort() will completed in linear time. Sounds like a Pythonic way to ensure a list is sorted ;)Python 3.6.8
Python 3.6.8
使用赋值表达式的解决方案(Python 3.8 中添加):
给出:
A solution using assignment expressions (added in Python 3.8):
Gives:
应该已经安装了
如果你想要最快的 numpy 数组方式,请使用 numba,如果你使用conda 代码会很快,因为它将由 numba 编译
,然后:
If you want the fastest way for numpy arrays, use numba, which if you use conda should be already installed
The code will be fast because it will be compiled by numba
and then:
这使用了递归:
请注意,对于长序列,这将引发
RuntimeError:超出最大递归深度
。This uses recursion:
Note that this will raise
RuntimeError: maximum recursion depth exceeded
for long sequences.试试这个:
Try this:
导出的缩减值是一个由 3 部分组成的元组(sortedSoFarFlag、firstTimeFlag、lastElementValue)。它最初以 (
True
,True
,None
) 开头,它也用作空列表的结果(视为已排序,因为有没有乱序元素)。当它处理每个元素时,它会计算元组的新值(使用前一个元组值和下一个元素值):归约的最终结果是一个元组:
第一个值是我们感兴趣的值,因此我们使用 < code>[0] 从reduce结果中获取它。
该解决方案适用于任何可迭代的包含可以相互比较的元素类型。其中包括布尔列表(检查 False 值出现在 True 值之前)、数字列表、字符串列表(按字母顺序排列)、集合列表(子集出现在超集之前)等。
The derived reduction value is a 3-part tuple of (sortedSoFarFlag, firstTimeFlag, lastElementValue). It initially starts with (
True
,True
,None
), which is also used as the result for an empty list (regarded as sorted because there are no out-of-order elements). As it processes each element it calculates new values for the tuple (using previous tuple values with the next elementValue):The final result of the reduction is a tuple of:
The first value is the one we're interested in, so we use
[0]
to grab that from the reduce result.This solution works for any iterable containing element types that can be compared with each other. That includes lists of boolean (checks the False values occur before the True values), lists of numbers, lists of strings (alphabetical order), lists of sets (subsets occur before supersets) etc.
这个怎么样?简单明了。
How about this one ? Simple and straightforward.
最简单的方法:
Simplest way:
对于整数或字符串绝对适用于 Python 3 及更高版本:
========================================== =================================
另一种查找给定列表是否已排序的方法
Definitely works in Python 3 and above for integers or strings:
=====================================================================
Another way of finding if the given list is sorted or not
这是一个简单的说明:
如果使用 Python 2,请使用
xrange
而不是range
。对于
reverse=True
,请使用>=
而不是<=
。Here is a one liner:
If using Python 2, use
xrange
instead ofrange
.For
reverse=True
, use>=
instead of<=
.我只会使用,
除非它是一个非常大的列表,在这种情况下您可能想要创建一个自定义函数。
如果你只是想在未排序的情况下对它进行排序,那么忘记检查并对其进行排序。
并且不要想太多。
如果你想要一个自定义函数,你可以这样做
,如果列表已经排序,那么这将是 O(n) (并且在
for
循环中为 O(n)!)所以,除非如果您希望它在大多数情况下不会被排序(并且相当随机),那么我会再次对列表进行排序。I would just use
unless it's a very big list in which case you might want to create a custom function.
if you are just going to sort it if it's not sorted, then forget the check and sort it.
and don't think about it too much.
if you want a custom function, you can do something like
This will be O(n) if the list is already sorted though (and O(n) in a
for
loop at that!) so, unless you expect it to be not sorted (and fairly random) most of the time, I would, again, just sort the list.这种迭代器形式比使用整数索引快 10-15%:
This iterator form is 10-15% faster than using integer indexing:
实现此目的的一个很好的方法是使用来自
itertools
的imap
函数:此实现速度很快并且适用于任何可迭代对象。
A beautiful way to implement this is to use the
imap
function fromitertools
:This implementation is fast and works on any iterables.
从
Python 3.10
开始,新的pairwise
函数提供了一种滑动连续元素对的方法,从而查找所有这些元素对是否满足相同的排序谓词:pairwise:
Starting in
Python 3.10
, the newpairwise
function provides a way to slide through pairs of consecutive elements, and thus find if all of these pairs satisfy the same predicate of ordering:The intermediate result of
pairwise
:我会这样做(从这里窃取了很多答案[Aaron Sterling,Wai Yip Tung,有点来自Paul McGuire],大部分是Armin Ronacher):
一件好事:您不必实现该系列的第二个可迭代(与列表切片不同)。
I'd do this (stealing from a lot of answers here [Aaron Sterling, Wai Yip Tung, sorta from Paul McGuire] and mostly Armin Ronacher):
One nice thing: you don't have to realize the second iterable for the series (unlike with a list slice).
我运行了一个基准测试
和。这些基准测试在 MacBook Pro 2010 13"(Core2 Duo 2.66GHz、4GB 1067MHz DDR3 RAM、Mac OS X 10.6.5)上运行。sorted(lst, reverse=True) == lst
对于长列表来说是最快的,而all(l[i] >= l[i+ 1] for i in xrange(len(l)-1))
是短列表最快的更新:我修改了脚本,以便您可以直接在您自己的系统上运行它。此外,我添加了排序和未排序的输入。all(l[i] >= l[i) +1] for i in xrange(len(l)-1))
sorted(l,verse=True) == l
all(l[i] >= l[i+1] for i in xrange(len(l)-1))
all(l[i] >= l[i+1] for i in xrange(len(l)-1))
所以在大多数情况下都有一个明显的赢家。更新: aaronasterling 的答案(#6 和 #7)实际上是所有情况下最快的,#7 是最快的,因为它没有。没有一个间接层来查找密钥。
I ran a benchmark
and. These benchmarks were run on a MacBook Pro 2010 13" (Core2 Duo 2.66GHz, 4GB 1067MHz DDR3 RAM, Mac OS X 10.6.5).sorted(lst, reverse=True) == lst
was the fastest for long lists, andall(l[i] >= l[i+1] for i in xrange(len(l)-1))
was the fastest for short listsUPDATE: I revised the script so that you can run it directly on your own system. The previous version had bugs. Also, I have added both sorted and unsorted inputs.all(l[i] >= l[i+1] for i in xrange(len(l)-1))
sorted(l, reverse=True) == l
all(l[i] >= l[i+1] for i in xrange(len(l)-1))
all(l[i] >= l[i+1] for i in xrange(len(l)-1))
So in most cases there is a clear winner.UPDATE: aaronasterling's answers (#6 and #7) are actually the fastest in all cases. #7 is the fastest because it doesn't have a layer of indirection to lookup the key.
由于我在上面没有看到此选项,因此我会将其添加到所有答案中。
让用
l
表示列表,然后:As I don't see this option above I will add it to all the answers.
Let denote the list by
l
, then: