确保列表中所有元素都不同的最 Pythonic 方法是什么?
我有一个 Python 列表,是我在程序中生成的。我有一个强烈的假设,即这些都是不同的,并且我用断言来检查这一点。
这就是我现在这样做的方式:
如果有两个元素:
try:
assert(x[0] != x[1])
except:
print debug_info
raise Exception("throw to caller")
如果有三个:
try:
assert(x[0] != x[1])
assert(x[0] != x[2])
assert(x[1] != x[2])
except:
print debug_info
raise Exception("throw to caller")
如果我必须用四个元素来做这件事,我会发疯的。
有没有更好的方法来确保列表中的所有元素都是唯一的?
I have a list in Python that I generate as part of the program. I have a strong assumption that these are all different, and I check this with an assertion.
This is the way I do it now:
If there are two elements:
try:
assert(x[0] != x[1])
except:
print debug_info
raise Exception("throw to caller")
If there are three:
try:
assert(x[0] != x[1])
assert(x[0] != x[2])
assert(x[1] != x[2])
except:
print debug_info
raise Exception("throw to caller")
And if I ever have to do this with four elements I'll go crazy.
Is there a better way to ensure that all the elements of the list are unique?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
也许是这样的:
Maybe something like this:
最流行的答案是 O(N)(好!-),但是,正如 @Paul 和 @Mark 指出的,他们要求列表中的项目是可散列的。 @Paul 和 @Mark 针对不可散列项提出的方法都是通用的,但需要 O(N 平方) —— 即很多。
如果您的列表中的项目不可散列,但具有可比性,那么您可以做得更好...这是一种根据列表项目的性质始终尽可能快地工作的方法。
在可行的情况下,这是 O(N)(所有项目均可散列),O(N log N) 作为最常见的回退(某些项目不可散列,但所有可比较),在不可避免的情况下是 O(N 平方)(某些项目不可散列,例如字典,和一些不可比较的,例如复数)。
这段代码的灵感来自伟大的 Tim Peters 的一个旧配方,它的不同之处在于实际上生成了一个独特项目的列表(而且很久以前
set
还没有出现——它必须使用一个dict
...!-),但基本上面临着相同的问题。The most popular answers are O(N) (good!-) but, as @Paul and @Mark point out, they require the list's items to be hashable. Both @Paul and @Mark's proposed approaches for unhashable items are general but take O(N squared) -- i.e., a lot.
If your list's items are not hashable but are comparable, you can do better... here's an approach that always work as fast as feasible given the nature of the list's items.
This is O(N) where feasible (all items hashable), O(N log N) as the most frequent fallback (some items unhashable, but all comparable), O(N squared) where inevitable (some items unhashable, e.g. dicts, and some non-comparable, e.g. complex numbers).
Inspiration for this code comes from an old recipe by the great Tim Peters, which differed by actually producing a list of unique items (and also was so far ago that
set
was not around -- it had to use adict
...!-), but basically faced identical issues.怎么样:
这假设
x
中的元素是可散列的。How about this:
This assumes that elements in
x
are hashable.希望序列中的所有项目都是不可变的——否则,您将无法在序列上调用
set
。如果您确实有可变项,则无法对这些项进行哈希处理,并且您几乎必须重复检查列表:
Hopefully all the items in your sequence are immutable -- if not, you will not be able to call
set
on the sequence.If you do have mutable items, you can't hash the items and you will pretty much have to repeatedly check through the list:
当您构建列表时,您可以检查该值是否已存在,例如:
这样做的好处是将报告冲突变量。
As you build the list you can check to see if the value already exists, e.g:
the benefit of this is that the clashing variable will be reported.
您可以处理该列表以创建一个已知的唯一副本:
或者如果 seq 元素不可散列:
这将使项目保持有序(当然,省略重复项)。
You could process the list to create a known-to-be-unique copy:
Or if the seq elements are not hashable:
And this will keep the items in order (omitting duplicates, of course).
我会用这个:
I would use this: