在Python中搜索列表的最快方法
当您在 a 中执行诸如 "test" in a
之类的操作(其中 a
是一个列表)时,Python 是否会对列表进行顺序搜索,或者是否创建哈希表表示形式来优化查找?在应用程序中,我需要这个,因为我将在列表中进行大量查找,因此最好执行类似 b = set(a)
的操作,然后在中执行 "test" b?另请注意,我将拥有的值列表不会有重复的数据,而且我实际上并不关心它的顺序;我只需要能够检查一个值是否存在。
When you do something like "test" in a
where a
is a list does python do a sequential search on the list or does it create a hash table representation to optimize the lookup? In the application I need this for I'll be doing a lot of lookups on the list so would it be best to do something like b = set(a)
and then "test" in b
? Also note that the list of values I'll have won't have duplicate data and I don't actually care about the order it's in; I just need to be able to check for the existence of a value.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
不要使用列表,使用
set()
相反。它完全具有您想要的属性,包括速度极快的in
测试。我发现在更改一组列表的地方(主要是繁重的数字运算),速度提高了 20 倍甚至更高。
Don't use a list, use a
set()
instead. It has exactly the properties you want, including a blazing fastin
test.I've seen speedups of 20x and higher in places (mostly heavy number crunching) where one list was changed for a set.
"test" in a
与列表a
将进行线性搜索。动态设置哈希表比线性搜索要昂贵得多。另一方面,b 中的“test”将执行 O(1) 哈希查找。在您描述的情况下,似乎没有理由使用列表而不是集合。
"test" in a
with a lista
will do a linear search. Setting up a hash table on the fly would be much more expensive than a linear search."test" in b
on the other hand will do an amoirtised O(1) hash look-up.In the case you describe, there doesn't seem to be a reason to use a list over a set.
我认为按照既定的实施方式会更好。我知道集合的查找时间为 O(1)。我认为列表需要 O(n) 查找时间。但即使列表也是 O(1) 查找,切换到集合也不会造成任何损失。
此外,集合不允许重复值。这也会使你的程序的内存效率稍微提高一些
I think it would be better to go with the set implementation. I know for a fact that sets have O(1) lookup time. I think lists take O(n) lookup time. But even if lists are also O(1) lookup, you lose nothing with switching to sets.
Further, sets don't allow duplicate values. This will make your program slightly more memory efficient as well
列表和元组似乎有相同的时间,并且使用“in”对于大数据来说很慢:
这是更好的解决方案:在巨大列表中查找/搜索的最有效方法(python)
它非常快:
List and tuples seems to have the same time, and using "in" is slow for large data:
Here is much better solution: Most efficient way for a lookup/search in a huge list (python)
It's super fast: