在Python中有效地知道两个列表的交集是否为空

发布于 2024-08-19 21:07:14 字数 311 浏览 3 评论 0原文

假设我有两个列表，L 和 M。现在我想知道它们是否共享一个元素。哪一种是询问（在 python 中）它们是否共享元素的最快方法？我不在乎他们共享哪些元素或有多少元素，只关心他们是否共享。

例如，在本例中

L = [1,2,3,4,5,6]
M = [8,9,10]

我应该得到 False，而这里：

L = [1,2,3,4,5,6]
M = [5,6,7]

我应该得到 True。

我希望问题很清楚。谢谢！

曼努埃尔

原文

Suppose I have two lists, L and M. Now I want to know if they share an element.
Which would be the fastest way of asking (in python) if they share an element?
I don't care which elements they share, or how many, just if they share or not.

For example, in this case

L = [1,2,3,4,5,6]
M = [8,9,10]

I should get False, and here:

L = [1,2,3,4,5,6]
M = [5,6,7]

I should get True.

I hope the question's clear.
Thanks!

Manuel

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一萌ing 2024-08-26 21:07:14

或者更简洁地说，

if set(L) & set(M):
    # there is an intersection
else:
    # no intersection

如果您确实需要 True 或 False

bool(set(L) & set(M))

运行一些计时后，这似乎也是一个不错的选择

m_set=set(M)
any(x in m_set  for x in L)

如果 M 或 L 中的项目不可散列，您必须使用像这样效率较低的方法

any(x in M for x in L)

以下是 100 个项目列表的一些计时。当没有交集时，使用集合要快得多，而当有相当大的交集时，使用集合会慢一些。

M=range(100)
L=range(100,200)

timeit set(L) & set(M)
10000 loops, best of 3: 32.3 µs per loop

timeit any(x in M for x in L)
1000 loops, best of 3: 374 µs per loop

timeit m_set=frozenset(M);any(x in m_set  for x in L)
10000 loops, best of 3: 31 µs per loop

L=range(50,150)

timeit set(L) & set(M)
10000 loops, best of 3: 18 µs per loop

timeit any(x in M for x in L)
100000 loops, best of 3: 4.88 µs per loop

timeit m_set=frozenset(M);any(x in m_set  for x in L)
100000 loops, best of 3: 9.39 µs per loop


# Now for some random lists
import random
L=[random.randrange(200000) for x in xrange(1000)]
M=[random.randrange(200000) for x in xrange(1000)]

timeit set(L) & set(M)
1000 loops, best of 3: 420 µs per loop

timeit any(x in M for x in L)
10 loops, best of 3: 21.2 ms per loop

timeit m_set=set(M);any(x in m_set  for x in L)
1000 loops, best of 3: 168 µs per loop

timeit m_set=frozenset(M);any(x in m_set  for x in L)
1000 loops, best of 3: 371 µs per loop

Or more concisely

if set(L) & set(M):
    # there is an intersection
else:
    # no intersection

If you really need True or False

bool(set(L) & set(M))

After running some timings, this seems to be a good option to try too

m_set=set(M)
any(x in m_set  for x in L)

If the items in M or L are not hashable you have to use a less efficient approach like this

any(x in M for x in L)

Here are some timings for 100 item lists. Using sets is considerably faster when there is no intersection, and a bit slower when there is a considerable intersection.

M=range(100)
L=range(100,200)

timeit set(L) & set(M)
10000 loops, best of 3: 32.3 µs per loop

timeit any(x in M for x in L)
1000 loops, best of 3: 374 µs per loop

timeit m_set=frozenset(M);any(x in m_set  for x in L)
10000 loops, best of 3: 31 µs per loop

L=range(50,150)

timeit set(L) & set(M)
10000 loops, best of 3: 18 µs per loop

timeit any(x in M for x in L)
100000 loops, best of 3: 4.88 µs per loop

timeit m_set=frozenset(M);any(x in m_set  for x in L)
100000 loops, best of 3: 9.39 µs per loop


# Now for some random lists
import random
L=[random.randrange(200000) for x in xrange(1000)]
M=[random.randrange(200000) for x in xrange(1000)]

timeit set(L) & set(M)
1000 loops, best of 3: 420 µs per loop

timeit any(x in M for x in L)
10 loops, best of 3: 21.2 ms per loop

timeit m_set=set(M);any(x in m_set  for x in L)
1000 loops, best of 3: 168 µs per loop

timeit m_set=frozenset(M);any(x in m_set  for x in L)
1000 loops, best of 3: 371 µs per loop

回复收藏 0 原文

記憶穿過時間隧道 2024-08-26 21:07:14

为了避免构建交集的工作，并在我们知道它们相交时立即生成答案：

m_set = frozenset(M)
return any(x in m_set for x in L)

更新： gnibbler 尝试了这一点，发现使用 set() 代替 freezeset( ）。你知道吗。

To avoid the work of constructing the intersection, and produce an answer as soon as we know that they intersect:

m_set = frozenset(M)
return any(x in m_set for x in L)

Update: gnibbler tried this out and found it to run faster with set() in place of frozenset(). Whaddayaknow.

回复收藏 0 原文

请叫√我孤独 2024-08-26 21:07:14

首先，如果不需要排序，则切换到 set 类型。

如果你仍然需要列表类型，那么这样做：0 == False

len(set.intersection(set(L), set(M)))

First of all, if you do not need them ordered, then switch to the set type.

If you still need the list type, then do it this way: 0 == False

len(set.intersection(set(L), set(M)))

回复收藏 0 原文

姜生凉生 2024-08-26 21:07:14

注意：这个答案似乎太复杂了，乍一看只需要一个集合操作，但集合只能包含可散列的项；原始问题没有指定列表中包含哪些项目。所以这段代码首先尝试使用集合，然后回退到更通用的代码。

这是我能想到的最通用、最有效的平衡方式（注释应该使代码易于理解）：

import itertools, operator

def _compare_product(list1, list2):
    "Return if any item in list1 equals any item in list2 exhaustively"
    return any(
        itertools.starmap(
            operator.eq,
            itertools.product(list1, list2)))

def do_they_intersect(list1, list2):
    "Return if any item is common between list1 and list2"

    # do not try to optimize for small list sizes
    if len(list1) * len(list2) <= 100: # pick a small number
        return _compare_product(list1, list2)

    # first try to make a set from one of the lists
    try: a_set= set(list1)
    except TypeError:
        try: a_set= set(list2)
        except TypeError:
            a_set= None
        else:
            a_list= list1
    else:
        a_list= list2

    # here either a_set is None, or we have a_set and a_list

    if a_set:
        return any(itertools.imap(a_set.__contains__, a_list))
    
    # try to sort the lists
    try:
        a_list1= sorted(list1)
        a_list2= sorted(list2)
    except TypeError: # sorry, not sortable
        return _compare_product(list1, list2)

    # they could be sorted, so let's take the N+M road,
    # not the N*M
    
    iter1= iter(a_list1)
    iter2= iter(a_list2)
    try:
        item1= next(iter1)
        item2= next(iter2)
    except StopIteration: # one of the lists is empty
        return False # ie no common items

    while 1:
        if item1 == item2:
            return True
        while item1 < item2:
            try: item1= next(iter1)
            except StopIteration: return False
        while item2 < item1:
            try: item2= next(iter2)
            except StopIteration: return False

HTH。

Note: this answer seems to be too-complicated for what at first glance needs to be only a set operation, but sets can contain only hashable items; the original question does not specify what items will be in the list. So this code first tries with sets and then falls back to more generic code.

That's the most generic and efficient in a balanced way I could come up with (comments should make the code easy to understand):

import itertools, operator

def _compare_product(list1, list2):
    "Return if any item in list1 equals any item in list2 exhaustively"
    return any(
        itertools.starmap(
            operator.eq,
            itertools.product(list1, list2)))

def do_they_intersect(list1, list2):
    "Return if any item is common between list1 and list2"

    # do not try to optimize for small list sizes
    if len(list1) * len(list2) <= 100: # pick a small number
        return _compare_product(list1, list2)

    # first try to make a set from one of the lists
    try: a_set= set(list1)
    except TypeError:
        try: a_set= set(list2)
        except TypeError:
            a_set= None
        else:
            a_list= list1
    else:
        a_list= list2

    # here either a_set is None, or we have a_set and a_list

    if a_set:
        return any(itertools.imap(a_set.__contains__, a_list))
    
    # try to sort the lists
    try:
        a_list1= sorted(list1)
        a_list2= sorted(list2)
    except TypeError: # sorry, not sortable
        return _compare_product(list1, list2)

    # they could be sorted, so let's take the N+M road,
    # not the N*M
    
    iter1= iter(a_list1)
    iter2= iter(a_list2)
    try:
        item1= next(iter1)
        item2= next(iter2)
    except StopIteration: # one of the lists is empty
        return False # ie no common items

    while 1:
        if item1 == item2:
            return True
        while item1 < item2:
            try: item1= next(iter1)
            except StopIteration: return False
        while item2 < item1:
            try: item2= next(iter2)
            except StopIteration: return False

HTH.

回复收藏 0 原文

~没有更多了~