Python 用交集合并多个列表

发布于 2025-01-06 18:09:16 字数 406 浏览 3 评论 0原文

可能的重复：
Python：基于交叉点的简单列表合并

我有一个多个列表：

 list=[[1,2,3],[3,5,6],[8,9,10],[11,12,13]]

是否有一种智能且快速的方法来获取至少具有一个交集的所有子列表。在我的示例中，我希望代码返回

 result=[[1,2,3,5,6],[8,9,10],[11,12,13]]

原文

Possible Duplicate:
Python: simple list merging based on intersections

I have a multiple list:

 list=[[1,2,3],[3,5,6],[8,9,10],[11,12,13]]

Is there a smart and fast way to get all the sublists with at least one intersection. In my example I want that the code return

 result=[[1,2,3,5,6],[8,9,10],[11,12,13]]

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

油饼 2025-01-13 18:09:16

这可行，但可能不是很优雅：

def merge_lists(l):
        s=map(set, l)
        i, n=0, len(s)
        while i < n-1:
                for j in xrange(i+1, n):
                        if s[i].intersection(s[j]):
                                s[i].update(s[j])
                                del s[j]
                                n-=1
                                break
                else:
                        i+=1
        return [sorted(i) for i in s]

This works, but maybe isn't very elegant:

def merge_lists(l):
        s=map(set, l)
        i, n=0, len(s)
        while i < n-1:
                for j in xrange(i+1, n):
                        if s[i].intersection(s[j]):
                                s[i].update(s[j])
                                del s[j]
                                n-=1
                                break
                else:
                        i+=1
        return [sorted(i) for i in s]

回复收藏 0 原文

神妖 2025-01-13 18:09:16

好问题！如果您将其视为图中的连通分量问题，那就简单多了。以下代码使用了优秀的 networkx 图形库和 pairs< /code> 函数来自这个问题。

def pairs(lst):
    i = iter(lst)
    first = prev = item = i.next()
    for item in i:
        yield prev, item
        prev = item
    yield item, first

lists = [[1,2,3],[3,5,6],[8,9,10],[11,12,13]]

import networkx
g = networkx.Graph()
for sub_list in lists:
    for edge in pairs(sub_list):
            g.add_edge(*edge)

networkx.connected_components(g)
[[1, 2, 3, 5, 6], [8, 9, 10], [11, 12, 13]]

解释

我们创建一个新的（空）图g。对于 lists 中的每个子列表，将其元素视为图形的节点，并在它们之间添加一条边。（因为我们只关心连通性，所以不需要添加所有边 - 只需添加相邻的边！）请注意，add_edge 接受两个对象，将它们视为节点（如果它们尚不存在，则添加它们），并在它们之间添加边缘。

然后，我们只需找到图的连通分量——问题就解决了！ -- 并将它们输出为我们的相交集。

Nice question! It's much simpler if you think of it as a connected-components problem in a graph. The following code uses the excellent networkx graph library and the pairs function from this question.

def pairs(lst):
    i = iter(lst)
    first = prev = item = i.next()
    for item in i:
        yield prev, item
        prev = item
    yield item, first

lists = [[1,2,3],[3,5,6],[8,9,10],[11,12,13]]

import networkx
g = networkx.Graph()
for sub_list in lists:
    for edge in pairs(sub_list):
            g.add_edge(*edge)

networkx.connected_components(g)
[[1, 2, 3, 5, 6], [8, 9, 10], [11, 12, 13]]

Explanation

We create a new (empty) graph g. For each sub-list in lists, consider its elements as nodes of the graph and add an edge between them. (Since we only care about connectedness, we don't need to add all the edges -- only adjacent ones!) Note that add_edge takes two objects, treats them as nodes (and adds them if they aren't already there), and adds an edge between them.

Then, we just find the connected components of the graph -- a solved problem! -- and output them as our intersecting sets.

回复收藏 0 原文

鹿港巷口少年归 2025-01-13 18:09:16

您基本上可以通过一次数据传递来完成此操作：

list_of_lists = [[1, 2, 3], [3, 5, 6], [8, 9, 10], [11, 12, 13]]
sets = {}
for lst in list_of_lists:
    s = set(lst)
    t = set()
    for x in s:
        if x in sets:
            t.update(sets[x])
        else:
            sets[x] = s
    for y in t:
        sets[y] = s
    s.update(t)
ids = set()
for s in sets.itervalues():
    if id(s) not in ids:
        ids.add(id(s))
        print s

这将创建一个字典sets，将每个元素映射到它所属的集合。如果某个元素之前已经见过，则其集合将被包含到当前集合中，并且映射到前一个集合的所有字典条目都会被更新。

最后我们需要在字典sets的值中找到所有唯一的集合。请注意，虽然我将其实现为集合字典，但该代码也适用于列表而不是集合。

You can do this with essentially a single pass through the data:

list_of_lists = [[1, 2, 3], [3, 5, 6], [8, 9, 10], [11, 12, 13]]
sets = {}
for lst in list_of_lists:
    s = set(lst)
    t = set()
    for x in s:
        if x in sets:
            t.update(sets[x])
        else:
            sets[x] = s
    for y in t:
        sets[y] = s
    s.update(t)
ids = set()
for s in sets.itervalues():
    if id(s) not in ids:
        ids.add(id(s))
        print s

This creates a dictionary sets mapping each element to the set it belongs to. If some element has been seen before, its set is subsumed into the current one and all dictinary entries mapping to the former set are updated.

Finally we need to find all unique sets in the values of the dictionary sets. Note that while I implemented this as a dictionary of sets, the code also works with lists instead of sets.

回复收藏 0 原文

~没有更多了~