有效的算法以检查列表中的值是否在列表中并重述元素的索引

发布于 2025-01-23 20:58:29 字数 747 浏览 0 评论 0原文

我的目标是有效地在大量列表中查找（让我们以示例为1 mln条目，每个条目是由3个元素组成的列表），该元素的索引包含一定值：

例如，让我们以列表a

a = [[0,1,2],[0,5,6],[7,8,9]]

我想检验包含值0的元素的索引，因此我的函数将返回0,1

我的第一次尝试是：

def any_identical_value(elements,index):

    for el in elements:

        if el == index:

            return True

    return False


def get_dual_points(compliant_cells, index ):
      compliant = [i for i,e in enumerate(compliant_cells) if any_identical_value(e,index)]
      return compliant


result = get_dual_points(a,0)

该解决方案正常工作，但对于大量列表列表的效率高度低。特别是我的目标是执行主要列表中值总数的疑问，因此n_queries = len（a）*3在上述9中

。：

列表是完成此任务的好数据结构吗？
是否有更有效的算法解决方案？

原文

My goal is to efficiently find in a large list of list (let's take as an example 1 mln of entries and each entry is a list composed of 3 elements) the index of the element containing a certain value:

e.g let's take the list a

a = [[0,1,2],[0,5,6],[7,8,9]]

i want to retrive the indices of the elements containing the value 0, hence my function would return 0,1

My first try has been the following:

def any_identical_value(elements,index):

    for el in elements:

        if el == index:

            return True

    return False


def get_dual_points(compliant_cells, index ):
      compliant = [i for i,e in enumerate(compliant_cells) if any_identical_value(e,index)]
      return compliant


result = get_dual_points(a,0)

The solution works correctly but it is highly inefficient for large list of lists. In particular my goal is to perform a number of quesries that is the total number of values in the primary list, hence n_queries = len(a)*3, in the example above 9.

Here comes 2 questions:

Is the list the good data structure to achieve this task?
Is there a more efficient algorithm solution?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

颜 2025-01-30 20:58:29

您可以一次使用所有索引（单个o（n）通过），这将使您可以在o（1）时间内回答查询。

from collections import defaultdict

d = defaultdict(list)
a = [[0,1,2],[0,5,6],[7,8,9]]
queries = [0,1]
for i in range(len(a)):
    for element in a[i]:
        d[element].append(i)

for x in queries:
    print(d[x])

# prints
# [0, 1]
# [0]

You can hash all indexes in one go (single O(N) pass), which would allow you to answer the queries in O(1) time.

from collections import defaultdict

d = defaultdict(list)
a = [[0,1,2],[0,5,6],[7,8,9]]
queries = [0,1]
for i in range(len(a)):
    for element in a[i]:
        d[element].append(i)

for x in queries:
    print(d[x])

# prints
# [0, 1]
# [0]

回复收藏 0 原文

与酒说心事 2025-01-30 20:58:29

这是一种提出的算法：曾经在列表中迭代，以构建一个地图每个独特的元素 all 它属于的sublists的索引。

通过这种方法，dict构建需要时间与列表列表中的元素总数成比例。然后每个查询都是恒定的。

这需要列表的命令：

def dict_of_indices(a):
    d = {}
    for i,l in enumerate(a):
        for e in l:
            d.setdefault(e, []).append(i)
    return d

a = [[0,1,2],[0,5,6],[7,8,9]]
d = dict_of_indices(a)
print( d[0] )
# [0, 1]

Here is a proposed algorithm: iterate on the list of lists once, to build a dict that maps every unique element to all the indices of the sublists it belongs to.

With this method, the dict-building takes time proportional to the total number of elements in the list of lists. Then every query is constant-time.

This requires a dict of lists:

def dict_of_indices(a):
    d = {}
    for i,l in enumerate(a):
        for e in l:
            d.setdefault(e, []).append(i)
    return d

a = [[0,1,2],[0,5,6],[7,8,9]]
d = dict_of_indices(a)
print( d[0] )
# [0, 1]

回复收藏 0 原文

忘年祭陌 2025-01-30 20:58:29

您可以创建一个词典，该字典从一个值映射到一组行索引。然后，对于每个查询，您可以简单地查找该值，如果它在2D列表中的任何地方都不存在，则返回一个空集

from itertools import product

a = [[0,1,2],[0,5,6],[7,8,9]]

values = {}

for row, col in product(range(len(a)), range(len(a[0]))):
    value_at_index = a[row][col]
    values.setdefault(value_at_index, set()).add(row)
    
print(values.get(0, set()))

：

{0, 1}

如果您提前知道每个子列表中的每个元素都是唯一的，那么您可以将字典更新行更改为：

values.setdefault(value_at_index, []).append(row)

并将.get（）调用更改为：

values.get(0, [])

要维护输出中索引的排序。

You can create a dictionary that maps from a value to a set of row indices. Then, for each query, you can simply look up the value, returning an empty set if it doesn't exist anywhere in the 2D list:

from itertools import product

a = [[0,1,2],[0,5,6],[7,8,9]]

values = {}

for row, col in product(range(len(a)), range(len(a[0]))):
    value_at_index = a[row][col]
    values.setdefault(value_at_index, set()).add(row)
    
print(values.get(0, set()))

This outputs:

{0, 1}

If you know in advance that the elements within each sublist are unique, then you can change the dictionary update line to:

values.setdefault(value_at_index, []).append(row)

and change the .get() call to:

values.get(0, [])

to maintain the ordering of the indices in the output.

回复收藏 0 原文

~没有更多了~

关于作者

如何视而不见

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

有效的算法以检查列表中的值是否在列表中并重述元素的索引

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

饮湿

明月

02

hs1283

风向决定发型

落花浅忆

友情链接

有效的算法以检查列表中的值是否在列表中并重述元素的索引

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

饮湿

明月

02

hs1283

风向决定发型

落花浅忆

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。