如何将此电源集函数修改为仅包括＆＃x27;有效＆＃x27;子集？

发布于 2025-01-23 01:31:12 字数 2398 浏览 4 评论 0原文

我正在使用二项式树作为具有2^t -1'Nodes'的列表，并希望创建一组在列表元素上在某些给定标准（以下概述）中工作的子集。现在，我使用以下代码来生成树，

def gen_nodes(T):
    nodes = []
    for t in range(T):
        for i in range(2**t):
            nodes += [[t,1 + i]]
    return nodes

例如，对于t = 1，我们得到了根

gen_nodes(1) = [[0,1]],

，但对于t = 2和t = 3，我们得到了

gen_nodes(2) = [[0,1], [1,1], [1,2]]
gen_nodes(3) = [[0,1], [1,1], [1,2], [2,1], [2,2], [2,3], [2,4]],

et et cetera。现在，我使用的powerset函数由撰稿人，

def powerset(s):
    x = len(s)
    masks = [1 << i for i in range(x)]
    for i in range(1 << x):
        yield [ss for mask, ss in zip(masks, s) if i & mask]

这对我来说很棒，但是我敢肯定，您在这一点上遇到了时间，powerset的长度完全太大了，因为o（2^（2^） t））。最初，我将仅通过蛮力创建整个集合，然后在创建较大的子集之后对有效子集进行约束，但是如果我不修改该子集，我似乎会遇到一些溢出问题powerset功能与这些约束。

基本上，我只希望在ls = list的输出（powerset（gen_nodes（t））的输出中列表e，以便对于所有i的i。 1，e [i]]或[e [i] -1，e [i] -1]在e中。

返回二进制树模拟，这基本上是在[0，t] x [1，t+1]中的所有[t，i]中说的，仅当[t-1，i]或[t-1，i-1]在e中，基本上[t，i]在e中，必须至少有一个“路径”，从[0,1]到[t，i]，其中每个节点在路径也在e中。我怀疑这会大大凝结输出的子集的大小，但是我不确定如何实施它。我认为我可能必须放弃使用PowerSet功能，但是我不确定如何在这种情况下对其进行编码，因此会感谢我能获得的任何帮助。

编辑：我应该包括所评论的所需输出。此外，到目前为止，我还包括对我“有用”的功能，但目前效率非常低。首先，让PSET为解决此问题的函数。和pg（i）= pSet（gen_nodes（i））。然后

pg(1) = [[0,1]],
pg(2) = [[[0, 1]], [[0, 1], [1, 1]], [[0, 1],
[1, 2]], [[0, 1], [1, 1], [1, 2]]]

不幸的是，这套集仍然非常快（pg（3）是长度高达6对的17个列表，pg（4）是97个长度列表，最多可达10对等），所以我不能在上面分享更多这篇文章。但是，我确实开发了一个有效的功能，但似乎效率低下（PG（6）需要半秒钟，然后PG（7）需要4分钟才能完成）。它附加在下面：

import time
def pset(lst):
    pw_set = [[]]
    start_time = time.time()
    for i in range(0,len(lst)):
        for j in range(0,len(pw_set)):
            ele = pw_set[j].copy()
            if lst[i] == [0,1]:
                ele = ele + [lst[i]]
                pw_set = pw_set + [ele]
            else:
                if [lst[i][0] - 1,lst[i][1]] in ele or [lst[i][0] - 1,lst[i][1] - 1] in ele:
                    ele = ele + [lst[i]]
                    pw_set = pw_set + [ele]
    print("--- %s seconds ---" % (time.time() - start_time))
    return pw_set[1:]

在这里，我只是检查了是否添加的“节点”是否在集合之前至少有一个节点：如果没有，则跳过了。我检查了PG（3），并且输出的内容是必需的，因此我认为它在起作用，效率低下。因此，我（看似）解决了内存溢出问题，现在我只需要提高这个效率。

原文

I am using a binomial tree as a list with 2^T - 1 'nodes' and want to create a set of subsets that work within some given criteria (outlined below) on the elements of the list. Right now, I use the following code to generate a tree

def gen_nodes(T):
    nodes = []
    for t in range(T):
        for i in range(2**t):
            nodes += [[t,1 + i]]
    return nodes

For instance, for T = 1, we get the root

gen_nodes(1) = [[0,1]],

but for T = 2 and T = 3, we get

gen_nodes(2) = [[0,1], [1,1], [1,2]]
gen_nodes(3) = [[0,1], [1,1], [1,2], [2,1], [2,2], [2,3], [2,4]],

et cetera. Right now, I'm using a powerset function courtesy of this wonderful contributor,

def powerset(s):
    x = len(s)
    masks = [1 << i for i in range(x)]
    for i in range(1 << x):
        yield [ss for mask, ss in zip(masks, s) if i & mask]

This has worked great for me, but as I'm sure you've caught at this point, the length of the powerset gets entirely too large with the time complexity of something like O(2^(2^T)). Initially, I was going to just create the entire set by brute force and then apply constraints on valid subsets after creating the larger set of subsets, but it seem's like I'm going to run into some overflow problems if I don't modify the powerset function with those constraints.

Basically, I only want the lists e within the output of ls = list(powerset(gen_nodes(T))) such that for all i from 0:len(e), e[i] in e implies [e[i] - 1, e[i]] OR [e[i] - 1, e[i] - 1] are in e.

Returning to the binary tree analog, this is basically saying for all [t,i] in [0,t] x [1,t+1], [t,i] in e only if [t-1,i] OR [t-1,i-1] in e, basically if [t,i] is in e, there there must be at least one "path" from [0,1] to [t,i] where each node in the path is also in e. I suspect this will condense the size of subsets output immensely, but I'm unsure of how to implement it. I think I might have to forgo using the powerset function, but I'm not sure how to code it in that case, and would therefore appreciate any help I can get.

EDIT: I should include desired output as commented. Additionally, I've included the function that has been 'working' for me thus far, but it's horribly inefficient currently. First, let pset be the function that solves this problem. and pg(i) = pset(gen_nodes(i)) for brevity. Then

pg(1) = [[0,1]],
pg(2) = [[[0, 1]], [[0, 1], [1, 1]], [[0, 1],
[1, 2]], [[0, 1], [1, 1], [1, 2]]]

Unfortunately, this set still grows very fast (pg(3) is 17 lists of length up to 6 pairs, pg(4) is 97 lists of length up to 10 pairs, etc), so I can't share much more on this post. However, I did develop a function that works, but seems to be horribly inefficient (pg(6) takes half a second, and then pg(7) takes 4 minutes to complete). It is attached below:

import time
def pset(lst):
    pw_set = [[]]
    start_time = time.time()
    for i in range(0,len(lst)):
        for j in range(0,len(pw_set)):
            ele = pw_set[j].copy()
            if lst[i] == [0,1]:
                ele = ele + [lst[i]]
                pw_set = pw_set + [ele]
            else:
                if [lst[i][0] - 1,lst[i][1]] in ele or [lst[i][0] - 1,lst[i][1] - 1] in ele:
                    ele = ele + [lst[i]]
                    pw_set = pw_set + [ele]
    print("--- %s seconds ---" % (time.time() - start_time))
    return pw_set[1:]

Here, I just checked if the 'node' being added had at least one of the nodes preceding it in the set: if not, it was skipped. I checked up to pg(3) and the output is as desired, so I'm thinking it's working, just inefficient. Thus, I've (seemingly) solved the memory overflow problem, now I just need to make this efficient.

分享到QQ

分享到微博