在下一个子列表中添加新元素取决于它是否已添加（还涉及字典问题）python

发布于 2025-01-15 05:19:47 字数 2556 浏览 0 评论 0原文

Stackoverflow 社区：

我正在尝试根据另一个列表的值的随机采样创建一个带有循环的子列表列表；并且每个子列表都具有不能重复或已添加到先前子列表的值的限制。

比方说（示例）我有一个主列表：

[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]

#I get:
[[1,13],[4,1],[8,13]]

#I WANT:
[[1,13],[4,9],[8,14]]          #(no duplicates when checking previous sublists)

我认为它可以工作的真正代码如下（作为草案）：

matrixvals=list(matrix.index.values)  #list where values are obtained
lists=[[]for e in range(0,3)]         #list of sublists that I want to feed
vls=[]                                #stores the values that have been added to prevent adding them again
for e in lists:                       #initiate main loop
    for i in range(0,5):              #each sublist will contain 5 different random samples
        x=random.sample(matrixvals,1) #it doesn't matter if the samples are 1 or 2
        if any(x) not in vls:         #if the sample isn't in the evaluation list
            vls.extend(x)
            e.append(x)
        else:             #if it IS, then do a sample but without those already added values (line below)
            x=random.sample([matrixvals[:].remove(x) for x in vls],1)
            vls.extend(x)
            e.append(x)

        
print(lists)
print(vls)

它不起作用，因为我得到以下内容：

[[[25], [16], [15], [31], [17]], [[4], [2], [13], [42], [13]], [[11], [7], [13], [17], [25]]]
[25, 16, 15, 31, 17, 4, 2, 13, 42, 13, 11, 7, 13, 17, 25]

如您所见，数字 13 重复了 3次，我不明白为什么

我想要：

[[[25], [16], [15], [31], [17]], [[4], [2], [13], [42], [70]], [[11], [7], [100], [18], [27]]]
[25, 16, 15, 31, 17, 4, 2, 13, 42, 70, 11, 7, 100, 18, 27]   #no dups

此外，有没有办法将sample.random结果转换为值而不是列表？（获取）：

[[25,16,15,31,17]], [4, 2, 13, 42,70], [11, 7, 100, 18, 27]]

另外，现实中的最终结果不是子列表的列表，实际上是一个字典（上面的代码是解决字典问题的草案尝试），有没有办法在a中获取先前的方法听写？使用我目前的代码，我得到了下一个结果：

{'1stkey': {'1stsubkey': {'list1': [41,
    40,
    22,
    28,
    26,
    14,
    41,
    15,
    40,
    33],
   'list2': [41, 40, 22, 28, 26, 14, 41, 15, 40, 33],
   'list3': [41, 40, 22, 28, 26, 14, 41, 15, 40, 33]},
  '2ndsubkey': {'list1': [21,
    7,
    31,
    12,
    8,
    22,
    27,...}

而不是那个结果，我想要以下结果：

 {'1stkey': {'1stsubkey': {'list1': [41,40,22],
       'list2': [28, 26, 14],
       'list3': [41, 15, 40, 33]},
      '2ndsubkey': {'list1': [21,7,31],
       'list2':[12,8,22],
       'list3':[27...,...}#and so on

有没有办法解决列表和字典问题？任何帮助将不胜感激；即使只有列表问题，我也能取得一些进展，

谢谢大家

原文

Community of Stackoverflow:

I'm trying to create a list of sublists with a loop based on a random sampling of values of another list; and each sublist has the restriction of not having a duplicate or a value that has already been added to a prior sublist.

Let's say (example) I have a main list:

[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]

#I get:
[[1,13],[4,1],[8,13]]

#I WANT:
[[1,13],[4,9],[8,14]]          #(no duplicates when checking previous sublists)

The real code that I thought it would work is the following (as a draft):

matrixvals=list(matrix.index.values)  #list where values are obtained
lists=[[]for e in range(0,3)]         #list of sublists that I want to feed
vls=[]                                #stores the values that have been added to prevent adding them again
for e in lists:                       #initiate main loop
    for i in range(0,5):              #each sublist will contain 5 different random samples
        x=random.sample(matrixvals,1) #it doesn't matter if the samples are 1 or 2
        if any(x) not in vls:         #if the sample isn't in the evaluation list
            vls.extend(x)
            e.append(x)
        else:             #if it IS, then do a sample but without those already added values (line below)
            x=random.sample([matrixvals[:].remove(x) for x in vls],1)
            vls.extend(x)
            e.append(x)

        
print(lists)
print(vls)

It didn't work as I get the following:

[[[25], [16], [15], [31], [17]], [[4], [2], [13], [42], [13]], [[11], [7], [13], [17], [25]]]
[25, 16, 15, 31, 17, 4, 2, 13, 42, 13, 11, 7, 13, 17, 25]

As you can see, number 13 is repeated 3 times, and I don't understand why

I would want:

[[[25], [16], [15], [31], [17]], [[4], [2], [13], [42], [70]], [[11], [7], [100], [18], [27]]]
[25, 16, 15, 31, 17, 4, 2, 13, 42, 70, 11, 7, 100, 18, 27]   #no dups

In addition, is there a way to convert the sample.random results as values instead of lists? (to obtain):

[[25,16,15,31,17]], [4, 2, 13, 42,70], [11, 7, 100, 18, 27]]

Also, the final result in reality isn't a list of sublists, actually is a dictionary (the code above is a draft attempt to solve the dict problem), is there a way to obtain that previous method in a dict? With my present code I got the next results:

{'1stkey': {'1stsubkey': {'list1': [41,
    40,
    22,
    28,
    26,
    14,
    41,
    15,
    40,
    33],
   'list2': [41, 40, 22, 28, 26, 14, 41, 15, 40, 33],
   'list3': [41, 40, 22, 28, 26, 14, 41, 15, 40, 33]},
  '2ndsubkey': {'list1': [21,
    7,
    31,
    12,
    8,
    22,
    27,...}

Instead of that result, I would want the following:

 {'1stkey': {'1stsubkey': {'list1': [41,40,22],
       'list2': [28, 26, 14],
       'list3': [41, 15, 40, 33]},
      '2ndsubkey': {'list1': [21,7,31],
       'list2':[12,8,22],
       'list3':[27...,...}#and so on

Is there a way to solve both list and dict problem? Any help will be very appreciated; I can made some progress even only with the list problem

Thanks to all

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

夏有森光若流苏 2025-01-22 05:19:47

我意识到您可能更感兴趣的是找出为什么您的特定方法不起作用。但是，如果我理解您想要的行为，我也许可以提供替代解决方案。发布我的答案后，我将看看您的尝试。

random.sample 允许您从 population（集合、列表等）中采样 k 个项目。如果集合，那么您保证随机样本中没有重复：

from random import sample

pool = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

num_samples = 4

print(sample(pool, k=num_samples))

可能的输出：

[9, 11, 8, 7]
>>>

无论您运行此代码段多少次，随机样本中都不会出现重复元素。这是因为 random.sample 不会生成随机对象，它只是随机选取集合中已存在的项目。例如，这与从一副牌中随机抽牌或抽彩票号码时所采用的方法相同。

就您而言，pool 是可供您选择样本的可能唯一编号的池。您所需的输出似乎是三个列表的列表，其中每个子列表都有两个样本。我们不应该调用 random.sample 三次（每个子列表一次），而应该使用 k=num_sublists * num_samples_per_sublist 调用一次：

from random import sample

pool = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

num_sublists = 3
samples_per_sublist = 2

num_samples = num_sublists * samples_per_sublist

assert num_samples <= len(pool)

print(sample(pool, k=num_samples))

可能的输出：

[14, 10, 1, 8, 6, 3]
>>>

好的，所以我们有六个样本而不是四个。还没有子列表。现在，您可以简单地将这个六个样本的列表分成三个子列表，每个子列表有两个样本：

from random import sample

pool = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

num_sublists = 3
samples_per_sublist = 2

num_samples = num_sublists * samples_per_sublist

assert num_samples <= len(pool)

def pairwise(iterable):
    yield from zip(*[iter(iterable)]*samples_per_sublist)

print(list(pairwise(sample(pool, num_samples))))

可能的输出：

[(4, 11), (12, 13), (8, 15)]
>>>

或者，如果您确实想要子列表，而不是元组：

def pairwise(iterable):
    yield from map(list, zip(*[iter(iterable)]*samples_per_sublist))

编辑 - 只是意识到您实际上并不想要列表列表，但是一本字典。还有类似这样的事情吗？抱歉，我对生成器很着迷，这并不容易阅读：

keys = ["1stkey"]
subkeys = ["1stsubkey", "2ndsubkey"]
num_lists_per_subkey = 3
num_samples_per_list = 5
num_samples = num_lists_per_subkey * num_samples_per_list

min_sample = 1
max_sample = 50

pool = list(range(min_sample, max_sample + 1))

def generate_items():

    def generate_sub_items():
        from random import sample

        samples = sample(pool, k=num_samples)

        def generate_sub_sub_items():

            def chunkwise(iterable, n=num_samples_per_list):
                yield from map(list, zip(*[iter(iterable)]*n))
        
            for list_num, chunk in enumerate(chunkwise(samples), start=1):
                key = f"list{list_num}"
                yield key, chunk

        for subkey in subkeys:
            yield subkey, dict(generate_sub_sub_items())
    
    for key in keys:
        yield key, dict(generate_sub_items())

print(dict(generate_items()))

可能的输出：

{'1stkey': {'1stsubkey': {'list1': [43, 20, 4, 27, 2], 'list2': [49, 44, 18, 8, 37], 'list3': [19, 40, 9, 17, 6]}, '2ndsubkey': {'list1': [43, 20, 4, 27, 2], 'list2': [49, 44, 18, 8, 37], 'list3': [19, 40, 9, 17, 6]}}}
>>>

I realize you may be more interested in finding out why your particular approach isn't working. However, if I've understood your desired behavior, I may be able to offer an alternative solution. After posting my answer, I will take a look at your attempt.

random.sample lets you sample k number of items from a population (collection, list, whatever.) If there are no repeated elements in the collection, then you're guaranteed to have no repeats in your random sample:

from random import sample

pool = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

num_samples = 4

print(sample(pool, k=num_samples))

Possible output:

[9, 11, 8, 7]
>>>

It doesn't matter how many times you run this snippet, you will never have repeated elements in your random sample. This is because random.sample doesn't generate random objects, it just randomly picks items which already exist in a collection. This is the same approach you would take when drawing random cards from a deck of cards, or drawing lottery numbers, for example.

In your case, pool is the pool of possible unique numbers to choose your sample from. Your desired output seems to be a list of three lists, where each sublist has two samples in it. Rather than calling random.sample three times, once for each sublist, we should call it once with k=num_sublists * num_samples_per_sublist:

from random import sample

pool = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

num_sublists = 3
samples_per_sublist = 2

num_samples = num_sublists * samples_per_sublist

assert num_samples <= len(pool)

print(sample(pool, k=num_samples))

Possible output:

[14, 10, 1, 8, 6, 3]
>>>

OK, so we have six samples rather than four. No sublists yet. Now you can simply chop this list of six samples up into three sublists of two samples each:

from random import sample

pool = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

num_sublists = 3
samples_per_sublist = 2

num_samples = num_sublists * samples_per_sublist

assert num_samples <= len(pool)

def pairwise(iterable):
    yield from zip(*[iter(iterable)]*samples_per_sublist)

print(list(pairwise(sample(pool, num_samples))))

Possible output:

[(4, 11), (12, 13), (8, 15)]
>>>

Or if you really want sublists, rather than tuples:

def pairwise(iterable):
    yield from map(list, zip(*[iter(iterable)]*samples_per_sublist))

EDIT - just realized that you don't actually want a list of lists, but a dictionary. Something more like this? Sorry I'm obsessed with generators, and this isn't really easy to read:

keys = ["1stkey"]
subkeys = ["1stsubkey", "2ndsubkey"]
num_lists_per_subkey = 3
num_samples_per_list = 5
num_samples = num_lists_per_subkey * num_samples_per_list

min_sample = 1
max_sample = 50

pool = list(range(min_sample, max_sample + 1))

def generate_items():

    def generate_sub_items():
        from random import sample

        samples = sample(pool, k=num_samples)

        def generate_sub_sub_items():

            def chunkwise(iterable, n=num_samples_per_list):
                yield from map(list, zip(*[iter(iterable)]*n))
        
            for list_num, chunk in enumerate(chunkwise(samples), start=1):
                key = f"list{list_num}"
                yield key, chunk

        for subkey in subkeys:
            yield subkey, dict(generate_sub_sub_items())
    
    for key in keys:
        yield key, dict(generate_sub_items())

print(dict(generate_items()))

Possible output:

{'1stkey': {'1stsubkey': {'list1': [43, 20, 4, 27, 2], 'list2': [49, 44, 18, 8, 37], 'list3': [19, 40, 9, 17, 6]}, '2ndsubkey': {'list1': [43, 20, 4, 27, 2], 'list2': [49, 44, 18, 8, 37], 'list3': [19, 40, 9, 17, 6]}}}
>>>

回复收藏 0 原文

~没有更多了~