从Python中的远程随机样品中排除一组元素

发布于 2025-01-27 04:16:01 字数 514 浏览 2 评论 0原文

我从两组数字组中随机采样N元组，如下所示：

set1 = [list(range(10))]
set2 = [list(range(10,20))]

c1 = np.random.choice(set1,N) #ex [9,3,7,8]
c2 = np.random.choice(set2,N) #ex [15,13,19,12]
tuples = np.concatenate([c1,c2],axis=1) #ex [[9,15],[9,19],[3,12]]

对于下一个迭代，我想再次采样C1，C2，但不包括我已经拥有的独特元组。数字可以再次出现，但不与（number1，number2）的组合相同。理想情况下，这将是类似的：

new_tuples = np.random.choice([set1,set2],exclude=tuples)

一个人可以使用NP. Unique和重新采样，但我希望这是一种更有效的方法。

编辑：事先获得所有可能的组合将是昂贵的。

原文

I randomly sample N tuples from two different sets of numbers as follows:

set1 = [list(range(10))]
set2 = [list(range(10,20))]

c1 = np.random.choice(set1,N) #ex [9,3,7,8]
c2 = np.random.choice(set2,N) #ex [15,13,19,12]
tuples = np.concatenate([c1,c2],axis=1) #ex [[9,15],[9,19],[3,12]]

For the next iteration I want to sample c1,c2 again but excluding the unique tuples I already have. The numbers can appear again but just not the same combination of (number1,number2). Ideally that would be something like:

new_tuples = np.random.choice([set1,set2],exclude=tuples)

One could just check them with np.unique and resample but I was hopping for it to be a more efficient way.

EDIT: Getting all possible combinations beforehand will be to expensive.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

錯遇了你 2025-02-03 04:16:01

在Asker评论之后，我修复了代码：

import numpy as np
import time
begin = time.time()
N = 4
set1 = list(range(10**6))
set2 = list(range(10**6, 20**6))
c1 = np.random.choice(set1,N) #ex [9,3,7,8]
c2 = np.random.choice(set2,N)
tuples = set(list(zip(c1, c2)))
print(tuples)
c1 = np.random.choice(set1,N) #ex [9,3,7,8]
c2 = np.random.choice(set2,N)
new_tuples = set([(n1, n2) for (n1, n2) in list(zip(c1, c2)) if (n1, n2) not in tuples][0:4])
print(new_tuples)
print(tuples | new_tuples)
print(time.time() - begin)

评论逐步解释。测试了20个bilions，它在13秒内返回！
获得的输出：

##{(315090, 13207382), (175935, 7922219), (249258, 59598185), (45681, 27246043)}

{（446782，45042493），（122963，12794175），（388061，20418275），（328064，48911155）}
{（315090，13207382），（175935，7922219），（328064，48911155），（446782，45042493），（249258，249258，59598185） 418275）}
12.917975664138794

After asker comment, I fixed the code:

import numpy as np
import time
begin = time.time()
N = 4
set1 = list(range(10**6))
set2 = list(range(10**6, 20**6))
c1 = np.random.choice(set1,N) #ex [9,3,7,8]
c2 = np.random.choice(set2,N)
tuples = set(list(zip(c1, c2)))
print(tuples)
c1 = np.random.choice(set1,N) #ex [9,3,7,8]
c2 = np.random.choice(set2,N)
new_tuples = set([(n1, n2) for (n1, n2) in list(zip(c1, c2)) if (n1, n2) not in tuples][0:4])
print(new_tuples)
print(tuples | new_tuples)
print(time.time() - begin)

Comments explain step by step. Tested 20 bilions, it returned in 13 seconds!
Output obtained:

##{(315090, 13207382), (175935, 7922219), (249258, 59598185), (45681, 27246043)}

{(446782, 45042493), (122963, 12794175), (388061, 20418275), (328064, 48911155)}
{(315090, 13207382), (175935, 7922219), (328064, 48911155), (446782, 45042493), (249258, 59598185), (45681, 27246043), (122963, 12794175), (388061, 20418275)}
12.917975664138794

回复收藏 0 原文

~没有更多了~