从Python中的远程随机样品中排除一组元素

发布于 2025-01-27 04:16:01 字数 514 浏览 2 评论 0原文

我从两组数字组中随机采样N元组,如下所示:

set1 = [list(range(10))]
set2 = [list(range(10,20))]

c1 = np.random.choice(set1,N) #ex [9,3,7,8]
c2 = np.random.choice(set2,N) #ex [15,13,19,12]
tuples = np.concatenate([c1,c2],axis=1) #ex [[9,15],[9,19],[3,12]]

对于下一个迭代,我想再次采样C1,C2,但不包括我已经拥有的独特元组。数字可以再次出现,但不与(number1,number2)的组合相同。理想情况下,这将是类似的:

new_tuples = np.random.choice([set1,set2],exclude=tuples)

一个人可以使用NP. Unique和重新采样,但我希望这是一种更有效的方法。

编辑:事先获得所有可能的组合将是昂贵的。

I randomly sample N tuples from two different sets of numbers as follows:

set1 = [list(range(10))]
set2 = [list(range(10,20))]

c1 = np.random.choice(set1,N) #ex [9,3,7,8]
c2 = np.random.choice(set2,N) #ex [15,13,19,12]
tuples = np.concatenate([c1,c2],axis=1) #ex [[9,15],[9,19],[3,12]]

For the next iteration I want to sample c1,c2 again but excluding the unique tuples I already have. The numbers can appear again but just not the same combination of (number1,number2). Ideally that would be something like:

new_tuples = np.random.choice([set1,set2],exclude=tuples)

One could just check them with np.unique and resample but I was hopping for it to be a more efficient way.

EDIT: Getting all possible combinations beforehand will be to expensive.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

錯遇了你 2025-02-03 04:16:01

在Asker评论之后,我修复了代码:

import numpy as np
import time
begin = time.time()
N = 4
set1 = list(range(10**6))
set2 = list(range(10**6, 20**6))
c1 = np.random.choice(set1,N) #ex [9,3,7,8]
c2 = np.random.choice(set2,N)
tuples = set(list(zip(c1, c2)))
print(tuples)
c1 = np.random.choice(set1,N) #ex [9,3,7,8]
c2 = np.random.choice(set2,N)
new_tuples = set([(n1, n2) for (n1, n2) in list(zip(c1, c2)) if (n1, n2) not in tuples][0:4])
print(new_tuples)
print(tuples | new_tuples)
print(time.time() - begin)

评论逐步解释。测试了20个bilions,它在13秒内返回!
获得的输出:

##{(315090, 13207382), (175935, 7922219), (249258, 59598185), (45681, 27246043)}

{(446782,45042493),(122963,12794175),(388061,20418275),(328064,48911155)}
{(315090,13207382),(175935,7922219),(328064,48911155),(446782,45042493),(249258,249258,59598185) 418275)}
12.917975664138794

After asker comment, I fixed the code:

import numpy as np
import time
begin = time.time()
N = 4
set1 = list(range(10**6))
set2 = list(range(10**6, 20**6))
c1 = np.random.choice(set1,N) #ex [9,3,7,8]
c2 = np.random.choice(set2,N)
tuples = set(list(zip(c1, c2)))
print(tuples)
c1 = np.random.choice(set1,N) #ex [9,3,7,8]
c2 = np.random.choice(set2,N)
new_tuples = set([(n1, n2) for (n1, n2) in list(zip(c1, c2)) if (n1, n2) not in tuples][0:4])
print(new_tuples)
print(tuples | new_tuples)
print(time.time() - begin)

Comments explain step by step. Tested 20 bilions, it returned in 13 seconds!
Output obtained:

##{(315090, 13207382), (175935, 7922219), (249258, 59598185), (45681, 27246043)}

{(446782, 45042493), (122963, 12794175), (388061, 20418275), (328064, 48911155)}
{(315090, 13207382), (175935, 7922219), (328064, 48911155), (446782, 45042493), (249258, 59598185), (45681, 27246043), (122963, 12794175), (388061, 20418275)}
12.917975664138794

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文