从 python 中具有固定数量元素的集合中非常快速地采样
我需要从固定大小的集合中均匀随机采样一个数字,进行一些计算,然后将新数字放回集合中。 (所需的数字样本非常大)
我尝试将数字存储在列表中并使用 random.choice() 来选择一个元素,删除它,然后附加新元素。但这太慢了!
我正在考虑将数字存储在 numpy 数组中,对索引列表进行采样,并为每个索引执行计算。
- 有没有更快的方法来完成这个过程?
I need to sample uniformly at random a number from a set with fixed size, do some calculation, and put the new number back into the set. (The number samples needed is very large)
I've tried to store the numbers in a list and use random.choice() to pick an element, remove it, and then append the new element. But that's way too slow!
I'm thinking to store the numbers in a numpy array, sample a list of indices, and for each index perform the calculation.
- Are there any faster way of doing this process?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
Python 列表在内部实现为数组(如 Java
ArrayList
、C++std::vector
等),因此从中间删除元素相对较慢:必须重新索引后续元素。 (参见http://www.laurentluce.com/posts/python-list-implementation/ 了解更多信息。)由于元素的顺序似乎与您无关,我建议您仅使用random.randint(0, len(L) - 1) 选择一个索引
i
,然后使用L[i] =calculation(L[i])
更新第i
元素。Python lists are implemented internally as arrays (like Java
ArrayList
s, C++std::vector
s, etc.), so removing an element from the middle is relatively slow: all subsequent elements have to be reindexed. (See http://www.laurentluce.com/posts/python-list-implementation/ for more on this.) Since the order of elements doesn't seem to be relevant to you, I'd recommend you just userandom.randint(0, len(L) - 1)
to choose an indexi
, then useL[i] = calculation(L[i])
to update thei
th element.随机.sample(
集合或列表或 Numpy 数组, Nsample )
非常快,
但我不清楚你是否想要这样的东西:
你可以使用 Numpy 数组
或 bitarray
而不是
set
,但我希望 calc() 中的时间占主导地位。您的 Setsize 和 Samplesize 大致是多少?
random.sample(
a set or list or Numpy array, Nsample )
is very fast,
but it's not clear to me if you want anything like this:
You could use Numpy arrays
or bitarray
instead of
set
, but I'd expect the time in calc() to dominate.What are your Setsize and Samplesize, roughly ?