集群环境下的伪随机数生成器
如何在集群上生成独立的伪随机数,例如用于蒙特卡罗模拟?我可以有许多计算节点(例如 100 个),并且我需要在每个节点上生成数百万个数字。我需要保证一个节点上的 PRN 序列不会与另一个节点上的 PRN 序列重叠。
- 我可以在根节点上生成所有 PRN,然后将它们发送到其他节点。但这太慢了。
- 我可以在每个节点上跳转到序列中的已知距离。但是对于 Mersenne-Twister 或任何其他好的 PRNG 是否有这样的算法,可以用合理的时间和内存来完成?
- 我可以在每个节点上使用不同的生成器。但是像 Mersenne-Twister 这样的优秀生成器有可能实现吗?怎么可能呢?
- 还有其他的吗?
How can I generate independent pseudo-random numbers on a cluster, for Monte Carlo simulation for example? I can have many compute nodes (e.g. 100), and I need to generate millions of numbers on each node. I need a warranty that a PRN sequence on one node will not overlap the PRN sequence on another node.
- I could generate all PRN on root node, then send them to other nodes. But it would be far too slow.
- I could jump to a know distance in the sequence, on each node. But is there such an algorithm for Mersenne-Twister or for any other good PRNG, which can be done with a reasonable amount of time and memory?
- I could use different generators on each node. But is it possible with good generators like Mersenne-Twister? How could it be done?
- Any other though?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您永远不应该使用从同一原始流获得的可能重叠的随机流。如果您尚未测试生成的交错流,则您不知道其统计质量。
幸运的是,Mersenne Twister (MT) 将帮助您完成分发任务。使用其称为Dynamic Creator(以下简称DC)的专用算法,您可以创建独立的随机数生成器,从而生成高度独立的随机流。
每个流都将在使用它的节点上创建。基本上,可以将 DC 视为面向对象范式中的构造函数,用于创建 MT 的不同实例。每个不同的实例都被设计为产生高度独立的随机序列。
您可以在这里找到 DC:http: //www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dc.html
它使用起来非常简单,您将能够修复不同的参数,例如您想要获取的不同 MT 实例的数量或这些 MT 的周期。根据其输入参数,DC 的运行时间会发生变化。
除了 DC 附带的自述文件之外,请查看 DC 存档中的文件
example/new_example2.c
。它显示了调用获取独立序列给定不同的输入标识符的示例,这基本上是您识别集群作业所必须的。最后,如果您打算了解有关如何在并行或分布式环境中使用 PRNG 的更多信息,我建议您阅读以下科学文章:
随机高性能计算的随机流的实用分布,David RC Hill,高性能计算与仿真 (HPCS) 国际会议,2010 年
You should never use potentially overlapping random streams obtained from the same original stream. If you have not tested the resulting interleaved stream, you have no idea of its statistic quality.
Fortunately, Mersenne Twister (MT) will help you in your distribution task. Using its dedicated algorithm, called Dynamic Creator (DC hereafter), you can create independent random number generators that will produce highly independent random streams.
Each stream will be created on the node that will be using it. Basically, think of DC as a constructor in object oriented paradigm that creates different instances of MT. Each different instance is designed to produce highly independent random sequences.
You can find DC here: http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/DC/dc.html
It's quite straightforward to use and you'll be able to fix different parameters such as the number of different MT instances you want to obtain or the period of these MTs. Depending on its input parameter, DC will runtime will change.
In addition of the README coming along with DC, take a look at the file
example/new_example2.c
in the DC archive. It shows example of calls to get independent sequences given a different input identifier, which is basically what you have to identify cluster jobs.Finally, if you intend to learn more about how to use PRNGs in parallel or distributed environments, I suggest you read this scientific articles:
Practical distribution of random streams for stochastic High Performance Computing, David RC Hill, in International Conference on High Performance Computing and Simulation (HPCS), 2010
好的,回答#2 ;-)
我想说……保持简单。只需使用“短”种子来启动 MT(假设该种子为 232 位,因为缺乏更好的限制)。这假设短种子生成“充分分布”的 MT 起始状态(例如,希望我的其他答案中的代码中的
init_genrand
)。这并不能保证均匀分布的起始状态,而是“足够好”,见下文。每个节点将使用自己预先选择的种子序列(传输的随机种子列表或类似
number_nodes * node_number * iteration
的公式)。重要的是,最初的“短”种子永远不会在节点之间重复使用。然后,每个节点将使用使用此种子初始化的 MT PRNG n 次,其中 n
<< MT_period / max_value_of_short_seed
(TT800 为 2800-1 和 MT19937是 219937-1,所以n
仍然可以是非常大数量)。n
次后,节点移动到所选列表中的下一个种子。虽然我没有(也不能)提供任何节点都不会同时(或根本)有重复序列的“保证”,但以下是 AMD 关于使用不同 Seends 的说明:(显然初始种子算法发挥了作用)。
快乐编码。
Okay, answer #2 ;-)
I am going to say ... keep it simple. Just use a "short" seed to prime the MT (imagine that this seed is 232 bits for lack of better restriction). This assumes that the the short seed generates "sufficiently distributed" MT starting states (e.g.
init_genrand
in the code in my other answer, hopefully). This doesn't guarantee an equally distributed starting state but rather goes for "good enough", see below.Each node will use it's own sequence of seeds which are pre-selected (either a list of random seeds which is transmitted or a formula like
number_nodes * node_number * iteration
). The important thing is that the initial "short" seed will never be re-used across nodes.Each node will then use a MT PRNG initialized with this seed
n
times wheren <<< MT_period / max_value_of_short_seed
(TT800 is 2800-1 and MT19937 is 219937-1, son
can still be a very large number). Aftern
times, the node moves onto the next seed in the chosen list.While I do not (nor can I) provide a "guarantee" that no node will ever have a duplicate sequence at the same time (or at all), here is what AMD says about Using Different Seends: (Obviously the initial seeding algorithm plays a role).
Happy coding.
是的,请参阅 http://theo.phys.sci .hiroshima-u.ac.jp/~ishikawa/PRNG/mt_stream_en.html。这是获取独立随机数流的绝佳解决方案。通过进行大于每个流创建每个流的开头所需的随机数数量的跳转,流将不会重叠。
Yes, see http://theo.phys.sci.hiroshima-u.ac.jp/~ishikawa/PRNG/mt_stream_en.html. This is a excellent solution to obtaining independent random number streams. By making jumps that are larger than the number of random numbers needed from each stream to create the starts of each stream, the streams won't overlap.
免责声明:我不确定当从任意“uint”(或 x,其中 x 是较小的任意但唯一值)种子开始时,MT 在循环重叠方面有什么保证,但这可能值得研究,好像有是一个保证,那么在不同的“uint”种子上启动每个节点可能就足够了,这篇文章的其余部分在很大程度上变得毫无意义。 (MT 的周期长度/周期令人震惊并且除out UINT_MAX 仍然留下一个难以理解的——除了纸上的数字。)
好吧,这是我要回答的评论……
我喜欢使用预先生成的一组状态的方法#2;然后用给定的起始状态初始化每个节点中的 MT。
当然,只有初始状态必须保留,一旦生成,
考虑到 MT生成速度很快,我不建议上面的#3,因为它很复杂并且附加了许多条件。选项 #1 很简单,但可能不够动态。
选项#2 似乎是一个很好的可能性。服务器(“快速机器”,不一定是节点)只需要传输下一个“未使用的序列块”的起始状态(例如,十亿个周期)——节点在询问之前将使用生成器十亿个周期为一个新块。这将使其成为帖子中的#1 和非常罕见的消息的混合体。
在我的 Core2 Duo 系统上,我可以使用下面提供的代码在 17 秒内生成十亿个随机数(它在 LINQPad)。我不确定这是什么 MT 变体。
快乐编码。
Disclaimer: I am not sure what guarantee MT has in terms of cycle overlap when starting from an arbitrary "uint" (or x, where x is a smaller arbitrary but unique value) seed, but that may be worth looking into, as if there is a guarantee then it may be sufficient to just start each node on a different "uint" seed and the rest of this post becomes largely moot. (The cycle length/period of MT is staggering and dividing out UINT_MAX still leaves an incomprehensible -- except on paper -- number.)
Well, here goes my comments to answer...
I like approach #2 with a pre-generated set of states; the MT in each node is then initialized with a given starting state.
Only the initial states must be preserved, of course, and once this is generated these states can
Considering that MT is fast to generate, I would not recommend #3 from above as it's just complicated and has a number of strings attached. Option #1 is simple, but might not be dynamic enough.
Option #2 seems like a very good possibility. The server (a "fast machine", not necessarily a node) only needs to transmit the starting state of the next "unused sequence block" (say, one billion cycles) -- the node would use the generator for one billion cycles before asking for a new block. This would make it a hybrid of #1 in the post with very infrequent messaging.
On my system, a Core2 Duo, I can generate one billion random numbers in 17 seconds using the code provided below (it runs in LINQPad). I am not sure what MT variant this is.
Happy coding.
TRNG 是专门针对并行集群环境而构建的随机数生成器(特别是为德国的 TINA 超级计算机而构建)。因此,创建独立的随机数流并生成非标准分布非常容易。这里有一个关于如何设置的教程:
http://www.lindonslog.com/programming/parallel-random-编号-生成-trng/
TRNG is a random number generator built specifically with parallel cluster environments in mind (specifically it was built for the TINA super computer in Germany). Hence it is very eas to create independent random number streams and also generate non standard distributions. There is a tutorial on how to set it up here:
http://www.lindonslog.com/programming/parallel-random-number-generation-trng/