Boost Mersenne Twister:如何使用多个值作为种子?

发布于 2024-09-03 01:41:11 字数 575 浏览 3 评论 0原文

我正在使用 boost mt19937 实现进行模拟。

模拟需要可重复,这意味着存储并可能在以后重复使用 RNG 种子。我使用 Windows crypto api 来生成种子值,因为我需要种子的外部源,而不是因为任何特定的随机性保证。任何模拟运行的输出都会有一个包含 RNG 种子的注释 - 因此种子需要相当短。另一方面,作为模拟分析的一部分,我将比较几次运行 - 但为了确保这些运行实际上是不同的,我需要使用不同的种子 - 所以种子需要足够长以避免意外碰撞。

我确定 64 位种子应该足够了;大约 2^32 次运行后,碰撞的几率将达到 50%——这个概率足够低,以至于它引起的平均误差对我来说可以忽略不计。仅使用 32 位种子是很棘手的; 2^16 次运行后,碰撞几率已达到 50%;这对我的口味来说有点太可能了。

不幸的是,boost 实现要么使用完整的状态向量作为种子 - 这太长了 - 要么使用单个 32 位无符号长整型 - 这并不理想。

我如何为生成器提供种子超过 32 位但小于完整状态向量?我尝试仅填充向量或重复种子来填充状态向量,但即使粗略地浏览一下结果也会发现,这会产生很差的结果。

I'm using the boost mt19937 implementation for a simulation.

The simulation needs to be reproducible, and that means storing and potentially reusing the RNG seeds later. I'm using the windows crypto api to generate the seed values because I need an external source for the seeds and not because of any particular guarantees of randomness. The output of any simulation run will have a note including the RNG seed - so the seed needs to be reasonably short. On the other hand, as part of the analysis of the simulation, I'll be comparing several runs - but to be sure that these runs are actually different, I'll need to use different seeds - so the seed needs to be long enough to avoid accidental collisions.

I've determined that 64-bits of seeding should suffice; the chance of a collision will reach 50% after about 2^32 runs - that probability is low enough that the average error caused by it is negligible to me. Using just 32-bits of seed is tricky; the chance of a collision reaches 50% already after 2^16 runs; and that's a little too likely for my tastes.

Unfortunately, the boost implementation either seeds with a full state vector - which is far, far too long - or a single 32-bit unsigned long - which isn't ideal.

How can I seed the generator with more than 32-bits but less than a full state vector? I tried just padding the vector or repeating the seeds to fill the state vector, but even a cursory glance at the results shows that that generates poor results.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

长梦不多时 2024-09-10 01:41:11

你的假设是错误的。对于模拟,您不需要加密强度高的种子。事实上,使用种子 1、2、3、4 等通常是更好的主意。 Mersenne Twister 的输出值将是不相关的,但没有人会质疑您是否精心挑选了种子以获得所需的模拟输出。

对于确实有真正需要的其他人,一种简单的方法是丢弃生成的第一个(种子>>32)值。这为您提供了大约 log2(seed>>32) 额外的状态位。然而,只有当您需要一些额外的位时,它才能有效地工作。以这种方式添加 32 位可能太慢了。

更快的算法是为好的随机生成器生成完整的状态向量。由于结果状态向量的随机性有限,问题中提到的解决方案(重复或填充)不太好。但是,如果您从 mersenne_twister(seed1) ^ mersenne_twister(seed2) 的输出填充初始状态向量,则这根本不是问题。

Your assumptions are mistaken. For a simulation, you don't need cryptographically strong seeds. In fact, using seeds 1,2,3,4, etcetera is often a better idea. The output values of the Mersenne Twister will be uncorrelated, yet nobody will question whether you cherry-picked your seeds to get desired simulation outputs.

For other people who do have a real need, one easy way is to discard the first (seed>>32) values generated. This gives you about log2(seed>>32) extra bits of state. However, it only works efficiently if you need a few extra bits. Adding 32 bits this way is probably too slow.

A faster algorithm is to generate the full state vector for the good random generator. The solutions mentioned in the question (repeating or padding) aren't so good due to the limited randomness in the resulting state vector. But if you fill the initial state vector from the output of mersenne_twister(seed1) ^ mersenne_twister(seed2), this is not an issue at all.

笑饮青盏花 2024-09-10 01:41:11

查看 mersenne_twister 模板的升压源

  void seed(UIntType value)
  {
    // New seeding algorithm from 
    // http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html
    // In the previous versions, MSBs of the seed affected only MSBs of the
    // state x[].
    const UIntType mask = ~0u;
    x[0] = value & mask;
    for (i = 1; i < n; i++) {
      // See Knuth "The Art of Computer Programming" Vol. 2, 3rd ed., page 106
      x[i] = (1812433253UL * (x[i-1] ^ (x[i-1] >> (w-2))) + i) & mask;
    }
  }

对于 mt19937 UIntTypeuint32_tw 是 32。对于 64 位种子,也许您可​​以使用较低的 32 位来计算每个偶数索引状态 (x) 和高 32 位,使用该算法计算状态的奇数索引。

(尽管这是货物崇拜的建议)

Looking at boost sources of mersenne_twister template:

  void seed(UIntType value)
  {
    // New seeding algorithm from 
    // http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html
    // In the previous versions, MSBs of the seed affected only MSBs of the
    // state x[].
    const UIntType mask = ~0u;
    x[0] = value & mask;
    for (i = 1; i < n; i++) {
      // See Knuth "The Art of Computer Programming" Vol. 2, 3rd ed., page 106
      x[i] = (1812433253UL * (x[i-1] ^ (x[i-1] >> (w-2))) + i) & mask;
    }
  }

For mt19937 UIntType is uint32_t, w is 32. For 64-bit seed, maybe you could use the lower 32 bits to calculate every even index of the state (x) and the higher 32 bits to calculate the odd indices of the state, using that algorithm.

(This is cargo cult suggestion though)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文