为什么 java.util.Random 使用掩码？

发布于 2024-10-31 12:18:15 字数 437 浏览 5 评论 0原文

简化（即，排除并发性）Random.next(int bits) 看起来像是

protected int next(int bits) {
    seed = (seed * multiplier + addend) & mask;
    return (int) (seed >>> (48 - bits));
}

使用掩码将种子减少到 48 位。为什么它比仅仅更好

protected int next(int bits) {
    seed = seed * multiplier + addend;
    return (int) (seed >>> (64 - bits));
}

？我读过很多关于随机数的文章，但没有发现这样做的原因。

原文

Simplified (i.e., leaving concurrency out) Random.next(int bits) looks like

protected int next(int bits) {
    seed = (seed * multiplier + addend) & mask;
    return (int) (seed >>> (48 - bits));
}

where masking gets used to reduce the seed to 48 bits. Why is it better than just

protected int next(int bits) {
    seed = seed * multiplier + addend;
    return (int) (seed >>> (64 - bits));
}

? I've read quite a lot about random numbers, but see no reason for this.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

撕心裂肺的伤痛 2024-11-07 12:18:15

原因是较低的位往往具有较低的周期（至少对于 Java 使用的算法而言）

来自 Wikipedia -线性同余发生器：

如上所示，LCG 并不总是使用它们生成的值中的所有位。 Java 实现每次迭代都会生成 48 位，但仅返回这些值中的 32 个最高有效位。这是因为高阶位比低阶位具有更长的周期（见下文）。使用这种技术的 LCG 比不使用这种技术的 LCG 产生更好的值。

编辑：

进一步阅读后（方便地在维基百科上），a、c 和 m 的值必须满足这些条件才能具有完整的周期：

c 和 m 必须互质
a-1 可被 m 的所有质因数整除
如果 m 是 4 的倍数，则 a-1 是 4 的倍数< /p>

我能清楚地看出唯一仍然满意的是 #3。 #1 和 #2 需要检查，我有一种感觉，其中一个（或两个）都失败了。

回复收藏 0 原文

放飞的风筝 2024-11-07 12:18:15

来自 java.util.Random 顶部的文档：

《计算机编程艺术》第 2 卷第 3.2.1 节中进行了描述

该算法在Donald Knuth 的。它是一个 48 位种子、
线性同余公式。

因此整个算法被设计为操作 48 位种子，而不是 64 位种子。我想你可以向 Knuth 先生提出这个问题；p

回复收藏 0 原文

雪化雨蝶 2024-11-07 12:18:15

来自 wikipedia （@helloworld922 发布的引文所暗示的引文）：

LCG 的另一个问题是，如果 m 设置为 2 的幂，则生成序列的低位比特的周期比整个序列的周期短得多。一般来说，序列中的第 n 个最低有效数字以 b 为基数表示输出序列，其中 bk = m 对于某个整数 k，最多重复周期 bn。

此外，它继续（我的斜体）：

当 m 是 2 的幂时，LCG 的低位位无论如何都不应依赖于任何程度的随机性。事实上，简单地用 2n 代替模数项就可以看出低阶位经历的周期非常短。特别是，当 m 为 2 的幂时，任何全周期 LCG 将交替产生奇数和偶数结果。

最后，原因可能是历史性的：Sun 的人们想要可靠地工作的东西，而 Knuth 公式给出了 32 个有效位。请注意 java.util.Random API 是这么说的（我的斜体）：

如果使用相同的种子创建两个 Random 实例，并且对每个实例进行相同的方法调用序列，则它们将生成并返回相同的数字序列。 为了保证这个属性，为 Random 类指定了特定的算法。为了 Java 代码的绝对可移植性，Java 实现必须使用此处为 Random 类显示的所有算法。但是，Random 类的子类允许使用其他算法，只要它们遵守通用契约即可对于所有方法。

所以我们坚持将其作为参考实现。然而，这并不意味着您不能使用另一个生成器（以及 Random 子类或创建新类）：

来自同一维基百科页面：

MMIX，作者：Donald Knuth m=2⁶⁴ a=6364136223846793005 c=1442695040888963407

有一个适合您的 64 位公式。

随机数很棘手（正如 Knuth 指出的那样），根据您的需要，如果您需要 64 位数字，您可能只需调用 java.util.Random 两次并连接这些位就可以了。如果您确实关心统计属性，请使用类似 Mersenne Twister 的东西，或者如果您关心信息泄漏/不可预测性使用java.security.SecureRandom。

From wikipedia (the quote alluded to by the quote that @helloworld922 posted):

A further problem of LCGs is that the lower-order bits of the generated sequence have a far shorter period than the sequence as a whole if m is set to a power of 2. In general, the nth least significant digit in the base b representation of the output sequence, where bk = m for some integer k, repeats with at most period bn.

And furthermore, it continues (my italics):

The low-order bits of LCGs when m is a power of 2 should never be relied on for any degree of randomness whatsoever. Indeed, simply substituting 2n for the modulus term reveals that the low order bits go through very short cycles. In particular, any full-cycle LCG when m is a power of 2 will produce alternately odd and even results.

In the end, the reason is probably historical: the folks at Sun wanted something to work reliably, and the Knuth formula gave 32 significant bits. Note that the java.util.Random API says this (my italics):

If two instances of Random are created with the same seed, and the same sequence of method calls is made for each, they will generate and return identical sequences of numbers. In order to guarantee this property, particular algorithms are specified for the class Random. Java implementations must use all the algorithms shown here for the class Random, for the sake of absolute portability of Java code. However, subclasses of class Random are permitted to use other algorithms, so long as they adhere to the general contracts for all the methods.

So we're stuck with it as a reference implementation. However that doesn't mean you can't use another generator (and subclass Random or create a new class):

from the same Wikipedia page:

MMIX by Donald Knuth m=2⁶⁴ a=6364136223846793005 c=1442695040888963407

There's a 64-bit formula for you.

Random numbers are tricky (as Knuth notes) and depending on your needs, you might be fine with just calling java.util.Random twice and concatenating the bits if you need a 64-bit number. If you really care about the statistical properties, use something like Mersenne Twister, or if you care about information leakage / unpredictability use java.security.SecureRandom.

回复收藏 0 原文