如何量化伪随机数生成器的质量?

发布于 2024-07-13 06:16:36 字数 696 浏览 7 评论 0原文

这是基于这个问题。 提出了许多产生不均匀分布的答案,我开始想知道如何量化输出的不均匀性。 我不是在寻找模式问题,只是在寻找单一值方面。

接受的程序是什么?


我目前的想法是通过计算每个值的熵来计算每次调用的平均香农熵取加权平均值。 然后可以将其与预期值进行比较。

我担心的是

  1. 这是正确的吗?
  2. 如何在不损失精度的情况下计算这些值?

对于#1,我想知道我是否正确。

对于#2,我担心的是,我将处理像 1/7 +/- 1e-18 这样的数字,并且我担心浮点错误会因为除了最小的问题之外的任何问题而杀死我。 计算的确切形式可能会导致这里出现一些重大差异,我似乎记得对于某些特殊的日志情况有一些 ASM 选项,但我似乎找不到有关此的文档。


在这种情况下,使用的是针对范围[1,n]采用“好的”PRNG并为范围[1,m]生成SRNG。 问题是结果比输入差多少?

我所得到的是每个输出值的预期发生率。

This is based on this question. A number of answers were proposed that generate non-uniform distributions and I started wondering how to quantify the non uniformity of the output. I'm not looking for patterning issues, just single value aspects.

What are the accepted procedures?


My current thinking is to computer the average Shannon entropy per call by computing the entropy of each value and taking a weighted average. This can then be compered to the expected value.

My concerns are

  1. Is this correct?
  2. How to compute these value without loosing precision?

For #1 I'm wondering if I've got it correct.

For #2 the concern is that I would be processing numbers with magnitudes like 1/7 +/- 1e-18 and I'm worried that the floating point errors will kill me for any but the smallest problems. The exact form of the computation could result in some major differences here and I seem to recall that there are some ASM options for some special log cases but I can't seem to find the docs about this.


In this case the use is take a "good" PRNG for the range [1,n] and generate a SRNG for the range [1,m]. The question is how much worse is the results than the input?

What I have is expected occurrence rates for each output value.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

甜心 2024-07-20 06:16:36

NIST 拥有一组文档和工具,用于跨各种指标对随机数生成器进行统计分析。

http://csrc.nist.gov/groups/ST/toolkit/ rng/index.html

其中许多测试也被纳入 Dieharder PRNG 测试套件中。

http://www.phy.duke.edu/~rgb/General/ rand_rate.php

有大量不同的指标,因为有很多很多不同的方法来使用 PRNG。 您无法在真空中分析 PRNG - 您必须了解用例。 这些工具和文档提供了大量信息来帮助您,但最终您仍然必须了解自己实际需要什么,然后才能确定算法是否合适。 NIST 文档非常详尽,尽管有些密集。

-亚当

NIST has a set of documents and tools for statistically analyzing random number generators cross a variety of metrics.

http://csrc.nist.gov/groups/ST/toolkit/rng/index.html

Many of these tests are also incorporated into the Dieharder PRNG test suite.

http://www.phy.duke.edu/~rgb/General/rand_rate.php

There are a ton of different metrics, because there are many, many different ways to use PRNGs. You can't analyze a PRNG in a vacuum - you have to understand the use case. These tools and documents provide a lot of information to help you in this, but at the end of the day you'll still have to understand what you actually need before you can determine of the algorithm is suitable. The NIST documentation is thorough, if somewhat dense.

-Adam

⊕婉儿 2024-07-20 06:16:36

此页面讨论了一种检查您是否成绩不佳的方法分布:在字段中绘制伪随机值,然后查看它们。

This page discusses one way of checking if you are getting a bad distribution: plotting the pseudo-random values in a field and then just looking at them.

玩世 2024-07-20 06:16:36

TestU01 拥有比 Dieharder 更严格的测试集。 最大的测试集称为“BigCrush”,但执行时间较长,因此还有称为“Crush”和“SmallCrush”的子集。 我们的想法是首先尝试 SmallCrush,如果 PRNG 通过了该测试,则尝试 Crush,如果通过了该测试,则尝试 BigCrush。 如果它也通过了,那应该就足够了。

您可以此处获取 TestU01。

TestU01 has an even more exacting test set than Dieharder. The largest test set is called "BigCrush", but it takes a long time to execute, so there are also subsets called just "Crush" and "SmallCrush". The idea is to first try SmallCrush, and if the PRNG passes that, try Crush, and if it passes that, BigCrush. If it passes that too, it should be good enough.

You can get TestU01 here.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文