如何调整随机数据流中值的分布?

发布于 2024-08-24 06:11:55 字数 308 浏览 9 评论 0原文

给定来自有偏差的随机 0 和 1 的无限流(例如,已知因素下 1 比 0 更常见),但在其他方面是理想的随机数生成器,我想将其转换为(更短的)无限流,就像理想但也不带偏见。

查找熵的定义会发现该图显示了输出的位数从理论上讲,我应该能够从每一位输入中获取信息。

是否有任何实际方法可以实际实现近乎理想效率的转换器?

Given a infinite stream of random 0's and 1's that is from a biased (e.g. 1's are more common than 0's by a know factor) but otherwise ideal random number generator, I want to convert it into a (shorter) infinite stream that is just as ideal but also unbiased.

Looking up the definition of entropy finds this graph showing how many bits of output I should, in theory, be able to get from each bit of input.

Entropy of a coin flip in bits versus the fairness of the coin

The question: Is there any practical way to actually implement a converter that is nearly ideally efficient?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

聚集的泪 2024-08-31 06:11:55

冯·诺依曼有一个著名的装置,可以将不公平的硬币变成公平的硬币。我们可以使用这个设备来解决我们这里的问题。

重复从有偏差的源中抽取两个位,直到获得一对不同的位。现在返回第一位,丢弃第二位。这产生了一个公正的来源。这样做的原因是,无论来源如何,01 的概率与 10 的概率相同。因此,以 01 或 10 为条件的 0 的概率是 1/2,以 01 为条件的 1 的概率是 1/2。或者 10 是 1/2。

There is a well-known device due to Von Neumann for turning an unfair coin into a fair coin. We can use this device to solve our problem here.

Repeatedly draw two bits from your biased source until you obtain a pair for which the bits are different. Now return the first bit, discarding the second. This produces an unbiased source. The reason this works is because regardless of the source, the probability of a 01 is the same as a probability of a 10. Therefore the probability of a 0 conditional on 01 or 10 is 1/2 and the probability of a 1 conditional on 01 or 10 is 1/2.

眼眸里的快感 2024-08-31 06:11:55

霍夫曼对输入进行编码

假设输入具有已知偏差,您可以计算每个 n 位段的校验和的概率分布。由此构造一个 霍夫曼代码,然后对序列进行编码。

我不确定,但一个潜在的问题是这可能会在连续位之间引入一些相关性。

Hoffman encode the input.

Given that the input is of a known bias, you can compute a probability distribution for check sums of each n bit segment. From that construct a Hoffman code and then just encode the sequence.

I'm not sure but one potential problem is that this might introduce some correlation between sequential bits.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文