使用 PRNG 而不是打乱生成打乱范围
是否有任何已知的算法可以在给定任意种子值的情况下在线性时间和恒定空间(当迭代产生输出时)生成打乱范围 [0..n)?
假设n可能很大,例如数百万,因此不需要潜在地产生每种可能的排列的要求,尤其是因为它是不可行的(种子值空间需要很大)。 这也是需要恒定空间的原因。 (所以,我特别不寻找数组洗牌算法,因为这要求范围存储在长度为 n 的数组中,因此将使用线性空间。)
我知道 问题 162606,但它没有给出这个特定问题的答案 - 从排列索引到该问题中给出的排列将需要巨大的种子值空间。
理想情况下,它就像一个周期和范围为 n 的 LCG >,但是为 LCG 选择 a
和 c
的艺术是微妙的。 简单地满足全周期 LCG 中的 a
和 c
约束可能会满足我的要求,但我想知道是否有更好的想法。
Is there any known algorithm that can generate a shuffled range [0..n) in linear time and constant space (when output produced iteratively), given an arbitrary seed value?
Assume n may be large, e.g. in the many millions, so a requirement to potentially produce every possible permutation is not required, not least because it's infeasible (the seed value space would need to be huge). This is also the reason for a requirement of constant space. (So, I'm specifically not looking for an array-shuffling algorithm, as that requires that the range is stored in an array of length n, and so would use linear space.)
I'm aware of question 162606, but it doesn't present an answer to this particular question - the mappings from permutation indexes to permutations given in that question would require a huge seed value space.
Ideally, it would act like a LCG with a period and range of n
, but the art of selecting a
and c
for an LCG is subtle. Simply satisfying the constraints for a
and c
in a full period LCG may satisfy my requirements, but I am wondering if there are any better ideas out there.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
根据 Jason 的回答,我用 C# 做了一个简单直接的实现。 找到大于 N 的下一个最大的二的幂。这使得生成 a 和 c 变得微不足道,因为 c 需要是互质的(意味着它不能被 2 整除,也称为奇数),并且 (a-1) 需要能被 2 整除,(a-1) 需要能被 4 整除。从统计上看,应该需要 1-2 个同余才能生成下一个数字(因为 2N >= M >= N)。
Based on Jason's answer, I've made a simple straightforward implementation in C#. Find the next largest power of two greater than N. This makes it trivial to generate a and c, since c needs to be relatively prime (meaning it can't be divisible by 2, aka odd), and (a-1) needs to be divisible by 2, and (a-1) needs to be divisible by 4. Statistically, it should take 1-2 congruences to generate the next number (since 2N >= M >= N).
下面是 线性同余生成器的 Python 实现,来自 FryGuy 的回答。 因为无论如何我都需要写它并且认为它可能对其他人有用。
Here is a Python implementation of the Linear Congruential Generator from FryGuy's answer. Because I needed to write it anyway and thought it might be useful for others.
听起来你想要一个保证产生从 0 到 n-1 的循环而没有任何重复的算法。 几乎肯定有一大堆,具体取决于您的要求; 如果您想深入研究其背后的理论,群论将是数学中最有帮助的分支。
如果您想要快速且不关心可预测性/安全性/统计模式,LCG 可能是最简单的方法。 您链接到的维基百科页面包含这组(相当简单的)要求:
或者,您可以选择周期 N >= n,其中 N 是具有方便数值属性的最小值,并丢弃 n 和 N-1 之间产生的任何值。 例如,最低的 N = 2k - 1 >= n 将允许您使用 线性反馈移位寄存器 (LFSR)。 或者找到您最喜欢的加密算法(RSA、AES、DES 等等)并给定一个特定的密钥,计算出它排列的数字的空间 N,并为每个步骤应用一次加密。
如果 n 很小,但您希望安全性很高,这可能是最棘手的情况,因为任何序列 S 的周期 N 都可能比 n 高得多,但导出周期较短的非重复数字序列也很重要(例如,如果您可以获取 S mod n 的输出并保证数字序列不重复,那么这将提供攻击者可能使用的有关 S 的信息)
Sounds like you want an algorithm which is guaranteed to produce a cycle from 0 to n-1 without any repeats. There are almost certainly a whole bunch of these depending on your requirements; group theory would be the most helpful branch of mathematics if you want to delve into the theory behind it.
If you want fast and don't care about predictability/security/statistical patterns, an LCG is probably the simplest approach. The wikipedia page you linked to contains this (fairly simple) set of requirements:
Alternatively, you could choose a period N >= n, where N is the smallest value that has convenient numerical properties, and just discard any values produced between n and N-1. For example, the lowest N = 2k - 1 >= n would let you use linear feedback shift registers (LFSR). Or find your favorite cryptographic algorithm (RSA, AES, DES, whatever) and given a particular key, figure out the space N of numbers it permutes, and for each step apply encryption once.
If n is small but you want the security to be high, that's probably the trickiest case, as any sequence S is likely to have a period N much higher than n, but is also nontrivial to derive a nonrepeating sequence of numbers with a shorter period than N. (e.g. if you could take the output of S mod n and guarantee nonrepeating sequence of numbers, that would give information about S that an attacker might use)
请参阅我的文章 使用分组密码进行安全排列是实现此目的的一种方法。
See my article on secure permutations with block ciphers for one way to do it.
查看线性反馈移位寄存器,它们正是可以用于此目的。
解释它们的简短方法是,您从种子开始,然后使用公式进行迭代,
其中 f(x) 只能返回 0 或 1。
如果您选择一个好的函数
f
,x 将循环遍历以良好的伪随机方式计算 1 到 2^n-1 之间的所有值(其中 n 是某个数字)。示例函数可以在此处找到,例如您可以使用 63 个值
Look into Linear Feedback Shift Registers, they can be used for exactly this.
The short way of explaining them is that you start with a seed and then iterate using the formula
where f(x) can only return 0 or 1.
If you choose a good function
f
, x will cycle through all values between 1 and 2^n-1 (where n is some number), in a good, pseudo-random way.Example functions can be found here, e.g. for 63 values you can use