位和字节:存储洗牌指令
给定一个长度为 2 的字节数组,我们有两种洗牌的可能性。 01
和 10
长度为 3 将允许这些随机播放选项 012
、021
、102
代码>、<代码>120、<代码>102、<代码>201、<代码>210。总共 2x3=6 个选项。
长度为 4 将有 6x4=24。长度为 5 将有 24x5=120 个选项,等等。
因此,一旦您随机选择了这些随机选项之一,您如何存储它?您可以存储 23105
来指示如何打乱四个字节。但这需要 5x3=15 位。我知道它可以用 7 位完成,因为只有 120 种可能性。
有什么想法如何更有效地存储洗牌指令吗?它应该是一个可以扩展长度的算法。
编辑:在发布之前请参阅下面的我自己的答案一个新的。我确信许多已经存在的答案中都有很好的信息,但我只是无法理解其中的大部分内容。
Given a byte array with a length of two we have two possibilities for a shuffle. 01
and 10
A length of 3 would allow these shuffle options 012
,021
,102
,120
,102
,201
,210
. Total of 2x3=6 options.
A length of 4 would have 6x4=24. Length of 5 would have 24x5=120 options, etc.
So once you have randomly picked one of these shuffle options, how do you store it? You could store 23105
to indicate how to shuffle four bytes.. But that takes 5x3=15 bits. I know it can be done in 7 bits because there are only 120 possibilities.
Any ideas how to more efficiently store a shuffle instruction? It should be an algorithm that will scale in length.
Edit: See my own answer below before you post a new one. I am sure that there is good information in many of these already existing answers, but I just could not understand much of it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
如果您对要打乱的元素集有良好排序,那么您可以为所有排列的集合创建一个良好排序,并仅存储一个表示排列在顺序中的位置的整数。
例子:
打乱
1 4 5
:可能性是:要存储排列
415
,您只需存储2(零索引)。如果您对原始元素集有良好的排序,则可以通过从最左边的元素从最小顺序到最大顺序迭代元素,同时迭代下一个元素的剩余元素来对排列集进行良好排序。向右放置,依此类推,直到到达最右边的元素。您不需要存储这个数组,您只需要能够再次以相同的顺序生成排列以“解包”存储的整数。
然而,尝试一一生成所有排列将花费超出最小集合的相当多的时间。您可以观察到第一个
(N-1)!
排列从第一个元素开始,第二个(N-1)!
排列从第二个元素开始,然后,对于以特定元素开头的每个排列,第一个(N-2)!
排列从第一个剩余元素开始,依此类推。这将允许您“打包”或“解包”O(n)
中的元素,但实际生成阶乘以及任意长度整数的除法和模数的复杂性除外,这将是相当大的。If you have a well-ordering of the set of elements you are shuffling, then you can create a well-ordering for the set of all the permutations and just store a single integer representing which place in the order a permutation falls.
Example:
Shuffling
1 4 5
: the possibilities are:To store the permutation
415
, you would just store 2 (zero indexed).If you have a well-ordering for the original set of elements, you can make a well-ordering for the set of permutations by iterating through the elements from least order to greatest for the leftmost element, while iterating through the leftover elements for the next place to the right and so on until you get to the rightmost element. You wouldn't need to store this array, you would just need to be able to generate the permutations in the same order again to "unpack" the stored integer.
However, attempting to generate all the permutations one by one will take a considerable amount of time beyond the smallest of sets. You can use the observation that the first
(N-1)!
permutations start with the 1st element, the second(N-1)!
permutations start with the second element, then for each permutation that starts with a specific element, the 1st(N-2)!
permutations start with the first of the leftover elements and so on and so forth. This will allow you to "pack" or "unpack" the elements inO(n)
, excepting the complexity of actually generating the factorials and the division and modulus of arbitrary length integers, which will be somewhat substantial.您是对的,要仅存储数据的排列,而不是数据本身,您只需要 ceil(log2(permutations)) 那么多的位。对于 N 个项目,排列的数量为阶乘 (N) 或 N!,因此您需要 ceil(log2(factorial(N))) 位来仅存储 N 个项目的排列,而不存储项目。
无论您熟悉哪种语言,都应该有一种现成的方法来制作一个 M 位的大数组,用排列填充它,然后将其存储在存储设备上。
You are right that to store just a permutation of data, and not the data itself, you will need only as many bits as ceil(log2(permutations)). For N items, the number of permutations is factorial(N) or N!, so you would need ceil(log2(factorial(N))) bits to store just the permutation of N items without also storing the items.
In whatever language you're familiar, there should be a ready way to make a big array of M bits, fill it up with a permutation, and then store it on a storage device.
一种常见的洗牌算法,也是少数无偏的算法之一,是 Fisher-Yates 洗牌。该算法的每次迭代都会采用一个随机数,并根据该数字交换两个位置。通过存储这些随机数的列表,您可以稍后重现完全相同的排列。
此外,由于每个数字的有效范围都是预先知道的,因此您可以通过将每个数字乘以较低数字的有效范围的乘积将它们全部打包成一个大整数,就像一种可变基数位置表示法。
A common shuffling algorithm, and one of the few unbiased ones, is the Fisher-Yates shuffle. Each iteration of the algorithm takes a random number and swaps two places based on that number. By storing a list of those random numbers, you can later reproduce the exact same permutation.
Furthermore, since the valid range for each of those numbers is known in advance, you can pack them all into a big integer by multiplying each number by the product of the lower number's valid ranges, like a kind of variable-base positional notation.
对于 L 个项目的数组,为什么不将顺序打包到 L*ceil(log2(L)) 位中? (ceil(log2(L)) 是保存 L 个唯一值所需的位数)。 例如,以下是“未洗牌”洗牌的表示,按顺序取出项目
:与@user470379的答案相比,该方案的优点是提取索引非常容易,只需移位和掩码即可。无需重新生成排列表。这对于大 L 来说应该是一个巨大的胜利:(对于 128 个项目,有 128!= 3.8562e+215 种可能的排列)。
(排列 == “可能性”;阶乘 =
L!= L * (L-1) * ... * 1
= 正是您计算可能性的方式)方法也不比存储排列索引大多少。您可以将 128 个项目随机存储在 1024 位(32 x 32 位整数)中。存储 128! 需要 717 位(23 个整数)。
考虑到更快的解码速度和计算排列不需要临时存储的事实,存储额外的 9 个整数可能是非常值得的。
这是 Ruby 中的一个实现,它应该适用于任意大小。 “随机播放指令”包含在数组
指令
中。第一部分使用@Theran提到的Fisher-Yates算法的版本来计算洗牌。这是将洗牌应用于字符数组的示例:
希望这不会太难转换为javascript或任何其他语言。
For an array of L items, why not pack the order into L*ceil(log2(L)) bits? (ceil(log2(L)) is the number of bits needed to hold L unique values). For example, here is the representation of the "unshuffled" shuffle, taking the items in order:
The main advantage to this scheme compared to @user470379's answer, is that it is really easy to extract the indexes, just shift and mask. No need to regenerate the permutation table. This should be a big win for large L: (For 128 items, there are 128! = 3.8562e+215 possible permutations).
(Permutations == "possibilities"; factorial =
L! = L * (L-1) * ... * 1
= exactly the way you are calculating possibilities)This method also isn't that much larger than storing the permutation index. You can store a 128 item shuffle in 1024 bits (32 x 32-bit integers). It takes 717 bits (23 ints) to store 128!.
Between the faster decoding speed and the fact that no temporary storage is required for caclulating the permutation, storing the extra 9 ints may be well worth their cost.
Here is an implementation in Ruby that should work for arbitrary sizes. The "shuffle instruction" is contained in the array
instruction
. The first part calculates the shuffle using a version of the Fisher-Yates algorithm that @Theran mentionedHere is an example of applying the shuffle to an array of characters:
Hopefully this won't be too hard to convert to javascript, or any other language.
如果之前的答案已经涵盖了这一点,我很抱歉,但这是第一次,这些答案对我来说完全陌生。我可能会提到,我了解 Java 和 JavaScript,但我对数学一无所知……所以
log2
、排列、
factorial
、well-ordering
对我来说都是陌生的词。最重要的是,我最终(再次)使用 StackOverflow 作为白板来写下我的问题,并在 20 分钟后回答了我脑海中的问题。我陷入了与计算机无关的生活中,而且,在了解 StackOverflow 后,我发现现在要节省每个人容易浪费的 20% 以上的时间已经太晚了。
无论如何,迷失在所有三个现有答案中。这是我知道的答案
(用 Javascript 编写,但应该很容易将 20 行外国代码翻译成您选择的语言)
(在此处查看它的实际操作:http://jsfiddle.net/M3vHC)
编辑: 感谢 AShelly 的捕捉:这将失败(变得高度有偏差)当给定的密钥长度超过 12 时,假设您的整数是 32 位(如果您的整数是 64 位,则超过 19)
I am sorry if this was already covered in a previous answer,, but for the first time,, these answers are completely foreign to me. I might have mentioned that I know Java and JavaScript and that I know nothing of mathematics... So
log2
,permutations
,factorial
,well-ordering
are all unknown words to me.And on top of that I ended up (again) using StackOverflow as a white board to write out my question and answered the question in my head 20 minutes later. I was tied up in non computer life and,, knowing StackOverflow I figured it was too late to save more than 20% of everybody's easily wasted time.
Anyway, having gotten lost in all three existing answers. Here is the answer I know of
(written in Javascript but it should be easy to translate 20 lines of foreign code to your language of choice)
(see it in action here: http://jsfiddle.net/M3vHC)
Edit: Thanks to AShelly for this catch: This will fail (become highly biased) when given a key length of more than 12 assuming your ints are 32 bit (more than 19 if your ints are 64 bit)