如何实现随机的重复洗牌 - 但又不太随机
我有一个包含 n 项的列表。我想要一个算法让我从该集合中随机选择一个可能无限的项目序列,但有几个限制:
- 一旦选择了一个项目,它就不应该在接下来的几个项目中再次出现(比如在接下来的m个项目,其中m显然严格n),
- 您不必等待太长时间才能出现任何项目 - 项目应该;平均出现一次 n 选择
- 序列应该是非循环
的 基本上,我想要一个算法来为打开“随机播放”和“重复”的 MP3 播放器生成播放列表,以确保它不会将同一首歌曲播放得太接近,并确保它均匀地播放所有歌曲,没有明显的模式。
这些约束消除了两个明显的争用解决方案:
- 我们不能简单地选择 rnd(n) 作为下一项的索引,因为这不能保证不重复;挑选某些物品也可能需要很长时间。
- 我们不能只是用 Fisher-Yates 洗牌来预先洗牌列表,然后反复迭代它,每次到达末尾时都重新洗牌;虽然这保证了最多 2n - 1 次选择后出现物品,但它并不能完全防止物品重复出现。
一个简单的解决方案可能是随机挑选,但如果它们出现在最后 m 个挑选中,则拒绝挑选;这意味着保留一个 m 个先前选择的列表,并每次都根据该列表检查每个选择,这使得算法具有不确定性,同时速度很慢 - 双输。除非我遗漏了一些明显的东西......
所以我现在使用的算法我有点不满意。我通过类比一副牌得出了它,其中有一个抽牌堆和一个弃牌堆。我从完整的列表开始,在抽签堆中打乱顺序,丢弃堆是空的。从抽取堆的顶部读取下一个项目,然后将其放入丢弃堆中。一旦弃牌堆达到一定大小(m 件),我就会将其洗牌,然后将其移动到抽牌堆的底部。
这满足了要求,但是每m选择一次的洗牌让我很困扰。通常是 O(1),但在 m 中一次是 O(m)。平均而言,这相当于恒定的时间,但必须有一个更清洁的解决方案来将丢弃的东西洗掉。
在我看来,这是一个如此简单、通用和常见的问题,它一定有双管算法之一,比如 Fisher-Yates 或 Boyer-Moore。但我的 google-fu 显然不强,因为我还没有找到一组术语来定位不可避免的 1973 年 ACM 论文,该论文可能准确地解释了如何在 O(1) 时间内做到这一点,以及为什么我的算法存在严重缺陷以某种方式。
I've got a list of n items. I want an algorithm to let me pick a potentially infinite sequence of items from that collection at random, but with a couple of constraints:
- once an item has been picked, it shouldn't turn up again in the next few items (say in the next m items, with m obviously strictly < n)
- you shouldn't have to wait too long for any item to appear - items should appear on average once every n picks
- the sequence should be non-cyclical
Basically, I want an algorithm to generate the playlist for an MP3 player with 'shuffle' and 'repeat' turned on, that makes sure it doesn't play the same song too close to itself, and makes sure it plays all your songs evenly, with no discernible pattern.
Those constraints eliminate two obvious solutions from contention:
- We can't simply pick rnd(n) as the index for the next item, because that will not guarantee no repetition; it may also take a long time to pick some items.
- We can't just pre-shuffle the list with a Fisher-Yates shuffle, and iterate over it repeatedly, reshuffling it each time we reach the end; while that guarantees items turn up at most after 2n - 1 picks, it doesn't completely prevent an item repeating.
A naive solution might be to pick at random but reject picks if they occurred in the last m picks; that means keeping a list of m previous picks, and checking each pick against that list every time, which makes the algorithm nondeterministic and slow at the same time - lose-lose. Unless I'm missing something obvious..
So I have an algorithm I'm using now which I'm a little dissatisfied with. I've derived it by analogy with a deck of cards, where I have a draw-pile and a discard-pile. I start off with the complete list, shuffled, in the draw pile, the discard pile empty. The next item is read from the top of the draw pile, and then placed in the discard pile. Once the discard pile reaches a certain size (m items) I shuffle it, and move it to the bottom of the draw pile.
This meets the requirement, but that shuffle once every m picks bothers me. It's O(1) normally, but O(m) one time in m. That amounts to constant time, on average, but there must be a cleaner solution that shuffles the discards in as it goes.
It seems to me that this is such a simple, generic, and common problem, it must have one of those double-barreled algorithms, like Fisher-Yates or Boyer-Moore. But my google-fu is clearly not strong, as I've yet to find the set of terms that locates the inevitable 1973 ACM paper which probably explains exactly how to do this in O(1) time, and why my algorithm is deeply flawed in some way.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
要生成列表,请执行以下操作。拥有抽牌堆和弃牌堆。最初,弃牌堆是空的。从抽奖堆中随机选择您的第一个项目。将其添加到播放列表中,然后将其放入丢弃堆中。当丢弃堆中有 m 个物品时,从丢弃堆中取出最后一个物品(最近最少使用的)并将其添加到抽签堆中。播放列表将是随机的,但不需要随机播放。
这是 ruby 中的:
编辑:添加了 playlist_of_length 方法,使您更清楚如何调用 next_item 来生成播放列表
To generate your list do the following. Have a draw and discard pile. Initially the discard pile is empty. Pick your first item at random from the draw pile. Add it to the play list and then put it in the discard pile. When there are m items in the discard pile, take the last item (least recently used) from the discard pile and add it to the draw pile. The playlist will be random, but without shuffle required.
Here it is in ruby:
EDIT: Added playlist_of_length method to make it clearer how you call next_item to generate the playlist
抛开队列算法实现和视觉验证
在 Mathematica 中:
让我们看看没有明显的重复模式:
比较不同的周期长度:
Aside queue algorithm implemententation and visual verification
In Mathematica:
Let's see that there are not obvious repetitive patterns:
Comparing different cycles lengths:
播放给定歌曲后,使用伪追加将其放置在列表末尾附近。您可能需要真正附加大约 1/2 到 2/3,而其他 1/3 到 1/2 分布在列表中最后 5 个左右的项目中。
显然,这不适用于非常短的列表,但对于 10 个或更多的列表应该没问题。
编辑(提供有关“PseudoAppend”的更多详细信息):
以下伪代码使用了多种语言结构的混合,但应该足够容易转化为真实代码。
给定 List[songs]
在没有备份列表的情况下从列表中删除刚刚完成的歌曲可能会导致信息丢失,但这只是旨在传达一个想法的伪代码。
正如你所看到的,刚刚播放的歌曲有 2/3 的时间会移到列表的后面,1/3 的时间会移到最后一首歌曲的前面。
歌曲向前移动的概率为 1/3,其中 2/3 的情况只会移动到一首歌曲之前,另外 1/3 的情况会移动到两首或更多歌曲之前。歌曲移动到最后一个位置的几率=66%,倒数第二个位置=22%,倒数第三个位置=12%。
PseudoAppend 的实际行为全部由
While
语句的条件控制。您可以更改与随机
数字生成器进行比较的值,以增加或减少某首歌曲排在其他歌曲前面的可能性,并且您可以更改与目标
相比的值> 调整刚刚完成的歌曲在列表中前进的距离。Edit II (Python 3 code and sample output for a list of 11 items)
下面是一个示例输出,它运行了该列表约 9 次。第一列只是一个运行索引,第二列显示当前选择的歌曲,第三列显示列表的当前顺序:
After playing a given song, use a pseudo-append to place it near the end of the list. You'll probably want about 1/2 to 2/3 to be truly appended and the other 1/3 to 1/2 spread within the last 5 or so items in the list.
Obviously this won't work for very short lists, but should be fine for lists of 10 or more.
Edit (provide more detail about 'PseudoAppend'):
The following pseudo-code uses a mix of language constructs but should be easy enough to turn into real code.
Given List[songs]
Removing the just-completed song from the list without first having a backup list can result in information loss, but this is just pseudo-code meant to convey an idea.
As you can see, 2/3 of the time the song that was just played will be moved to the back of the list, and 1/3 of the time it will be moved ahead of the last song.
Of the 1/3 chance that the song is moved forward, 2/3 of the time it will only be moved ahead of one song, and the other 1/3 of the time it will be moved ahead of two or more songs. Chance that song moves to last position=66%, second to last position=22%, third to last=12%.
The actual behavior of the PseudoAppend is all governed within the condition of the
While
statement. You can change the value to compare against therandom
number generator to make it more or less likely that a song is moved ahead of others, and you can change the value compared totarget
to adjust how far the just-completed song can move ahead in the list.Edit II (Python 3 code and sample output for a list of 11 items)
Here's a sample output running through the list ~9 times. The first column is simply a running index, the second column shows the currently selected song, and the third column shows the current order of the list:
我的想法是有一个要玩的牌队列。队列被打乱,然后一次播放一个,直到清空。当每张牌被打出时,如果该牌打出的时间少于 m 回合,则将其添加到队列末尾并选择下一张牌。一旦队列被清空,它就可以被再次填充并重新洗牌。数组可用于跟踪最后打出一张牌的回合。每首歌曲的平均运行时间为 O(1)。
这是我在 F# 中的解决方案。
m = 4 时 0 .. 7 交易的一些测试数据。
完整的测试程序。
My idea is to have a queue of cards to be played. The queue is shuffled and then played one at a time until emptied. As each card is being played, if the card was played less than m turns ago add it to the end of the queue and pick the next card. Once the queue is emptied it can be filled again and reshuffled. An array can be used to keep track of what turn a card was last played at. This runs O(1) per song on average.
Here's my solution in F#.
Some test data for a deal of 0 .. 7 with m = 4.
Full test program.