算法在不改变播放的情况下最小化播放列表
我正在寻找一种算法来减少有序但非唯一项目的列表(播放列表)。 搜索集合论,但还没有找到任何合适的
例子
[a, b, b, c] -> [a, b, b, c] Cannot be reduced.
[a, b, b, a, b, b] -> [a, b, b].
[b, b, b, b, b] -> [b].
[b, b, b, b, a] -> [b, b, b, b, a] Cannot be reduced.
考虑获取所有现有的子列表并计算每个实例。 如果存在这样一个子列表,其中计数乘以子列表长度等于原始列表,则取符合此条件的最短子列表。
这看起来有点蛮力,必须有一个更简单/更快的解决方案可用。
Im looking for an algorithm to reduce a list (playlist) of ordered but not unique items.
Searched for set theory but havent found anything suitable yet
Examples
[a, b, b, c] -> [a, b, b, c] Cannot be reduced.
[a, b, b, a, b, b] -> [a, b, b].
[b, b, b, b, b] -> [b].
[b, b, b, b, a] -> [b, b, b, b, a] Cannot be reduced.
Thinking of getting all existing sublists and count each instance.
If there is such a sublist where the count times the sublist length is equal to the orginal list, take the shortest sublist matching this criteria.
This seems a bit brute force, there must be a simpler/faster solution available.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
对于初学者,您不需要检查所有子列表 - 只需检查那些长度是完整列表长度的因子的子列表。
如果您主要关心的是编码简单性而不是原始速度,只需让正则表达式引擎解决问题即可:
这是 Abigail 很棒的 Perl 正则表达式来查找素数。
For starters, you don't need to check all sublists -- just those with lengths that are factors of the length of the full list.
If your main concern is coding simplicity rather than raw speed, just let a regex engine solve the problem:
Which is a variant on Abigail's awesome Perl regex to find prime numbers.
对于每个
n <= N
(其中N
是列表的长度),如果n
是N< 的因子/代码>。如果是,则检查重复前 n 个字符的子列表是否生成原始列表。如果是,那么您就找到了一个潜在的答案(答案是最短的)。这应该会让你的效率低于
O(N^2)
,但在最坏的情况下仍然与暴力破解相同的效率。您可以通过注意以下事项进行一些修剪:例如,如果长度为 2 的子列表成功生成前 4 个字符但未生成完整列表,则长度为 4 的子列表将失败。您可以保留所有此类子列表长度的列表以不检查,这将减少一些计算。
For each
n <= N
(whereN
is the length of the list), ifn
is a factor ofN
. If it is, then check if repeating the sublist of the firstn
characters generates the original list. If it does, then you've found a potential answer (the answer is the shortest). This should get you down to less thanO(N^2)
but still the same order of efficiency as brute force in the worst case.You can do some pruning by noting that, if for example a sublist of length 2 successfully generates the first 4 characters but not the full list, then a sublist of length 4 will fail. You can keep a list of all such sublist lengths to not check and this will cut down on some computation.
用素数对每个集合元素进行编码。
例如:
等等。
现在,您还需要维护两个列表。
第一个列表是素数,第二个列表是它们的指数。
这个想法是;当你偶然发现一个元素时,记录它的素数以及它连续出现的次数。
对于
[a, b, b, c]
,您会得到:可以记录为:
或者,更准确地说:
并且您维护两个列表:
现在,您从末尾迭代这两个列表中间,检查第一个元素 [p]^[e] 是否与最后一个元素相同;如果是,那么第二个是倒数第二个,依此类推...如果所有这些都相同,则可以减少您的列表。
在此示例中,您检查是否
2^1*5^1 == 3^2*3^2
;既然不是,就不能减少。让我们尝试一下
[a, b, b, a, b, b]
:这被编码为
or,
现在,我们检查是否
2^1 * 3^2 == 3^2 * 2^1
(第一个素数,第一个指数乘以最后一个素数,最后一个指数,然后与第二个素数和倒数第二个进行比较)由于这个成立,所以它是可约的。
让我们尝试一下
[b, b, b, b, b]
:这可以编码为
or,
这是一种特殊情况:如果您有 1 个元素列表,那么您的原始列表是可简化的。
让我们尝试一下
[b, b, b, b, a]
:这可以编码为
or,
我们检查是否
3^4 == 2^1
,因为它不是,你的列表是不可缩减的。让我们尝试一下
[a, b, a, b, a, b]
:这可以编码为
or,
尝试上述过程是有效的,因为
2^1 * 3^1 == 3 ^1 * 2^1 == 2^1 * 3^1
因此,算法将如下所示:
将所有数字编码为素数。
迭代列表,创建两个列表并按照描述填充它们
现在您有了两个列表,
p
和e
,它们的长度都是n 执行此操作:
注意:我并没有真正编写该算法并针对各种输入进行尝试。这只是一个想法。
另外,如果列表是可简化的,从它的长度和
n
的长度来看,应该不难看出如何将原始列表简化为其基本形式。第二个注意:如果有人看到上面的错误,请纠正我。可能这一切都不起作用,因为已经很晚了,而且我的注意力也没有达到最佳状态。
Encode every set element with a prime number.
Ex:
etc.
Now, you need two more lists to maintain.
First list is for primes, second is for their exponents.
The idea is; when you stumble upon an element, record it's prime number and how many times in succession it appears.
For
[a, b, b, c]
, you get this:Which can be recorded as:
or, more precisely:
and you maintain two lists:
Now, you iterate through these two lists from ends to middle, by checking if first element [p]^[e] is the same as last element; if it is, then second with next to last and so on... If all of them are the same, your list can be reduced.
In this example, you check if
2^1*5^1 == 3^2*3^2
; and since it's not, it cannot be reduced.Let's try
[a, b, b, a, b, b]
:This is encoded as
or,
Now, we check if
2^1 * 3^2 == 3^2 * 2^1
(first prime, first exponent multiplied with last prime, last exponent, and then compared with second against next to last)Since this holds, it is reducible.
Let's try
[b, b, b, b, b]
:This can be encoded as
or,
This is a special case: if you've got 1 element lists, then your original list is reducible.
Let's try
[b, b, b, b, a]
:This can be encoded as
or,
We check if
3^4 == 2^1
, and since it is not, your list is not reducible.Let's try
[a, b, a, b, a, b]
:This can be encoded as
or,
Trying the above procedure works, because
2^1 * 3^1 == 3^1 * 2^1 == 2^1 * 3^1
So, the algorithm would be something like this:
Encode all numbers to primes.
Iterating through your list, make two lists and populate them as described
Now that you have your two lists,
p
ande
, both of them having lengthn
do this:Note: I didn't really code this algorithm and try it out for various inputs. It's just an idea.
Also, if a list is reducible, from it's length and length of
n
, it shouldn't be too hard to see how to reduce the original list to its basic form.Second note: if anyone sees a mistake above, please correct me. It's possible that nothing of this really works since it's late and my concentration isn't optimal.
这是一些简单的代码,应该以接近线性的时间运行(我认为最坏的情况是 O(n lg lg n),依赖于一些更高的数学)。
本质上,上面执行的是朴素算法,但跳过了一些周期性,这些周期性由于早期的“未遂事件”而已知是不可能的。例如,如果您尝试 5 的周期并看到“aaaabaaaabaaaabaaaabab”,则可以安全地跳过 6、7、...、10,因为我们看到了 4 个 5 重复周期,然后失败。
最终,您最终会完成线性工作量加上 sigma(n) 线性工作量,sigma(n) 是 n 的除数之和,其边界为 O(n lg lg n)。
*请注意,证明这种跳过的正确性非常微妙,我可能在细节上犯了错误——欢迎评论。
Here's some simple code that should run in close to linear time (at worst O(n lg lg n) I think, relying on some higher math).
Essentially, the above performs the naive algorithm, but skips some periodicities which are known to be impossible due to earlier "near-misses". For example, if you try a period of 5 and see "aaaabaaaabaaaabaaaabab", you can safely skip 6, 7,..., 10 since we saw 4 cycles of 5 repeat and then a failure.
Ultimately, you end up doing a linear amount of work plus an amount of work that is linear in sigma(n), the sum of the divisors of n, which is bounded by O(n lg lg n).
*Note that proving the correctness of this skipping is pretty subtle, and I may have made a mistake on the details -- comments welcome.