将数字列表分为 2 个等和列表的算法
有一个数字列表。
<前><代码>#示例: >>>que = [2,3,10,5,8,9,7,3,5,2] >>>make_teams(que) 27 27
该列表将被分为 2 个大小相等的列表,并且总和相差最小。 必须打印总数。
对于某些情况,以下代码算法是否存在错误?
我如何优化和/或Python化这个?
def make_teams(que):
que.sort()
if len(que)%2: que.insert(0,0)
t1,t2 = [],[]
while que:
val = (que.pop(), que.pop())
if sum(t1)>sum(t2):
t2.append(val[0])
t1.append(val[1])
else:
t1.append(val[0])
t2.append(val[1])
print min(sum(t1),sum(t2)), max(sum(t1),sum(t2)), "\n"
There is a list of numbers.
The list is to be divided into 2 equal sized lists, with a minimal difference in sum. The sums have to be printed.#Example: >>>que = [2,3,10,5,8,9,7,3,5,2] >>>make_teams(que) 27 27
Is there an error in the following code algorithm for some case?
How do I optimize and/or pythonize this?
def make_teams(que):
que.sort()
if len(que)%2: que.insert(0,0)
t1,t2 = [],[]
while que:
val = (que.pop(), que.pop())
if sum(t1)>sum(t2):
t2.append(val[0])
t1.append(val[1])
else:
t1.append(val[0])
t2.append(val[1])
print min(sum(t1),sum(t2)), max(sum(t1),sum(t2)), "\n"
Question is from http://www.codechef.com/problems/TEAMSEL/
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(14)
动态编程就是您正在寻找的解决方案。
以 [4, 3, 10, 3, 2, 5] 为例:
12 是我们的幸运数字! 回溯得到组:
然后可以计算另一组:{4,3,10,3,2,5} - {5,3,4} = {10,3,2}
所有带有数字的字段都是可能的一袋解决方案。 选择右下角最远的一个。
顺便说一句:这被称为背包问题。
Dynamic programming is the solution you're looking for.
Example with [4, 3, 10, 3, 2, 5]:
12 is our lucky number! Backtracing to get the group:
The other set can then be calculated: {4,3,10,3,2,5} - {5,3,4} = {10,3,2}
All fields with a number are possible solutions for one bag. Choose the one that is furthest in the bottom right corner.
BTW: It's called the knapsack-problem.
新解决方案
这是一种采用启发式剔除的广度优先搜索。 树的深度限制为players/2。 玩家总分限制为总分/2。 玩家池为 100 人时,大约需要 10 秒才能解决。
另请注意,我尝试使用 GS 的描述来解决此问题,但仅通过存储运行总计不可能获得足够的信息。 如果您同时存储了项目数和总数,那么它与此解决方案相同,只是您保留了不必要的数据。 因为你只需要保持n-1和n次迭代最多为numplayers/2。
我有一个基于二项式系数的旧详尽系数(查看历史)。 它很好地解决了长度为 10 的示例问题,但后来我看到比赛的人长度达到了 100。
New Solution
This is a breadth-first search with heuristics culling. The tree is limited to a depth of players/2. The player sum limit is totalscores/2. With a player pool of 100, it took approximately 10 seconds to solve.
Also note that I attempted to solve this using GS's description, but it is impossible to get enough information simply by storing the running totals. And if you stored both the number of items and totals, then it would be the same as this solution except you kept needless data. Because you only need to keep the n-1 and n iterations up to numplayers/2.
I had an old exhaustive one based on binomial coefficients (look in history). It solved the example problems of length 10 just fine, but then I saw that the competition had people of up to length 100.
问:给定一个整数多重集 S,有没有办法将 S 分成两个子集 S1 和 S2,使得数字的总和 S1中的数字等于S2中的数字之和?
A.设置分区问题。
祝你好运。 :)
Q. Given a multiset S of integers, is there a way to partition S into two subsets S1 and S2 such that the sum of the numbers in S1 equals the sum of the numbers in S2?
A.Set Partition Problem.
Best of luck approximating. : )
好吧,您可以在多项式时间内找到百分比精度的解决方案,但要真正找到最佳(绝对最小差异)解决方案,该问题是 NP 完全的。 这意味着该问题不存在多项式时间解。 因此,即使数字列表相对较小,计算量也太大而无法解决。 如果您确实需要解决方案,请查看一些近似算法。
http://en.wikipedia.org/wiki/Subset_sum_problem
Well, you can find a solution to a percentage precision in polynomial time, but to actually find the optimal (absolute minimal difference) solution, the problem is NP-complete. This means that there is no polynomial time solution to the problem. As a result, even with a relatively small list of numbers, it is too compute intensive to solve. If you really need a solution, take a look at some of the approximation algorithms for this.
http://en.wikipedia.org/wiki/Subset_sum_problem
请注意,这也是一种启发式方法,我将排序移出函数。
Note that it is also an heuristic and I moved the sort out of the function.
它实际上是 PARTITION,KNAPSACK 的一个特例。
它是 NP Complete,具有伪多项式 dp 算法。 伪多项式中的伪是指运行时间取决于权重的范围。
一般来说,在接受启发式解决方案之前,您必须首先确定是否存在精确解决方案。
It's actually PARTITION, a special case of KNAPSACK.
It is NP Complete, with pseudo-polynomial dp algorithms. The pseudo in pseudo-polynomial refers to the fact that the run time depends on the range of the weights.
In general you will have to first decide if there is an exact solution before you can admit a heuristic solution.
您的方法不起作用的测试用例是
问题是您成对分析事物,在本例中,您希望所有 50 都位于同一组中。 不过,如果您删除配对分析方面并一次只执行一个条目,那么这个问题应该可以解决。
这是执行此操作的代码
这给出了 27, 27 和 150, 1002 这对我来说是有意义的答案。
编辑:经过审查,我发现这实际上不起作用,尽管最后我不太确定为什么。 不过,我将在这里发布我的测试代码,因为它可能有用。 该测试只是生成具有相等总和的随机序列,将它们放在一起并进行比较(结果令人遗憾)。
编辑#2:基于Unknown,
[87,100,28,67,68,41,67,1]
指出的示例,很明显为什么我的方法不'不工作。 具体来说,要解决此示例,需要将两个最大的数字都添加到同一序列中以获得有效的解决方案。A test case where your method doesn't work is
The problem is that you're analyzing things in pairs, and in this example, you want all the 50's to be in the same group. This should be solved though if you remove the pair analysis aspect and just do one entry at a time.
Here's the code that does this
This give 27, 27 and 150, 1002 which are the answers that make sense to me.
Edit: In review, I find this to not actually work, though in the end, I'm not quite sure why. I'll post my test code here though, as it might be useful. The test just generates random sequence that have equal sums, puts these together and compares (with sad results).
Edit #2: Based in the example pointed out by Unknown,
[87,100,28,67,68,41,67,1]
, it's clear why my method doesn't work. Specifically, to solve this example, the two largest numbers need to both be added to the same sequence to get a valid solution.他们显然正在寻找动态规划背包解决方案。 因此,在我的第一次努力(我认为是一个相当好的原始启发式)和我的第二次努力(一个非常狡猾的精确组合解决方案,适用于较短的数据集,甚至适用于最多 100 个元素的集合,只要 的数量唯一值很低),我最终屈服于同行压力并编写了他们想要的一个(不是太难 - 处理重复条目是最棘手的部分 - 我基于它的底层算法仅在所有输入都是唯一的情况下才有效- 我很高兴long long足够大,可以容纳 50 位!)。
因此,对于我在测试前两项工作时汇总的所有测试数据和尴尬的边缘情况,它给出了相同的答案。 至少对于我用组合求解器检查的那些,我知道它们是正确的。 但我仍然因为一些错误的答案而提交失败!
我并不是要求任何人在这里修复我的代码,但如果有人能找到下面的代码生成错误答案的情况,我将非常感激。
谢谢,
Graham
PS 这段代码确实总是在时间限制内执行,但它距离优化很远。 我会保持简单,直到它通过测试,然后我有一些想法来加快它的速度,也许可以提高 10 倍或更多。
They are obviously looking for a dynamic programming knapsack solution. So after my first effort (a pretty good original heuristic I thought), and my second effort (a really sneaky exact combinatorial solution that worked for shortish data sets, and even for sets up to 100 elements as long as the number of unique values was low), I finally succumbed to peer pressure and wrote the one they wanted (not too hard - handling duplicated entries was the trickiest part - the underlying algorithm I based it on only works if all the inputs are unique - I'm sure glad that long long is big enough to hold 50 bits!).
So for all the test data and awkward edge cases I put together while testing my first two efforts, it gives the same answer. At least for the ones I checked with the combinatorial solver, I know they're correct. But I'm still failing the submission with some wrong answer!
I'm not asking for anyone to fix my code here, but I would be very greatful if anyone can find a case for which the code below generates the wrong answer.
Thanks,
Graham
PS This code does always execute within the time limit but it is far from optimised. i'm keeping it simple until it passes the test, then I have some ideas to speed it up, maybe by a factor of 10 or more.
为了提高性能,您可以通过用运行总计替换append()和sum()来节省计算。
For performance you save computations by replacing append() and sum() with running totals.
您可以使用以下方法收紧循环:
You could tighten your loop up a little by using the following:
由于列表必须相等,所以问题根本不是 NP 。
我使用模式 t1<-que(1st, last), t2<-que(2nd, last-1) ... 编辑 分割排序列表
:我认为这也是一个错误的方法。 结果错误!
Since the lists must me equal the problem isn't NP at all.
I split the sorted list with the pattern t1<-que(1st, last), t2<-que(2nd, last-1) ...
Edit: I suppose that this is also a wrong method. Wrong results!
经过一番思考,对于不太大的问题,我认为最好的启发式方法是:
如果问题更大,您可以调整 nb_iter 。
它几乎总是解决上述所有问题。
After some thinking, for not too big problem, I think that the best kind of heuristics will be something like:
You can adjust nb_iter if the problem is bigger.
It solves all the problem mentioned above mostly all the times.
在之前的评论中,我假设所设置的问题是可以处理的,因为他们仔细选择了测试数据,以便在分配的时间内与各种算法兼容。 事实证明并非如此——相反,这是问题的限制——数字不高于450,并且最终集合不超过50个数字是关键。 这些与使用我在后面的文章中提出的动态编程解决方案解决问题兼容。 其他算法(启发式算法或组合模式生成器的详尽枚举)都不可能起作用,因为会有足够大或足够难的测试用例来破坏这些算法。 说实话,这相当烦人,因为其他解决方案更具挑战性,当然也更有趣。 请注意,在没有大量额外工作的情况下,动态编程解决方案只是说明对于任何给定的总和是否可以使用 N/2 找到解决方案,但它不会告诉您任一分区的内容。
In an earlier comment I hypothesized that the problem as set was tractable because they had carefully chosen the test data to be compatible with various algorithms within the time alloted. This turned out not to be the case - instead it is the problem constraints - numbers no higher than 450 and a final set no larger than 50 numbers is the key. These are compatible with solving the problem using the dynamic programming solution I put up in a later post. None of the other algorithms (heuristics, or exhaustive enumeration by a combinatorial pattern generator) can possibly work because there will be test cases large enough or hard enough to break those algorithms. It's rather annoying to be honest because those other solutions are more challenging and certainly more fun. Note that without a lot of extra work, the dynamic programming solution just says whether a solution is possible with N/2 for any given sum, but it doesn't tell you the contents of either partition.