一种用于排序和排序的算法对加权对象列表进行分组
我有很多数据块。
为了论证起见,我们会说它们是
File 1 - 150Kb
File 2 - 50Kb
File 3 - 70Kb
File 4 - 60Kb
File 5 - 70Kb
File 6 - 100Kb
File 7 - 90Kb
为了传输,我可以将它们打包成最大 300Kb 的有效负载。
如果您只是按顺序迭代它们,您会得到
TRANS1: 150Kb + 50Kb + 70Kb = 270Kb - the next file puts you over the limit of 300Kb
TRANS2: 60Kb + 70Kb + 100Kb = 230Kb - the next file puts you over the limit of 300Kb
TRANS3: 90Kb
三个单独的传输。
但如果你重新组织它们,你就可以发送
Files 1,2,6 together = 300Kb
Files 3,4,5,7 together = 290Kb
这样你就减少了单独传输需求的数量。既然有一个 与每次传输相关的货币成本(这些传输实际上是对第三方系统的 API 调用,我们按 API 调用计费),我们希望保留该号码 单独的有效负载发送量降至最低。
有没有围绕这种优化的排序算法, 这将获取一个加权对象列表,并对它们进行排序/分组,这样你就结束了 作为参考,我将在 .NET 中对其进行编码,
但如果您可以提供另一种语言或伪代码实现/描述的示例,那就太好了。
谢谢, 欧因C
I have a number of chunks of data.
For arguments sake we'll say they are
File 1 - 150Kb
File 2 - 50Kb
File 3 - 70Kb
File 4 - 60Kb
File 5 - 70Kb
File 6 - 100Kb
File 7 - 90Kb
For transmission, I can package these up into a max payload of 300Kb.
If you just iterate through them in order you'd get
TRANS1: 150Kb + 50Kb + 70Kb = 270Kb - the next file puts you over the limit of 300Kb
TRANS2: 60Kb + 70Kb + 100Kb = 230Kb - the next file puts you over the limit of 300Kb
TRANS3: 90Kb
So three seperate transmissions.
But if you re-organise them, you can send
Files 1,2,6 together = 300Kb
Files 3,4,5,7 together = 290Kb
So you cut the number of separate transmissions needs. Since there's a
monetary cost associated with each transmission (these transmissions are actually API calls to a 3rd party system, where we're billed per API Call) we'd like to keep the number
of separate payload sends to a minimum.
Is there any sorting algorithm out there around this kind of optimization,
that will take a list of weighted objects, and sort/group them so you end
up with fewest number of groups
For reference, I'd be coding this up in .NET but if you can provide and example in another language or psuedo-code implementation/description, that'd be great also.
Thanks,
Eoin C
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您的问题正是 Bin Packing 问题,不幸的是 NP-Complete:(
如果数据包的数量非常小,您可以暴力破解每种可能的组合。
否则,我提出的动态编程解决方案将不会给出最佳答案,因为它会假设您始终对连续的数据包进行分组。但它会执行得很快并且给出的结果接近事实。我使用带有记忆的递归。最好在一开始就按升序对数据包进行排序。
Your problem is exactly the Bin packing problem which is unfortunately NP-Complete:(
If the number of packets is pretty small you can bruteforce every possible combination.
Otherwise, the dynamic programming solution I propose will not give the optimal answer because it will assume you always group consecutive packets. But it will perform fast and give something close to the truth. I use recursion with memoization. It will be good to sort the packets in increasing order at the beginning.
您正在寻找的是背包算法或对其的修改。
问题无法有效解决,但您可以接近完美的解决方案。
您可以使用此 Google 搜索来实现。
What you are looking for is the knapsack algorithm or a modification to it.
The problem is not solveable efficiently, but you can get close to the perfect solution.
You may use this google search for implementations.
这让我想起了背包问题。我没有跟进它,但我记得这是一个 NP 完全问题:读“难”。您基本上需要在“正确”和“快速”之间找到折衷方案。
这并不意味着不可能,但这显然是完美与优秀的敌人的领域。
我将从“大石头优先”方法开始,运行几次模拟。保留最佳方案。使用有限的时间来中断搜索并使用迄今为止找到的最好的搜索。
This reminds me of the Knapsack problem. I have not followed up on it, but I remember it is an NP-complete problem : read 'hard'. You basically need to find a compromise between "right" and "fast".
That does not mean impossible, but this clearly an area where perfection is the enemy of good.
I would start with the "Big Rocks First" approach, run a couple of simulations. Keep the best solution. Use a bounded time to break the search off and use the best one found so far.