C#:尽可能高效地将大量文件放入 DVD 的代码
我需要编写一个应用程序,它将获取文件列表(有些大,有些小)并尽可能有效地将它们放入 DVD(或 CD,或其他)。此应用程序的全部要点是在移动到第二个光盘之前用尽第一张光盘,在移动到第三个光盘之前尽可能多地填充第二个光盘,依此类推。
(注意:该应用程序没有实际刻录到 DVD 时,只需找出最佳的匹配方式)。
我最初认为我有一个很好的游戏计划,通过生成文件的排列,然后检查每个组合以查看最适合的组合。 (我对此的帮助请求可以找到 这里)
但是文件越多,花费的时间就越长......呈指数级增长。所以我想听听你们对如何最好地实现这一目标的一些意见。
有什么想法吗?而且,一如既往,C# 代码始终受到赞赏。
I need to write an application that will take a list of files (some large, some small) and fit them onto DVDs (or CDs, or whatever) as efficiently as possible. The whole point of this application is to use up as much of the 1st disc before moving onto the 2nd disc, filling the 2nd disc up as much as possible before moving onto the 3rd disc, etc.
(Note: The application doesn't have to do the actual burning to the DVD, it just has to figure out the best possible fit).
I initially thought I had a good game-plan by generating a permutation of the files and then checking each combination to see what fits the best. (My request for help on this can be found HERE)
But the more files there are, the longer it takes... exponentially. So I wanted some of your opinions on how to best achieve this.
Any ideas? And, as always, C# code is always appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
您遇到的问题与背包问题有关。链接的维基百科页面有更多信息,包括建议的解决方法。
What you're facing is related to the knapsack problem. The linked wikipedia page has lots more information, including suggested ways of solving it.
简单算法:
Simple algorithm:
对于任何仍然对这个问题感兴趣的人......
我编写了一个实用程序,用于将文件装入一组磁盘/光盘的类似目的。
它使用基于命令行/文件的界面。
版本有 C、C++ 和 C 语言版本。 Java(不是 C#)。
http://whizman.com/code/diskfit.tgz
更详细的信息位于 diskfit 中。 tgz:Doc/diskfit.txt 文件。
(AGPL3)
我们可以将问题描述为 0-1 多重背包,
或线性箱包装。
(感谢 jon-skeet 提供有关背包问题的链接。)
Dthorpe 解决了线性装箱问题,以恰好容纳足够的箱子/磁盘
所有文件 [很好,O(n) 或 O(n lg n) 快 - 在电子表格中也可能可行,无需
必须编写脚本]。
基本上,diskfit(上面链接的实用程序)输出合格的文件集
基于0-1个单背包,
并且用户选择单磁盘文件集来组装到磁盘集中 -
帮助用户(但不是完全自动化)实现以下两个目标:
(其中文件具有优先级,也称为值不同)。
完整的此类磁盘集的完全编程选择,
将是一个附加功能。
应用 0-1 单背包解决方案是不够的,
自动逐盘[贪婪地]。
(考虑3个容量为6的背包,以及同等价值的可用物品
以及权重:{1, 1, 2, 2, 3, 4, 5}。
对第一个背包单独应用0-1个背包会选择
{1, 1, 2, 2} 以获得总和值 4 - 之后我们无法拟合所有的
剩余第 2 项和第 3 项第三个背包——
而我们知道我们可以将所有物品放入 3 个背包中,如下所示
{1,2,3}& {1, 5} & {2, 4}。)
For anyone still interested in this question...
I wrote a utility which I used for a similar purpose of fitting files into a set of disks/discs.
It uses a command-line/file-based interface.
Versions are available in C, C++, & Java (not C#).
http://whizman.com/code/diskfit.tgz
More detailed information is in the diskfit.tgz:Doc/diskfit.txt file.
(AGPL3)
We might characterize the question as 0-1 multiple-knapsack,
or linear bin packing.
(Thanks jon-skeet for the link about knapsack problem.)
Dthorpe solves linear bin packing, for exactly enough bins/disks to fit
all files [nicely O(n) or O(n lg n) fast - also may be feasible in spreadsheet without
having to write a script].
Basically, diskfit (above-linked utility) outputs qualifying file-sets
based around 0-1 single-knapsack,
and the user chooses one-disk file-sets to assemble into the disk-set -
assisting the user (but not fully automating) toward both:
(where files are prioritized, aka differ in value).
Full programmatic choice of the complete such disk-set,
would be an additional feature.
It would be insufficient to apply 0-1 single-knapsack solution,
automatically disc by disc [greedily].
(Consider 3 knapsacks of capacity 6, and available items with equal value
and weights of: {1, 1, 2, 2, 3, 4, 5}.
Applying 0-1 knapsack to the first knapsack in isolation would choose
{1, 1, 2, 2} to obtain sum value 4 - after which we cannot fit all of the
remaining 3 items in the remaining second & third knapsacks -
whereas we know we can fit all items in the 3 knapsacks as
{1, 2, 3} & {1, 5} & {2, 4}.)
虽然对于某些应用程序来说这是一个很酷的问题...但是在您的应用程序中,为什么不只使用 WinRAR 或其他一些能够将存档分割成特定大小的文件块的存档程序呢?你可以将每个块制作成 DVD 大小,然后烧掉。
编辑:您会遇到的一个问题是,如果您的某个文件大于媒体的大小,您将无法刻录该文件。
While thats a cool problem to solve in a program for certain applications... however in your application, why not just use WinRAR or some other archiving program that has the capability to split up the archive into specific sized file chunks. You could make each chunk the size of a DVD and then just burn away.
EDIT: one issue you would run into is that if one of your files is greater than the size of your media, you are not going to be able to burn that file.
如果您一开始将尽可能多的最大文件放入一张 DVD 上,然后用尽可能多的最小文件填充它(从最小的文件开始)怎么样?
对每个磁盘的剩余文件重复此过程。
我不确定这是否会给您带来完美的覆盖/分布,但我认为这可能会在某种程度上解决您的需求。
How about if you started by putting as many of the largest files you can onto one DVD and then filling it up with as many of the smallest files that you can (starting with the smallest).
Repeat this process with the remaining files for each disk.
I'm not sure that's going to give you perfect coverage/distribution but I think it might go some way to solving your needs.
使用回溯来获取要刻录到 DVD 1 的最佳文件集,然后将它们从列表中排除,并对剩余文件使用回溯以获得 DVD 2 的最佳填充,依此类推
use backtracking to get the optimal set of files to burn to dvd 1, then exclude them from the list and use backtracking on the remaining files to get the optimal fill for dvd 2 and so on
我发现了很多应该解决这个问题的工具,但它们都试图最大限度地减少使用的光盘总数,而我只对最适合单个光盘的文件的单个子集感兴趣。
所以我已经结束了编写自己的工具“ss”(来自基于的“子集和”算法)。
该工具仍然有缺陷,并且无法递归目录,但它对我有用。 :)
I've found a lot of tools that are supposed to solve this problem, but they all try to minimize the TOTAL number of disc used, while I was just interested into the SINGLE subset of files that best fit a SINGLE disc.
So i've ended writing my own tool called "ss" (from the "subset sum" algorithm which is based from).
The tool is still buggy and can't recurse directories, but it's working for me. :)
这个问题是Bin Packing Problem并且是NP困难的,这意味着如果你想要一个真正最优的解决方案你将需要指数时间。然而,有些方法给出的解不是最优解,但运行速度却快得多。
假设我们有一个无限的磁盘列表。按大小降序排列每个文件,然后将每个文件添加到它适合的第一个磁盘。这称为“首次适合递减”,在最坏的情况下需要 11/9 OPT + 6/9 磁盘。如果您以随机顺序选择文件,则需要 11/9 OPT + 1 个磁盘。
有一些算法可以将事情打包得更紧,请参阅上面的维基百科链接以获取更多详细信息。
This problem is the Bin Packing Problem and is NP hard, which means if you want a truly optimal solution you will need exponential time. However there are methods that give less than optimal solutions but run much faster.
Assume we have an unbounded list of disks. Take each file ordered descending in size, then add each file to the first disk that it fits in. This is called First fit decreasing and takes 11/9 OPT + 6/9 disks in worst case. If you choose files in a random order you instead need 11/9 OPT + 1 disks.
There are algorithms that will pack things tighter, see the wikipedia link above for more details.