We don’t allow questions seeking recommendations for software libraries, tutorials, tools, books, or other off-site resources. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(2)
有一个众所周知的集合覆盖贪婪近似算法,它也很容易用您选择的任何语言实现。该算法本身如下所述:
http://en.wikipedia.org/wiki/Set_cover_problem#Greedy_algorithm< /a>
它是如此简单,最简单的事情就是从头开始编写它。
值得注意的是,它也是已知的集合覆盖中最好的多项式时间近似算法。这意味着为了获得更好的最坏情况性能(更紧凑的结果集),您需要非多项式运行时间(=大型集的慢速算法)。
不幸的是,维基百科条目实际上并未涵盖加权集合覆盖,这里就是这种情况。扩展很简单,如下所述:
http://pages.cs.wisc.edu/~shuchi/courses/880-S07/scribe-notes/lecture03.pdf
一些更有用的注释:
http://www.cs.ucr.edu/~neal/non_arxiv/Young08SetCover.pdf
http://www.cs.uiuc.edu/class/sp08/cs473/Lectures/lec20.pdf
There is a well-known greedy approximation algorithm for set cover that is also easy to implement in whatever language of your choice. The algorithm itself is described here:
http://en.wikipedia.org/wiki/Set_cover_problem#Greedy_algorithm
It is so simple that the easiest thing is just to write it from scratch.
Notably, it is also the best polynomial-time approximation algorithm known for set cover. That means that to get better worst-case performance (more compact result set) you would need to have non-polynomial running times (= slow algorithms for large sets).
Unfortunately the Wikipedia entry doesn't actually cover weighted set cover, which is the case here. The extension is simple, and is described e.g. here:
http://pages.cs.wisc.edu/~shuchi/courses/880-S07/scribe-notes/lecture03.pdf
Some more useful notes:
http://www.cs.ucr.edu/~neal/non_arxiv/Young08SetCover.pdf
http://www.cs.uiuc.edu/class/sp08/cs473/Lectures/lec20.pdf
我在 c++ 中实现的贪婪集合覆盖的线性时间/空间实现可以在 github 上找到。
https://github.com/martin-steinegger/setcover
计算 40.000.000 组平均。在 Amazon AWS m2.2xlarge 实例上计算每组 10 个元素大约需要 4 分钟。
我仍在研究一些技巧来提高性能,
My linear time / space implementation of greedy set cover in c++ is available at github.
https://github.com/martin-steinegger/setcover
A calculation for 40.000.000 sets with avg. 10 elements per set takes around 4 Minutes on computed on Amazon AWS m2.2xlarge instances.
I still work on some tricks to improve the performance