优化 bin 放置算法

发布于 2024-08-24 08:46:06 字数 550 浏览 13 评论 0原文

好吧，我有两个集合，我需要将 collection1 中的元素放入 collection2 的容器（元素）中，具体取决于它们的值是否落在给定容器的范围内。

举一个具体的例子，假设我有一个排序的集合对象（bins），它有一个 int 范围（[1...4]、[5..10]等）。我需要确定 int 所属的范围，并将其放入适当的容器中。

foreach(element n in collection1) {
 foreach(bin m in collection2) {
  if (m.inRange(n)) {
   m.add(n);
   break;
  }
 }
}

所以明显的 NxM 复杂度算法就在那里，但我真的很想看到 Nxlog(M)。为此，我想使用 BinarySearch 代替内部 foreach 循环。要使用 BinarySearch，我需要实现一个 IComparer 类来为我进行搜索。我遇到的问题是这种方法需要我创建一个 IComparer.Compare 函数来比较两种不同类型的对象（一个元素与其容器），但这似乎不可能或不正确。那么我想问一下，这个算法应该怎么写呢？

原文

Alright, I've got two collections, and I need to place elements from collection1 into the bins (elements) of collection2, based on whether their value falls within a given bin's range.

For a concrete example, assume I have a sorted collection objects (bins) which have an int range ([1...4], [5..10], etc). I need to determine the range an int falls in, and place it in the appropriate bin.

foreach(element n in collection1) {
 foreach(bin m in collection2) {
  if (m.inRange(n)) {
   m.add(n);
   break;
  }
 }
}

So the obvious NxM complexity algorithm is there, but I really would like to see Nxlog(M). To do this I'd like to use BinarySearch in place of the inner foreach loop. To use BinarySearch, I need to implement an IComparer class to do the searching for me. The problem I'm running into is this approach would require me to make an IComparer.Compare function that compares two different types of objects (an element to its bin), and that doesn't seem possible or correct. So I'm asking, how should I write this algorithm?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

迟月 2024-08-31 08:46:06

让我们重述一下这个问题。您希望

foreach(int item in items)
    bins[GetBinIndex(item)].Add(item);

这样编写，使 GetBinIndex 的性能在 bin 数量方面优于 O(n)。

这取决于箱的拓扑结构。

如果 bin 只是连续的非负整数范围，例如 0..4、5..9、10..14 等，则只需将 item 除以 5 即可。那是 O(1)。

如果 bin 是不同大小的连续整数范围，例如 0..4、5..14、15..16、17..17、18..32，...，则

： int> 存储每个范围的顶部。因此，在本例中，{4, 14, 16, 17, 32, ...} 在
列表中二进制搜索该项目。
如果结果是非负数，那么它就是 bin 的索引，并且您有一个项目位于其 bin 的顶部。
如果结果是负数，那么它是顶部元素大于该项目的最佳 bin 的补集。取索引的补码，这就是垃圾箱。

搜索的时间复杂度为 O(lg n)，构建列表的时间复杂度为 O(n)。

如果箱是不连续的整数范围（也就是说，如果范围有空洞，或者它们重叠），那么您想要构建以有效查找最佳范围的数据结构是区间树。在非病态情况下搜索区间树的时间复杂度通常为 O(lg n)，首先构建树的时间复杂度为 O(n lg n)。

Let's restate the problem. You wish to write

foreach(int item in items)
    bins[GetBinIndex(item)].Add(item);

such that the performance of GetBinIndex is better than O(n) in the number of bins.

It depends on the topology of the bins.

If the bins are simply contiguous non-negative integer ranges, say, 0..4, 5..9, 10..14, and so on, then just divide item by 5, done. That's O(1).

If the bins are contiguous integer ranges of different sizes, say, 0..4, 5..14, 15..16, 17..17, 18..32, ... then:

Make a List<int> that stores the top of each range. So in this case, {4, 14, 16, 17, 32, ...}
BinarySearch the list for the item.
if the result is a non-negative number then that is the index of the bin, and you have an item that is at the top of its bin.
if the result is a negative number then that is the complement of the best bin whose top element is larger than the item. Take the complement of the index, and that's the bin.

This is O(lg n) to search, and O(n) to build the list in the first place.

If the bins are noncontiguous integer ranges -- that is, if the ranges have holes, or if they overlap -- then the data structure you want to build to efficiently find the best range is an interval tree. Interval trees are typically O(lg n) to search in nonpathological situations, and O(n lg n) to build the tree in the first place.

回复收藏 0 原文

執念 2024-08-31 08:46:06

我不确定我是否完全理解这个问题，因为我并没有真正理解这一部分：

我遇到的问题是这样的
方法需要我做一个
IComparer.Compare 函数
比较两种不同类型的
对象（容器中的元素）

尽管如此，我会尽力而为：

IComparer 用于对集合进行排序，以便您可以执行二分搜索。查看 MSDN 文章：http://msdn。 microsoft.com/en-us/library/system.collections.icomparer.aspx

因此，基本上，您需要确保首先使用 IComparer 对 Collection2 进行排序，IComparer 基本上只是从最低到最高范围对 Bins 进行排序。从您在第二个 foreach 内进行中断的事实来看，似乎您没有任何重叠，因此这不应该成为问题。

您不会使用 Array.BinarySearch (http: //msdn.microsoft.com/en-us/library/system.array.binarysearch.aspx) 方法，因为它搜索数组中的特定对象（也许这就是您上面引用的内容）？），但是您可以毫无困难地实现自己的二分搜索。