获取集合中 N 个最小的 [Comparable] 项

发布于 2024-10-27 04:48:32 字数 157 浏览 0 评论 0原文

我有一个未排序的对象集合[具有可比性],是否可以获取列表集合的子列表而无需调用排序?

我正在考虑做一个容量有限的 SortedList 的可能性,但这看起来不是正确的选择。

我可以很容易地写这个,但我想知道是否还有其他方法。

我无法修改现有集合的结构。

I have an unsorted Collection of objects [that are comparable], is it possible to get a sub list of the collection of the list without having to call sort?

I was looking at the possibility of doing a SortedList with a limited capacity, but that didn't look like the right option.

I could easily write this, but I was wondering if there was another way.

I am not able to modify the existing collection's structure.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

梦中的蝴蝶 2024-11-03 04:48:32

由于您不想调用 sort(),因此您似乎试图避免 O(n log(n)) 运行时成本。实际上有一种方法可以在 O(n) 时间内完成此操作 - 您可以使用 选择算法

Guava 库(Google 的核心 Java 库)中有一些方法可以做到这一点;查看 Ordering 并查看:

这些是 quickselect,并且由于它们是通用编写的,因此您只需在 Set 上调用它们即可获取 k 最小事物的列表。如果您不想使用整个 Guava 库,文档链接到源代码,我认为将方法移植到您的项目应该很简单。

如果您不想偏离标准库太远,您始终可以使用像 TreeSet 这样的排序集,尽管这会让您获得对数插入/删除时间,而不是漂亮的 O( 1)基于哈希的Set的性能,最终达到O(n log(n))。其他人提到了使用堆。这也将为您带来O(n log(n))运行时间,除非您使用一些更奇特的堆变体。 GraphMaker 中有一个 斐波那契堆实现< /a> 如果您正在寻找其中之一。

其中哪一个有意义实际上取决于您的项目,但我认为这涵盖了大部分选项。

Since you don't want to call sort(), it seems like you are trying to avoid an O(n log(n)) runtime cost. There is actually a way to do that in O(n) time -- you can use a selection algorithm.

There are methods to do this in the Guava libraries (Google's core Java libraries); look in Ordering and check out:

These are implementations of quickselect, and since they're written generically, you could just call them on your Set and get a list of the k smallest things. If you don't want to use the entire Guava libraries, the docs link to the source code, and I think it should be straightforward to port the methods to your project.

If you don't want to deviate too far from the standard libraries, you can always use a sorted set like TreeSet, though this gets you logarithmic insert/remove time instead of the nice O(1) performance of the hash-based Set, and it ends up being O(n log(n)) in the end. Others have mentioned using heaps. This will also get you O(n log(n)) running time, unless you use some of the fancier heap variants. There's a fibonacci heap implementation in GraphMaker if you're looking for one of those.

Which of these makes sense really depends on your project, but I think that covers most of the options.

誰認得朕 2024-11-03 04:48:32

我可能会创建一个排序集。将未排序集合中的前 N ​​项插入已排序集合中。然后,对于未排序集合的其余部分:

  1. 在排序集中插入每个项目
  2. 从排序集中删除最大的项目
  3. 重复直到处理完未排序集合中的所有项目

I would probably create a sorted set. Insert the first N items from your unsorted collection into your sorted set. Then for the remainder of your unsorted collection:

  1. insert each item in the sorted set
  2. delete the largest item from the sorted set
  3. Repeat until you've processed all items in the unsorted collection
善良天后 2024-11-03 04:48:32

是的,你可以将它们全部放入一个固定大小为N的最大堆数据结构中,有条件地,如果该项小于最大堆中的最大项(通过使用 get()“peek”方法检查)。一旦你这样做了,根据定义,它们将是 N 个最小的。最佳实现将以 O(M)+lg(N)O(M)(其中 M 是集合的大小)性能执行,这在理论上是最快的。下面是一些伪代码:

MaxHeap maxHeap = new MaxHeap(N);
for (Item x : mySetOfItems) {
  if (x < maxHeap.get()) {
    maxHeap.add(x);
  }
}

Apache Commons Collections 类PriorityBuffer 似乎是他们的旗舰二进制堆数据结构,请尝试使用该结构。

Yes, you can put all of them into a max heap data structure with a fixed size of N, conditionally, if the item is smaller than the largest in the max heap (by checking with the get() "peek" method). Once you have done so they will, by definition, be the N smallest. Optimal implementations will perform with O(M)+lg(N) or O(M) (where M is the size of the set) performance, which is theoretically fastest. Here's some pseudocode:

MaxHeap maxHeap = new MaxHeap(N);
for (Item x : mySetOfItems) {
  if (x < maxHeap.get()) {
    maxHeap.add(x);
  }
}

The Apache Commons Collections class PriorityBuffer seems to be their flagship binary heap data structure, try using that one.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文