提取列表中给定数量的最高值

发布于 2024-08-28 18:45:03 字数 188 浏览 7 评论 0原文

我正在寻求根据各自的权重(由 Integer 表示)在网页上显示固定数量的项目。找到这些项目的列表实际上可以是任何大小。

我想到的第一个解决方案是执行 Collections.sort() 并通过遍历 List 逐项获取项目。有没有更优雅的解决方案可以用来准备前八项?

I'm seeking to display a fixed number of items on a web page according to their respective weight (represented by an Integer). The List where these items are found can be of virtually any size.

The first solution that comes to mind is to do a Collections.sort() and to get the items one by one by going through the List. Is there a more elegant solution though that could be used to prepare, say, the top eight items?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

后知后觉 2024-09-04 18:45:03

只需使用 Collections.sort(..) 即可。它足够高效。

该算法提供有保证的 n log(n) 性能。

如果您知道列表的一些独特属性,您可以尝试为您的具体情况实现更有效的东西,但这是不合理的。此外,例如,如果您的列表来自数据库,您可以对其进行LIMIT在那里订购而不是在代码中订购。

Just go for Collections.sort(..). It is efficient enough.

This algorithm offers guaranteed n log(n) performance.

You can try to implement something more efficient for your concrete case if you know some distinctive properties of your list, but that would not be justified. Furthermore, if your list comes from a database, for example, you can LIMIT it & order it there instead of in code.

冰火雁神 2024-09-04 18:45:03

您的选择:

  1. 进行线性搜索,保留沿途找到的前N个权重。 如果由于某种原因,您无法在显示页面之间重复使用排序结果(例如列表变化很快),这应该比对冗长的列表进行排序更快。

    更新:我纠正了线性搜索必然比排序更好的观点。请参阅维基百科文章“Selection_algorithm - 选择 k 个最小或最大元素”以获得更好的选择算法。

  2. 手动维护一个按权重顺序排序的列表(原始列表或并行列表)。您可以使用诸如 Collections.binarySearch() 来确定在何处插入每个新项目。

  3. 通过调用 Collections.sort() 每次修改、批量修改或就在显​​示之前(可能维护修改标志以避免对已排序的列表进行排序)。

  4. 使用为您维护排序权重顺序的数据结构:优先级队列树set 等。您还可以创建自己的数据结构。

  5. 手动维护前 N 个项目的第二个(可能是按权重排序的)数据结构。每当原始数据结构被修改时,该数据结构就会被更新。您可以创建自己的数据结构,将原始列表和“前 N 个缓存”包装在一起。

Your options:

  1. Do a linear search, maintaining the top N weights found along the way. This should be quicker than sorting a lengthly list if, for some reason, you can't reuse the sorting results between displaying the page (e.g. the list is changing quickly).

    UPDATE: I stand corrected on the linear search necessarily being better than sorting. See Wikipedia article "Selection_algorithm - Selecting k smallest or largest elements" for better selection algorithms.

  2. Manually maintain a List (the original one or a parallel one) sorted in weight order. You can use methods like Collections.binarySearch() to determine where to insert each new item.

  3. Maintain a List (the original one or a parallel one) sorted in weight order by calling Collections.sort() after each modification, batch modifications, or just before display (possibly maintaining a modification flag to avoid sorting an already sorted list).

  4. Use a data structure that maintains sorted weight-order for you: priority queue, tree set, etc. You could also create your own data structure.

  5. Manually maintain a second (possibly weight-ordered) data structure of the top N items. This data structure is updated anytime the original data structure is modified. You could create your own data structure to wrap the original list and this "top N cache" together.

甜`诱少女 2024-09-04 18:45:03

您可以使用最大堆

如果您的数据源自数据库,请在该列上放置索引,并使用 ORDER BY 和 TOP 或 LIMIT 仅获取需要显示的记录。

You could use a max-heap.

If your data originates from a database, put an index on that column and use ORDER BY and TOP or LIMIT to fetch only the records you need to display.

等待我真够勒 2024-09-04 18:45:03

使用dollar

List<Integer> topTen = $(list).sort().slice(10).toList();

不使用dollar,你应该sort()它使用Collections.sort(),然后使用list.sublist(0, n)获取前n个项目。

using dollar:

List<Integer> topTen = $(list).sort().slice(10).toList();

without using dollar you should sort() it using Collections.sort(), then get the first n items using list.sublist(0, n).

不语却知心 2024-09-04 18:45:03

既然你说从中提取前 N 个项目的列表可能是任意大小,所以我认为可能很大,所以我会补充上面简单的 sort() 答案(这完全是适合合理大小的输入),建议这里的大部分工作是找到前 N 个——然后对这些 N 进行排序就很简单了。也就是说:

Queue<Integer> topN = new PriorityQueue<Integer>(n);
for (Integer item : input) {
  if (topN.size() < n) {
    topN.add(item);        
  } else if (item > topN.peek()) {
    topN.add(item);          
    topN.poll();
  }
}

List<Integer> result = new ArrayList<Integer>(n);
result.addAll(topN);
Collections.sort(result, Collections.reverseOrder());

这里的堆(最小堆)的大小至少是有限的。没有必要将所有物品堆成一堆。

Since you say the list of items from which to extract these top N may be of any size, and so may be large I assume, I'd augment the simple sort() answers above (which are entirely appropriate for reasonably-sized input) by suggesting most of the work here is finding the top N -- then sorting those N is trivial. That is:

Queue<Integer> topN = new PriorityQueue<Integer>(n);
for (Integer item : input) {
  if (topN.size() < n) {
    topN.add(item);        
  } else if (item > topN.peek()) {
    topN.add(item);          
    topN.poll();
  }
}

List<Integer> result = new ArrayList<Integer>(n);
result.addAll(topN);
Collections.sort(result, Collections.reverseOrder());

The heap here (a min-heap) is at least bounded in size. There's no real need to make a heap out of all your items.

執念 2024-09-04 18:45:03

不,不是真的。至少不使用Java的内置方法。

有一些巧妙的方法可以比 O(n*log(n)) 操作更快地从列表中获取最高(或最低)N 个项目,但这需要您通过以下方式编写此解决方案:手。如果项目数量保持相对较低(不超过几百),则使用 Collections.sort() 对它进行排序,然后获取前 N 个数字是我认为的最佳方法。

No, not really. At least not using Java's built-in methods.

There are clever ways to get the highest (or lowest) N number of items from a list quicker than an O(n*log(n)) operation, but that will require you to code this solution by hand. If the number of items stays relatively low (not more than a couple of hundred), sorting it using Collections.sort() and then grabbing the top N numbers is the way to go IMO.

悸初 2024-09-04 18:45:03

取决于有多少。让我们将 n 定义为按键总数,将 m 定义为您希望显示的数字。
对整个事物进行排序:O(nlogn)
每次扫描数组寻找下一个最大数字:O(n*m)
所以问题是 - n 和 m 之间的关系是什么?
如果m < log n,扫描效率会更高。
否则,m >= log n,这意味着排序会更好。 (因为对于 m = log n 的边缘情况,这实际上并不重要,但排序也会给你带来好处,嗯,对数组进行排序,这总是很好的。

Depends on how many. Lets define n as the total number of keys, and m as the number you wish to display.
Sorting the entire thing: O(nlogn)
Scanning the array each time for the next highest number: O(n*m)
So the question is - What's the relation between n to m?
If m < log n, scanning will be more efficient.
Otherwise, m >= log n, which means sorting will be better. (Since for the edge case of m = log n it doesn't actually matter, but sorting will also give you the benefit of, well, sorting the array, which is always nice.

小糖芽 2024-09-04 18:45:03

如果列表的大小为N,要检索的项目数为K,则需要对列表调用Heapify,它将列表(必须可索引,例如数组)转换为优先级队列。 (参见 http://en.wikipedia.org/wiki/Heapsort 中的 heapify 函数

)堆顶部的项目(最大项目)需要 O (lg N) 时间。所以你的总时间将是:

O(N + k lg N) ,

这比 O (N lg N) 更好,假设 k 比 N 小得多。

If the size of the list is N, and the number of items to be retrieved is K, you need to call Heapify on the list, which converts the list (which has to be indexable, e.g. an array) into a priority queue. (See heapify function in http://en.wikipedia.org/wiki/Heapsort)

Retrieving an item on the top of the heap (the max item) takes O (lg N) time. So your overall time would be:

O(N + k lg N)

which is better than O (N lg N) assuming k is much smaller than N.

一百个冬季 2024-09-04 18:45:03

如果无法保留排序数组或使用不同的数据结构,您可以尝试如下操作。 O 时间类似于对大数组进行排序,但实际上这应该更有效。

small_array = big_array.slice( number_of_items_to_find );
small_array.sort();
least_found_value = small_array.get(0).value;

for ( item in big_array ) {  // needs to skip first few items
  if ( item.value > least_found_value ) {
    small_array.remove(0);
    small_array.insert_sorted(item);
    least_found_value = small_array.get(0).value;
  }
}

small_array 可以是一个 Object[],内部循环可以通过交换来完成,而不是实际删除和插入到数组中。

If keeping a sorted array or using a different data structure is not an option, you could try something like the following. The O time is similar to sorting the large array but in practice this should be more efficient.

small_array = big_array.slice( number_of_items_to_find );
small_array.sort();
least_found_value = small_array.get(0).value;

for ( item in big_array ) {  // needs to skip first few items
  if ( item.value > least_found_value ) {
    small_array.remove(0);
    small_array.insert_sorted(item);
    least_found_value = small_array.get(0).value;
  }
}

small_array could be an Object[] and the inner loop could be done with swapping instead of actually removing and inserting into an array.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文