具有固定大小的 Java PriorityQueue

发布于 2024-08-13 01:51:26 字数 258 浏览 9 评论 0原文

我正在计算算法的大量可能的结果组合。为了对这些组合进行排序,我用双值对它们进行评级并将它们存储在 PriorityQueue 中。目前,该队列中有大约 200k 个项目,这非常占用内存。实际上,我只需要说出列表中所有项目中最好的 1000 或 100 个。 所以我开始问自己是否有一种方法可以在Java中拥有一个固定大小的优先级队列。我应该这样表现: 该物品是否比已存储的物品更好?如果是,请将其插入相应位置,并将评级最小的元件扔掉。

有人有想法吗?再次非常感谢!

马可

I am calculating a large number of possible resulting combinations of an algortihm. To sort this combinations I rate them with a double value und store them in PriorityQueue. Currently, there are about 200k items in that queue which is pretty much memory intesive. Acutally, I only need lets say the best 1000 or 100 of all items in the list.
So I just started to ask myself if there is a way to have a priority queue with a fixed size in Java. I should behave like this:
Is the item better than one of the allready stored? If yes, insert it to the according position and throw the element with the least rating away.

Does anyone have an idea? Thanks very much again!

Marco

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

梦冥 2024-08-20 01:51:26
que.add(d);
if (que.size() > YOUR_LIMIT)
     que.poll();

或者我误解了你的问题?

编辑:忘记提及,要使其工作,您可能必须反转您的 comparTo 函数,因为它会丢弃每个周期中优先级最高的函数。 (如果 a “更好”b 比较 (a, b) 应该返回一个正数。

保持最大数字的示例使用如下所示:

public int compare(Double first, Double second) {
            // keep the biggest values
            return first > second ? 1 : -1;
        }
que.add(d);
if (que.size() > YOUR_LIMIT)
     que.poll();

or did I missunderstand your question?

edit: forgot to mention that for this to work you probably have to invert your comparTo function since it will throw away the one with highest priority each cycle. (if a is "better" b compare (a, b) should return a positvie number.

example to keep the biggest numbers use something like this:

public int compare(Double first, Double second) {
            // keep the biggest values
            return first > second ? 1 : -1;
        }
公布 2024-08-20 01:51:26

MinMaxPriorityQueue,Google Guava

确实有一个用于维护队列的类,当添加超出集合最大大小的项目时,它会比较这些项目以找到要删除的项目,从而创建空间: MinMaxPriorityQueue 自版本 8 起可在 Google Guava 中找到。

EvictingQueue

顺便说一句,如果您只想删除最旧的元素没有对对象的值进行任何比较,Google Guava 15 获得了 EvictingQueue 类。

MinMaxPriorityQueue, Google Guava

There is indeed a class for maintaining a queue that, when adding an item that would exceed the maximum size of the collection, compares the items to find an item to delete and thereby create room: MinMaxPriorityQueue found in Google Guava as of version 8.

EvictingQueue

By the way, if you merely want deleting the oldest element without doing any comparison of the objects’ values, Google Guava 15 gained the EvictingQueue class.

感情废物 2024-08-20 01:51:26

Apache Lucene 中有一个固定大小的优先级队列: http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/util/PriorityQueue.html

根据我的测试,它具有出色的性能。

There is a fixed size priority queue in Apache Lucene: http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/util/PriorityQueue.html

It has excellent performance based on my tests.

源来凯始玺欢你 2024-08-20 01:51:26

使用排序集:

SortedSet<Item> items = new TreeSet<Item>(new Comparator<Item>(...));
...
void addItem(Item newItem) {
    if (items.size() > 100) {
         Item lowest = items.first();
         if (newItem.greaterThan(lowest)) {
             items.remove(lowest);
         }
    }

    items.add(newItem);   
}

Use SortedSet:

SortedSet<Item> items = new TreeSet<Item>(new Comparator<Item>(...));
...
void addItem(Item newItem) {
    if (items.size() > 100) {
         Item lowest = items.first();
         if (newItem.greaterThan(lowest)) {
             items.remove(lowest);
         }
    }

    items.add(newItem);   
}
度的依靠╰つ 2024-08-20 01:51:26

如果队列的最小元素小于当前元素(在您的情况下,评级更差),则只需 poll() 队列。

static <V extends Comparable<? super V>> 
PriorityQueue<V> nbest(int n, Iterable<V> valueGenerator) {
    PriorityQueue<V> values = new PriorityQueue<V>();
    for (V value : valueGenerator) {
        if (values.size() == n && value.compareTo(values.peek()) > 0)
            values.poll(); // remove least element, current is better
        if (values.size() < n) // we removed one or haven't filled up, so add
            values.add(value);
    }
    return values;
}

这假设您有某种组合类,它实现了 Comparable,可以比较组合的评级。

编辑:澄清一下,我的示例中的Iterable不需要预先填充。例如,下面的 Iterable 将为您提供 int 可以表示的所有自然数:

Iterable<Integer> naturals = new Iterable<Integer>() {
    public Iterator<Integer> iterator() {
        return new Iterator<Integer>() {
            int current = 0;
            @Override
            public boolean hasNext() {
                return current >= 0;
            }
            @Override
            public Integer next() {
                return current++;
            }
            @Override
            public void remove() {
                throw new UnsupportedOperationException();
            }
        };
    }
};

正如您所见,内存消耗非常适中 - 超过 20 亿值,您需要两个对象(IterableIterator)加上一个 int

当然,您可以很容易地调整我的代码,这样它就不会使用 Iterable - 我只是使用它,因为它是表示序列的一种优雅的方式(而且,我已经做了太多的 Python 和C# ☺)。

Just poll() the queue if its least element is less than (in your case, has worse rating than) the current element.

static <V extends Comparable<? super V>> 
PriorityQueue<V> nbest(int n, Iterable<V> valueGenerator) {
    PriorityQueue<V> values = new PriorityQueue<V>();
    for (V value : valueGenerator) {
        if (values.size() == n && value.compareTo(values.peek()) > 0)
            values.poll(); // remove least element, current is better
        if (values.size() < n) // we removed one or haven't filled up, so add
            values.add(value);
    }
    return values;
}

This assumes that you have some sort of combination class that implements Comparable that compares combinations on their rating.

Edit: Just to clarify, the Iterable in my example doesn't need to be pre-populated. For example, here's an Iterable<Integer> that will give you all natural numbers an int can represent:

Iterable<Integer> naturals = new Iterable<Integer>() {
    public Iterator<Integer> iterator() {
        return new Iterator<Integer>() {
            int current = 0;
            @Override
            public boolean hasNext() {
                return current >= 0;
            }
            @Override
            public Integer next() {
                return current++;
            }
            @Override
            public void remove() {
                throw new UnsupportedOperationException();
            }
        };
    }
};

Memory consumption is very modest, as you can see - for over 2 billion values, you need two objects (the Iterable and the Iterator) plus one int.

You can of course rather easily adapt my code so it doesn't use an Iterable - I just used it because it's an elegant way to represent a sequence (also, I've been doing too much Python and C# ☺).

为你鎻心 2024-08-20 01:51:26

更好的方法是更严格地调整队列中的内容,在程序运行时删除和追加队列。听起来在将某些项目添加到队列中之前会有一些空间来排除它们。可以说,这比重新发明轮子更简单。

A better approach would be to more tightly moderate what goes on the queue, removing and appending to it as the program runs. It sounds like there would be some room to exclude some the items before you add them on the queue. It would be simpler than reinventing the wheel so to speak.

忆依然 2024-08-20 01:51:26

我的优先级队列的大多数用例都需要大小限制。目前尚不清楚为什么 Java 设计没有从具有强制大小限制的 ArrayBlockingQueue 扩展 PriorityBlockingQueue

在没有从上面的答案中查看第三方类的情况下,您需要同步限制控制。

import java.util.Comparator;
import java.util.concurrent.PriorityBlockingQueue;

/**
 * Limited Priority Queue with put/offer/take
 * 
 * @implNote poll(), offer(time) not implemented
 */
public class LimitedPriorityBlockingQueue<E> extends PriorityBlockingQueue<E>
{
    /**
     * 
     * @param maxDim     initial and maximum size
     * @param comparator priority relation
     */
    public LimitedPriorityBlockingQueue(final int maxDim, final Comparator<E> comparator)
    {
        super(maxDim, comparator);
        this.maxDim = maxDim;
        this.putSync = new Object();
    }

    @Override
    public boolean offer(final E s)
    {
        synchronized (putSync)
        {
            if (size() >= maxDim)
            {
                // avoid growth
                return false;
            }
            else
            {
                return super.offer(s);
            }
        }
    }

    @Override
    public void put(final E s)
    {
        synchronized (putSync)
        {
            if (size() >= maxDim)
            {
                try
                {
                    putSync.wait();
                }
                catch (final InterruptedException e)
                {
                    // deal with empty inherited throw list
                    throw new RuntimeException(e.getMessage());
                }
            }
            super.put(s);
        }
    }

    @Override
    public synchronized E take() throws InterruptedException
    {
        final E res = super.take();
        synchronized (putSync)
        {
            putSync.notify();
        }
        return res;
    }

    protected int maxDim;
    protected final Object putSync;

    private static final long serialVersionUID = 1484945675769120529L;
}

Most of my use cases of a priority queue required a size limit. It's not clear why Java design did not extend the PriorityBlockingQueue from an ArrayBlockingQueue with mandatory size limit.

Without having looked into third party classes from answers above, you need a synchronized limit control.

import java.util.Comparator;
import java.util.concurrent.PriorityBlockingQueue;

/**
 * Limited Priority Queue with put/offer/take
 * 
 * @implNote poll(), offer(time) not implemented
 */
public class LimitedPriorityBlockingQueue<E> extends PriorityBlockingQueue<E>
{
    /**
     * 
     * @param maxDim     initial and maximum size
     * @param comparator priority relation
     */
    public LimitedPriorityBlockingQueue(final int maxDim, final Comparator<E> comparator)
    {
        super(maxDim, comparator);
        this.maxDim = maxDim;
        this.putSync = new Object();
    }

    @Override
    public boolean offer(final E s)
    {
        synchronized (putSync)
        {
            if (size() >= maxDim)
            {
                // avoid growth
                return false;
            }
            else
            {
                return super.offer(s);
            }
        }
    }

    @Override
    public void put(final E s)
    {
        synchronized (putSync)
        {
            if (size() >= maxDim)
            {
                try
                {
                    putSync.wait();
                }
                catch (final InterruptedException e)
                {
                    // deal with empty inherited throw list
                    throw new RuntimeException(e.getMessage());
                }
            }
            super.put(s);
        }
    }

    @Override
    public synchronized E take() throws InterruptedException
    {
        final E res = super.take();
        synchronized (putSync)
        {
            putSync.notify();
        }
        return res;
    }

    protected int maxDim;
    protected final Object putSync;

    private static final long serialVersionUID = 1484945675769120529L;
}
〃温暖了心ぐ 2024-08-20 01:51:26

每次添加项目时只保留前 1000 个项目似乎很自然,但是 PriorityQueue 没有提供任何东西来优雅地实现这一点。也许您可以在方法中执行类似以下操作,而不是使用 PriorityQueue

List<Double> list = new ArrayList<Double>();
...
list.add(newOutput);
Collections.sort(list);
list = list.subList(0, 1000);

It seems natural to just keep the top 1000 each time you add an item, but the PriorityQueue doesn't offer anything to achieve that gracefully. Maybe you can, instead of using a PriorityQueue, do something like this in a method:

List<Double> list = new ArrayList<Double>();
...
list.add(newOutput);
Collections.sort(list);
list = list.subList(0, 1000);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文