访问 SortedSet 中特定元素的最有效方法是什么?

发布于 2024-10-24 06:58:14 字数 450 浏览 3 评论 0原文

我想使用一个已排序的集合,但可以通过索引访问其中的元素,即我想要同时具有集合和列表特征的集合。 Java.util.TreeSet 非常接近我的需要,但不允许通过索引进行访问。

我可以想到几个选项:

  1. 每次需要特定元素时,我都可以迭代 TreeSet。
  2. 当我需要访问特定元素时,我可以维护一个 TreeSet 并从中生成一个列表。
  3. 和上面一样,只缓存List,直到Set改变。
  4. 我可以有一个列表,并在需要添加元素时自行对其进行排序。
  5. 等等。

各种选择之间存在各种权衡。我希望有人能给我一些好的建议。要回答有关“为什么要这样做?”的潜在问题,请阅读Apriori< /a> 算法。

I want to use a collection that is sorted, but one in which I can access elements by index, i.e. I want something that has characteristics of both a Set and a List. Java.util.TreeSet comes real close to what I need, but doesn't permit access via an index.

I can think of several options:

  1. I could iterate through a TreeSet every time I needed a particular element.
  2. I could maintain a TreeSet and generate a List from it when I needed to access a particular element.
  3. Same as above, only cache the List until the Set changes.
  4. I could have a List and sort it myself whenever I needed to add an element.
  5. etc.

There are various trade-offs between the various options. I'm hoping somebody can give me some good advice. To answer the potential questions as to "why would you ever want to do that?", please read about the Apriori algorithm.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

醉城メ夜风 2024-10-31 06:58:14

https://github.com/geniot/indexed-tree-map

我有同样的问题。于是我就拿了java.util.TreeMap的源码,写了IndexedTreeMap。它实现了我自己的IndexedNavigableMap

public interface IndexedNavigableMap<K, V> extends NavigableMap<K, V> {
   K exactKey(int index);
   Entry<K, V> exactEntry(int index);
   int keyIndex(K k);
}

该实现基于红黑树中节点权重发生变化时的更新。权重是给定节点下的子节点数量加一 - self。例如,当一棵树向左旋转时:

    private void rotateLeft(Entry<K, V> p) {
    if (p != null) {
        Entry<K, V> r = p.right;

        int delta = getWeight(r.left) - getWeight(p.right);
        p.right = r.left;
        p.updateWeight(delta);

        if (r.left != null) {
            r.left.parent = p;
        }

        r.parent = p.parent;


        if (p.parent == null) {
            root = r;
        } else if (p.parent.left == p) {
            delta = getWeight(r) - getWeight(p.parent.left);
            p.parent.left = r;
            p.parent.updateWeight(delta);
        } else {
            delta = getWeight(r) - getWeight(p.parent.right);
            p.parent.right = r;
            p.parent.updateWeight(delta);
        }

        delta = getWeight(p) - getWeight(r.left);
        r.left = p;
        r.updateWeight(delta);

        p.parent = r;
    }
  }

updateWeight 只是将权重更新到根:

   void updateWeight(int delta) {
        weight += delta;
        Entry<K, V> p = parent;
        while (p != null) {
            p.weight += delta;
            p = p.parent;
        }
    }

当我们需要通过索引查找元素时,这里是使用权重的实现:

public K exactKey(int index) {
    if (index < 0 || index > size() - 1) {
        throw new ArrayIndexOutOfBoundsException();
    }
    return getExactKey(root, index);
}

private K getExactKey(Entry<K, V> e, int index) {
    if (e.left == null && index == 0) {
        return e.key;
    }
    if (e.left == null && e.right == null) {
        return e.key;
    }
    if (e.left != null && e.left.weight > index) {
        return getExactKey(e.left, index);
    }
    if (e.left != null && e.left.weight == index) {
        return e.key;
    }
    return getExactKey(e.right, index - (e.left == null ? 0 : e.left.weight) - 1);
}

查找键的索引也非常方便:

    public int keyIndex(K key) {
    if (key == null) {
        throw new NullPointerException();
    }
    Entry<K, V> e = getEntry(key);
    if (e == null) {
        throw new NullPointerException();
    }
    if (e == root) {
        return getWeight(e) - getWeight(e.right) - 1;//index to return
    }
    int index = 0;
    int cmp;
    index += getWeight(e.left);
    
    Entry<K, V> p = e.parent;
    // split comparator and comparable paths
    Comparator<? super K> cpr = comparator;
    if (cpr != null) {
        while (p != null) {
            cmp = cpr.compare(key, p.key);
            if (cmp > 0) {
                index += getWeight(p.left) + 1;
            }
            p = p.parent;
        }
    } else {
        Comparable<? super K> k = (Comparable<? super K>) key;
        while (p != null) {
            if (k.compareTo(p.key) > 0) {
                index += getWeight(p.left) + 1;
            }
            p = p.parent;
        }
    }
    return index;
}

您可以在 https://github.com/geniot/indexed-tree 找到这项工作的结果-地图

https://github.com/geniot/indexed-tree-map

I had the same problem. So I took the source code of java.util.TreeMap and wrote IndexedTreeMap. It implements my own IndexedNavigableMap:

public interface IndexedNavigableMap<K, V> extends NavigableMap<K, V> {
   K exactKey(int index);
   Entry<K, V> exactEntry(int index);
   int keyIndex(K k);
}

The implementation is based on updating node weights in the red-black tree when it is changed. Weight is the number of child nodes beneath a given node, plus one - self. For example when a tree is rotated to the left:

    private void rotateLeft(Entry<K, V> p) {
    if (p != null) {
        Entry<K, V> r = p.right;

        int delta = getWeight(r.left) - getWeight(p.right);
        p.right = r.left;
        p.updateWeight(delta);

        if (r.left != null) {
            r.left.parent = p;
        }

        r.parent = p.parent;


        if (p.parent == null) {
            root = r;
        } else if (p.parent.left == p) {
            delta = getWeight(r) - getWeight(p.parent.left);
            p.parent.left = r;
            p.parent.updateWeight(delta);
        } else {
            delta = getWeight(r) - getWeight(p.parent.right);
            p.parent.right = r;
            p.parent.updateWeight(delta);
        }

        delta = getWeight(p) - getWeight(r.left);
        r.left = p;
        r.updateWeight(delta);

        p.parent = r;
    }
  }

updateWeight simply updates weights up to the root:

   void updateWeight(int delta) {
        weight += delta;
        Entry<K, V> p = parent;
        while (p != null) {
            p.weight += delta;
            p = p.parent;
        }
    }

And when we need to find the element by index here is the implementation that uses weights:

public K exactKey(int index) {
    if (index < 0 || index > size() - 1) {
        throw new ArrayIndexOutOfBoundsException();
    }
    return getExactKey(root, index);
}

private K getExactKey(Entry<K, V> e, int index) {
    if (e.left == null && index == 0) {
        return e.key;
    }
    if (e.left == null && e.right == null) {
        return e.key;
    }
    if (e.left != null && e.left.weight > index) {
        return getExactKey(e.left, index);
    }
    if (e.left != null && e.left.weight == index) {
        return e.key;
    }
    return getExactKey(e.right, index - (e.left == null ? 0 : e.left.weight) - 1);
}

Also comes in very handy finding the index of a key:

    public int keyIndex(K key) {
    if (key == null) {
        throw new NullPointerException();
    }
    Entry<K, V> e = getEntry(key);
    if (e == null) {
        throw new NullPointerException();
    }
    if (e == root) {
        return getWeight(e) - getWeight(e.right) - 1;//index to return
    }
    int index = 0;
    int cmp;
    index += getWeight(e.left);
    
    Entry<K, V> p = e.parent;
    // split comparator and comparable paths
    Comparator<? super K> cpr = comparator;
    if (cpr != null) {
        while (p != null) {
            cmp = cpr.compare(key, p.key);
            if (cmp > 0) {
                index += getWeight(p.left) + 1;
            }
            p = p.parent;
        }
    } else {
        Comparable<? super K> k = (Comparable<? super K>) key;
        while (p != null) {
            if (k.compareTo(p.key) > 0) {
                index += getWeight(p.left) + 1;
            }
            p = p.parent;
        }
    }
    return index;
}

You can find the result of this work at https://github.com/geniot/indexed-tree-map

感情洁癖 2024-10-31 06:58:14

有几点:

  • 有点无法回答,但是当我最后一次需要重新实现频繁项集挖掘算法时,我选择了 FP-growth,它的性能与先验相当(或更好)而且,在我看来,更容易实施。该技术是由Jiawei Han等人开发的,基本上在数据挖掘:概念与技术中有专门的章节。 有点没有答案,但是

  • 有几种开源工具采用相当标准化的输入(每行一个整数列表;整数代表项目,行代表项目集)。其中一些让您可以选择算法。其中许多都可以在此处获得许可:http://fimi.ua.ac.be/src/

  • 请记住,除非您专门使用数组/向量,否则仅使用任何 List 实现都不会获得 O(1) 元素访问。更有可能的是,通过保留大部分或完全排序的数组(使用二分搜索查找超过特定限制的元素,以及用于随机访问的通常索引),您将获得更好的效果。

A couple of points:

  • Sort of a non-answer, but when I last needed to re-implement a frequent itemset mining algorithm, I went with FP-growth, which has performance on-par (or better) than a priori and, in my opinion, is easier to implement. This technique was developed by Jiawei Han and others, basically has a dedicated chapter in Data Mining: Concepts and Techniques.

  • There are several open-source tools that take a pretty standardized input (one list of integers per line; integers represent items, lines represent itemsets). Some of them give you a choice of algorithms. Many of them are available here with permissive licenses: http://fimi.ua.ac.be/src/

  • Keep in mind that using just any List implementation doesn't get you O(1) element access unless you specifically use an array/vector. More likely, you'll get better mileage out of keeping a mostly- or fully sorted array (with binary search for finding elements over a specific limit, and usual indexing for random access).

初懵 2024-10-31 06:58:14

也许是 Treeset 和 apache commons 集合 API CollectionUtils.get() 会解决你的问题

Perhaps a combination of Treeset and the apache commons collections API CollectionUtils.get() would solve your problem

后来的我们 2024-10-31 06:58:14

我会研究LinkedHashSet。它维护 HashSet 的插入顺序。

I would look into LinkedHashSet. It maintains insertion order of a HashSet.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文