Java:Map中的内部数据结构

发布于 2024-09-09 04:51:32 字数 94 浏览 3 评论 0原文

我正在尝试创建一个实现 Map 接口的类。所以我正在编写代码来检查调用对象是否为空。然而,我对应该在内部使用哪种数据结构有点困惑。目前我正在使用哈希表。 提前致谢

I am trying to create a class that implements the Map interface. So I am writing code that will check if the calling object is empty or not. However I am a little confused as to which data structure I should use internally. At present I am using a Hash Table.
Thanks in advance

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

秋凉 2024-09-16 04:51:32

来自维基百科

通常使用关联数组
当查找最频繁时
手术。为此原因,
通常设计实现
以允许快速查找,但代价是
较慢的插入和较大的
比其他数据的存储占用空间
结构(例如关联
列表)。

高效表示:
主要有两个高效数据
用于表示的结构
关联数组、哈希表和
自平衡二叉搜索树
(例如红黑树或 AVL
树)。跳过列表也是一个
替代方案,虽然相对较新且
没有那么广泛使用。 B 树(和
变体)也可以使用,并且是
关联时常用
数组太大,无法完全驻留
在内存中,例如在一个简单的
数据库。相对优势和
缺点包括:

  1. 渐近运算性能:哈希表具有更快的平均查找速度
    和插入时间 O(1) 相比
    平衡二叉搜索树的 θ(log
    n),而平衡树的速度更快
    最坏情况下的查找和插入时间,
    O(log n) 与 θ(n) 相比。跳过
    列表的最坏情况为 O(n),O(log
    n) 平均情况下的操作时间,但是
    插入和删除较少
    实际开销高于平衡开销
    二叉树。

  2. 顺序保留:平衡二叉树和跳跃列表保留
    排序——让人们能够高效地
    按顺序迭代键或
    有效地定位关联
    其键最接近给定值。
    哈希表不保留顺序
    因此无法执行这些
    运营同样高效(他们
    要求数据按顺序排序
    单独的步骤)。

  3. 范围查询:平衡二叉树可以轻松适应
    有效地将单个值分配给
    大范围的有序键,或者
    计算有序的键的数量
    范围。 (数组中有n个元素
    并在 a 上执行操作
    m 个键的连续范围,平衡
    二叉树将花费 O(log(n)+m) 时间
    而哈希表则需要 θ(n)
    时间,因为它需要搜索整个
    表。)

  4. 分配行为:具有开放寻址的哈希表将所有数据存储在
    一大块连续的内存
    很少重新分配,
    而树分配执行许多
    小而频繁的分配。作为一个
    结果哈希表可能很难
    在碎片堆中分配,并且
    相反,树可能会导致堆
    碎片化。树也比较多
    容易受到效率低下的影响
    分配器。

  5. 紧凑性:哈希表可以对小值进行更紧凑的存储
    类型,尤其是当值是
    位。

  6. 持久性:有平衡二进制的简单持久版本
    树木,特别突出
    使用函数式语言。

  7. 支持新的键类型:构建哈希表需要合理的
    键类型的哈希函数,其中
    可能很难写好,而
    平衡二叉树和跳跃列表
    只需要总订购
    键。

有时简单的实现
一种数据结构或另一种具有
可以克服的缺点
更好的设计。例如:

  • 使用不受信任的输入作为键的哈希表可能容易受到攻击
    拒绝服务攻击
    不受信任的用户提供了预期的数据
    来产生大量的
    碰撞。这可以通过以下方式克服
    随机选择哈希函数
    通用家族,或通过散列
    使用加密的不可信输入
    插入之前的哈希函数。

  • 简单的平衡树在指针和分配上浪费了空间
    元数据;这些问题可以是
    通过存储多个来缓解
    每个节点中的元素并使用
    内存池。

From Wikipedia,

Associative arrays are usually used
when lookup is the most frequent
operation. For this reason,
implementations are usually designed
to allow speedy lookup, at the expense
of slower insertion and a larger
storage footprint than other data
structures (such as association
lists).

Efficient representations:
There are two main efficient data
structures used to represent
associative arrays, the hash table and
the self-balancing binary search tree
(such as a red-black tree or an AVL
tree). Skip lists are also an
alternative, though relatively new and
not as widely used. B-trees (and
variants) can also be used, and are
commonly used when the associative
array is too large to reside entirely
in memory, for instance in a simple
database. Relative advantages and
disadvantages include:

  1. Asymptotic operation performance: Hash tables have faster average lookup
    and insertion time, O(1), compared to
    a balanced binary search tree's Θ(log
    n), while balanced trees have faster
    worst-case lookup and insertion time,
    O(log n) as compared to Θ(n). Skip
    lists have O(n) worst-case and O(log
    n) average-case operation times, but
    with less insertion and deletion
    overhead in practice than balanced
    binary trees.

  2. Ordering preservation: Balanced binary trees and skip lists preserve
    ordering — allowing one to efficiently
    iterate over the keys in order or to
    efficiently locate an association
    whose key is nearest to a given value.
    Hash tables do not preserve ordering
    and therefore cannot perform these
    operations as efficiently (they
    require the data to be sorted in a
    separate step).

  3. Range queries: Balanced binary trees can be easily adapted to
    efficiently assign a single value to a
    large ordered range of keys, or to
    count the number of keys in an ordered
    range. (With n elements in the array
    and performing the operation on a
    contiguous range of m keys, a balanced
    binary tree will take O(log(n)+m) time
    while a hash table would need Θ(n)
    time as it needs to search the entire
    table.)

  4. Allocation behavior: Hash tables with open addressing store all data in
    a large contiguous block of memory
    that is reallocated infrequently,
    whereas tree allocations perform many
    small, frequent allocations. As a
    result hash tables may be difficult to
    allocate in a fragmented heap, and
    conversely trees may cause heap
    fragmentation. Trees are also more
    vulnerable to inefficiencies in
    allocators.

  5. Compactness: Hash tables can have more compact storage for small value
    types, especially when the values are
    bits.

  6. Persistence: There are simple persistent versions of balanced binary
    trees, which are especially prominent
    in functional languages.

  7. Supporting new key types: Building a hash table requires a reasonable
    hash function for the key type, which
    can be difficult to write well, while
    balanced binary trees and skip lists
    only require a total ordering on the
    keys.

Sometimes simple implementations of
one data structure or the other have
disadvantages that can be overcome by
better design. For example:

  • Hash tables that use untrusted input as keys may be vulnerable to
    denial-of-service attacks where an
    untrusted user supplies data intended
    to generate large numbers of
    collisions. This may be overcome by
    choosing hash functions at random from
    a universal family, or by hashing
    untrusted input with a cryptographic
    hash function before insertion.

  • Simple balanced trees waste space on pointers and allocation
    metadata; these problems can be
    mitigated by storing multiple
    elements in each node and by using
    memory pools.

情独悲 2024-09-16 04:51:32

除了表本身之外,您还可以维护一个整数成员变量来跟踪集合的大小,每次放置新映射时递增它,每次删除映射时递减。这样,您可以简化 sizeisEmpty 接口方法:

public int size() {
    return this.size;
}

public boolean isEmpty() {
    return this.size == 0;
}

In addition to the table itself you could also maintain an integer member variable to track the size of the collection, incrementing it each time a new mapping is put and decrementing each time one is removed. This way, you can simplify the size and isEmpty interface methods:

public int size() {
    return this.size;
}

public boolean isEmpty() {
    return this.size == 0;
}
优雅的叶子 2024-09-16 04:51:32

我尝试了不同的方法,但最终扩展了一些数据结构,这些数据结构本身非常强大,不需要任何编码技能。因此,我决定使用普通字符串数组(2)来形成类似虚拟哈希图的结构,该结构会随着空间需求的增加而扩展。

下面是完整代码的链接。

http://code.google.com /p/rohan-oopconcept-assignment/source/browse/trunk/src/EmployeeMap.java

I tried different methods but ended up extending some data structure that was itself so strong that no coding skills were needed for it. So I decided to use normal string arrays( 2) to from a virtual hash map like structure which would expand as the need for space increased.

Below is the link to the complete code.

http://code.google.com/p/rohan-oopconcept-assignment/source/browse/trunk/src/EmployeeMap.java

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文