Java HashMap 如何处理具有相同哈希码的不同对象?

发布于 2024-11-17 03:25:14 字数 320 浏览 3 评论 0 原文

根据我的理解,我认为:

  1. 两个对象具有相同的哈希码是完全合法的。
  2. 如果两个对象相等(使用 equals() 方法),则它们具有相同的哈希码。
  3. 如果两个对象不相等,那么它们不能具有相同的哈希码,

我正确吗?

现在,如果正确的话,我有以下问题: HashMap 内部使用对象的哈希码。因此,如果两个对象可以具有相同的哈希码,那么 HashMap 如何跟踪它使用哪个键?

有人可以解释一下HashMap内部如何使用对象的哈希码吗?

As per my understanding I think:

  1. It is perfectly legal for two objects to have the same hashcode.
  2. If two objects are equal (using the equals() method) then they have the same hashcode.
  3. If two objects are not equal then they cannot have the same hashcode

Am I correct?

Now if am correct, I have the following question:
The HashMap internally uses the hashcode of the object. So if two objects can have the same hashcode, then how can the HashMap track which key it uses?

Can someone explain how the HashMap internally uses the hashcode of the object?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(15

过期以后 2024-11-24 03:25:14

哈希图的工作原理如下(这有点简化,但它说明了基本机制):

它有许多“桶”,用于存储键值对。每个桶都有一个唯一的编号 - 这就是标识水桶。当你将一个键值对放入map中时,hashmap会查看该键的哈希码,并将该键值对存储在其标识符为该键的哈希码的桶中。例如:密钥的哈希码是235->该键值对存储在编号为 235 的存储桶中。(请注意,一个存储桶可以存储多个键值对)。

当您在哈希映射中查找值时,通过给它一个键,它会首先查看您提供的键的哈希码。然后,hashmap 将查找相应的存储桶,然后通过使用 equals() 进行比较,将您提供的键与存储桶中所有对的键进行比较。

现在您可以看到这对于在映射中查找键值对是如何非常有效的:通过键的哈希码,哈希映射立即知道要在哪个存储桶中查找,因此它只需测试该存储桶中的内容即可。

通过上面的机制,你也可以看出对key的hashCode()equals()方法有什么要求:

  • 如果两个key相同(当您比较它们时,equals() 返回 true),它们的 hashCode() 方法必须返回相同的数字。如果键违反了这一点,则相等的键可能会存储在不同的存储桶中,并且哈希映射将无法找到键值对(因为它将在同一个存储桶中查找)。

  • 如果两个键不同,那么它们的哈希码是否相同并不重要。如果它们的哈希码相同,它们将存储在同一个桶中,在这种情况下,哈希映射将使用 equals() 来区分它们。

A hashmap works like this (this is a little bit simplified, but it illustrates the basic mechanism):

It has a number of "buckets" which it uses to store key-value pairs in. Each bucket has a unique number - that's what identifies the bucket. When you put a key-value pair into the map, the hashmap will look at the hash code of the key, and store the pair in the bucket of which the identifier is the hash code of the key. For example: The hash code of the key is 235 -> the pair is stored in bucket number 235. (Note that one bucket can store more then one key-value pair).

When you lookup a value in the hashmap, by giving it a key, it will first look at the hash code of the key that you gave. The hashmap will then look into the corresponding bucket, and then it will compare the key that you gave with the keys of all pairs in the bucket, by comparing them with equals().

Now you can see how this is very efficient for looking up key-value pairs in a map: by the hash code of the key the hashmap immediately knows in which bucket to look, so that it only has to test against what's in that bucket.

Looking at the above mechanism, you can also see what requirements are necessary on the hashCode() and equals() methods of keys:

  • If two keys are the same (equals() returns true when you compare them), their hashCode() method must return the same number. If keys violate this, then keys that are equal might be stored in different buckets, and the hashmap would not be able to find key-value pairs (because it's going to look in the same bucket).

  • If two keys are different, then it doesn't matter if their hash codes are the same or not. They will be stored in the same bucket if their hash codes are the same, and in this case, the hashmap will use equals() to tell them apart.

時窥 2024-11-24 03:25:14

你的第三个断言是不正确的。

两个不相等的对象具有相同的哈希码是完全合法的。它被HashMap用作“第一遍过滤器”,以便映射可以快速找到具有指定键的可能条目。然后测试具有相同哈希码的键与指定键的相等性。

您不希望要求两个不相等的对象不能具有相同的哈希码,否则这会将您限制为 232 个可能的对象。 (这也意味着不同的类型甚至不能使用对象的字段来生成哈希代码,因为其他类可以生成相同的哈希值。)

Your third assertion is incorrect.

It's perfectly legal for two unequal objects to have the same hash code. It's used by HashMap as a "first pass filter" so that the map can quickly find possible entries with the specified key. The keys with the same hash code are then tested for equality with the specified key.

You wouldn't want a requirement that two unequal objects couldn't have the same hash code, as otherwise that would limit you to 232 possible objects. (It would also mean that different types couldn't even use an object's fields to generate hash codes, as other classes could generate the same hash.)

恋竹姑娘 2024-11-24 03:25:14

HashMap结构图

HashMap是一个Entry对象的数组。

HashMap 视为一个对象数组。

看看这个 Object 是什么:

static class Entry<K,V> implements Map.Entry<K,V> {
        final K key;
        V value;
        Entry<K,V> next;
        final int hash;
… 
}

每个 Entry 对象代表一个键值对。如果存储桶有多个 Entry,则字段 next 引用另一个 Entry 对象。

有时可能会发生两个不同对象的哈希码相同的情况。在这种情况下,两个对象将被保存在一个桶中,并以链表的形式呈现。
入口点是最近添加的对象。该对象通过 next 字段引用另一个对象,依此类推。最后一个条目引用null

当您使用默认构造函数创建 HashMap 时,

HashMap hashMap = new HashMap();

将创建大小为 16 且默认负载平衡为 0.75 的数组。

添加新的键值

  1. 计算键的哈希码 计算应放置元素的位置 hash % (arrayLength-1) (桶号)
  2. 如果您尝试使用已存在的键添加值已保存在HashMap中,然后值被覆盖。
  3. 否则元素将被添加到桶中。

如果存储桶已经至少有一个元素,则会添加一个新元素并将其放置在存储桶的第一个位置。它的 next 字段引用旧元素。

删除

  1. 计算给定键的哈希码
  2. 计算桶数 hash % (arrayLength-1)
  3. 获取桶中第一个 Entry 对象的引用,并通过 equals 方法迭代给定桶中的所有条目。最终我们会找到正确的Entry
    如果没有找到所需的元素,则返回null

HashMap structure diagram

HashMap is an array of Entry objects.

Consider HashMap as just an array of objects.

Have a look at what this Object is:

static class Entry<K,V> implements Map.Entry<K,V> {
        final K key;
        V value;
        Entry<K,V> next;
        final int hash;
… 
}

Each Entry object represents a key-value pair. The field next refers to another Entry object if a bucket has more than one Entry.

Sometimes it might happen that hash codes for 2 different objects are the same. In this case, two objects will be saved in one bucket and will be presented as a linked list.
The entry point is the more recently added object. This object refers to another object with the next field and so on. The last entry refers to null.

When you create a HashMap with the default constructor

HashMap hashMap = new HashMap();

The array is created with size 16 and default 0.75 load balance.

Adding a new key-value pair

  1. Calculate hashcode for the key
  2. Calculate position hash % (arrayLength-1) where element should be placed (bucket number)
  3. If you try to add a value with a key which has already been saved in HashMap, then value gets overwritten.
  4. Otherwise element is added to the bucket.

If the bucket already has at least one element, a new one gets added and placed in the first position of the bucket. Its next field refers to the old element.

Deletion

  1. Calculate hashcode for the given key
  2. Calculate bucket number hash % (arrayLength-1)
  3. Get a reference to the first Entry object in the bucket and by means of equals method iterate over all entries in the given bucket. Eventually we will find the correct Entry.
    If a desired element is not found, return null
大姐,你呐 2024-11-24 03:25:14

您可以在 http://javarevisited.blogspot.com/2011/ 找到优秀的信息02/how-hashmap-works-in-java.html

总结一下:

HashMap 的工作原理是哈希

put(key, value): HashMap 将键和值对象存储为 Map.Entry。 Hashmap应用hashcode(key)来获取桶。如果发生碰撞,HashMap使用LinkedList来存储对象。

get(key): HashMap 使用 Key 对象的哈希码来查找存储桶位置,然后调用keys.equals() 方法来识别 LinkedList 中正确的节点,并返回 Java HashMap 中该键的关联值对象。

You can find excellent information at http://javarevisited.blogspot.com/2011/02/how-hashmap-works-in-java.html

To Summarize:

HashMap works on the principle of hashing

put(key, value): HashMap stores both key and value object as Map.Entry. Hashmap applies hashcode(key) to get the bucket. if there is collision ,HashMap uses LinkedList to store object.

get(key): HashMap uses Key Object's hashcode to find out bucket location and then call keys.equals() method to identify correct node in LinkedList and return associated value object for that key in Java HashMap.

止于盛夏 2024-11-24 03:25:14

以下是 HashMap 机制的粗略描述,适用于 Java 8 版本,(可能与 Java 6 略有不同)


数据结构

  • 哈希表
    哈希值是通过 key 上的 hash() 计算出来的,它决定给定 key 使用哈希表的哪个存储桶。
  • 链接列表 (单个)
    当桶中的元素数量较少时,使用单链表。
  • 红黑树
    当桶中的元素数量较多时,使用红黑树。

(内部)

  • Map.Entry
    表示映射中的单个实体,即键/值实体。
  • HashMap.Node
    节点的链表版本。

    它可以代表:

    • 哈希桶。
      因为它有一个哈希属性。
    • 单链表中的一个节点,(因此也是链表的头)
  • HashMap.TreeNode
    节点的树版本。

字段(内部)

  • Node[] 表
    桶表(链表的头部)。
    如果存储桶不包含元素,则它为空,因此只占用引用的空间。
  • 设置条目集
    实体集。
  • 整数大小
    实体数量。
  • 浮动负载因子
    指示在调整大小之前允许哈希表的填充程度。
  • 整数阈值
    下一个要调整大小的尺寸。
    公式:阈值 = 容量 * loadFactor

方法(内部)

  • int hash(key)
    通过key计算hash。
  • 如何将哈希映射到存储桶?
    使用以下逻辑:

    <块引用>

    static int hashToBucket(int tableSize, int hash) {
        返回(tableSize - 1) &哈希;
    }
    

关于容量

在哈希表中,容量指的是桶的数量,可以从table.length中获取。
也可以通过 thresholdloadFactor 计算,因此不需要定义为类字段。

获取有效容量


可以通过capacity()操作

  • 通过key查找实体。
    首先通过哈希值找到桶,然后循环链表或搜索排序树。
  • 使用键添加实体。
    首先根据key的哈希值找到bucket。
    然后尝试找到该值:

    • 如果找到,请替换该值。
    • 否则,在链表开头添加一个新节点,或插入到排序树中。
  • 调整大小
    当达到threshold时,会将哈希表的容量加倍(table.length),然后对所有元素执行重新哈希以重建表。
    这可能是一项昂贵的操作。

性能

  • 获取&把
    时间复杂度为 O(1),因为:

    • 通过数组索引访问存储桶,因此 O(1)
    • 每个桶中的链表长度都很小,因此可以看作O(1)
    • 树的大小也受到限制,因为会扩展容量和存储空间。当元素计数增加时重新哈希,因此可以将其视为 O(1),而不是 O(log N)

Here is a rough description of HashMap's mechanism, for Java 8 version, (it might be slightly different from Java 6).


Data structures

  • Hash table
    Hash value is calculated via hash() on key, and it decide which bucket of the hashtable to use for a given key.
  • Linked list (singly)
    When count of elements in a bucket is small, a singly linked list is used.
  • Red-Black tree
    When count of elements in a bucket is large, a red-black tree is used.

Classes (internal)

  • Map.Entry
    Represent a single entity in map, the key/value entity.
  • HashMap.Node
    Linked list version of node.

    It could represent:

    • A hash bucket.
      Because it has a hash property.
    • A node in singly linked list, (thus also head of linkedlist).
  • HashMap.TreeNode
    Tree version of node.

Fields (internal)

  • Node[] table
    The bucket table, (head of the linked lists).
    If a bucket don't contains elements, then it's null, thus only take space of a reference.
  • Set<Map.Entry> entrySet
    Set of entities.
  • int size
    Number of entities.
  • float loadFactor
    Indicate how full the hash table is allowed, before resizing.
  • int threshold
    The next size at which to resize.
    Formula: threshold = capacity * loadFactor

Methods (internal)

  • int hash(key)
    Calculate hash by key.
  • How to map hash to bucket?
    Use following logic:

    static int hashToBucket(int tableSize, int hash) {
        return (tableSize - 1) & hash;
    }
    

About capacity

In hash table, capacity means the bucket count, it could be get from table.length.
Also could be calculated via threshold and loadFactor, thus no need to be defined as a class field.

Could get the effective capacity via: capacity()


Operations

  • Find entity by key.
    First find the bucket by hash value, then loop linked list or search sorted tree.
  • Add entity with key.
    First find the bucket according to hash value of key.
    Then try find the value:

    • If found, replace the value.
    • Otherwise, add a new node at beginning of linked list, or insert into sorted tree.
  • Resize
    When threshold reached, will double hashtable's capacity(table.length), then perform a re-hash on all elements to rebuild the table.
    This could be an expensive operation.

Performance

  • get & put
    Time complexity is O(1), because:

    • Bucket is accessed via array index, thus O(1).
    • Linked list in each bucket is of small length, thus could view as O(1).
    • Tree size is also limited, because will extend capacity & re-hash when element count increase, so could view it as O(1), not O(log N).
白日梦 2024-11-24 03:25:14

哈希码确定哈希图要检查的存储桶。如果存储桶中存在多个对象,则执行线性搜索来查找存储桶中的哪一项等于所需的项目(使用 equals())方法。

换句话说,如果你有一个完美的 hashcode,那么 hashmap 的访问是恒定的,你将永远不需要迭代存储桶(从技术上讲,你还必须有 MAX_INT 存储桶,Java 实现可能会在同一个存储桶中共享一些哈希码来减少空间需求)。如果你有最差的哈希码(总是返回相同的数字),那么你的哈希映射访问就会变成线性,因为你必须搜索映射中的每个项目(它们都在同一个存储桶中)才能得到你想要的东西。

大多数时候,编写良好的哈希码并不完美,但足够独特,可以为您提供或多或少的持续访问。

The hashcode determines which bucket for the hashmap to check. If there is more than one object in the bucket then a linear search is done to find which item in the bucket equals the desired item (using the equals()) method.

In other words, if you have a perfect hashcode then hashmap access is constant, you will never have to iterate through a bucket (technically you would also have to have MAX_INT buckets, the Java implementation may share a few hash codes in the same bucket to cut down on space requirements). If you have the worst hashcode (always returns the same number) then your hashmap access becomes linear since you have to search through every item in the map (they're all in the same bucket) to get what you want.

Most of the time a well written hashcode isn't perfect but is unique enough to give you more or less constant access.

呆萌少年 2024-11-24 03:25:14

你在第三点上就错了。两个条目可以具有相同的哈希码,但不能相等。看一下 HashMap.从 OpenJdk 获取。您可以看到它检查哈希值是否相等以及键是否相等。如果第三点为真,那么就没有必要检查密钥是否相等。哈希码在键之前进行比较,因为前者的比较效率更高。

如果您有兴趣了解更多相关信息,请查看 Wikipedia 文章 Open解决冲突解决,我相信这是 OpenJdk 实现使用的机制。该机制与其他答案之一提到的“桶”方法略有不同。

You're mistaken on point three. Two entries can have the same hash code but not be equal. Take a look at the implementation of HashMap.get from the OpenJdk. You can see that it checks that the hashes are equal and the keys are equal. Were point three true, then it would be unnecessary to check that the keys are equal. The hash code is compared before the key because the former is a more efficient comparison.

If you're interested in learning a little more about this, take a look at the Wikipedia article on Open Addressing collision resolution, which I believe is the mechanism that the OpenJdk implementation uses. That mechanism is subtly different than the "bucket" approach one of the other answers mentions.

沒落の蓅哖 2024-11-24 03:25:14
import java.util.HashMap;

public class Students  {
    String name;
    int age;

    Students(String name, int age ){
        this.name = name;
        this.age=age;
    }

    @Override
    public int hashCode() {
        System.out.println("__hash__");
        final int prime = 31;
        int result = 1;
        result = prime * result + age;
        result = prime * result + ((name == null) ? 0 : name.hashCode());
        return result;
    }

    @Override
    public boolean equals(Object obj) {
        System.out.println("__eq__");
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        Students other = (Students) obj;
        if (age != other.age)
            return false;
        if (name == null) {
            if (other.name != null)
                return false;
        } else if (!name.equals(other.name))
            return false;
        return true;
    }

    public static void main(String[] args) {

        Students S1 = new Students("taj",22);
        Students S2 = new Students("taj",21);

        System.out.println(S1.hashCode());
        System.out.println(S2.hashCode());

        HashMap<Students,String > HM = new HashMap<Students,String > (); 
        HM.put(S1, "tajinder");
        HM.put(S2, "tajinder");
        System.out.println(HM.size());
    }
}

Output:

__ hash __

116232

__ hash __

116201

__ hash __

__ hash __

2

所以在这里我们看到,如果对象 S1 和 S2 都有不同的内容,那么我们非常确定我们重写的 Hashcode 方法将为这两个对象生成不同的 Hashcode(116232,11601)。现在因为有不同的哈希码,所以它甚至不需要调用 EQUALS 方法。因为不同的哈希码保证对象中的内容不同。

    public static void main(String[] args) {

        Students S1 = new Students("taj",21);
        Students S2 = new Students("taj",21);

        System.out.println(S1.hashCode());
        System.out.println(S2.hashCode());

        HashMap<Students,String > HM = new HashMap<Students,String > (); 
        HM.put(S1, "tajinder");
        HM.put(S2, "tajinder");
        System.out.println(HM.size());
    }
}

Now lets change out main method a little bit. Output after this change is 

__ hash __

116201

__ hash __

116201

__ hash __

__ hash __

__ eq __

1
We can clearly see that equal method is called. Here is print statement __eq__, since we have same hashcode, then content of objects MAY or MAY not be similar. So program internally  calls Equal method to verify this. 


Conclusion 
If hashcode is different , equal method will not get called. 
if hashcode is same, equal method will get called.

Thanks , hope it helps. 
import java.util.HashMap;

public class Students  {
    String name;
    int age;

    Students(String name, int age ){
        this.name = name;
        this.age=age;
    }

    @Override
    public int hashCode() {
        System.out.println("__hash__");
        final int prime = 31;
        int result = 1;
        result = prime * result + age;
        result = prime * result + ((name == null) ? 0 : name.hashCode());
        return result;
    }

    @Override
    public boolean equals(Object obj) {
        System.out.println("__eq__");
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        Students other = (Students) obj;
        if (age != other.age)
            return false;
        if (name == null) {
            if (other.name != null)
                return false;
        } else if (!name.equals(other.name))
            return false;
        return true;
    }

    public static void main(String[] args) {

        Students S1 = new Students("taj",22);
        Students S2 = new Students("taj",21);

        System.out.println(S1.hashCode());
        System.out.println(S2.hashCode());

        HashMap<Students,String > HM = new HashMap<Students,String > (); 
        HM.put(S1, "tajinder");
        HM.put(S2, "tajinder");
        System.out.println(HM.size());
    }
}

Output:

__ hash __

116232

__ hash __

116201

__ hash __

__ hash __

2

So here we see that if both the objects S1 and S2 have different content, then we are pretty sure that our overridden Hashcode method will generate different Hashcode(116232,11601) for both objects. NOW since there are different hash codes, so it won't even bother to call EQUALS method. Because a different Hashcode GUARANTEES DIFFERENT content in an object.

    public static void main(String[] args) {

        Students S1 = new Students("taj",21);
        Students S2 = new Students("taj",21);

        System.out.println(S1.hashCode());
        System.out.println(S2.hashCode());

        HashMap<Students,String > HM = new HashMap<Students,String > (); 
        HM.put(S1, "tajinder");
        HM.put(S2, "tajinder");
        System.out.println(HM.size());
    }
}

Now lets change out main method a little bit. Output after this change is 

__ hash __

116201

__ hash __

116201

__ hash __

__ hash __

__ eq __

1
We can clearly see that equal method is called. Here is print statement __eq__, since we have same hashcode, then content of objects MAY or MAY not be similar. So program internally  calls Equal method to verify this. 


Conclusion 
If hashcode is different , equal method will not get called. 
if hashcode is same, equal method will get called.

Thanks , hope it helps. 
牵强ㄟ 2024-11-24 03:25:14

两个对象相等,意味着它们具有相同的哈希码,但反之则不然。

2个相等的对象------>他们有相同的哈希码

2 个对象具有相同的哈希码 ----xxxxx-->他们不相等

HashMap 中的 Java 8 更新 -

您在代码中执行此操作 -

myHashmap.put("old","old-value");
myHashMap.put("very-old","very-old-value");

因此,假设您的哈希码返回了两个键 "old""very -old" 是一样的。然后会发生什么。

myHashMap 是一个 HashMap,假设最初您没有指定它的容量。因此,每个 java 的默认容量是 16。所以现在,只要您使用 new 关键字初始化 hashmap,它就会创建 16 个存储桶。现在,当你执行第一个语句时 -

myHashmap.put("old","old-value");

然后计算 "old" 的哈希码,并且因为哈希码也可能是非常大的整数,所以,java内部这样做 - (哈希是这里的哈希码并且>> ;> 是右移),

hash XOR hash >>> 16

因此为了给出更大的图片,它将返回一些索引,该索引介于 0 到 15 之间。现在您的键值对 "old""old -value" 将被转换Entry 对象的键和值实例变量。然后这个entry对象就会被存储到bucket中,或者你可以说在某个特定的索引处,这个entry对象会被存储。

仅供参考- Entry 是 Map 接口中的一个类 - Map.Entry,

class Entry{
          final Key k;
          value v;
          final int hash;
          Entry next;
}

当您执行下一条语句时,现在具有这些签名/定义 -

myHashmap.put("very-old","very-old-value");

并且 "very-old" 给出与 "old"< 相同的哈希码/code>,所以这个新的键值对再次被发送到相同的索引或相同的桶。但由于这个桶不为空,那么 Entry 对象的 next 变量就用来存储这个新的键值对。

对于具有相同哈希码的每个对象,这将存储为链表,但 TRIEFY_THRESHOLD 指定为值 6。因此,达到此值后,链表将转换为平衡树(红黑树),其中第一个元素为根。

two objects are equal, implies that they have same hashcode, but not vice versa.

2 equal objects ------> they have same hashcode

2 objects have same hashcode ----xxxxx--> they are NOT equal

Java 8 update in HashMap-

you do this operation in your code -

myHashmap.put("old","old-value");
myHashMap.put("very-old","very-old-value");

so, suppose your hashcode returned for both keys "old" and "very-old" is same. Then what will happen.

myHashMap is a HashMap, and suppose that initially you didn't specify its capacity. So default capacity as per java is 16. So now as soon as you initialised hashmap using the new keyword, it created 16 buckets. now when you executed first statement-

myHashmap.put("old","old-value");

then hashcode for "old" is calculated, and because the hashcode could be very large integer too, so, java internally did this - (hash is hashcode here and >>> is right shift)

hash XOR hash >>> 16

so to give as a bigger picture, it will return some index, which would be between 0 to 15. Now your key value pair "old" and "old-value" would be converted to Entry object's key and value instance variable. and then this entry object will be stored in the bucket, or you can say that at a particular index, this entry object would be stored.

FYI- Entry is a class in Map interface- Map.Entry, with these signature/definition

class Entry{
          final Key k;
          value v;
          final int hash;
          Entry next;
}

now when you execute next statement -

myHashmap.put("very-old","very-old-value");

and "very-old" gives same hashcode as "old", so this new key value pair is again sent to the same index or the same bucket. But since this bucket is not empty, then the next variable of the Entry object is used to store this new key value pair.

and this will be stored as linked list for every object which have the same hashcode, but a TRIEFY_THRESHOLD is specified with value 6. so after this reaches, linked list is converted to the balanced tree(red-black tree) with first element as the root.

鞋纸虽美,但不合脚ㄋ〞 2024-11-24 03:25:14

每个 Entry 对象代表键值对。如果一个桶有超过 1 个 Entry,那么 Field next 会引用其他 Entry 对象。

有时可能会发生两个不同对象的 hashCode 相同的情况。在这种情况下,2 个对象将保存在一个存储桶中,并以 LinkedList 的形式呈现。入口点是最近添加的对象。该对象引用具有下一个字段的其他对象,依此类推。最后一个条目引用 null。
当您使用默认构造函数创建 HashMap 时,

将创建大小为 16 且默认负载平衡为 0.75 的数组。

在此处输入图像描述

(来源)

Each Entry object represents key-value pair. Field next refers to other Entry object if a bucket has more than 1 Entry.

Sometimes it might happen that hashCodes for 2 different objects are the same. In this case 2 objects will be saved in one bucket and will be presented as LinkedList. The entry point is more recently added object. This object refers to other object with next field and so one. Last entry refers to null.
When you create HashMap with default constructor

Array is gets created with size 16 and default 0.75 load balance.

enter image description here

(Source)

李不 2024-11-24 03:25:14

哈希映射的工作原理是哈希

HashMap get(Key k) 方法在键对象上调用 hashCode 方法,并将返回的 hashValue 应用于其自己的静态哈希函数,以找到存储键和值的存储桶位置(支持数组)一个名为 Entry (Map.Entry) 的嵌套类。因此,从上一行可以得出结论,键和值都以 Entry 对象的形式存储在存储桶中。所以认为桶中只存储价值是不正确的,也不会给面试官留下好印象。

  • 每当我们在 HashMap 对象上调用 get(Key k) 方法时。首先它检查 key 是否为 null 。请注意,HashMap 中只能有一个空键。

如果 key 为 null ,则 Null 键总是映射到哈希 0,因此索引 0。

如果 key 不为 null 那么,它将在 key 对象上调用 hashfunction,请参见上面方法中的第 4 行,即 key.hashCode() ,因此在 key 之后.hashCode() 返回 hashValue,第 4 行看起来像

            int hash = hash(hashValue)

现在,它将返回的 hashValue 应用到自己的哈希函数中。

我们可能想知道为什么我们要使用 hash(hashValue) 再次计算哈希值。答案是它可以防御质量较差的哈希函数。

现在,最终的哈希值用于查找存储 Entry 对象的存储桶位置。 Entry对象像这样存储在bucket中(hash,key,value,bucketindex)

Hash map works on the principle of hashing

HashMap get(Key k) method calls hashCode method on the key object and applies returned hashValue to its own static hash function to find a bucket location(backing array) where keys and values are stored in form of a nested class called Entry (Map.Entry) . So you have concluded that from the previous line that Both key and value is stored in the bucket as a form of Entry object . So thinking that Only value is stored in the bucket is not correct and will not give a good impression on the interviewer .

  • Whenever we call get( Key k ) method on the HashMap object . First it checks that whether key is null or not . Note that there can only be one null key in HashMap .

If key is null , then Null keys always map to hash 0, thus index 0.

If key is not null then , it will call hashfunction on the key object , see line 4 in above method i.e. key.hashCode() ,so after key.hashCode() returns hashValue , line 4 looks like

            int hash = hash(hashValue)

and now ,it applies returned hashValue into its own hashing function .

We might wonder why we are calculating the hashvalue again using hash(hashValue). Answer is It defends against poor quality hash functions.

Now final hashvalue is used to find the bucket location at which the Entry object is stored . Entry object stores in the bucket like this (hash,key,value,bucketindex)

小清晰的声音 2024-11-24 03:25:14

我不会详细介绍 HashMap 的工作原理,但会给出一个示例,以便我们可以通过将其与现实联系起来来记住 HashMap 的工作原理。

我们有Key、Value、HashCode和bucket。

有时,我们会将它们与以下内容相关联:

  • Bucket ->社会
  • 哈希码 ->协会地址(始终唯一)
  • 值 ->社会
  • 钥匙中的一所房子 ->房屋地址。

使用 Map.get(key) :

Stevie 想要去他朋友(Josse)的家,他住在 VIP 社团的别墅里,假设是 JavaLovers Society。
Josse 的地址是他的 SSN(每个人都不同)。
有一个索引,我们可以根据 SSN 找到协会的名称。
这个索引可以被认为是一种找出HashCode的算法。

  • SSN 协会名称
  • 92313(Josse's) -- JavaLovers
  • 13214 -- AngularJSLovers
  • 98080 -- JavaLovers
  • 53808 -- BiologyLovers

  1. 这个 SSN(key) 首先给我们一个 HashCode(来自索引表),它只是协会的名称。
  2. 现在,多个房屋可以在同一个社会中,因此HashCode可以通用。
  3. 假设,社会对于两栋房子来说是通用的,我们如何识别我们要去哪栋房子,是的,通过使用(SSN)键,它只是房子地址

使用 Map.put(key,Value)

这发现通过查找 HashCode 找到适合该 Value 的社会,然后存储该值。

我希望这会有所帮助,并且可以进行修改。

I will not get into the details of how HashMap works, but will give an example so we can remember how HashMap works by relating it to reality.

We have Key, Value ,HashCode and bucket.

For sometime, we will relate each of them with the following:

  • Bucket -> A Society
  • HashCode -> Society's address(unique always)
  • Value -> A House in the Society
  • Key -> House address.

Using Map.get(key) :

Stevie wants to get to his friend's(Josse) house who lives in a villa in a VIP society, let it be JavaLovers Society.
Josse's address is his SSN(which is different for everyone).
There's an index maintained in which we find out the Society's name based on SSN.
This index can be considered to be an algorithm to find out the HashCode.

  • SSN Society's Name
  • 92313(Josse's) -- JavaLovers
  • 13214 -- AngularJSLovers
  • 98080 -- JavaLovers
  • 53808 -- BiologyLovers

  1. This SSN(key) first gives us a HashCode(from the index table) which is nothing but Society's name.
  2. Now, mulitple houses can be in the same society, so the HashCode can be common.
  3. Suppose, the Society is common for two houses, how are we going to identify which house we are going to, yes, by using the (SSN)key which is nothing but the House address

Using Map.put(key,Value)

This finds a suitable society for this Value by finding the HashCode and then the value is stored.

I hope this helps and this is open for modifications.

鞋纸虽美,但不合脚ㄋ〞 2024-11-24 03:25:14

记住这里对 hashmap 结构的解释,也许有人可以解释 Baeldung 上的以下段落:-

Java 有几种 Map 接口的实现,每一种都有自己的特殊性。

但是,现有的 Java 核心 Map 实现都不允许 Map 处理单个键的多个值。

正如我们所见,如果我们尝试为同一个键插入两个值,第二个值将被存储,而第一个将被删除。

它还将被返回(通过 put(K key, V value) 方法的每个正确实现):

Map<String, String> map = new HashMap<>();
assertThat(map.put("key1", "value1")).isEqualTo(null);
assertThat(map.put("key1", "value2")).isEqualTo("value1");
assertThat(map.get("key1")).isEqualTo("value2");

Bearing in mind the explanations here for the structure of a hashmap, perhaps someone could explain the following paragraph on Baeldung :-

Java has several implementations of the interface Map, each one with its own particularities.

However, none of the existing Java core Map implementations allow a Map to handle multiple values for a single key.

As we can see, if we try to insert two values for the same key, the second value will be stored, while the first one will be dropped.

It will also be returned (by every proper implementation of the put(K key, V value) method):

Map<String, String> map = new HashMap<>();
assertThat(map.put("key1", "value1")).isEqualTo(null);
assertThat(map.put("key1", "value2")).isEqualTo("value1");
assertThat(map.get("key1")).isEqualTo("value2");
执笏见 2024-11-24 03:25:14

这将是一个很长的答案,喝一杯并继续阅读......

散列就是在内存中存储一​​个可以更快读取和写入的键值对。 它将键存储在数组中,将值存储在 LinkedList 中。

假设我想存储 4 个键值对 -

{
“girl” => “ahhan” , 
“misused” => “Manmohan Singh” , 
“horsemints” => “guess what”, 
“no” => “way”
}

因此,为了存储键,我们需要一个包含 4 个元素的数组。现在如何将这 4 个键之一映射到 4 个数组索引 (0,1,2,3)?

因此java找到各个键的hashCode并将它们映射到特定的数组索引。
哈希码公式是 -

1) reverse the string.

2) keep on multiplying ascii of each character with increasing power of 31 . then add the components .

3) So hashCode() of girl would be –(ascii values of  l,r,i,g are 108, 114, 105 and 103) . 

e.g. girl =  108 * 31^0  + 114 * 31^1  + 105 * 31^2 + 103 * 31^3  = 3173020

哈希和女孩!我知道你在想什么。你对狂野二重唱的迷恋可能会让你错过一件重要的事情。

为什么java将它与31相乘?

这是因为,31 是一个奇素数,其形式为 2^5 – 1 。并且奇素数减少了哈希冲突的几率

现在如何将这个哈希代码映射到数组索引?

答案是,哈希代码%(数组长度-1) 。因此,在我们的例子中,“girl” 映射到 (3173020 % 3) = 1。这是数组的第二个元素。

值“ahhan”存储在与数组索引 1 关联的 LinkedList 中。

HashCollision - 如果您尝试使用描述的公式查找键 “misused”“horsemints”hasHCode在上面您会看到两者都给了我们相同的1069518484。哇啊!!吸取的教训——

2 个相等的对象必须具有相同的 hashCode,但不能保证是否
hashCode 匹配则对象相等。所以它应该存储
与存储桶 1 的“misused”和“horsemints”相对应的两个值
(1069518484%3)。

现在哈希映射看起来像 –

Array Index 0 –
Array Index 1 - LinkedIst (“ahhan” , “Manmohan Singh” , “guess what”)
Array Index 2 – LinkedList (“way”)
Array Index 3 – 

现在,如果有人试图找到键 “horsemints” 的值,java 很快就会找到它的 hashCode,对其进行模块并开始在 LinkedList 中搜索它的值对应的索引1。这样我们就不需要搜索所有 4 个数组索引,从而使数据访问更快。

但是,等一下,等一下。该linkedList中有3个值对应数组索引1,它如何找出哪个是键“horsemints”的值?

实际上我撒谎了,当我说 HashMap 只是将值存储在 LinkedList 中时。

它将两个键值对存储为映射条目。所以实际上 Map 看起来像这样。

Array Index 0 –
Array Index 1 - LinkedIst (<”girl” => “ahhan”> , <” misused” => “Manmohan Singh”> , <”horsemints” => “guess what”>)
Array Index 2 – LinkedList (<”no” => “way”>)
Array Index 3 – 

现在您可以看到,在遍历与 ArrayIndex1 对应的 linkedList 时,它实际上将该 LinkedList 的每个条目的键与“horsemints”进行比较,当找到一个时,它只返回它的值。

希望您在阅读时玩得开心:)

It gonna be a long answer , grab a drink and read on …

Hashing is all about storing a key-value pair in memory that can be read and written faster. It stores keys in an array and values in a LinkedList .

Lets Say I want to store 4 key value pairs -

{
“girl” => “ahhan” , 
“misused” => “Manmohan Singh” , 
“horsemints” => “guess what”, 
“no” => “way”
}

So to store the keys we need an array of 4 element . Now how do I map one of these 4 keys to 4 array indexes (0,1,2,3)?

So java finds the hashCode of individual keys and map them to a particular array index .
Hashcode Formulae is -

1) reverse the string.

2) keep on multiplying ascii of each character with increasing power of 31 . then add the components .

3) So hashCode() of girl would be –(ascii values of  l,r,i,g are 108, 114, 105 and 103) . 

e.g. girl =  108 * 31^0  + 114 * 31^1  + 105 * 31^2 + 103 * 31^3  = 3173020

Hash and girl !! I know what you are thinking. Your fascination about that wild duet might made you miss an important thing .

Why java multiply it with 31 ?

It’s because, 31 is an odd prime in the form 2^5 – 1 . And odd prime reduces the chance of Hash Collision

Now how this hash code is mapped to an array index?

answer is , Hash Code % (Array length -1) . So “girl” is mapped to (3173020 % 3) = 1 in our case . which is second element of the array .

and the value “ahhan” is stored in a LinkedList associated with array index 1 .

HashCollision - If you try to find hasHCode of the keys “misused” and “horsemints” using the formulae described above you’ll see both giving us same 1069518484. Whooaa !! lesson learnt -

2 equal objects must have same hashCode but there is no guarantee if
the hashCode matches then the objects are equal . So it should store
both values corresponding to “misused” and “horsemints” to bucket 1
(1069518484 % 3) .

Now the hash map looks like –

Array Index 0 –
Array Index 1 - LinkedIst (“ahhan” , “Manmohan Singh” , “guess what”)
Array Index 2 – LinkedList (“way”)
Array Index 3 – 

Now if some body tries to find the value for the key “horsemints” , java quickly will find the hashCode of it , module it and start searching for it’s value in the LinkedList corresponding index 1 . So this way we need not search all the 4 array indexes thus making data access faster.

But , wait , one sec . there are 3 values in that linkedList corresponding Array index 1, how it finds out which one was was the value for key “horsemints” ?

Actually I lied , when I said HashMap just stores values in LinkedList .

It stores both key value pair as map entry. So actually Map looks like this .

Array Index 0 –
Array Index 1 - LinkedIst (<”girl” => “ahhan”> , <” misused” => “Manmohan Singh”> , <”horsemints” => “guess what”>)
Array Index 2 – LinkedList (<”no” => “way”>)
Array Index 3 – 

Now you can see While traversing through the linkedList corresponding to ArrayIndex1 it actually compares key of each entry to of that LinkedList to “horsemints” and when it finds one it just returns the value of it .

Hope you had fun while reading it :)

深爱成瘾 2024-11-24 03:25:14

俗话说,一图胜千言。我说:有些代码胜过1000字。这是HashMap的源代码。 Get方法:

/**
     * Implements Map.get and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @return the node, or null if none
     */
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

因此很明显,哈希用于查找“桶”,并且始终在该桶中检查第一个元素。如果没有,则使用键的 equals 来查找链表中的实际元素。

让我们看看 put() 方法:

  /**
     * Implements Map.put and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

它稍微复杂一些,但很明显,新元素被放置在选项卡中基于哈希计算的位置:

i = (n - 1 ) & hash 这里的 i 是放置新元素的索引(或者它是“桶”)。 ntab 数组(“桶”数组)的大小。

首先,尝试将其作为该“桶”中的第一个元素。如果已经有一个元素,则将一个新节点追加到列表中。

As it is said, a picture is worth 1000 words. I say: some code is better than 1000 words. Here's the source code of HashMap. Get method:

/**
     * Implements Map.get and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @return the node, or null if none
     */
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

So it becomes clear that hash is used to find the "bucket" and the first element is always checked in that bucket. If not, then equals of the key is used to find the actual element in the linked list.

Let's see the put() method:

  /**
     * Implements Map.put and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

It's slightly more complicated, but it becomes clear that the new element is put in the tab at the position calculated based on hash:

i = (n - 1) & hash here i is the index where the new element will be put (or it is the "bucket"). n is the size of the tab array (array of "buckets").

First, it is tried to be put as the first element of in that "bucket". If there is already an element, then append a new node to the list.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文