当前位置：文江博客话题详情

Java中hashCode()是如何计算的

发布于 2024-08-24 23:39:20 字数 172 浏览 4 评论 0原文

java中hashCode()方法返回什么值？

我读到它是一个对象的内存引用...new Integer(1) 的哈希值为 1； String("a") 的哈希值为 97。

我很困惑：它是 ASCII 还是什么类型的值？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

救赎№ 2024-08-31 23:39:21

hashCode() 返回的值绝不保证是对象的内存地址。我不确定 Object 类中的实现，但请记住，大多数类都会重写 hashCode() ，这样两个实例在语义上是等效的（但不是相同的实例）将散列到相同的值。如果这些类可以在另一个数据结构（例如 Set）中使用，而该数据结构依赖于 hashCode 与 equals 一致，那么这一点尤其重要。

无论如何，都不存在唯一标识对象实例的 hashCode() 。如果您想要基于底层指针的哈希码（例如在 Sun 的实现中），请使用 System.identityHashCode() - 这将委托给默认的 hashCode 方法，无论它是否已被覆盖。

然而，即使是 System.identityHashCode() 也可以为多个对象返回相同的哈希值。请参阅注释以获取解释，但这里是一个示例程序，它不断生成对象，直到找到两个具有相同 System.identityHashCode() 的对象。当我运行它时，平均在向映射添加大约 86,000 个 Long 包装器对象（以及键的 Integer 包装器）后，它会快速找到两个匹配的 System.identityHashCode()。

public static void main(String[] args) {
    Map<Integer,Long> map = new HashMap<>();
    Random generator = new Random();
    Collection<Integer> counts = new LinkedList<>();

    Long object = generator.nextLong();
    // We use the identityHashCode as the key into the map
    // This makes it easier to check if any other objects
    // have the same key.
    int hash = System.identityHashCode(object);
    while (!map.containsKey(hash)) {
        map.put(hash, object);
        object = generator.nextLong();
        hash = System.identityHashCode(object);
    }
    System.out.println("Identical maps for size:  " + map.size());
    System.out.println("First object value: " + object);
    System.out.println("Second object value: " + map.get(hash));
    System.out.println("First object identityHash:  " + System.identityHashCode(object));
    System.out.println("Second object identityHash: " + System.identityHashCode(map.get(hash)));
}

输出示例：

Identical maps for size:  105822
First object value: 7446391633043190962
Second object value: -8143651927768852586
First object identityHash:  2134400190
Second object identityHash: 2134400190

The value returned by hashCode() is by no means guaranteed to be the memory address of the object. I'm not sure of the implementation in the Object class, but keep in mind most classes will override hashCode() such that two instances that are semantically equivalent (but are not the same instance) will hash to the same value. This is especially important if the classes may be used within another data structure, such as Set, that relies on hashCode being consistent with equals.

There is no hashCode() that uniquely identifies an instance of an object no matter what. If you want a hashcode based on the underlying pointer (e.g. in Sun's implementation), use System.identityHashCode() - this will delegate to the default hashCode method regardless of whether it has been overridden.

Nevertheless, even System.identityHashCode() can return the same hash for multiple objects. See the comments for an explanation, but here is an example program that continuously generates objects until it finds two with the same System.identityHashCode(). When I run it, it quickly finds two System.identityHashCode()s that match, on average after adding about 86,000 Long wrapper objects (and Integer wrappers for the key) to a map.

public static void main(String[] args) {
    Map<Integer,Long> map = new HashMap<>();
    Random generator = new Random();
    Collection<Integer> counts = new LinkedList<>();

    Long object = generator.nextLong();
    // We use the identityHashCode as the key into the map
    // This makes it easier to check if any other objects
    // have the same key.
    int hash = System.identityHashCode(object);
    while (!map.containsKey(hash)) {
        map.put(hash, object);
        object = generator.nextLong();
        hash = System.identityHashCode(object);
    }
    System.out.println("Identical maps for size:  " + map.size());
    System.out.println("First object value: " + object);
    System.out.println("Second object value: " + map.get(hash));
    System.out.println("First object identityHash:  " + System.identityHashCode(object));
    System.out.println("Second object identityHash: " + System.identityHashCode(map.get(hash)));
}

Example output:

Identical maps for size:  105822
First object value: 7446391633043190962
Second object value: -8143651927768852586
First object identityHash:  2134400190
Second object identityHash: 2134400190

回复收藏 0 原文

梦里兽 2024-08-31 23:39:21

哈希码是一个整数值，表示调用它的对象的状态。这就是为什么设置为 1 的 Integer 将返回哈希码“1”，因为 Integer 的哈希码及其值是相同的。字符的哈希码等于它的 ASCII 字符代码。如果您编写自定义类型，则您有责任创建一个良好的 hashCode 实现，以最好地表示当前实例的状态。

回复收藏 0 原文

铃予 2024-08-31 23:39:21

如果您想了解它们是如何实现的，我建议您阅读源代码。如果您使用的是 IDE，您只需 + 您感兴趣的方法即可查看该方法是如何实现的。如果您无法做到这一点，您可以通过谷歌搜索来源。

例如，Integer.hashCode() 实现为

   public int hashCode() {
       return value;
   }

String.hashCode()

   public int hashCode() {
       int h = hash;
       if (h == 0) {
           int off = offset;
           char val[] = value;
           int len = count;

           for (int i = 0; i < len; i++) {
               h = 31*h + val[off++];
           }
           hash = h;
       }
       return h;
   }

If you want to know how they are implmented, I suggest you read the source. If you are using an IDE you can just + on a method you are interested in and see how a method is implemented. If you cannot do that, you can google for the source.

For example, Integer.hashCode() is implemented as

   public int hashCode() {
       return value;
   }

and String.hashCode()

   public int hashCode() {
       int h = hash;
       if (h == 0) {
           int off = offset;
           char val[] = value;
           int len = count;

           for (int i = 0; i < len; i++) {
               h = 31*h + val[off++];
           }
           hash = h;
       }
       return h;
   }

回复收藏 0 原文

笑红尘 2024-08-31 23:39:21

hashCode() 方法通常用于识别对象。我认为 Object 实现返回对象的指针（不是真正的指针，而是唯一的 id 或类似的东西）。但大多数类都会重写该方法。就像String类一样。两个 String 对象具有不同的指针，但它们是相等的：

new String("a").hashCode() == new String("a").hashCode()

我认为 hashCode() 最常见的用途是在 Hashtable、HashSet 中，等等..

Java API 对象 hashCode()

编辑：（由于最近的否决，并且基于我读到的有关 JVM 参数的文章）

使用 JVM 参数 -XX:hashCode 您可以更改方式hashCode 的计算方式（请参阅 Java 专家通讯的第 222 期）。

HashCode==0：简单地返回随机数，与位置无关
在内存中找到该对象。据我所知，全球
对于具有大量数据的系统来说，种子的读写并不是最佳选择
处理器。
HashCode==1：计算哈希码值，不确定是多少
他们开始了，但看起来相当高。
HashCode==2：始终返回与 1 完全相同的身份哈希码。
这可用于测试依赖于对象标识的代码。这
JavaChampionTest 在上例中返回 Kirk 的 URL 的原因
是所有对象都返回相同的哈希码。
HashCode==3：从零开始计算哈希码值。它
看起来不是线程安全的，因此可以生成多个线程
具有相同哈希码的对象。
HashCode==4：这个好像和内存位置有一些关系
创建对象的时间。
HashCode>=5：这是 Java 8 的默认算法，具有
每线程种子。它使用 Marsaglia 的异或移位方案来生成
伪随机数。

The hashCode() method is often used for identifying an object. I think the Object implementation returns the pointer (not a real pointer but a unique id or something like that) of the object. But most classes override the method. Like the String class. Two String objects have not the same pointer but they are equal:

new String("a").hashCode() == new String("a").hashCode()

I think the most common use for hashCode() is in Hashtable, HashSet, etc..

Java API Object hashCode()

Edit: (due to a recent downvote and based on an article I read about JVM parameters)

With the JVM parameter -XX:hashCode you can change the way how the hashCode is calculated (see the Issue 222 of the Java Specialists' Newsletter).

HashCode==0: Simply returns random numbers with no relation to where
in memory the object is found. As far as I can make out, the global
read-write of the seed is not optimal for systems with lots of
processors.
HashCode==1: Counts up the hash code values, not sure at what value
they start, but it seems quite high.
HashCode==2: Always returns the exact same identity hash code of 1.
This can be used to test code that relies on object identity. The
reason why JavaChampionTest returned Kirk's URL in the example above
is that all objects were returning the same hash code.
HashCode==3: Counts up the hash code values, starting from zero. It
does not look to be thread safe, so multiple threads could generate
objects with the same hash code.
HashCode==4: This seems to have some relation to the memory location
at which the object was created.
HashCode>=5: This is the default algorithm for Java 8 and has a
per-thread seed. It uses Marsaglia's xor-shift scheme to produce
pseudo-random numbers.

回复收藏 0 原文

謸气贵蔟 2024-08-31 23:39:21

我读到它是一个对象的内存引用..

不。大约 14 年前，Object.hashCode() 用于返回内存地址。从那以后就没有了。

值是什么类型

它是什么完全取决于您正在谈论的类以及它是否覆盖了`Object.hashCode()。

回复收藏 0 原文

ゞ记忆︶ㄣ 2024-08-31 23:39:21

来自 OpenJDK 源 (JDK8)：

使用默认值 5 生成哈希码：

product(intx, hashCode, 5,                                                
      "(Unstable) select hashCode generation algorithm")

一些常量数据和随机生成的数字，每个线程启动一个种子：

// thread-specific hashCode stream generator state - Marsaglia shift-xor form
  _hashStateX = os::random() ;
  _hashStateY = 842502087 ;
  _hashStateZ = 0x8767 ;    // (int)(3579807591LL & 0xffff) ;
  _hashStateW = 273326509 ;

然后，此函数创建 hashCode（默认为 5，如上所述）：

static inline intptr_t get_next_hash(Thread * Self, oop obj) {
  intptr_t value = 0 ;
  if (hashCode == 0) {
     // This form uses an unguarded global Park-Miller RNG,
     // so it's possible for two threads to race and generate the same RNG.
     // On MP system we'll have lots of RW access to a global, so the
     // mechanism induces lots of coherency traffic.
     value = os::random() ;
  } else
  if (hashCode == 1) {
     // This variation has the property of being stable (idempotent)
     // between STW operations.  This can be useful in some of the 1-0
     // synchronization schemes.
     intptr_t addrBits = cast_from_oop<intptr_t>(obj) >> 3 ;
     value = addrBits ^ (addrBits >> 5) ^ GVars.stwRandom ;
  } else
  if (hashCode == 2) {
     value = 1 ;            // for sensitivity testing
  } else
  if (hashCode == 3) {
     value = ++GVars.hcSequence ;
  } else
  if (hashCode == 4) {
     value = cast_from_oop<intptr_t>(obj) ;
  } else {
     // Marsaglia's xor-shift scheme with thread-specific state
     // This is probably the best overall implementation -- we'll
     // likely make this the default in future releases.
     unsigned t = Self->_hashStateX ;
     t ^= (t << 11) ;
     Self->_hashStateX = Self->_hashStateY ;
     Self->_hashStateY = Self->_hashStateZ ;
     Self->_hashStateZ = Self->_hashStateW ;
     unsigned v = Self->_hashStateW ;
     v = (v ^ (v >> 19)) ^ (t ^ (t >> 8)) ;
     Self->_hashStateW = v ;
     value = v ;
  }

  value &= markOopDesc::hash_mask;
  if (value == 0) value = 0xBAD ;
  assert (value != markOopDesc::no_hash, "invariant") ;
  TEVENT (hashCode: GENERATE) ;
  return value;
}

所以我们可以请注意，至少在 JDK8 中，默认值设置为随机线程特定。

From OpenJDK sources (JDK8):

Use default of 5 to generate hash codes:

product(intx, hashCode, 5,                                                
      "(Unstable) select hashCode generation algorithm")

Some constant data and a random generated number with a seed initiated per thread:

// thread-specific hashCode stream generator state - Marsaglia shift-xor form
  _hashStateX = os::random() ;
  _hashStateY = 842502087 ;
  _hashStateZ = 0x8767 ;    // (int)(3579807591LL & 0xffff) ;
  _hashStateW = 273326509 ;

Then, this function creates the hashCode (defaulted to 5 as specified above):

static inline intptr_t get_next_hash(Thread * Self, oop obj) {
  intptr_t value = 0 ;
  if (hashCode == 0) {
     // This form uses an unguarded global Park-Miller RNG,
     // so it's possible for two threads to race and generate the same RNG.
     // On MP system we'll have lots of RW access to a global, so the
     // mechanism induces lots of coherency traffic.
     value = os::random() ;
  } else
  if (hashCode == 1) {
     // This variation has the property of being stable (idempotent)
     // between STW operations.  This can be useful in some of the 1-0
     // synchronization schemes.
     intptr_t addrBits = cast_from_oop<intptr_t>(obj) >> 3 ;
     value = addrBits ^ (addrBits >> 5) ^ GVars.stwRandom ;
  } else
  if (hashCode == 2) {
     value = 1 ;            // for sensitivity testing
  } else
  if (hashCode == 3) {
     value = ++GVars.hcSequence ;
  } else
  if (hashCode == 4) {
     value = cast_from_oop<intptr_t>(obj) ;
  } else {
     // Marsaglia's xor-shift scheme with thread-specific state
     // This is probably the best overall implementation -- we'll
     // likely make this the default in future releases.
     unsigned t = Self->_hashStateX ;
     t ^= (t << 11) ;
     Self->_hashStateX = Self->_hashStateY ;
     Self->_hashStateY = Self->_hashStateZ ;
     Self->_hashStateZ = Self->_hashStateW ;
     unsigned v = Self->_hashStateW ;
     v = (v ^ (v >> 19)) ^ (t ^ (t >> 8)) ;
     Self->_hashStateW = v ;
     value = v ;
  }

  value &= markOopDesc::hash_mask;
  if (value == 0) value = 0xBAD ;
  assert (value != markOopDesc::no_hash, "invariant") ;
  TEVENT (hashCode: GENERATE) ;
  return value;
}

So we can see that at least in JDK8 the default is set to random thread specific.

回复收藏 0 原文

薄凉少年不暖心 2024-08-31 23:39:21

Object.hashCode()，如果内存服务正确（检查 java.lang.Object 的 JavaDoc），则依赖于实现，并且会根据对象而变化（Sun JVM 从对象的引用值中派生出该值））。

请注意，如果您正在实现任何重要的对象，并且希望将它们正确存储在 HashMap 或 HashSet 中，则必须重写 hashCode() 和 equals()。 hashCode() 可以做任何你想做的事情（这是完全合法的，但返回 1 并不是最理想的。），但重要的是，如果你的 equals() 方法返回 true，那么两个对象的 hashCode() 返回的值是相等的。

对 hashCode() 和 equals() 的混淆和缺乏理解是错误的一大来源。确保您完全熟悉 Object.hashCode() 和 Object.equals() 的 JavaDocs，我保证所花费的时间是值得的。

回复收藏 0 原文

篱下浅笙歌 2024-08-31 23:39:21

定义：String hashCode() 方法以整数形式返回 String 的 hashcode 值。

语法：
public int hashCode()

哈希码使用以下公式计算

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

where:

s is ith character in the string
n is length of the string
^ is exponential operand

示例：
例如，如果您想计算字符串“abc”的哈希码，那么我们有以下详细信息

s[] = {'a', 'b', 'c'}
n = 3

因此哈希码值将计算为：

s[0]*31^(2) + s[1]*31^1 + s[2]
= a*31^2 + b*31^1 + c*31^0
= (ASCII value of a = 97, b = 98 and c = 99)
= 97*961 + 98*31 + 99 
= 93217 + 3038 + 99
= 96354

因此“abc”的哈希码值为96354

Definition: The String hashCode() method returns the hashcode value of the String as an Integer.

Syntax:
public int hashCode()

Hashcode is calculated using below formula

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

where:

s is ith character in the string
n is length of the string
^ is exponential operand

Example:
For example if you want to calculate hashcode for string "abc" then we have below details

s[] = {'a', 'b', 'c'}
n = 3

So the hashcode value will be calculated as:

s[0]*31^(2) + s[1]*31^1 + s[2]
= a*31^2 + b*31^1 + c*31^0
= (ASCII value of a = 97, b = 98 and c = 99)
= 97*961 + 98*31 + 99 
= 93217 + 3038 + 99
= 96354

So the hashcode value for 'abc' is 96354

回复收藏 0 原文

樱＆纷飞 2024-08-31 23:39:21

我很惊讶没有人提到这一点，但尽管它对于任何非 Object 类来说都是显而易见的，但您的第一个操作应该是阅读许多类的源代码 .hashcode() 是只是从 Object 扩展而来，在这种情况下，根据您的 JVM 实现，可能会发生一些不同的有趣的事情。 Object.hashcode() 调用 System.identityHashcode(object)。

事实上，在内存中使用对象地址已经是古老的历史，但许多人没有意识到他们可以控制这种行为以及如何通过jvm参数-XX:hashCode=N计算Object.hashcode() > 其中 N 可以是 [0-5] 中的数字...

0 – Park-Miller RNG (default, blocking)
1 – f(address, global_statement)
2 – constant 1
3 – serial counter
4 – object address
5 – Thread-local Xorshift

根据应用程序的不同，当调用 .hashcode() 时，您可能会看到意外的性能影响，当发生这种情况时，您可能正在使用共享全局状态和/或块的算法之一。

I'm surprised that no one mentioned this but although its obvious for any non Object class your first action should be to read the source code for many classes .hashcode() is simply extended from Object in which case there are several different interesting things that may happen depending on your JVM implementation. Object.hashcode() calls to System.identityHashcode(object).

Indeed using object address in memory is ancient history but many do not realise they can control this behaviour and how Object.hashcode() is computed via jvm argument -XX:hashCode=N where N can be a number from [0-5]...

0 – Park-Miller RNG (default, blocking)
1 – f(address, global_statement)
2 – constant 1
3 – serial counter
4 – object address
5 – Thread-local Xorshift

Depending on an application you may see unexpected performance hits when .hashcode() is called, when that happens it is likely you are using one of the algorithms that shares global state and/or blocks.

回复收藏 0 原文

罗罗贝儿 2024-08-31 23:39:21

来自 Javadoc：

在相当实用的情况下，Object 类定义的 hashCode 方法确实为不同的对象返回不同的整数。（这通常是通过将对象的内部地址转换为整数来实现的，但 Java™ 编程语言不需要这种实现技术。）

https://docs.oracle.com/javase/8/docs/api/java/lang/Object.html#哈希码--

回复收藏 0 原文

笑，眼淚并存 2024-08-31 23:39:21

根据javaDoc的“对象的内部地址转换为整数”。因此很明显，hashCode() 方法不会按原样返回对象的内部地址。下面提供了链接。
https://docs.oracle。 com/javase/8/docs/api/java/lang/Object.html#hashCode--

要清除它，请参阅以下示例代码：

public class HashCodeDemo
    {
    public static void main(String[] args)
        {
        final int CAPACITY_OF_MAP = 10000000;

        /**
         * hashCode as key, and Object as value
         */
        java.util.HashMap<Integer, Object> hm1 = new java.util.HashMap<Integer, Object>(CAPACITY_OF_MAP);
        int noOfDistinceObject = 0;
        Object obj = null;
        for(int i = 0; i < CAPACITY_OF_MAP; i++)
            {
            obj = new Object();
            hm1.put(obj.hashCode(), new Object());
            }
        System.out.println("hm1.size() = "+hm1.size());

        /**
         * hashCode as key, and Object as value
         */
        java.util.HashMap<Integer, Object> hm2 = new java.util.HashMap<Integer, Object>(CAPACITY_OF_MAP);
        for(int i = 0; i < CAPACITY_OF_MAP; i++)
            {
            obj = new Object();
            /**
             * Each Object has unique memory location , 
             * and if Object's hashCode is memory location then hashCode of Object is also unique
             * then no object can put into hm2.
             * 
             * If obj's hashCode is doesn't exists in hm1 then increment noOfDistinceObject , else add obj into hm2.
             */
            if(hm1.get(obj.hashCode()) == null)
                {
                noOfDistinceObject++;
                }
            else
                {
                hm2.put(obj.hashCode(), new Object());
                }
            }

        System.out.println("hm2.size() = "+hm2.size());
        System.out.println("noOfDistinceObject = "+noOfDistinceObject);
        }
    }

每个对象都有唯一的内存位置，如果对象的 hashCode 方法返回内存位置，则对象的 hashCode 也是唯一的，但如果我们运行上面的示例代码，则某些对象具有相同的 hashcode 值，而某些对象具有唯一的 hashcode 值。

所以我们可以说 Object 类的 hashCode 方法不返回内存位置。

According to javaDoc of "internal address of the object is converted into an integer". So it is clear that hashCode() method do not return internal address of object as it is. Link is provided below.
https://docs.oracle.com/javase/8/docs/api/java/lang/Object.html#hashCode--

To clear it please see following sample code:

public class HashCodeDemo
    {
    public static void main(String[] args)
        {
        final int CAPACITY_OF_MAP = 10000000;

        /**
         * hashCode as key, and Object as value
         */
        java.util.HashMap<Integer, Object> hm1 = new java.util.HashMap<Integer, Object>(CAPACITY_OF_MAP);
        int noOfDistinceObject = 0;
        Object obj = null;
        for(int i = 0; i < CAPACITY_OF_MAP; i++)
            {
            obj = new Object();
            hm1.put(obj.hashCode(), new Object());
            }
        System.out.println("hm1.size() = "+hm1.size());

        /**
         * hashCode as key, and Object as value
         */
        java.util.HashMap<Integer, Object> hm2 = new java.util.HashMap<Integer, Object>(CAPACITY_OF_MAP);
        for(int i = 0; i < CAPACITY_OF_MAP; i++)
            {
            obj = new Object();
            /**
             * Each Object has unique memory location , 
             * and if Object's hashCode is memory location then hashCode of Object is also unique
             * then no object can put into hm2.
             * 
             * If obj's hashCode is doesn't exists in hm1 then increment noOfDistinceObject , else add obj into hm2.
             */
            if(hm1.get(obj.hashCode()) == null)
                {
                noOfDistinceObject++;
                }
            else
                {
                hm2.put(obj.hashCode(), new Object());
                }
            }

        System.out.println("hm2.size() = "+hm2.size());
        System.out.println("noOfDistinceObject = "+noOfDistinceObject);
        }
    }

Each Object has unique memory location , and if Object's hashCode method return memory location then hashCode of Object is also unique but if we run above sample code then some Objects have same hashcode value and some have unique hashcode value.

So we can say that hashCode method from Object class does not return memory location.

回复收藏 0 原文

~没有更多了~