为什么默认的 Object.toString() 返回 hashCode 的十六进制表示形式？

发布于 2024-09-19 21:23:59 字数 268 浏览 4 评论 0原文

我很好奇为什么 Object.toString() 返回此：

return getClass().getName() + "@" + Integer.toHexString(hashCode());

而不是：

return getClass().getName() + "@" + hashCode();

将哈希码显示为十六进制而不是十进制会给您带来什么好处？

原文

I'm curious why Object.toString() returns this:

return getClass().getName() + "@" + Integer.toHexString(hashCode());

as opposed to this:

return getClass().getName() + "@" + hashCode();

What benefits does displaying the hash code as a hex rather than a decimal buy you?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

忆梦 2024-09-26 21:23:59

简短的回答：

哈希码通常以十六进制显示，因为这样我们更容易将它们保留在我们的短期记忆中，因为十六进制数字比以十进制表示的相同数字更短并且具有更大的字符种类。

另外，（正如超级猫在评论中所述）十六进制表示往往会阻止人们尝试为数字赋予一些含义，因为他们没有任何含义。（以 supercat 为例，Fnord@194 绝对不是第 194 个 Fnord；它只是 Fnord 旁边有一些唯一的数字。）

长答案：

十进制有两个用途：

进行算术
估计大小

但是，这些操作不适用于哈希码。您当然不会在头脑中将哈希码添加在一起，也不会关心哈希码与另一个哈希码相比有多大。

您可能对哈希码所做的事情是它们的唯一用途：判断两个哈希码是否可能引用同一个对象，或者肯定引用不同的对象。

换句话说，您将使用它们作为对象的唯一标识符或助记符。因此，哈希码是一个数字这一事实实际上是完全无关紧要的；您不妨将其视为哈希字符串。

嗯，碰巧的是，我们的大脑发现在短期记忆中（出于比较的目的）保留由 16 个不同字符组成的短字符串比仅由 10 个不同字符组成的较长字符串要容易得多。

为了通过荒谬的方式进一步说明这个类比，想象一下如果哈希码以二进制表示，其中每个数字都比十进制长得多，并且字符种类要少得多。如果您现在看到哈希码 010001011011100010100100101011，10 秒后再次看到，您是否有机会知道您正在查看相同的哈希码？（我不能，即使我同时查看这两个数字。我必须逐位比较它们。）

另一端是四十六进制计数系统，这意味着以 64 为基数。该系统中的数字包括

：数字 0-9，加上：
大写字母 AZ，加上：
小写字母 az，加上：
几个符号，如“+”和“/”，达到 64。

四十六进制显然比低基数系统具有更多的字符多样性，并且其中表达的数字极其简洁也就不足为奇了。（我不确定为什么 JVM 不使用这个系统来存储哈希码；也许有些谨慎的人担心机会可能会导致形成某些不方便的四字母单词？）

因此，在一个假设的具有 32 位对象哈希码的 JVM 上， “Foo”对象的哈希码可能类似于以下任意一种：

Binary:           com.acme.Foo@11000001110101010110101100100011
Decimal:          com.acme.Foo@3251989283
Hexadecimal:      com.acme.Foo@C1D56B23
Tetrasexagesimal: com.acme.Foo@31rMiZ

您更喜欢哪一个？

我肯定更喜欢四进制，如果没有的话，我会选择十六进制。大多数人都会同意。

这里有一个可以进行转化的网站：
https://www.mobilefish.com/services/big_number/big_number.php

The Short Answer:

Hash Codes are usually displayed in hexadecimal because this way it is easier for us to retain them in our short-term memory, since hexadecimal numbers are shorter and have a larger character variety than the same numbers expressed in decimal.

Also, (as supercat states in a comment,) hexadecimal representation tends to prevent folks from trying to assign some meaning to the numbers, because they don't have any. (To use supercat's example, Fnord@194 is absolutely not the 194th Fnord; it is just Fnord with some unique number next to it.)

The Long Answer:

Decimal is convenient for two things:

Doing arithmetic
Estimating magnitude

However, these operations are inapplicable to hashcodes. You are certainly not going to be adding hashcodes together in your head, nor would you ever care how big a hashcode is compared to another hashcode.

What you are likely to be doing with hashcodes is the one and only thing that they were intended for: to tell whether two hash codes possibly refer to the same object, or definitely refer to different objects.

In other words, you will be using them as unique identifiers or mnemonics for objects. Thus, the fact that a hashcode is a number is in fact entirely irrelevant; you might as well think of it as a hash string.

Well, it just so happens that our brains find it a lot easier to retain in short-term memory (for the purpose of comparison) short strings consisting of 16 different characters, than longer strings consisting of only 10 different characters.

To further illustrate the analogy by taking it to absurdity, imagine if hash codes were represented in binary, where each number is far longer than in decimal, and has a much smaller character variety. If you saw the hash code 010001011011100010100100101011 now, and again 10 seconds later, would you stand the slightest chance of being able to tell that you are looking at the same hash code? (I can't, even if I am looking at the two numbers simultaneously. I have to compare them digit by digit.)

On the opposite end lies the tetrasexagesimal numbering system, which means base 64. Numbers in this system consist of:

the digits 0-9, plus:
the uppercase letters A-Z, plus:
the lowercase letters a-z, plus:
a couple of symbols like '+' and '/' to reach 64.

Tetrasexagesimal obviously has a much greater character variety than lower-base systems, and it should come as no surprise that numbers expressed in it are admirably terse. (I am not sure why the JVM is not using this system for hashcodes; perhaps some prude feared that chance might lead to certain inconvenient four-letter words being formed?)

So, on a hypothetical JVM with 32-bit object hash codes, the hash code of your "Foo" object could look like any of the following:

Binary:           com.acme.Foo@11000001110101010110101100100011
Decimal:          com.acme.Foo@3251989283
Hexadecimal:      com.acme.Foo@C1D56B23
Tetrasexagesimal: com.acme.Foo@31rMiZ

Which one would you prefer?

I would definitely prefer the tetrasexagesimal, and in lack of that, I would settle for the hexadecimal one. Most people would agree.

One web site where you can play with conversions is here:
https://www.mobilefish.com/services/big_number/big_number.php

回复收藏 0 原文