当我们散列特定字符串或单词时，真正发生的事情（实际过程）

发布于 2025-01-05 00:49:54 字数 329 浏览 1 评论 0原文

您好，我正在尝试用 java 开发一个计数布隆过滤器。我确实搜索了有关布隆过滤器的大部分资源。我理解的是，当我们对特定字符串或单词进行散列（散列）时，散列的结果将返回一个值，以便我们可以将内容存储在该结果值中地方。但我的大问题是如何进行哈希（算法）。当我们对特定的字符串或单词进行哈希处理时，到底会发生什么。您能否解释一下，当我们对特定字符串或单词进行哈希处理时，到底会发生什么（例如，当我们对特定字符串或单词进行哈希处理时，特定的最终值是如何到达的）。我还读到也有发生碰撞的机会。您还可以解决为什么生成的哈希值不唯一（为什么它有时会为不同的输入返回相同的哈希值）。我真的需要编写代码来进行散列吗？或者java中是否有任何内置函数来进行散列。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一抹微笑 2025-01-12 00:49:55

您只需在任何对象上调用 hashCode() 即可获取哈希码。特别是对于 javadoc：

公共 int hashCode()
返回该字符串的哈希码。 String 对象的哈希码
计算为
s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
使用int算术，其中s[i]是字符串的第i个字符，n
是字符串的长度，^表示求幂。（哈希值
空字符串的值为零。）

回复收藏 0 原文

树深时见影 2025-01-12 00:49:55

“哈希”是一个函数

H: I -> O

通常，集合 I 比 O 更大或更复杂。在哈希表中，I 是元素的类，O 是正整数集。特别是，在布隆过滤器中，您有 n 个不同的函数。要开发哈希函数，您需要提取相似对象的不同特征。例如，对于字符串，您可以具有：

长度
第一个字符的
特定字符出现的次数
作为多项式计算的字符串 h(S) = sum (s(i)*31^i) mod d

当使用多个哈希特性时，应避免冲突，例如使用 number of voyels 和 number of non-voyels 并没有多大帮助。哈希函数必须具备一些特征，请查看维基百科条目

回复收藏 0 原文

囍孤女 2025-01-12 00:49:55

String 执行的代码如下：

public int hashCode() {
int h = hash;
    int len = count;
if (h == 0 && len > 0) {
    int off = offset;
    char val[] = value;

        for (int i = 0; i < len; i++) {
            h = 31*h + val[off++];
        }
        hash = h;
    }
    return h;
}

哈希是一个函数（不是双射），因此不同的输入可以产生相同的结果。这是哈希函数的基础知识

The code executed for String is this one:

public int hashCode() {
int h = hash;
    int len = count;
if (h == 0 && len > 0) {
    int off = offset;
    char val[] = value;

        for (int i = 0; i < len; i++) {
            h = 31*h + val[off++];
        }
        hash = h;
    }
    return h;
}

Hash is a function (not a bijection) and therefore, different inputs can produce the same result. This is the basics of hash functions

回复收藏 0 原文

半透明的墙 2025-01-12 00:49:55

Java 允许您重写类的 hashCode() 方法以使用哈希算法

public class Employee {


   // Default implementation might want to use "name" for as part of hashCode
   private String name; 

   @Override
   public int hashCode() {
     // We know that ID is always unique, so don't use name in calculating 
     // the hash code. & hashCode() is an int
     return id;
   }
}

*（如果您要重写 hashCode，您还应该重写 equals。）

哈希码是根据存储在集合中的每个对象计算的。
它是使用标准算法计算的。
您确实可以在每个对象的基础上重写 hashcode 方法。
实现 hashcode 方法的一种方法是使用 HashcodeBuilder。

希望这有帮助。在stackoverflow中搜索更多与本文相关的内容，可以获得更多描述性的答案。

Java allows you to override the hashCode() method for your Classes to use a hashing algorithm

public class Employee {


   // Default implementation might want to use "name" for as part of hashCode
   private String name; 

   @Override
   public int hashCode() {
     // We know that ID is always unique, so don't use name in calculating 
     // the hash code. & hashCode() is an int
     return id;
   }
}

*(if you are going to override hashCode you should also override equals.)

The hashcode is computed per object stored in the collection.
It is computed using a standard algorithm.
You can indeed override the hashcode method on a per object basis.
one way to implement a hashcode method is using HashcodeBuilder.

Hope this helps. Search more in stack overflow related to this article ,you can get more descriptive answers.

回复收藏 0 原文

~没有更多了~