对字符串哈希功能感到困惑

发布于 2025-01-21 04:43:52 字数 1323 浏览 4 评论 0 原文

当我查看一些字符串哈希福音时,我遇到了这个(下面的代码)。该功能一次处理字符串四个字节,并将四字节块中的每个字节解释为一个长整数值。将四字节块的整数值添加在一起。最后,使用模量运算符将结果总和转换为范围0到M-1。 以下是功能代码:

// Use folding on a string, summed 4 bytes at a time
long sfold(String s, int M) {
     int intLength = s.length() / 4;
     long sum = 0;
     for (int j = 0; j < intLength; j++) {
       char c[] = s.substring(j * 4, (j * 4) + 4).toCharArray();
       long mult = 1;
       for (int k = 0; k < c.length; k++) {
     sum += c[k] * mult;
     mult *= 256;
       }
     }

     char c[] = s.substring(intLength * 4).toCharArray();
     long mult = 1;
     for (int k = 0; k < c.length; k++) {
       sum += c[k] * mult;
       mult *= 256;
     }

     return(Math.abs(sum) % M);
   }

对我来说,混乱是代码的一部分,尤其是第一行。

     char c[] = s.substring(intLength * 4).toCharArray();
     long mult = 1;
     for (int k = 0; k < c.length; k++) {
       sum += c[k] * mult;
       mult *= 256;

据我所知,此行中使用的子字符串函数以参数为:开始索引包含在内,子字符串将从指定的 begindexex 开始,并将扩展到末尾细绳。 为了举个例子,假设我们要哈希列以下字符串:aaaabbbb。在这种情况下, intlength 将为 2 (功能代码的第二行)。在 s.substring(intlength * 4)中替换intlength的值.tochararray()将为我们提供 s.substring(8).tochararray(),这意味着字符串索引为从范围内,给定的字符串具有8个字符。 我不太了解发生了什么!

As I was looking through some string hash fucntions, I came across this one (code below). The function processes the string four bytes at a time, and interprets each of the four-byte chunks as a single long integer value. The integer values for the four-byte chunks are added together. In the end, the resulting sum is converted to the range 0 to M-1 using the modulus operator.
The following is the function code :

// Use folding on a string, summed 4 bytes at a time
long sfold(String s, int M) {
     int intLength = s.length() / 4;
     long sum = 0;
     for (int j = 0; j < intLength; j++) {
       char c[] = s.substring(j * 4, (j * 4) + 4).toCharArray();
       long mult = 1;
       for (int k = 0; k < c.length; k++) {
     sum += c[k] * mult;
     mult *= 256;
       }
     }

     char c[] = s.substring(intLength * 4).toCharArray();
     long mult = 1;
     for (int k = 0; k < c.length; k++) {
       sum += c[k] * mult;
       mult *= 256;
     }

     return(Math.abs(sum) % M);
   }

The confusion for me is this chunk of code, especially the first line.

     char c[] = s.substring(intLength * 4).toCharArray();
     long mult = 1;
     for (int k = 0; k < c.length; k++) {
       sum += c[k] * mult;
       mult *= 256;

To my knowledge, the substring function used in this line takes as argument : begin index inclusive, The substring will start from the specified beginIndex and it will extend to the end of the string.
For the sake of example, let's assume we want to hash the following string : aaaabbbb. In this case intLength is going to be 2 (second line of function code). Replacing the value of intlength in s.substring(intLength * 4).toCharArray() will give us s.substring(8).toCharArray() which means string index is out of bounds given the string to be hashed has 8 characters.
I don't quite understand what's going on !

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

始于初秋 2025-01-28 04:43:52

此哈希功能很糟糕,但是要回答您的问题:

没有 indexoutofBoundSexception ,因为“ aaaaabbbb” .substring(8)是 is 是“ ”

最后一个循环的目的是在字符串长度不是4个倍数时处理剩菜。 s “ aaaaabbbbbcc” ,例如,然后 > intlength == 2 s.substring(8)“ cc”

This hash function is awful, but to answer your question:

There is no IndexOutOfBoundsException, because "aaaabbbb".substring(8) is ""

The purpose of that last loop is to deal with leftovers when the string length isn't a multiple of 4. When s is "aaaabbbbcc", for example, then intLength == 2, and s.substring(8) is "cc".

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文