如何使用 c++ 将字符串哈希为 int？

发布于 2024-08-26 12:02:53 字数 134 浏览 11 评论 0原文

我必须编写自己的哈希函数。如果我只想创建一个简单的哈希函数，将字符串中的每个字母映射到一个数值（即 a=1，b=2，c=3，...），有没有办法可以执行此哈希一个字符串而不必首先将其转换为 C 字符串来查看每个单独的字符？有没有更有效的散列字符串的方法？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一袭水袖舞倾城 2024-09-02 12:02:53

从个人经验来看，我知道这很有效并且可以产生良好的发行版。（抄袭自http://www.cse.yorku.ca/~oz/hash.html ):

djb2

这个算法 (k=33) 是由 dan bernstein 多年前在 comp.lang.c 中首次报道的。该算法的另一个版本（现在受到 Bernstein 的青睐）使用异或： hash(i) = hash(i - 1) * 33 ^ str[i];数字 33 的魔力（为什么它比许多其他常量（无论是素数还是非素数）效果更好）从未得到充分解释。

unsigned long hash(unsigned char *str) {
    unsigned long hash = 5381;
    int c;

    while (c = *str++) {
        hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
    }

    return hash;
}

From personal experience I know that this works and produces good distributions. (Plagiarised from http://www.cse.yorku.ca/~oz/hash.html):

djb2

this algorithm (k=33) was first reported by dan bernstein many years ago in comp.lang.c. another version of this algorithm (now favored by bernstein) uses xor: hash(i) = hash(i - 1) * 33 ^ str[i]; the magic of number 33 (why it works better than many other constants, prime or not) has never been adequately explained.

unsigned long hash(unsigned char *str) {
    unsigned long hash = 5381;
    int c;

    while (c = *str++) {
        hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
    }

    return hash;
}

回复收藏 0 原文

别靠近我心 2024-09-02 12:02:53

关于第一个问题，当然，例如：

int hash = 0;
int offset = 'a' - 1;
for(string::const_iterator it=s.begin(); it!=s.end(); ++it) {
  hash = hash << 1 | (*it - offset);
}

关于第二个问题，有很多更好的方法来散列字符串。例如，请参阅此处获取一些 C 示例（可以轻松翻译为 C++上面代码片段的行）。

Re the first question, sure, e.g, something like:

int hash = 0;
int offset = 'a' - 1;
for(string::const_iterator it=s.begin(); it!=s.end(); ++it) {
  hash = hash << 1 | (*it - offset);
}

regarding the second, there are many better ways to hash strings. E.g., see here for a few C examples (easily translatable to C++ along the lines of the snippet above).

回复收藏 0 原文

爱她像谁 2024-09-02 12:02:53

您可以使用 [] 运算符检查 std::string 中的每个单独的字符。但是，您可以查看 Boost::Functional/Hash 获取更好的散列方案的指导。此处还提供了 c 中的哈希函数列表。

回复收藏 0 原文

狂之美人 2024-09-02 12:02:53

这是我在 Stroustrup 的书中找到的一个 C (++) 哈希函数：

int hash(const char *str)
{
    int h = 0;
    while (*str)
       h = h << 1 ^ *str++;
    return h;
}

如果您将它用于哈希表（Stroustrup 就是这样做的），那么您可以返回哈希模素数的绝对值。所以改为

    return (h > 0 ? h : -h) % N_BUCKETS;

最后一行。

Here's a C (++) hash function that I found in Stroustrup's book:

int hash(const char *str)
{
    int h = 0;
    while (*str)
       h = h << 1 ^ *str++;
    return h;
}

If you're using it for a hash table (which Stroustrup does) then you can instead return the abs of the hash modulo a prime number. So instead

    return (h > 0 ? h : -h) % N_BUCKETS;

for the last line.

回复收藏 0 原文

机场等船 2024-09-02 12:02:53

C++11 附带了一个标准的字符串哈希函数。

https://en.cppreference.com/w/cpp/string/basic_string/哈希

#include <string>
#include<functional> // hash
int main(){
    std::string s = "Hello";
    std::size_t hash = std::hash<std::string>{}(s);
}

C++11 ships with a standard hashing function for strings.

https://en.cppreference.com/w/cpp/string/basic_string/hash

#include <string>
#include<functional> // hash
int main(){
    std::string s = "Hello";
    std::size_t hash = std::hash<std::string>{}(s);
}

回复收藏 0 原文

蓦然回首 2024-09-02 12:02:53

只是发布对 Arnestig 的 djb2 算法的改进，使其对 constexpr 友好。
我必须删除参数的无符号限定符，以便它可以使用文字字符串。

constexpr unsigned long hash(const char *str) {
    unsigned long hash = 5381;

    while (int c = *str++) {
        hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
    }

    return hash;
}

Just posting an improvement to Arnestig's djb2 algorithm to be constexpr-friendly.
I had to remove the unsigned qualifier of the argument so it can work with literal strings.

constexpr unsigned long hash(const char *str) {
    unsigned long hash = 5381;

    while (int c = *str++) {
        hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
    }

    return hash;
}

回复收藏 0 原文

软糖 2024-09-02 12:02:53

将字符异或在一起，一次四个。

回复收藏 0 原文

素年丶 2024-09-02 12:02:53

#include <iostream>
#include <string>
#include <algorithm>

using namespace std;

// a variation on dan bernstein's algorithm
// [http://www.cse.yorku.ca/~oz/hash.html]
template<typename Int>
struct hash {
    hash() : acc(5381) { }
    template<typename Ch>
    void operator()(Ch ch) { acc = ((acc << 5) + acc) ^ ch; }
    operator Int() const { return acc; }
    Int acc;
};

int main(int argc, char* argv[])
{
    string s("Hellp, world");
    cout << hex << showbase
        << for_each(s.begin(), s.end(), hash<unsigned long long>()) << '\n';
    return 0;
}

#include <iostream>
#include <string>
#include <algorithm>

using namespace std;

// a variation on dan bernstein's algorithm
// [http://www.cse.yorku.ca/~oz/hash.html]
template<typename Int>
struct hash {
    hash() : acc(5381) { }
    template<typename Ch>
    void operator()(Ch ch) { acc = ((acc << 5) + acc) ^ ch; }
    operator Int() const { return acc; }
    Int acc;
};

int main(int argc, char* argv[])
{
    string s("Hellp, world");
    cout << hex << showbase
        << for_each(s.begin(), s.end(), hash<unsigned long long>()) << '\n';
    return 0;
}

回复收藏 0 原文

他是夢罘是命 2024-09-02 12:02:53

小字符串的另一种方法：

int hash(const char* str) {
    int hash = 0;
    int c = 0;

    while (c < std::strlen(str)) {
        hash += (int)str[c] << (int)str[c+1];
        c++;
    }
    return hash;
}

Another way for small strings:

int hash(const char* str) {
    int hash = 0;
    int c = 0;

    while (c < std::strlen(str)) {
        hash += (int)str[c] << (int)str[c+1];
        c++;
    }
    return hash;
}

回复收藏 0 原文

顾挽 2024-09-02 12:02:53

您可以使用成员函数 operator[]或字符串类或迭代器的 at 来访问字符串的各个字符对象而不将其转换为 C 风格的 char 数组。

要将字符串对象哈希为整数，您必须访问字符串对象的每个单独的字符，您可以执行以下操作：

for (i=0; i < str.length(); i++) {
    // use str[i] or str.at(i) to access ith element.
}

You can make use of the member functions operator[] or at of the string class or iterators to access individual char of a string object without converting it to c-style char array.

To hash a string object to an integer you'll have to access each individual char of the string object which you can do as:

for (i=0; i < str.length(); i++) {
    // use str[i] or str.at(i) to access ith element.
}

回复收藏 0 原文

~没有更多了~

关于作者

本王不退位尔等都是臣

暂无简介

文章

618 人气

关注发私信

友情链接

文江博客

如何使用 c++ 将字符串哈希为 int？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（10）

关于作者

相关话题

热门标签

推荐作者

╰ゝ天使的微笑

少女净妖师

朱洁

觉浅

滥情空心

hl1314520

友情链接

如何使用 c++ 将字符串哈希为 int？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（10）

关于作者

相关话题

热门标签

推荐作者

╰ゝ天使的微笑

少女净妖师

朱洁

觉浅

滥情空心

hl1314520

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。