如何用C语言编写哈希函数?

发布于 2024-08-21 13:33:54 字数 446 浏览 11 评论 0原文

哈希表被认为是存储/检索数据的最快/最好的方式。

我对哈希表、哈希的理解如下(如果我错了请纠正我,或者如果有更多请补充):

  • 哈希表只不过是一个数组(单维或多维) )来存储值。
  • 散列是在数组中查找索引/位置以插入/检索数据的过程。您获取一个数据项并将其作为键传递给哈希函数,您将获得插入/检索数据的索引/位置。

我有一个问题:

用于存储/检索数据的哈希函数是否与 安全应用程序中用于身份验证的加密哈希函数 像 MD5、HMAC、SHA-1 等...?

它们有何不同?

  • 如何用C语言编写哈希函数?
  • 有一些标准或指南吗?
  • 我们如何确保哈希函数的输出(即索引)不超出范围?

如果您能提及一些好的链接来更好地理解这些内容,那就太好了。

Hash Tables are said to be the fastest/best way of Storing/Retrieving data.

My understanding of a hash table, hashing is as follows (Please correct me if I am wrong or Please add If there is anything more):

  • A Hash Table is nothing but an array (single or multi-dimensional) to store values.
  • Hashing is the process to find the index/location in the array to insert/retrieve the data. You take a data item(s) and pass it as a key(s) to a hash function and you would get the index/location where to insert/retrieve the data.

I have a question:

Is the hash function used to store/retrieve the data DIFFERENT from a
cryptographic hash function used in security applications for authentication
like MD5, HMAC, SHA-1 etc...?

In what way(s) are they different?

  • How to write a hash function in C?
  • Is there some standard or guidelines to it?
  • How do we ensure that the output of a hash function i.e, index is not out of range?

It would be great if you could mention some good links to understand these better.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

岁月如刀 2024-08-28 13:33:54

加密哈希强调使任何人都难以故意制造冲突。对于哈希表,重点通常是快速生成合理分布的结果。因此,两者通常有很大不同(特别是,加密哈希通常要慢很多)。

对于典型的散列函数,结果仅受类型限制——例如,如果它返回 size_t,则它完全可以返回任何可能的 size_t。您可以将输出范围缩小到表格的大小(例如,使用除以表格大小的余数,这通常应该是素数)。

举个例子,一个相当典型的普通哈希函数可能看起来像这样:

// warning: untested code.
size_t hash(char const *input) { 

    const int ret_size = 32;
    size_t ret = 0x555555;
    const int per_char = 7;

    while (*input) { 
        ret ^= *input++;
        ret = ((ret << per_char) | (ret >> (ret_size - per_char));
   }
   return ret;
}

这里的基本思想是让输入字符串的每一位都影响结果,并且(尽快)让结果的每一位至少受到影响输入的一部分。请注意,我并不是特别推荐它作为一个很棒的哈希函数——只是试图说明您要完成的任务的一些基础知识。

A cryptographic hash emphasizes making it difficult for anybody to intentionally create a collision. For a hash table, the emphasis is normally on producing a reasonable spread of results quickly. As such, the two are usually quite different (in particular, a cryptographic hash is normally a lot slower).

For a typical hash function, the result is limited only by the type -- e.g. if it returns a size_t, it's perfectly fine for it to return any possible size_t. It's up to you to reduce that output range to the size of your table (e.g. using the remainder of dividing by the size of your table, which should often be a prime number).

As an example, a fairly typical normal hash function might look something like:

// warning: untested code.
size_t hash(char const *input) { 

    const int ret_size = 32;
    size_t ret = 0x555555;
    const int per_char = 7;

    while (*input) { 
        ret ^= *input++;
        ret = ((ret << per_char) | (ret >> (ret_size - per_char));
   }
   return ret;
}

The basic idea here is to have every bit of the input string affect the result, and to (as quickly as possible) have every bit of the result affected by at least part of the input. Note that I'm not particularly recommending this as a great hash function -- only trying to illustrate some of the basics of what you're trying to accomplish.

绝情姑娘 2024-08-28 13:33:54

Bob Jenkins 深入描述了他的优秀(虽然有点过时)哈希函数 。这篇文章提供了更新、更好的哈希函数的链接,但这篇文章解决了构建一个好的哈希函数的问题。

此外,大多数哈希表实现实际上使用链表数组来解决冲突。如果您只想使用数组,则哈希函数需要检查冲突并创建新的哈希索引。

您提到的加密哈希函数可以用作哈希表的哈希函数,
但它们比为哈希表设计的哈希函数慢得多。速度使暴力攻击变得更容易。

Bob Jenkins wrote an in-depth description of his good, if slightly outdated, hash function. The article has links to newer, better hash functions, but the writeup addresses the concerns of building a good one.

Also, most hash table implementations actually use an array of linked lists to resolve collisions. If you want to just use an array then the hash function needs to check for collisions and create a new hash index.

The cryptographic hash functions you mention could be used as hash functions for a hash table,
but they are much slower than hash functions designed for a hash table. Speed makes brute force attacks easier.

表情可笑 2024-08-28 13:33:54

设计目标不同。

例如,对于加密哈希函数,您希望无法使用哈希和哈希函数确定原始数据或任何其他会产生相同哈希值的数据。

与哈希表和哈希表一起使用的哈希函数其他数据结构不需要这样的安全属性。如果散列函数很快并且它将输入集均匀地分配到可能的散列集中(以避免不必要的聚类/冲突),那么通常就足够了。

The design goals are different.

With cryptographic hash functions you want, for example, that the hash and the hash function cannot be used to determine the original data or any other data that would produce the same hash.

Hash functions used with hash tables & other data structures do not need such security properties. It's often enough if the hash function is fast and it will distribute the input set evenly into the set of possible hashes (to avoid unnecessary clustering / collisions).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文