创建短哈希的最佳方法是什么,类似于tiny Url 的做法?

发布于 2024-07-26 17:48:17 字数 294 浏览 7 评论 0原文

我目前正在使用 MD5 哈希值,但我想找到一些可以创建仅使用 [az][AZ][0-9] 的较短哈希值的东西。 它只需要大约 5-10 个字符长。

是否已经有一些东西可以做到这一点?

更新 1:

我喜欢 CRC32 哈希值。 在.NET中是否有一种干净的计算方法?

更新 2:

我正在使用 Joe 提供的链接中的 CRC32 函数。 如何将 uInt 转换为上面定义的字符?

I'm currently using MD5 hashes but I would like to find something that will create a shorter hash that uses just [a-z][A-Z][0-9]. It only needs to be around 5-10 characters long.

Is there something out there that already does this?

Update 1:

I like the CRC32 hash. Is there a clean way of calculating it in .NET?

Update 2:

I'm using the CRC32 function from the link Joe provided. How can I convert the uInt into the characters defined above?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

北城挽邺 2024-08-02 17:48:17

.NET 字符串对象有一个 GetHashCode() 函数。 它返回一个整数。
将其转换为十六进制,然后转换为 8 个字符长的字符串。

像这样:

string hashCode = String.Format("{0:X}", sourceString.GetHashCode());

更多信息:http://msdn.microsoft。 com/en-us/library/system.string.gethashcode.aspx

更新: 将上面链接中的注释添加到此答案中:

GetHashCode 的行为取决于其实现,其中
可能会从公共语言运行时的一个版本更改为
其他。 发生这种情况的一个原因是为了提高性能
GetHashCode 的。

如果两个字符串对象相等,则 GetHashCode 方法返回
相同的值。 但是,没有唯一的哈希码值
每个唯一的字符串值。 不同的字符串可以返回相同的哈希值
代码。

来电者须知

GetHashCode 返回的值依赖于平台。 它的不同之处在于
.NET Framework 的 32 位和 64 位版本。

.NET string object has a GetHashCode() function. It returns an integer.
Convert it into a hex and then to an 8 characters long string.

Like so:

string hashCode = String.Format("{0:X}", sourceString.GetHashCode());

More on that: http://msdn.microsoft.com/en-us/library/system.string.gethashcode.aspx

UPDATE: Added the remarks from the link above to this answer:

The behavior of GetHashCode is dependent on its implementation, which
might change from one version of the common language runtime to
another. A reason why this might happen is to improve the performance
of GetHashCode.

If two string objects are equal, the GetHashCode method returns
identical values. However, there is not a unique hash code value for
each unique string value. Different strings can return the same hash
code.

Notes to Callers

The value returned by GetHashCode is platform-dependent. It differs on
the 32-bit and 64-bit versions of the .NET Framework.

何其悲哀 2024-08-02 17:48:17

您的目标是创建 URL 缩短器还是创建哈希函数?

如果您的目标是创建 URL 缩短器,那么您不需要哈希函数。 在这种情况下,您只需预先生成一个加密安全随机数序列,然后为每个要编码的 url 分配该序列中的唯一数字。

您可以使用如下代码来执行此操作:

using System.Security.Cryptography;

const int numberOfNumbersNeeded = 100;
const int numberOfBytesNeeded = 8;
var randomGen = RandomNumberGenerator.Create();
for (int i = 0; i < numberOfNumbersNeeded; ++i)
{
     var bytes = new Byte[numberOfBytesNeeded];
     randomGen.GetBytes(bytes);
}

使用加密数字生成器将使人们很难预测您生成的字符串,我认为这对您很重要。

然后,您可以使用字母表中的字符将 8 字节随机数转换为字符串。 这基本上是基数计算的变化(从基数 256 到基数 62)。

Is your goal to create a URL shortener or to create a hash function?

If your goal is to create a URL shortener, then you don't need a hash function. In that case, you just want to pre generate a sequence of cryptographically secure random numbers, and then assign each url to be encoded a unique number from the sequence.

You can do this using code like:

using System.Security.Cryptography;

const int numberOfNumbersNeeded = 100;
const int numberOfBytesNeeded = 8;
var randomGen = RandomNumberGenerator.Create();
for (int i = 0; i < numberOfNumbersNeeded; ++i)
{
     var bytes = new Byte[numberOfBytesNeeded];
     randomGen.GetBytes(bytes);
}

Using the cryptographic number generator will make it very difficult for people to predict the strings you generate, which I assume is important to you.

You can then convert the 8 byte random number into a string using the chars in your alphabet. This is basically a change of base calculation (from base 256 to base 62).

愁杀 2024-08-02 17:48:17

我不认为 URL 缩短服务使用哈希值,我认为它们只是有一个运行的字母数字字符串,该字符串随着每个新 URL 的增加而增加并存储在数据库中。
如果您确实需要使用哈希函数,请查看此链接:一些哈希函数
另外,有点离题,但根据您正在研究的内容,这可能会很有趣:编码恐怖文章

I dont think URL shortening services use hashes, I think they just have a running alphanumerical string that is increased with every new URL and stored in a database.
If you really need to use a hash function have a look at this link: some hash functions
Also, a bit offtopic but depending on what you are working on this might be interesting: Coding Horror article

哆啦不做梦 2024-08-02 17:48:17

只需采用 Base36(不区分大小写)或 Base64 的条目 ID 即可。

所以,假设我想使用 Base36:

(ID - Base36)
1 - 1
2 - 2
3 - 3
10 - 一个
11 - B
12 - C
...
10000 - 7PS
22000 - GZ4
34000 - Q8C
...
1000000 - LFLS
2345000 - 1E9EW
6000000 - 3KLMO

如果您使用 base64,则可以使这些内容更短,但 URL 会区分大小写。 您可以看到您仍然得到漂亮、整洁的字母数字密钥,并且保证不会发生冲突!

Just take a Base36 (case-insensitive) or Base64 of the ID of the entry.

So, lets say I wanted to use Base36:

(ID - Base36)
1 - 1
2 - 2
3 - 3
10 - A
11 - B
12 - C
...
10000 - 7PS
22000 - GZ4
34000 - Q8C
...
1000000 - LFLS
2345000 - 1E9EW
6000000 - 3KLMO

You could keep these even shorter if you went with base64 but then the URL's would be case-sensitive. You can see you still get your nice, neat alphanumeric key and with a guarantee that there will be no collisions!

若水般的淡然安静女子 2024-08-02 17:48:17

您不能使用哈希值,因为您需要从短版本到实际值的一对一映射。 对于短哈希来说,发生冲突的机会太高了。 正常的长哈希不会非常用户友好(即使碰撞的机会可能足够小,但对我来说仍然感觉不“正确”)。

TinyURL.com 似乎使用转换为Base 36(0-9,AZ)。

You cannot use a short hash as you need a one-to-one mapping from the short version to the actual value. For a short hash the chance for a collision would be far too high. Normal, long hashes, would not be very user-friendly (and even though the chance for a collision would probably be small enough then, it still wouldn't feel "right" to me).

TinyURL.com seems to use an incremented number that is converted to Base 36 (0-9, A-Z).

羁拥 2024-08-02 17:48:17

首先,我得到一个随机不同数字的列表。 然后我从基本字符串中选择每个char,追加并返回结果。 我选择 5 个字符,这将相当于基于 62 的 6471002 个排列。第二部分是检查数据库以查看是否存在,如果不保存短网址。

 const string BaseUrlChars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";

 private static string ShortUrl
 {
     get
     {
         const int numberOfCharsToSelect = 5;
         int maxNumber = BaseUrlChars.Length;

         var rnd = new Random();
         var numList = new List<int>();

         for (int i = 0; i < numberOfCharsToSelect; i++)
             numList.Add(rnd.Next(maxNumber));

         return numList.Aggregate(string.Empty, (current, num) => current + BaseUrlChars.Substring(num, 1));
      } 
  }

First I get a list of random distinct numbers. Then I select each char from base string, append and return result. I'm selecting 5 chars, that will amount to 6471002 permutations out of base 62. Second part is to check against db to see if any exists, if not save short url.

 const string BaseUrlChars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";

 private static string ShortUrl
 {
     get
     {
         const int numberOfCharsToSelect = 5;
         int maxNumber = BaseUrlChars.Length;

         var rnd = new Random();
         var numList = new List<int>();

         for (int i = 0; i < numberOfCharsToSelect; i++)
             numList.Add(rnd.Next(maxNumber));

         return numList.Aggregate(string.Empty, (current, num) => current + BaseUrlChars.Substring(num, 1));
      } 
  }
乱了心跳 2024-08-02 17:48:17

您可以通过将 MD5 哈希值编码为字母数字来减少字符数。 每个 MD5 字符通常表示为十六进制,因此有 16 个可能的值。 [a-zA-Z0-9] 包含 62 个可能的值,因此您可以通过采用 4 个 MD5 值对每个值进行编码。

编辑:

这是一个函数,它接受一个数字(4 个十六进制数字长)并返回 [0-9a-zA-Z]。 这应该会让您了解如何实施它。 请注意,类型可能存在一些问题; 我没有测试这段代码。

char num2char( unsigned int x ){
    if( x < 26 ) return (char)('a' + (int)x);
    if( x < 52 ) return (char)('A' + (int)x - 26);
    if( x < 62 ) return (char)('0' + (int)x - 52);
    if( x == 62 ) return '0';
    if( x == 63 ) return '1';
}

You can decrease the number of characters from the MD5 hash by encoding them as alphanumerics. Each MD5 character is usually represented as hex, so that's 16 possible values. [a-zA-Z0-9] includes 62 possible values, so you could encode each value by taking 4 MD5 values.

EDIT:

here's a function that takes a number ( 4 hex digits long ) and returns [0-9a-zA-Z]. This should give you an idea of how to implement it. Note that there may be some issues with the types; I didn't test this code.

char num2char( unsigned int x ){
    if( x < 26 ) return (char)('a' + (int)x);
    if( x < 52 ) return (char)('A' + (int)x - 26);
    if( x < 62 ) return (char)('0' + (int)x - 52);
    if( x == 62 ) return '0';
    if( x == 63 ) return '1';
}
月朦胧 2024-08-02 17:48:17

您可以使用 CRC32,它有 8 个字节长,与 MD5 类似。 通过向实际值添加时间戳来支持唯一值。

所以它看起来像 http://foo.bar/abcdefg12

You can use CRC32, it is 8 bytes long and similar to MD5. Unique values will be supported by adding timestamp to actual value.

So its will look like http://foo.bar/abcdefg12.

惯饮孤独 2024-08-02 17:48:17

如果您正在寻找一个可以从 inters 生成微小独特哈希值的库,我强烈推荐 http://hashids.org/网/。 我在很多项目中使用它并且效果非常好。 您还可以为自定义哈希指定您自己的字符集。

If you're looking for a library that generates tiny unique hashes from inters, I can highly recommend http://hashids.org/net/. I use it in many projects and it works fantastically. You can also specify your own character set for custom hashes.

优雅的叶子 2024-08-02 17:48:17

如果您不关心加密强度,任何 CRC 函数都可以。

维基百科列出了一堆不同的哈希函数,包括输出的长度。 将它们的输出转换为 [az][AZ][0-9] 很简单。

If you don't care about cryptographic strength, any of the CRC functions will do.

Wikipedia lists a bunch of different hash functions, including length of output. Converting their output to [a-z][A-Z][0-9] is trivial.

世界如花海般美丽 2024-08-02 17:48:17

您可以使用 base64 而不是十六进制对 md5 哈希码进行编码,这样您就可以使用字符 [az][AZ][0-9] 获得更短的 url。

You could encode your md5 hash code with base64 instead of hexadecimal, this way you get a shorter url using exactly the characters [a-z][A-Z][0-9].

遗忘曾经 2024-08-02 17:48:17

有一个很棒但古老的程序,名为 btoa ,它可以转换二进制使用大写和小写字母、数字和两个附加字符转换为 ASCII。 还有 MIME base64 编码; 大多数 Linux 系统可能都有一个名为 base64base64encode 的程序。 任何一种都会为您提供来自 32 位 CRC 的简短、可读的字符串。

There's a wonderful but ancient program called btoa which converts binary to ASCII using upper- and lower-case letters, digits, and two additional characters. There's also the MIME base64 encoding; most Linux systems probably have a program called base64 or base64encode. Either one would give you a short, readable string from a 32-bit CRC.

超可爱的懒熊 2024-08-02 17:48:17

您可以采用 MD5 哈希值的前 5-10 个字母数字字符。

You could take the first alphanumeric 5-10 characters of the MD5 hash.

末骤雨初歇 2024-08-02 17:48:17

如果您需要在每次调用时更改哈希值,您可以执行以下操作:

string hash = String.Format("{0:X}", DateTime.Now.GetHashCode());

If you need the hash to change on every call, you can do something like:

string hash = String.Format("{0:X}", DateTime.Now.GetHashCode());
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文