将 MD5 哈希值表示为整数

发布于 2024-08-04 06:52:56 字数 569 浏览 4 评论 0原文

在我的用户数据库表中,我将用户电子邮件地址的 MD5 哈希值作为 id。

示例: email([电子邮件受保护]) = id(d41d8cd98f00b204e980099 8ecf8427e )

不幸的是,我现在必须将 ids 表示为整数值 - 以便能够使用 id 只能是整数的 API。

现在我正在寻找一种将 id 编码为整数的方法,以便在接收时再次发送解码。我怎么能这样做呢?

到目前为止我的想法:

  1. MD5 哈希的 convert_uuencode()convert_uudecode()
  2. 将 MD5 哈希的每个字符替换为其 ord()

哪种方法更好?您知道更好的方法吗?

我希望你能帮助我。预先非常感谢您!

In my user database table, I take the MD5 hash of the email address of a user as the id.

Example: email([email protected]) = id(d41d8cd98f00b204e9800998ecf8427e)

Unfortunately, I have to represent the ids as integer values now - in order to be able to use an API where the id can only be an integer.

Now I'm looking for a way to encode the id into an integer for sending an decode it again when receiving. How could I do this?

My ideas so far:

  1. convert_uuencode() and convert_uudecode() for the MD5 hash
  2. replace every character of the MD5 hash by its ord() value

Which approach is better? Do you know even better ways to do this?

I hope you can help me. Thank you very much in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

若水微香 2024-08-11 06:52:56

当心。将 MD5 转换为整数需要支持大整数(128 位)。您使用的 API 很可能只支持 32 位整数,或者更糟糕的是,可能会处理浮点数。不管怎样,你的身份证都会被盗用。如果是这种情况,仅仅任意分配第二个 ID 是比尝试将 MD5 转换为整数更好的处理方法。

但是,如果您确定 API 可以毫无问题地处理任意大的整数,则只需将 MD5 从十六进制转换为整数即可。然而,PHP 很可能不支持此内置函数,因为它会尝试将其表示为 32 位整数或浮点数;您可能需要使用 PHP GMP 库

Be careful. Converting the MD5s to an integer will require support for big (128-bit) integers. Chances are the API you're using will only support 32-bit integers - or worse, might be dealing with the number in floating-point. Either way, your ID will get munged. If this is the case, just assigning a second ID arbitrarily is a much better way to deal with things than trying to convert the MD5 into an integer.

However, if you are sure that the API can deal with arbitrarily large integers without trouble, you can just convert the MD5 from hexadecimal to an integer. PHP most likely does not support this built-in however, as it will try to represent it as either a 32-bit integer or a floating point; you'll probably need to use the PHP GMP library for it.

永言不败 2024-08-11 06:52:56

其他人指出,采用不同的方式是有充分理由的。

但如果你想要做的是将 md5 哈希转换为 字符串
十进制数字
(我认为这就是你真正的意思
“用整数表示”,因为 md5 已经是字符串形式的整数),
并将其转换回相同的 md5 字符串:

function md5_hex_to_dec($hex_str)
{
    $arr = str_split($hex_str, 4);
    foreach ($arr as $grp) {
        $dec[] = str_pad(hexdec($grp), 5, '0', STR_PAD_LEFT);
    }
    return implode('', $dec);
}

function md5_dec_to_hex($dec_str)
{
    $arr = str_split($dec_str, 5);
    foreach ($arr as $grp) {
        $hex[] = str_pad(dechex($grp), 4, '0', STR_PAD_LEFT);
    }
    return implode('', $hex);
}

演示:

$md5 = md5('[email protected]');
echo $md5 . '<br />';  // 23463b99b62a72f26ed677cc556c44e8
$dec = md5_hex_to_dec($md5);
echo $dec . '<br />';  // 0903015257466342942628374306682186817640
$hex = md5_dec_to_hex($dec);
echo $hex;             // 23463b99b62a72f26ed677cc556c44e8

当然,您必须小心使用任一字符串,例如确保仅将它们用作字符串类型以避免丢失前导零,确保字符串的长度正确, ETC。

There are good reasons, stated by others, for doing it a different way.

But if what you want to do is convert an md5 hash into a string
of decimal digits
(which is what I think you really mean by
"represent by an integer", since an md5 is already an integer in string form),
and transform it back into the same md5 string:

function md5_hex_to_dec($hex_str)
{
    $arr = str_split($hex_str, 4);
    foreach ($arr as $grp) {
        $dec[] = str_pad(hexdec($grp), 5, '0', STR_PAD_LEFT);
    }
    return implode('', $dec);
}

function md5_dec_to_hex($dec_str)
{
    $arr = str_split($dec_str, 5);
    foreach ($arr as $grp) {
        $hex[] = str_pad(dechex($grp), 4, '0', STR_PAD_LEFT);
    }
    return implode('', $hex);
}

Demo:

$md5 = md5('[email protected]');
echo $md5 . '<br />';  // 23463b99b62a72f26ed677cc556c44e8
$dec = md5_hex_to_dec($md5);
echo $dec . '<br />';  // 0903015257466342942628374306682186817640
$hex = md5_dec_to_hex($dec);
echo $hex;             // 23463b99b62a72f26ed677cc556c44e8

Of course, you'd have to be careful using either string, like making sure to use them only as string type to avoid losing leading zeros, ensuring the strings are the correct lengths, etc.

左耳近心 2024-08-11 06:52:56

一个简单的解决方案可以使用 hexdec() 用于哈希部分的转换。

可以容纳 64 位 Int 的系统可以拆分 128 位/16 字节 md5() 哈希为四个 4 字节部分,然后将每个部分转换为无符号 32 位整数的表示形式。每个十六进制对代表 1 个字节,因此使用 8 个字符块:

$hash = md5($value);

foreach (str_split($hash, 8) as $chunk) {
    $int_hashes[] = hexdec($chunk);
}

在另一端,使用 dechex() 将值转换回来:

foreach ($int_hashes as $ihash) {
    $original_hash .= dechex($ihash);
}

警告:由于 PHP 处理方式的潜在缺陷 整数 以及它如何实现 hexdec()intval(),该策略不适用于32位系统。

编辑要点:

  • PHP 中的整数总是有符号的,没有无符号的整数。

  • 尽管 intval() 在某些情况下可能有用,但 hexdec() 对于 16 进制来说性能更高且更易于使用。

  • hexdec()7fffffffffffffff 之上的值转换为浮点数,使其用于将哈希拆分为两个 64 位/8 字节块没有实际意义。

  • intval($chunk, 16) 类似,对于 7ffffffffffffffff 及以上,它返回相同的 Int 值。

A simple solution could use hexdec() for conversions for parts of the hash.

Systems that can accommodate 64-bit Ints can split the 128-bit/16-byte md5() hash into four 4-byte sections and then convert each into representations of unsigned 32-bit Ints. Each hex pair represents 1 byte, so use 8 character chunks:

$hash = md5($value);

foreach (str_split($hash, 8) as $chunk) {
    $int_hashes[] = hexdec($chunk);
}

On the other end, use dechex() to convert the values back:

foreach ($int_hashes as $ihash) {
    $original_hash .= dechex($ihash);
}

Caveat: Due to underlying deficiencies with how PHP handles integers and how it implements hexdec() and intval(), this strategy will not work with 32-bit systems.

Edit Takeaways:

  • Ints in PHP are always signed, there are no unsigned Ints.

  • Although intval() may be useful for certain cases, hexdec() is more performant and more simple to use for base-16.

  • hexdec() converts values above 7fffffffffffffff into Floats, making its use moot for splitting the hash into two 64-bit/8-byte chunks.

  • Similarly for intval($chunk, 16), it returns the same Int value for 7fffffffffffffff and above.

凌乱心跳 2024-08-11 06:52:56

为什么要排序()? md5 生成正常的 16 字节值,以十六进制形式呈现给您以获得更好的可读性。因此,您无法将 16 字节值无损地转换为 4 或 8 字节整数。您必须更改算法的某些部分才能使用它作为 id。

Why ord()? md5 produce normal 16-byte value, presented to you in hex for better readability. So you can't convert 16-byte value to 4 or 8 byte integer without loss. You must change some part of your algoritms to use this as id.

只为一人 2024-08-11 06:52:56

您可以使用 hexdec 解析十六进制字符串并将数字存储在数据库。

You could use hexdec to parse the hexadecimal string and store the number in the database.

攒眉千度 2024-08-11 06:52:56

难道不能再添加一个自增 int 字段吗?

Couldn't you just add another field that was an auto-increment int field?

凤舞天涯 2024-08-11 06:52:56

怎么样:

$float = hexdec(md5('string'));

或者

$int = (integer) (substr(hexdec(md5('string')),0,9)*100000000);

肯定有更大的碰撞机会,但仍然足以在数据库中使用而不是散列?

what about:

$float = hexdec(md5('string'));

or

$int = (integer) (substr(hexdec(md5('string')),0,9)*100000000);

Definitely bigger chances for collision but still good enaugh to use instead of hash in DB though?

无法言说的痛 2024-08-11 06:52:56

将这两列添加到您的表中。

`email_md5_l` bigint(20) UNSIGNED GENERATED ALWAYS AS (conv(left(md5(`email`),16),16,10)) STORED,
`email_md5_r` bigint(20) UNSIGNED GENERATED ALWAYS AS (conv(right(md5(`email`),16),16,10)) STORED,

不过,在这两列上创建 PK 可能有帮助,也可能没有帮助,因为它可能连接两个字符串表示并对结果进行哈希处理。这会有点违背你的目的,完整扫描可能会更快,但这取决于列和记录的数量。不要尝试在 php 中读取这些 bigint,因为它没有无符号整数,只需留在 SQL 中并执行类似以下操作:

select email 
into result 
from `address`
where url_md5_l = conv(left(md5(the_email), 16), 16, 10)
  and url_md5_r = conv(right(md5(the_email), 16), 16, 10) 
limit 1;

MD5 确实会发生冲突。

Add these two columns to your table.

`email_md5_l` bigint(20) UNSIGNED GENERATED ALWAYS AS (conv(left(md5(`email`),16),16,10)) STORED,
`email_md5_r` bigint(20) UNSIGNED GENERATED ALWAYS AS (conv(right(md5(`email`),16),16,10)) STORED,

It might or might not help to create a PK on these two columns though, as it probably concatenates two string representations and hashes the result. It would kind of defeat your purpose and a full scan might be quicker but that depends on number of columns and records. Don't try to read these bigints in php as it doesn't have unsigned integers, just stay in SQL and do something like:

select email 
into result 
from `address`
where url_md5_l = conv(left(md5(the_email), 16), 16, 10)
  and url_md5_r = conv(right(md5(the_email), 16), 16, 10) 
limit 1;

MD5 does collide btw.

楠木可依 2024-08-11 06:52:56

使用电子邮件地址作为共享文件夹中空白临时文件的文件名,例如 /var/myprocess/[email protected]

然后,对文件名调用 ftok。 ftok 将返回一个唯一的整数 ID。

虽然不能保证它是唯一的,但它可能足以满足您的 API 的需要。

Use the email address as the file name of a blank, temporary file in a shared folder, like /var/myprocess/[email protected]

Then, call ftok on the file name. ftok will return a unique, integer ID.

It won't be guaranteed to be unique though, but it will probably suffice for your API.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文