如何生成随机 SHA1 哈希值以用作 Node.js 中的 ID?

发布于 2025-01-08 02:29:20 字数 173 浏览 4 评论 0原文

我正在使用这一行为 node.js 生成 sha1 id:

crypto.createHash('sha1').digest('hex');

问题是它每次都返回相同的 id。

是否可以让它每次生成一个随机 ID,以便我可以将其用作数据库文档 ID?

I am using this line to generate a sha1 id for node.js:

crypto.createHash('sha1').digest('hex');

The problem is that it's returning the same id every time.

Is it possible to have it generate a random id each time so I can use it as a database document id?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

甜警司 2025-01-15 02:29:21

243,583,606,221,817,150,598,111,409x 更多熵

我建议使用 crypto.randomBytes。它不是 sha1,但出于 id 目的,它更快,而且就像“随机”一样。

var id = crypto.randomBytes(20).toString('hex');
//=> f26d60305dae929ef8640a75e70dd78ab809cfe9

生成的字符串将是您生成的随机字节长度的两倍;编码为十六进制的每个字节是 2 个字符。 20 个字节将是 40 个十六进制字符。

使用 20 个字节,我们有 256^201,461,501,637,330,902,918,203,684,832,716,283,019,655,932,542,976 唯一的输出值。这与 SHA1 的 160 位(20 字节)可能输出相同

知道了这一点,对我们的随机字节进行 shasum 就没有什么意义了。这就像掷骰子两次但只接受第二次掷骰;无论如何,每次掷骰都有 6 种可能的结果,因此第一次掷骰就足够了。


为什么这样更好?

要理解为什么这样更好,我们首先必须了解哈希函数的工作原理。如果给出相同的输入,哈希函数(包括 SHA1)将始终生成相同的输出。

假设我们想要生成 ID,但我们的随机输入是通过抛硬币生成的。我们有“heads”“tails”

% echo -n "heads" | shasum
c25dda249cdece9d908cc33adcd16aa05e20290f  -

% echo -n "tails" | shasum
71ac9eed6a76a285ae035fe84a251d56ae9485a4  -

如果“heads”再次出现,SHA1输出将是相同 因为这是第一次

% echo -n "heads" | shasum
c25dda249cdece9d908cc33adcd16aa05e20290f  -

好吧,所以抛硬币并不是一个很好的随机 ID 生成器,因为我们只有 2 个可能的输出。

如果我们使用标准的 6 面骰子,我们就有 6 个可能的输入。猜猜有多少种可能的 SHA1 输出? 6!

input => (sha1) => output
1 => 356a192b7913b04c54574d18c28d46e6395428ab
2 => da4b9237bacccdf19c0760cab7aec4a8359010b0
3 => 77de68daecd823babbb58edb1c8e14d7106e83bb
4 => 1b6453892473a467d07372d45eb05abc2031647a
5 => ac3478d69a3c81fa62e60f5c3696165a4e5e6ac4
6 => c1dfd96eea8cc2b62785275bca38ac261256e278

我们很容易欺骗自己,因为我们的函数的输出看起来非常随机,所以它就是非常随机的。

我们都同意抛硬币或 6 面骰子会产生不好的随机 ID 生成器,因为我们可能的 SHA1 结果(我们用于 ID 的值)非常少。但是如果我们使用具有更多输出的东西呢?就像带有毫秒的时间戳一样?或者 JavaScript 的 Math.random?或者甚至是这两者的组合?!

让我们计算一下我们将获得多少个唯一 ID ...


以毫秒为单位的时间戳的唯一性

使用 (new Date()).valueOf().toString() 时,您将获得一个 13 个字符的号码(例如,1375369309741)。然而,由于这是一个连续更新的数字(每毫秒一次),因此输出几乎总是相同的。让我们看一下

for (var i=0; i<10; i++) {
  console.log((new Date()).valueOf().toString());
}
console.log("OMG so not random");

// 1375369431838
// 1375369431839
// 1375369431839
// 1375369431839
// 1375369431839
// 1375369431839
// 1375369431839
// 1375369431839
// 1375369431840
// 1375369431840
// OMG so not random

。公平地说,为了进行比较,在给定的一分钟内(相当长的操作执行时间),您将拥有 60*100060000< /代码> 唯一性。


Math.random 的独特性

现在,当使用 Math.random 时,由于 JavaScript 表示 64 位浮点数的方式,您'将得到一个长度在 13 到 24 个字符之间的数字。更长的结果意味着更多的数字,这意味着更多的熵。首先,我们需要找出最可能的长度。

下面的脚本将确定最有可能的长度。我们通过生成 100 万个随机数并根据每个数字的 .length 递增计数器来实现此目的。

// get distribution
var counts = [], rand, len;
for (var i=0; i<1000000; i++) {
  rand = Math.random();
  len  = String(rand).length;
  if (counts[len] === undefined) counts[len] = 0;
  counts[len] += 1;
}

// calculate % frequency
var freq = counts.map(function(n) { return n/1000000 *100 });

通过将每个计数器除以 100 万,我们得到了从 Math.random 返回的数字长度的概率。

len   frequency(%)
------------------
13    0.0004  
14    0.0066  
15    0.0654  
16    0.6768  
17    6.6703  
18    61.133  <- highest probability
19    28.089  <- second highest probability
20    3.0287  
21    0.2989  
22    0.0262
23    0.0040
24    0.0004

因此,尽管这并不完全正确,但我们可以慷慨地说,您得到了 19 个字符长的随机输出; 0.1234567890123456789。第一个字符始终是 0.,所以实际上我们只获得 17 个随机字符。这给我们留下了 10^17 +1 (对于可能的 0;请参阅下面的注释)或 100,000,000,000,000,001 唯一值。


那么我们可以生成多少个随机输入?

好的,我们计算了毫秒时间戳的结果数量,并且 Math.random

      100,000,000,000,000,001 (Math.random)
*                      60,000 (timestamp)
-----------------------------
6,000,000,000,000,000,060,000

这是一个 6,000,000,000,000,000,060,000 面的骰子。或者,为了使这个数字更容易理解,这个数字大致

input                                            outputs
------------------------------------------------------------------------------
( 1×) 6,000,000,000,000,000,060,000-sided die    6,000,000,000,000,000,060,000
(28×) 6-sided die                                6,140,942,214,464,815,497,21
(72×) 2-sided coins                              4,722,366,482,869,645,213,696

听起来不错,对吧?好吧,让我们找出...

SHA1 生成一个 20 字节的值,可能有256^20 个结果。所以我们确实没有充分发挥 SHA1 的潜力。那么我们用了多少?

node> 6000000000000000060000 / Math.pow(256,20) * 100

毫秒时间戳和 Math.random 仅使用 SHA1 160 位潜力的 4.11e-27%!

generator               sha1 potential used
-----------------------------------------------------------------------------
crypto.randomBytes(20)  100%
Date() + Math.random()    0.00000000000000000000000000411%
6-sided die               0.000000000000000000000000000000000000000000000411%
A coin                    0.000000000000000000000000000000000000000000000137%

圣猫,伙计!看看所有这些零。那么 crypto.randomBytes(20) 到底好多少呢? 243,583,606,221,817,150,598,111,409 倍。


关于+1和零频率的注释

如果您想知道+1Math.random< /code> 返回 0 这意味着我们还必须考虑 1 个可能的唯一结果。

根据下面发生的讨论,我很好奇 0 出现的频率。这是一个小脚本,random_zero.js,我用来获取一些数据

#!/usr/bin/env node
var count = 0;
while (Math.random() !== 0) count++;
console.log(count);

然后,我在 4 个线程中运行它(我有一个 4 核处理器),将输出附加到一个文件

$ yes | xargs -n 1 -P 4 node random_zero.js >> zeroes.txt

所以事实证明0 并不难获得。记录100个值后,平均值为

3,164,854,823 随机中的 1 是 0

很酷!需要更多的研究来了解该数字是否与 v8 的 Math.random 实现的均匀分布相当

243,583,606,221,817,150,598,111,409x more entropy

I'd recommend using crypto.randomBytes. It's not sha1, but for id purposes, it's quicker, and just as "random".

var id = crypto.randomBytes(20).toString('hex');
//=> f26d60305dae929ef8640a75e70dd78ab809cfe9

The resulting string will be twice as long as the random bytes you generate; each byte encoded to hex is 2 characters. 20 bytes will be 40 characters of hex.

Using 20 bytes, we have 256^20 or 1,461,501,637,330,902,918,203,684,832,716,283,019,655,932,542,976 unique output values. This is identical to SHA1's 160-bit (20-byte) possible outputs.

Knowing this, it's not really meaningful for us to shasum our random bytes. It's like rolling a die twice but only accepting the second roll; no matter what, you have 6 possible outcomes each roll, so the first roll is sufficient.


Why is this better?

To understand why this is better, we first have to understand how hashing functions work. Hashing functions (including SHA1) will always generate the same output if the same input is given.

Say we want to generate IDs but our random input is generated by a coin toss. We have "heads" or "tails"

% echo -n "heads" | shasum
c25dda249cdece9d908cc33adcd16aa05e20290f  -

% echo -n "tails" | shasum
71ac9eed6a76a285ae035fe84a251d56ae9485a4  -

If "heads" comes up again, the SHA1 output will be the same as it was the first time

% echo -n "heads" | shasum
c25dda249cdece9d908cc33adcd16aa05e20290f  -

Ok, so a coin toss is not a great random ID generator because we only have 2 possible outputs.

If we use a standard 6-sided die, we have 6 possible inputs. Guess how many possible SHA1 outputs? 6!

input => (sha1) => output
1 => 356a192b7913b04c54574d18c28d46e6395428ab
2 => da4b9237bacccdf19c0760cab7aec4a8359010b0
3 => 77de68daecd823babbb58edb1c8e14d7106e83bb
4 => 1b6453892473a467d07372d45eb05abc2031647a
5 => ac3478d69a3c81fa62e60f5c3696165a4e5e6ac4
6 => c1dfd96eea8cc2b62785275bca38ac261256e278

It's easy to delude ourselves by thinking just because the output of our function looks very random, that it is very random.

We both agree that a coin toss or a 6-sided die would make a bad random id generator, because our possible SHA1 results (the value we use for the ID) are very few. But what if we use something that has a lot more outputs? Like a timestamp with milliseconds? Or JavaScript's Math.random? Or even a combination of those two?!

Let's compute just how many unique ids we would get ...


The uniqueness of a timestamp with milliseconds

When using (new Date()).valueOf().toString(), you're getting a 13-character number (e.g., 1375369309741). However, since this a sequentially updating number (once per millisecond), the outputs are almost always the same. Let's take a look

for (var i=0; i<10; i++) {
  console.log((new Date()).valueOf().toString());
}
console.log("OMG so not random");

// 1375369431838
// 1375369431839
// 1375369431839
// 1375369431839
// 1375369431839
// 1375369431839
// 1375369431839
// 1375369431839
// 1375369431840
// 1375369431840
// OMG so not random

To be fair, for comparison purposes, in a given minute (a generous operation execution time), you will have 60*1000 or 60000 uniques.


The uniqueness of Math.random

Now, when using Math.random, because of the way JavaScript represents 64-bit floating point numbers, you'll get a number with length anywhere between 13 and 24 characters long. A longer result means more digits which means more entropy. First, we need to find out which is the most probable length.

The script below will determine which length is most probable. We do this by generating 1 million random numbers and incrementing a counter based on the .length of each number.

// get distribution
var counts = [], rand, len;
for (var i=0; i<1000000; i++) {
  rand = Math.random();
  len  = String(rand).length;
  if (counts[len] === undefined) counts[len] = 0;
  counts[len] += 1;
}

// calculate % frequency
var freq = counts.map(function(n) { return n/1000000 *100 });

By dividing each counter by 1 million, we get the probability of the length of number returned from Math.random.

len   frequency(%)
------------------
13    0.0004  
14    0.0066  
15    0.0654  
16    0.6768  
17    6.6703  
18    61.133  <- highest probability
19    28.089  <- second highest probability
20    3.0287  
21    0.2989  
22    0.0262
23    0.0040
24    0.0004

So, even though it's not entirely true, let's be generous and say you get a 19-character-long random output; 0.1234567890123456789. The first characters will always be 0 and ., so really we're only getting 17 random characters. This leaves us with 10^17 +1 (for possible 0; see notes below) or 100,000,000,000,000,001 uniques.


So how many random inputs can we generate?

Ok, we calculated the number of results for a millisecond timestamp and Math.random

      100,000,000,000,000,001 (Math.random)
*                      60,000 (timestamp)
-----------------------------
6,000,000,000,000,000,060,000

That's a single 6,000,000,000,000,000,060,000-sided die. Or, to make this number more humanly digestible, this is roughly the same number as

input                                            outputs
------------------------------------------------------------------------------
( 1×) 6,000,000,000,000,000,060,000-sided die    6,000,000,000,000,000,060,000
(28×) 6-sided die                                6,140,942,214,464,815,497,21
(72×) 2-sided coins                              4,722,366,482,869,645,213,696

Sounds pretty good, right ? Well, let's find out ...

SHA1 produces a 20-byte value, with a possible 256^20 outcomes. So we're really not using SHA1 to it's full potential. Well how much are we using?

node> 6000000000000000060000 / Math.pow(256,20) * 100

A millisecond timestamp and Math.random uses only 4.11e-27 percent of SHA1's 160-bit potential!

generator               sha1 potential used
-----------------------------------------------------------------------------
crypto.randomBytes(20)  100%
Date() + Math.random()    0.00000000000000000000000000411%
6-sided die               0.000000000000000000000000000000000000000000000411%
A coin                    0.000000000000000000000000000000000000000000000137%

Holy cats, man! Look at all those zeroes. So how much better is crypto.randomBytes(20)? 243,583,606,221,817,150,598,111,409 times better.


Notes about the +1 and frequency of zeroes

If you're wondering about the +1, it's possible for Math.random to return a 0 which means there's 1 more possible unique result we have to account for.

Based on the discussion that happened below, I was curious about the frequency a 0 would come up. Here's a little script, random_zero.js, I made to get some data

#!/usr/bin/env node
var count = 0;
while (Math.random() !== 0) count++;
console.log(count);

Then, I ran it in 4 threads (I have a 4-core processor), appending the output to a file

$ yes | xargs -n 1 -P 4 node random_zero.js >> zeroes.txt

So it turns out that a 0 is not that hard to get. After 100 values were recorded, the average was

1 in 3,164,854,823 randoms is a 0

Cool! More research would be required to know if that number is on-par with a uniform distribution of v8's Math.random implementation

虚拟世界 2025-01-15 02:29:21

看看这里:如何我是否使用 node.js Crypto 创建 HMAC-SHA1 哈希?
我将创建当前时间戳的哈希值+随机数以确保哈希值的唯一性:

var current_date = (new Date()).valueOf().toString();
var random = Math.random().toString();
crypto.createHash('sha1').update(current_date + random).digest('hex');

Have a look here: How do I use node.js Crypto to create a HMAC-SHA1 hash?
I'd create a hash of the current timestamp + a random number to ensure hash uniqueness:

var current_date = (new Date()).valueOf().toString();
var random = Math.random().toString();
crypto.createHash('sha1').update(current_date + random).digest('hex');
淡墨 2025-01-15 02:29:21

也可以在浏览器中执行此操作!

编辑:这并不符合我之前的回答流程。我将其留在这里作为可能希望在浏览器中执行此操作的人们的第二个答案。

如果您愿意,您可以在现代浏览器中执行此客户端操作

// str byteToHex(uint8 byte)
//   converts a single byte to a hex string 
function byteToHex(byte) {
  return ('0' + byte.toString(16)).slice(-2);
}

// str generateId(int len);
//   len - must be an even number (default: 40)
function generateId(len = 40) {
  var arr = new Uint8Array(len / 2);
  window.crypto.getRandomValues(arr);
  return Array.from(arr, byteToHex).join("");
}

console.log(generateId())
// "1e6ef8d5c851a3b5c5ad78f96dd086e4a77da800"

console.log(generateId(20))
// "d2180620d8f781178840"

浏览器要求

Browser    Minimum Version
--------------------------
Chrome     11.0
Firefox    21.0
IE         11.0
Opera      15.0
Safari     5.1

Do it in the browser, too !

EDIT: this didn't really fit into the flow of my previous answer. I'm leaving it here as a second answer for people that might be looking to do this in the browser.

You can do this client side in modern browsers, if you'd like

// str byteToHex(uint8 byte)
//   converts a single byte to a hex string 
function byteToHex(byte) {
  return ('0' + byte.toString(16)).slice(-2);
}

// str generateId(int len);
//   len - must be an even number (default: 40)
function generateId(len = 40) {
  var arr = new Uint8Array(len / 2);
  window.crypto.getRandomValues(arr);
  return Array.from(arr, byteToHex).join("");
}

console.log(generateId())
// "1e6ef8d5c851a3b5c5ad78f96dd086e4a77da800"

console.log(generateId(20))
// "d2180620d8f781178840"

Browser requirements

Browser    Minimum Version
--------------------------
Chrome     11.0
Firefox    21.0
IE         11.0
Opera      15.0
Safari     5.1
总以为 2025-01-15 02:29:21

如果想要获得唯一标识符,您应该使用UUID(通用唯一标识符)/GUID(全局唯一标识符)。

哈希应该是确定性的独特&固定长度 对于任何大小的输入。因此,无论运行哈希函数多少次,如果使用相同的输入,输出都将是相同的。

UUID 是唯一的且唯一的随机生成!
有一个名为“uuid”的软件包,您可以通过

npm install uuid

& 导入模块

来安装它在您的代码中通过const { v4:uuidv4} = require('uuid');

// 在导入时调用方法 uuidv4 或任何你命名的方法记录它、存储它或分配它。该方法返回字符串形式的 UUID。

控制台.log(uuidv4());
// 示例输出: '59594fc8-6a35-4f50-a966-4d735d8402ea'

这是 npm 链接(如果需要):
https://www.npmjs.com/package/uuid

If Want To Get Unique Identifiers, You should use UUID (Universally Unique Identifier) / GUID (Globally Unique Identifier).

A Hash is Supposed to be Deterministic & Unique & of Fixed Length For Input of any size. So no matter how many times you run the hash function, the output will be the same if you use the same input.

UUIDs Are Unique & Randomly Generated!
There Is A Package called 'uuid' you can install it via npm by

npm install uuid

& In your code import the module by

const { v4:uuidv4} = require('uuid');

// Call The Method uuidv4 or whatever you name it while importing & log it or store it or assign it. The method return a UUID in the form of a string.

console.log(uuidv4());
// Example Output : '59594fc8-6a35-4f50-a966-4d735d8402ea'

Here is the npm link (if you need it) :
https://www.npmjs.com/package/uuid

小梨窩很甜 2025-01-15 02:29:21

使用crypto是一个很好的方法,因为它是本机且稳定的模块,
但在某些情况下,如果您想创建一个真正强大且安全的哈希,则可以使用 bcrypt。我用它作为密码,它有很多用于散列、创建盐和比较密码的技术。

技术 1(在单独的函数调用上生成盐和哈希值)

const salt = bcrypt.genSaltSync(saltRounds);
const hash = bcrypt.hashSync(myPlaintextPassword, salt);

技术 2(自动生成盐和哈希值):

const hash = bcrypt.hashSync(myPlaintextPassword, saltRounds);

有关更多示例,您可以在此处查看:https:// www.npmjs.com/package/bcrypt

Using crypto is a good approach cause it's native and stable module,
but there are cases where you can use bcrypt if you want to create a really strong and secure hash. I use it for passwords it has a lot of techniques for hashing, creating salt and comparing passwords.

Technique 1 (generate a salt and hash on separate function calls)

const salt = bcrypt.genSaltSync(saltRounds);
const hash = bcrypt.hashSync(myPlaintextPassword, salt);

Technique 2 (auto-gen a salt and hash):

const hash = bcrypt.hashSync(myPlaintextPassword, saltRounds);

For more examples you can check here: https://www.npmjs.com/package/bcrypt

肤浅与狂妄 2025-01-15 02:29:21

在浏览器中工作的简洁方法:

// Returns a 256 bit string, or the equivalent Uint8Array if `false` is
// passed in.

function get256RandomBits(returnAsString = true) {
  const uint8Array = new Uint8Array(32); // 32 bytes = 256 bits
  const rng = crypto.getRandomValues(uint8Array);
  if (returnAsString) {
    return Array.from(rng).map(b => b.toString(16).padStart(2, '0')).join('');
  }
  else {
    return rng;
  }
}

示例输出:
- c2fec24e465658aad1208d0a3f863585aa2e5fd30f4e0712e2c74239419700d3
- 8f9ff25c2948b8ee39f77303e0678b6eae1382e3d2517e15a1c4de6840f5673b
- d212ff805630edb41d22c9b6fdd8db6e7f910ea25483bad8b598249a6c73d950

我使用 256 位而不是 160 位,因为这个问题是 10 多年前提出的,但想法是相同的;只需根据需要修改第二行即可。

Concise method that works in the browser:

// Returns a 256 bit string, or the equivalent Uint8Array if `false` is
// passed in.

function get256RandomBits(returnAsString = true) {
  const uint8Array = new Uint8Array(32); // 32 bytes = 256 bits
  const rng = crypto.getRandomValues(uint8Array);
  if (returnAsString) {
    return Array.from(rng).map(b => b.toString(16).padStart(2, '0')).join('');
  }
  else {
    return rng;
  }
}

Example outputs:
- c2fec24e465658aad1208d0a3f863585aa2e5fd30f4e0712e2c74239419700d3
- 8f9ff25c2948b8ee39f77303e0678b6eae1382e3d2517e15a1c4de6840f5673b
- d212ff805630edb41d22c9b6fdd8db6e7f910ea25483bad8b598249a6c73d950

I'm using 256 bits instead of 160 seeing as this question was asked over 10 years ago, but the idea is the same; just modify the second line as needed.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文