在 PHP/MySQL 中生成唯一的代码?

发布于 2024-07-07 18:33:57 字数 275 浏览 14 评论 0原文

我正在与一位客户合作,该客户需要生成数百万个用于杂志刮刮卡、瓶盖奖品等的字母数字代码。 它们必须足够短才能打印在帽子上,他们希望确保不包含 1 和 I、0 和 O 等不明确的字符,并且必须显式存储它们以供将来使用 - 我们可以”当有人试图兑换时,不只是有一种算法可以确定“有效性”。 最后,他们希望确保代码随机分布在一个大的“代码空间”内,这样人们就不能通过浏览字母表来猜测其他代码。

是否有任何指向合理有效的算法来生成此类代码集的指针? 我在信封背面划了一些,但这个问题对于粗心的人来说就像是一个陷阱。

I'm working with a client that needs to generate millions of the alphanumeric codes used in magazine scratch-off cards, bottlecap prizes, and so on. They have to be short enough to print on a cap, they want to make sure that ambiguous characters like 1 and I, 0 and O, etc. are not included, and they have to be explicitly stored for future use -- we can't just have an algorithm that determines 'validity' when someone tries to redeem one. Finally, they want to make sure that the codes are randomly distributed inside of a large "code space" so that people can't just guess additional codes by walking through the alphabet.

Are there any pointers towards reasonably efficient algorithms for generating these kinds of code sets? I've scratched a few out on the back of an envelope, but this problem smells like a trap for the unwary.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

我偏爱纯白色 2024-07-14 18:33:57

例如,如果您需要大约 1000 万个唯一密钥,最好的方法是选择一个指数级更大的密钥空间,并开始随机生成。 了解生日悖论——这是您应该担心的主要事情。 如果您需要 2^n 个唯一且安全的密钥,请确保至少有 2^(2 * n) 个可能的值。 这是一个粗略的 O(n log n) 算法:

  • 使用至少 2^50 的键空间(因此,换句话说,允许 2^50 个可能的唯一值),并且整个数据集中几乎不会发生任何冲突 - - 任何暴力破解你的密钥的人,如果尝试 2^25 次,获得密钥的几率大约为偶数。
  • 生成所需数量的随机数
  • 在密钥上索引数据库(这是 O(n lg n) 步骤:排序)
  • 分页浏览数据库并迭代整个数据集以删除重复项(下面的伪代码)
  • 删除重复项行,就完成了。

伪代码:

$last = null;
while ($current = getnext()) {
    if ($last == $current) {
        push($toDelete, $current);
    }
    $last = $current;
}

If you need about 10 million unique keys (for example), the best approach is to pick a key-space that's exponentially bigger, and start randomly generating. Read about the Birthday Paradox -- it's the main thing you should be worried about. If you want 2^n unique and secure keys, make sure there are at least 2^(2 * n) possible values. Here's a rough O(n log n) algorithm:

  • Use a key space of at least 2^50 (so, in other words, allow 2^50 possible unique values), and you'll have barely any collisions in your entire dataset -- and anyone brute forcing your keys will have about even odds of getting a key if they try 2^25 of them.
  • generate as many random numbers as you need
  • index the database on your key (this is the O(n lg n) step: the sort)
  • page through the DB and iterate over the entire data set to trim duplicates (pseudocode below)
  • Delete the duplicate rows, and you're done.

Pseudocode:

$last = null;
while ($current = getnext()) {
    if ($last == $current) {
        push($toDelete, $current);
    }
    $last = $current;
}
一城柳絮吹成雪 2024-07-14 18:33:57

假设您可以使用包含 40 个明确的大写、小写和数字字符的字符集。

对于 n 个字符的序列,您有 40n 种组合

  • 404 = 2,560,000
  • 405 = 102,400,000
  • 406 = 4,096,000,000
  • 407 = 163,840,000,000
  • 408 = 6,553,600,000,000

因此,8 个字符提供了一个相当好的工作空间 - 如果您生成了 1000 万个代码,则必须尝试数十万种组合来暴力破解代码。

或者您从另一个方向来 - 给出可能代码的数量,您应该生成多少代码以避免他们称之为生日悖论

采用 8 个字符的代码,6,553,600,000,000 约为 242,因此您可以合理地从中生成 221 代码,即 2,097,152

Let's suppose you can use a character set of, say, 40 symbols of unambiguous upper,lower and numeric characters.

For a sequence of n chars, you've got 40n combinations

  • 404 = 2,560,000
  • 405 = 102,400,000
  • 406 = 4,096,000,000
  • 407 = 163,840,000,000
  • 408 = 6,553,600,000,000

Thus 8 chars gives a pretty good space to work in - if you generated 10 million codes, you'd have to try hundreds of thousands of combinations to brute force a code.

Or you come at from the other direction - give the number of possible codes, how many codes should you generate to avoid the trap they call the Birthday Paradox?

Taking the 8 char code, 6,553,600,000,000 is approx 242, thus you might reasonably generate 221 codes from it, or 2,097,152

红墙和绿瓦 2024-07-14 18:33:57

使用一次性密码算法?

RFC4225详细介绍了一种基于HMAC的算法。

http://www.ietf.org/rfc/rfc4226.txt

但不是使用0-9位base10编码,使用base32。

Use a one time password algorithm?

RFC4225 details one based on HMAC algorithm.

http://www.ietf.org/rfc/rfc4226.txt

but instead of using 0-9 digits base10 encoding, use base32.

北城孤痞 2024-07-14 18:33:57

无论您使用什么方法,我建议您添加一两个校验位,作为防止人们错误输入或试图发明数字的“第一线”防御。

Whatver method you use, I would suggest you add a check digit or two as a "first-line" defence against people mis-entering or trying to invent a number.

魔法唧唧 2024-07-14 18:33:57

奇怪的是,使用以下种子我只能生成 32 个唯一的字符串。

ABCDEFGHJKLMNPQRSTUVWXYZ23456789

使用更长的种子,我能够生成更多的字符串,成功生成了 40,000 个唯一的字符串。

ABCDEFGHJKLMNPQRSTUVWXYZ234567892345678923456789ABCDEFGHJKLMNPQRSTUVWXYZ234567892345678923456789ABCDEFGHJKLMNPQRSTUVWXYZ234567892345678923456789

Oddly enough, with the following seed I was only able to generate 32 unique strings.

ABCDEFGHJKLMNPQRSTUVWXYZ23456789

With a longer seed I was able to generate many more--generated 40,000 unique strings successfully.

ABCDEFGHJKLMNPQRSTUVWXYZ234567892345678923456789ABCDEFGHJKLMNPQRSTUVWXYZ234567892345678923456789ABCDEFGHJKLMNPQRSTUVWXYZ234567892345678923456789

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文