在 PHP/MySQL 中生成唯一的代码?
我正在与一位客户合作,该客户需要生成数百万个用于杂志刮刮卡、瓶盖奖品等的字母数字代码。 它们必须足够短才能打印在帽子上,他们希望确保不包含 1 和 I、0 和 O 等不明确的字符,并且必须显式存储它们以供将来使用 - 我们可以”当有人试图兑换时,不只是有一种算法可以确定“有效性”。 最后,他们希望确保代码随机分布在一个大的“代码空间”内,这样人们就不能通过浏览字母表来猜测其他代码。
是否有任何指向合理有效的算法来生成此类代码集的指针? 我在信封背面划了一些,但这个问题对于粗心的人来说就像是一个陷阱。
I'm working with a client that needs to generate millions of the alphanumeric codes used in magazine scratch-off cards, bottlecap prizes, and so on. They have to be short enough to print on a cap, they want to make sure that ambiguous characters like 1 and I, 0 and O, etc. are not included, and they have to be explicitly stored for future use -- we can't just have an algorithm that determines 'validity' when someone tries to redeem one. Finally, they want to make sure that the codes are randomly distributed inside of a large "code space" so that people can't just guess additional codes by walking through the alphabet.
Are there any pointers towards reasonably efficient algorithms for generating these kinds of code sets? I've scratched a few out on the back of an envelope, but this problem smells like a trap for the unwary.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
例如,如果您需要大约 1000 万个唯一密钥,最好的方法是选择一个指数级更大的密钥空间,并开始随机生成。 了解生日悖论——这是您应该担心的主要事情。 如果您需要 2^n 个唯一且安全的密钥,请确保至少有 2^(2 * n) 个可能的值。 这是一个粗略的 O(n log n) 算法:
伪代码:
If you need about 10 million unique keys (for example), the best approach is to pick a key-space that's exponentially bigger, and start randomly generating. Read about the Birthday Paradox -- it's the main thing you should be worried about. If you want 2^n unique and secure keys, make sure there are at least 2^(2 * n) possible values. Here's a rough O(n log n) algorithm:
Pseudocode:
假设您可以使用包含 40 个明确的大写、小写和数字字符的字符集。
对于 n 个字符的序列,您有 40n 种组合
因此,8 个字符提供了一个相当好的工作空间 - 如果您生成了 1000 万个代码,则必须尝试数十万种组合来暴力破解代码。
或者您从另一个方向来 - 给出可能代码的数量,您应该生成多少代码以避免他们称之为生日悖论?
采用 8 个字符的代码,6,553,600,000,000 约为 242,因此您可以合理地从中生成 221 代码,即 2,097,152
Let's suppose you can use a character set of, say, 40 symbols of unambiguous upper,lower and numeric characters.
For a sequence of n chars, you've got 40n combinations
Thus 8 chars gives a pretty good space to work in - if you generated 10 million codes, you'd have to try hundreds of thousands of combinations to brute force a code.
Or you come at from the other direction - give the number of possible codes, how many codes should you generate to avoid the trap they call the Birthday Paradox?
Taking the 8 char code, 6,553,600,000,000 is approx 242, thus you might reasonably generate 221 codes from it, or 2,097,152
使用一次性密码算法?
RFC4225详细介绍了一种基于HMAC的算法。
http://www.ietf.org/rfc/rfc4226.txt
但不是使用0-9位base10编码,使用base32。
Use a one time password algorithm?
RFC4225 details one based on HMAC algorithm.
http://www.ietf.org/rfc/rfc4226.txt
but instead of using 0-9 digits base10 encoding, use base32.
无论您使用什么方法,我建议您添加一两个校验位,作为防止人们错误输入或试图发明数字的“第一线”防御。
Whatver method you use, I would suggest you add a check digit or two as a "first-line" defence against people mis-entering or trying to invent a number.
奇怪的是,使用以下种子我只能生成 32 个唯一的字符串。
ABCDEFGHJKLMNPQRSTUVWXYZ23456789
使用更长的种子,我能够生成更多的字符串,成功生成了 40,000 个唯一的字符串。
ABCDEFGHJKLMNPQRSTUVWXYZ234567892345678923456789ABCDEFGHJKLMNPQRSTUVWXYZ234567892345678923456789ABCDEFGHJKLMNPQRSTUVWXYZ234567892345678923456789
Oddly enough, with the following seed I was only able to generate 32 unique strings.
ABCDEFGHJKLMNPQRSTUVWXYZ23456789
With a longer seed I was able to generate many more--generated 40,000 unique strings successfully.
ABCDEFGHJKLMNPQRSTUVWXYZ234567892345678923456789ABCDEFGHJKLMNPQRSTUVWXYZ234567892345678923456789ABCDEFGHJKLMNPQRSTUVWXYZ234567892345678923456789