生成无法猜测的独特令牌
我有一个系统,需要安排一些内容并将计划任务的标识符返回给一些外来对象。用户基本上会这样做:
identifier = MyLib.Schedule(something)
# Nah, let's unschedule it.
MyLib.Unschedule(identifier)
我在内部代码中经常使用这种模式,并且我总是使用普通整数作为标识符。但如果这些标识符被不受信任的代码使用,恶意用户就可以通过执行单个 Unschedule(randint())
来破坏整个系统。
我需要代码的用户只能取消安排他们实际安排的标识符。
我能想到的唯一解决方案是生成 64 位随机数作为标识符,并跟踪当前分发的标识符以避免出现极其不可能的重复。还是128位?我什么时候可以说“这是足够随机的,不可能发生重复”(如果有的话)?
或者更好的是,是否有更明智的方法来做到这一点?有没有一种方法可以生成标识符令牌,生成器可以轻松跟踪(避免重复),但对于接收者来说与随机数无法区分?
编辑 - 基于已接受答案的解决方案:
from Crypto.Cipher import AES
import struct, os, itertools
class AES_UniqueIdentifier(object):
def __init__(self):
self.salt = os.urandom(8)
self.count = itertools.count(0)
self.cipher = AES.new(os.urandom(16), AES.MODE_ECB)
def Generate(self):
return self.cipher.encrypt(self.salt +
struct.pack("Q", next(self.count)))
def Verify(self, identifier):
"Return true if identifier was generated by this object."
return self.cipher.decrypt(identifier)[0:8] == self.salt
I have a system that needs to schedule some stuff and return identifiers to the scheduled tasks to some foreign objects. The user would basically do this:
identifier = MyLib.Schedule(something)
# Nah, let's unschedule it.
MyLib.Unschedule(identifier)
I use this kind of pattern a lot in internal code, and I always use plain integers as the identifier. But if the identifiers are used by untrusted code, a malicious user could break the entire system by doing a single Unschedule(randint())
.
I need the users of the code to be able to only unschedule identifiers they have actually scheduled.
The only solution I can think of is to generate i.e 64-bit random numbers as identifiers, and keep track of which identifiers are currently handed out to avoid the ridiculously unlikely duplicates. Or 128-bit? When can I say "this is random enough, no duplicates could possibly occur", if ever?
Or better yet, is there a more sensible way to do this? Is there a way to generate identifier tokens that the generator can easily keep track of (avoiding duplicates) but is indistinguishable from random numbers to the recipient?
EDIT - Solution based on the accepted answer:
from Crypto.Cipher import AES
import struct, os, itertools
class AES_UniqueIdentifier(object):
def __init__(self):
self.salt = os.urandom(8)
self.count = itertools.count(0)
self.cipher = AES.new(os.urandom(16), AES.MODE_ECB)
def Generate(self):
return self.cipher.encrypt(self.salt +
struct.pack("Q", next(self.count)))
def Verify(self, identifier):
"Return true if identifier was generated by this object."
return self.cipher.decrypt(identifier)[0:8] == self.salt
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
根据您拥有的活动 ID 数量,64 位可能太少了。根据生日悖论,你最终会得到基本上你可能期望从 32 获得的保护级别位标识符。
此外,创建这些的最好方法可能是使用一些加盐哈希函数,例如 SHA-1 或 MD5 或您的框架已有的任何函数,以及随机选择的盐(保密),并且这些函数无论如何都会生成至少 128 位,正是出于上述原因。如果您使用创建更长哈希值的东西,我真的不认为有任何理由截断它们。
要创建可以在不存储的情况下进行检查的标识符,请采用易于检测的内容,例如两次使用相同的 64 位模式(总共 128 位),并使用一些恒定的秘密密钥对其进行加密,使用 AES 或其他带有块大小为 128 位(或您选择的任何值)。如果用户发送一些所谓的密钥,请解密并检查您的易于识别的模式。
Depending on how many active IDs you have, 64 bits can be too little. By the birthday paradox, you'd end up with essentially the level of protection you might expect from 32 bit identifiers.
Besides, probably the best way to create these is to use some salted hash function, such as SHA-1 or MD5 or whatever your framework already has, with a randomly chosen salt (kept secret), and those generate at least 128 bits anyway, exactly for the reason mentioned above. If you use something that creates longer hash values, I don't really see any reason to truncate them.
To create identifiers you can check without storing them, take something easy to detect, such as having the same 64 bit patterns twice (giving a total of 128 bits) and encrypt that with some constant secret key, using AES or some other cipher with a block size of 128 bits (or whatever you picked). If and when the user sends some alleged key, decrypt and check for your easy-to-spot pattern.
在我看来,你可能对这个问题想得太多了。这听起来 100% 像是一个 GUID/UUID 的应用程序。 Python 甚至有内置的方式来生成它们。 GUID/UUID 的要点在于,冲突的几率是天文数字,通过使用字符串而不是加密令牌,您可以跳过验证步骤中的解密操作。我认为这也将消除您在密钥管理方面可能遇到的一系列问题,并提高整个过程的速度。
编辑:
对于 UUID,您的验证方法只是给定 UUID 与存储的 UUID 之间的比较。由于两个 UUID 之间发生冲突的可能性非常低,因此您不必担心误报。在您的示例中,似乎同一对象正在执行加密和解密,而没有第三方读取存储的数据。如果是这种情况,除了您传递的位不容易被猜测之外,您不会通过传递加密数据获得任何东西。我认为 UUID 会给你带来同样的好处,而无需加密操作的开销。
It sounds to me like you might be over thinking this problem. This sounds 100% like an application for a GUID/UUID. Python even has a built in way to generate them. The whole point of GUID/UUIDs is that the odds of collision are astronomical, and by using a string instead of an encrypted token you can skip the decrypting operation in the verify step. I think this would also eliminate a whole slew of problems you might encounter regarding key management, and increase the speed of the whole process.
EDIT:
With a UUID, your verify method would just be a comparison between the given UUID and the stored one. Since the odds of a collision between two UUIDs is incredibly low, you shouldn't have to worry about false positives. In your example, it appears that the same object is doing both encryption and decryption, without a third party reading the stored data. If this is the case, you aren't gaining anything by passing around encrypted data except that the bits your passing around aren't easy to guess. I think a UUID would give you the same benefits, without the overhead of the encryption operations.
您使标识符足够长,因此无法合理猜测。另外,让Unschedule等待1秒,如果令牌没有被使用,那么暴力攻击就不再可行了。就像其他答案所说的那样,Web应用程序中的会话ID是完全相同的问题,并且我已经看到会话ID的长度为64个随机字符。
You make your identifier long enough, so it can't be reasonable guessed. In addition, let Unschedule wait for 1 second, if the token is not in use, so a brute force attack is not feasible anymore. Like the other answer said, session IDs in Webapplications are exactly the same problem, and I already saw session IDs which where 64 random characters long.
这与普通 Web 应用程序中处理会话标识符的问题相同。可预测的会话 ID 很容易导致会话劫持。
查看会话 ID 是如何生成的。这里是一个典型的 PHPSESSID cookie 的内容:
如果你想彻底确定没有暴力攻击是可行的,请向后计算: 破解者每秒可以进行多少次尝试?在随机时间点使用了多少个不同的唯一 ID?一共有多少个id?破解者需要多长时间才能覆盖 ids 总空间的 1%?相应地调整位数。
This is the same problem as dealing with session identifiers in ordinary web applications. Predictable session ids can easily lead to session hijacking.
Have a look at how session ids are generated. Here the content of a typical PHPSESSID cookie:
If you want to be dead sure no brute-force attack is feasible, do the calculations backward: How many attempts can a cracker do per second? How many different unique id's are used at a random point in time? How many id's are there in total? How long would it take for the cracker to cover, say 1 % of the total space of ids? Adjust number of bits accordingly.
您在分布式或本地环境中需要这种模式吗?
如果您是本地人,大多数面向对象语言应该支持对象标识的概念,因此如果您创建一个不透明句柄 - 只需创建一个新对象。
没有其他客户可以伪造这一点。
如果您需要在分布式环境中使用它,您可以为每个会话保留一个句柄池,以便外部会话永远无法使用被盗的句柄。
Do you need this pattern in a distributed or local environment?
If you're local, most OO languages should support the notion of object identity, so if you create an opaque handle - just create a new object.
No other client can fake this.
If you need to use this in distributes environments, you may keep a pool of handles per session, so that a foreign session can never use a stolen handle.