如何在不使用数据库中的主键的情况下为用户提供唯一的 ID?
如果我有 10,000 个用户,并且主键是从 1 到 10,000 的唯一 ID,有没有办法为他们提供唯一 ID,以便无法从中推断出原始主键?
例如,链接到您的 Facebook 个人资料或类似内容将是 http://site.com/profile?id=293852
那里的 id 是否可能与数据库中用户的主键相同?我正在努力想办法拥有两个不相关的唯一 ID 列,因为随机生成的 ID 列必须是唯一的。我想如果可以使用数字来创建 GUID,那么长度就会太长了。
和想法?
If I have 10,000 users and the primary key is a unique ID going from 1 to 10,000, is there a way to give them all a unique ID such that the original primary key cannot be inferred from it?
For example, linking to your facebook profile or similar would be http://site.com/profile?id=293852
Is it likely that the id there is the same as the primary key of their user in the database? I am struggling to think of a way to have two unrelated unique ID columns, because randomly generated ones would have to be unique. I imagine if it were possible to have a GUID using numbers only the length would be far too long.
And ideas?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您通常有两种选择:
但是……你为什么认为应该保护主键的值?如果唯一的原因是为了防止用户猜测其他有效的用户 ID,您可以将一个随机字符串附加到主键(并将其存储在数据库中并验证其访问的正确性)。
You have generally two options:
new Random(primaryKey).NextInt()
, or it might be quite complicated, but attack-proof, e.g. any kind of Format-preserving encryption.But then… why do you think you should protect the values of your primary keys? If the only reason is to prevent users guessing other valid user IDs, you can just append a random string to the primary key (and store it in the database and verify its correctness on access).
出于安全原因,确实建议将 ID 设置为非顺序的,以避免系统中的用户枚举。但 40 亿(我的意思是 2^32)太小,无法提供不可发现的区间。这就是为什么 GUID 更可取的原因。根据数据库(看看你的规范,它看起来像 MSSQL),你可以存储在类似 guid 的字段、字节字段(对于 MySQL)或 2 个单独的 int64 中。
为了减小 URL 大小,可以应用 base64 编码,以便 GUID 看起来更短。
It is really recommends in security reason to make ID non sequential, to avoid enumerating of user in system. But 4 billions (I mean 2^32) is too small to provide non-discoverable interval. That is why GUID is more preferable. Depending on database (looking at your spec it seems like MSSQL) you can store in guid-like fields, byte fields (for MySQL) or 2 separate int64.
To reduce URL size the base64 encoding can be applied so GUID looks shorter.
如何生成随机且唯一的 id 是一个有用的问题 - 但您似乎正在假设何时生成它们!
我的观点是,您不需要在创建行时生成这些 id,因为它们本质上独立于插入的数据。
我所做的是预先生成随机 ID 以供将来使用,这样我就可以享受自己的甜蜜时光并绝对保证它们是唯一的,并且在插入时无需进行任何处理。
例如,我有一个订单表,其中包含 order_id。当用户输入订单时,此 id 会动态生成,递增 1、2、3 等,直到永远。用户不需要看到这个内部ID。
然后我有另一个表 - random_ids 与(order_id,random_id)。我有一个每天晚上运行的例程,它会在该表中预加载足够的行,足以覆盖未来 24 小时内可能插入的订单。 (如果我有一天收到 10000 个订单,我就会遇到问题 - 但这将是一个好问题!)
这种方法保证了唯一性,并将任何处理负载从插入事务转移到批处理例程中,其中不影响用户。
How you generate the random and unique ids is a useful question - but you seem to be making an assumption about when to generate them!
My point is that you do not need to generate these id's at the time of creating your rows, because they are essentially independent of the data being inserted.
What I do is pre-generate random id's for future use, that way I can take my own sweet time and absolutely guarantee they are unique, and there's no processing to be done at the time of the insert.
For example I have an orders table with order_id in it. This id is generated on the fly when the user enters the order, incrementally 1,2,3 etc forever. The user does not need to see this internal id.
Then I have another table - random_ids with (order_id, random_id). I have a routine that runs every night which pre-loads this table with enough rows to more than cover the orders that might be inserted in the next 24 hours. (If I ever get 10000 orders in one day I'll have a problem - but that would be a good problem to have!)
This approach guarantees uniqueness and takes any processing load away from the insert transaction and into the batch routine, where it does not affect the user.
允许用户查看主键有什么问题吗?
您可以随机生成数字,确保它是一个非常大的数字,这样就不太可能发生冲突,然后只需运行一个选择来检查它是否不存在。
或者,您可以选择一个巨大的数字,然后围绕它建立一些方程。类似这样的:
这意味着随着 PK 的增加,唯一数字将远离你的起始数字,并根据 PK 是奇数还是偶数而高于或低于它。方程越复杂,被发现的可能性就越小,但永远不会 100% 依赖这种方法,因为总有可能有人会解决它。
What's wrong with allowing the user to see the primary key?
You could generate the numbers randomly, make sure it's a really big number so that clashes are unlikely, then just run a select to check it doesn't exist.
Or, you could pick a huge number, and then base some equation around that. Something like:
That means that the unique numbers will get further away from your starting number as the PK increases, and be above or below it depending on whether the PK is odd or even. The more complexity you add to the equation, the less likely it'll be discovered, but never ever 100% rely on this method, as there is always the possibility someone will work it out.
我所做的是使用 GUID 的一部分和实际 ID。
在表中,我有一个列类型 uniqueidentifier,默认值为 newid(),
然后我将其一部分并在末尾添加实际的序列 ID,并在它们之间使用已知的分隔符。我使用字母 H,因为它不会出现在 GUID 中。
所以对于#8659行我会:
ID列=8659
GUIDcolumn='{200BAB55-C7D5-4456-AB57-CFF8B7E82A90}'
PROFILECODE='200BAB55H8659'
我可以通过以下方式找到正确的行:
理论上,SQL 解析器应该首先找到 ID 列为 8659 的所有行,然后检查 GUID 列
如果人们尝试猜测配置文件的 ID,他们不能只更改其中一部分并取得成功。
What I do is use part of a GUID and the actual ID.
In the table I have a column type uniqueidentifier with a default value of newid()
I then take part of it and add the actual serial ID on the end with a known delimiter between them. I use the letter H as this doesn't appear in GUIDs.
So for row #8659 I would have:
IDcolumn=8659
GUIDcolumn='{200BAB55-C7D5-4456-AB57-CFF8B7E82A90}'
PROFILECODE='200BAB55H8659'
I can locate the correct row by:
In theory the SQL parser should find all rows with IDcolumn 8659 first and then check for the GUIDcolumn
If people try to guess an ID for a profile they couldn't just change one part of it and succeed.