惯用语“保证唯一” C++ 中的标识符
是否有一种惯用的 C++ 方法来保留和回收保证唯一的标识符?我的要求是:
- 假设存在当前未保留的 ID,reserve_id(void) 返回该 ID。
- 在一个不间断的 Reserve_id() 调用序列中,不会返回两次单个标识符
- 存在一个函数 recycle(id_type),它将标识符返回到可用池。
例如,我看到 Boost::Uuid ,但是 a) 我没有看到任何文档可以保证两个 UUID 的唯一性,并且 b) 我暂时只能使用早期版本的 Boost (1.40)。如果这特别适合该任务,我可以推动升级。
Is there an idiomatic C++ way to reserve and recycle identifiers that are guaranteed to be unique? My requirements are:
- Assuming there exists an ID not currently reserved, reserve_id(void) returns me that ID.
- In an unbroken sequence of reserve_id() calls, no single identifier will be returned twice
- There exists a function recycle(id_type) that returns an identifier to the available pool.
I have, for instance, seen Boost::Uuid, but a) I see no documentation which asserts the guaranteed uniqueness of two UUIDs and b) I'm constrained to an earlier version of Boost (1.40), for the time being. I could push to upgrade, if this were particularly perfect for the task.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
我认为您已经通过查找 Boost::Uuid 解决了这个问题,但您要求回收已生成的标识符。
从问题中的您链接到的文档中:
如果您一心想要回收和重新使用现有的标识符,我想您可以随着时间的推移不断建立一个 UUID 池,仅在需要时生成新的标识符并发现池是空的。但我无法想象这样的场景比生成新的 UUID 更好。
编辑:您曾评论说您需要保证唯一性。实际上,当以编程方式生成唯一标识符时,您永远不会得到一个。实际上,您会将生成的 ID 存储在具有有限大小的数据类型中,因此您可以生成的可能 ID 集也是有限的。恕我直言,您能实现的最好结果就是在容差阈值内模拟唯一性。
您可以通过
使用一种技术来实现这一点,该技术使获得重复 UUID 的机会变得非常遥远(这就是 Boost::UUID 所做的);
将极有可能唯一的 UUID 的生成包装在其他逻辑中,在已生成的 UUID 列表中查找新生成的 UUID,以消除新 UUID 重复的微小可能性。显然,当您在列表中接近大量 UUID 时,这样做的实用性就会降低。您预计会生成多少个?
如果您想要真正大量的唯一ID,大于适合本机类型的数量,您可以实现一个管理内存并进行必要的数学运算的类型,然后生成连续的ID,或者您可以使用类似的东西GNU Bignum 库 可以为您做到这一点。
I think you've already solved this problem for most practical purposes by finding Boost::Uuid, with the exception of your requirement to recycle identifiers already generated.
From the documentation you linked to in the question:
If you're hell-bent on recycling and re-using existing identifiers, I suppose you could maintain build up a pool of UUIDs over time, generating new ones only when you need one and find that the pool is empty. But I can't imagine a scenario where that would be preferable to generating a new UUID.
EDIT: You've commented that you need a guarantee as to uniqueness. Realistically, you're never going to get one when programatically generating a unique identifier. In practice, you're going to store the generated ID in a data type which has finite size, and so the possible set of IDs you can generate is finite too. IMHO, the best you can achieve then is to simulate uniqueness within a tolerance threshold.
You could do this by
Using a technique that makes the chances of getting a duplicate UUID very remote (this is what Boost::UUID will do);
Wrapping the generation of the highly-probably-to-be-unique UUID in some other logic that looks up the newly-generated UUID in a list of already-generated UUIDs to eliminate that tiny chance that the new one is a duplicate. Obviously, the practicality of doing this becomes decreases as you approach very large quantities of UUIDs in your list. How many do you anticipate generating?
If you want truly huge quantities of unique IDs, bigger than would fit in a native type, you could implement a type that manages the memory and does the necessary maths, and just produce sequential Ids, or you could perhaps use something like the GNU Bignum Library to do it for you.
您需要什么样的独特性?
只是在程序的生命周期中是唯一的,还是在多次运行/跨进程中是唯一的?
如果是前者,那么您可以
new
一个内存字节,然后使用该内存的地址作为标识符。这将保证是唯一的,直到您删除
内存,此时它可能会被回收。这可以很容易地包装在这样的类中:
可能有点不寻常,但如果您只需要每个进程的唯一性,它就满足您的要求:)
What sort of uniqueness do you require?
Just unique for the lifetime of the program or unique across multiple runs/cross-process?
If it is the former then you could just
new
a byte of memory then use the address of that memory as your identifier. This would be guaranteed to be unique until youdelete
the memory, at which point it may be recycled.This could easily be wrapped in a class like this:
Possibly a bit unusual, but it meets your requirements if you only need per-process uniqueness :)
ID 的有效期是多久?您真的需要回收它们吗?或者您可以忍受它们永远独一无二吗?您需要一次生成多少个?您可以为 id 分配多少位?
这是一个简单的方法:获取以太网卡的 MAC 地址(这是全球唯一的硬件问题),混合时间/日期(以毫秒为分辨率)和递增整数计数器(每个生成的 id 递增一次),然后您将得到一个id 在您的时间/日期范围内是唯一的,只要您不在本机上的一毫秒内生成 MAXINT 即可。现在它不是随机的,攻击者很容易预测,所以不要为了安全而使用它,它肯定不是最有效的位使用方式,但它是全球唯一的。
How long do the IDs live? Do you REALLY need to recycle them, or can you live with them being unique forever? How many do you need to generate all at once? How many bits can you devote to the id?
Here's a simple recipe: Take your ethernet card's mac address (which is globally unique baring hardware issues), mix in the time/date (to millisecond resolution) and an incrementing integer counter (increments once per id generated) and you'd have an id that's unique within the span of your time/date range, as long as you don't generate MAXINT of them in a single millisecond on this machine. Now it's NOT random looking, and it's EASY for an attacker to predict, so don't use it for security, and it's sure not the most efficient use of bits out there, but it IS globally unique.
是的,这很简单。
reserve_id
函数是operator new(0)
。recycle
函数当然是operator delete
Yes, this is simple.
reserve_id
function isoperator new(0)
.recycle
function is of courseoperator delete
这个问题似乎与 C++ 无关,它更像是一个基本问题。在任何给定时间预计有多少个 ID 有效?如果您希望在任何给定时间都只有很少的有效 ID,只需根据您的性能要求和相对回收/保留频率将它们放入链表、向量或集合等容器中。排序链表可能是最好的选择,因为您将在 O(n) 中进行回收和保留操作。一个向量有 O(n), O(n log n) ,而一个集合分别有 O(n log n), O(n) (可能是错的,我很快就想到了)。
The problem does not seem connected to C++, it is more of a fundamental issue. How many IDs are expected to be valid at any given time? If you expect to have few valid IDs at any given time, just put them in a container such as linked list, vector or set depending on your performance requirements and relative recycle/reserve frequency. A sorted linked list is probably the best option as you will have both recycle and reserve operations in O(n). A vector has O(n), O(n log n) and a set has O(n log n), O(n) respectively (might be wrong, I did the thinking very quicky).
下面是我在 .NET 项目中使用的一个简化实现,已快速转换为 C++(可能需要添加一些 C# 中隐式的边界验证):
T acquire()
) 并返回 (void release(T)
) 连续的整数,带有可选的开始(偏移量
)。如果需要,您可以将它们用作更复杂的标识符/结构的索引或种子。T size()
) 或验证它们 (bool validate(T)
)。可以按如下方式使用(请随意在测试框架中进行适当的断言):
Here's a simplified implementation I use in .NET projects, quickly translated to C++ (may want to add some bounds validation which was implicit in C#):
T acquire()
) and return (void release(T)
) successive integral numbers, with an optional start (offset
). You can use these as an index or seed to more complex identifiers/structures if needed.T size()
) or validating them (bool validate(T)
).Can be used as follows (feel free to make proper asserts in a test framework):