如何巧妙地将任意元数据编码为 UUID?
(这个算法适用于我正在开发的 iPhone 应用程序,如果这对上下文有帮助的话。)
我们需要创建 UUID 来唯一标识某些产品。通常这就像分配唯一编号一样简单,但我们还希望将元数据编码到我们的 UUID 中。我们的 API 只允许使用一个字段,因此我们希望使用 UUID 字段作为唯一标识符和元数据载体。
通常,您可以将数据与下划线混合在一起,但我们有一个要求使这变得困难:元数据项之一可以是 n 项的列表。
以下是元数据:
- 设备类型(最多 16 种离散类型)
- 支持的最低操作系统版本(xxx 格式,其中 x 是 0-99 之间的数字)
- 支持的最低二进制(应用程序)版本(xxx 格式,其中 x 是介于 0-99 之间的数字) 0-99)
- 该产品取代的任何产品(n 个 ID 列表,其格式是此设计问题的一部分)
限制
我们唯一的技术限制是我们可以仅使用最多 128 个字母数字字符 (a-zA-Z0-9),包括下划线、句点和连字符来表示 UUID(它是一个 API)。
用例
以下是一些用例来解释该算法将帮助解决什么问题:
用户购买产品 A 和产品 B。我们稍后发布产品 C,它是产品 A+ 的组合一起B。通过C的UUID,我们希望我们的应用程序代码能够确定C确实是A+B,并且由于用户已经拥有A+B,所以C不会出现在可用产品列表中。
用户有 2 台设备,A 和 B。设备 B 不支持产品 C,因此当用户在设备 B 上查看产品时,C 不应该对他们可用,而应该在设备 A 上。
到目前为止我所做的
设备类型应该很简单 - 有 16 种离散类型,我可以对其进行位掩码 - 16 位 = 4 个十六进制字符。够简单的。
版本控制是相同的 - 我可以将每个版本段 (xyz) 填充为 2 位数字,然后只需 2 次 6 位数字作为版本信息。
重要的是如何引用以前的产品 ID。显然,我的内存空间是有限的 - 我只有 128 个字符(使用上面的方法,我只剩下 112 个字符)。如果我需要 n 个项目的列表,我将用完空间。
实际上,n<=5 是合理的。任何给定产品将取代不超过 5 个其他产品。
固定长度的 UUID 不是必需的。是的,一个“便宜”的解决方案是将 ID 列表与下划线菊花链在一起,但由于许多 ID 必须首先手动输入,因此我们希望尽可能避免使用 128 字节避免它。在算法正确性之后,应优先考虑最小化 UUID 长度。
另一个可能使这变得困难的部分——尽管它的实现不是在 UUID 本身中而是在代码中——是如果一个被取代的产品正在取代其他产品,则需要级联下来。
有什么指示我可以从哪里开始吗?
(This algorithm is for an iPhone app I am working on, if that helps the context at all.)
We need to make UUIDs to uniquely identify some products. Usually this is as simple as assigning unique numbers, but we also want to encode metadata INTO our UUID. Our API only allows us ONE field, so we want to use the UUID field as both a unique identifier and a metadata carrier.
Usually, you can just hodge-podge the data together with underscores, but we have one requirement that makes this difficult: one of the metadata items can be a list of n items.
Here is the metadata:
- Device type (~up to 16 discrete types)
- Min OS version supported (x.x.x format, where x is a number between 0-99)
- Min Binary (app) version supported (x.x.x format, where x is a number between 0-99)
- Any products that this product supersedes (list of n IDs, the format of which is part of this design problem)
Limitations
Our only technical limitation is that we can only use up to 128 alphanumeric characters (a-zA-Z0-9), including underscores, periods, and hyphens, to represent the UUID (it's an API).
Use Cases
Here are a few use cases to explain what this algorithm will help solve:
A user buys product A and product B. We later release product C, which is a package of products A+B together. Via C's UUID, we want our application code to be able to determine that C is really A+B, and since the user already owns A+B, C will not appear on a list of available products.
A user has 2 devices, A and B. Product C is not supported on device B, so when the user views products on device B, C should not be available to them, but it should be on device A.
What I've Done Thus Far
Device type should be easy- with 16 discrete types, I can bitmask that - 16 bits = 4 hex characters. Simple enough.
Versioning is the same - I can pad each version segment (x.y.z) to 2 digits, and then just have 2 runs of 6 digits as the version information.
Where it is non-trivial is how to refer to previous product IDs. Clearly, my memory space is limited - I only have 128 characters (and using the approaches above, I'd have only 112 characters left). If I need a list of n items, I will run out of space.
Realistically n<=5 is reasonable. Any given product would supersede no more than 5 other products.
A fixed-length UUID is NOT a requirement. And yes, one "cheap" solution is to daisy-chain the ID list together with underscores, but since many of the IDs will have to be hand-entered in the first place, we'd like to avoid using 128 bytes if we can avoid it. Minimizing the UUID length should be a priority after correctness in the algorithm.
Another part that may make this difficult -- although the implementation of this isn't in the UUID itself but rather in the code -- is that if one of the superseded products was superseding something else, that needs to cascade down.
Any pointers on where I can start on this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
用十进制或十六进制数字来思考是一个坏主意,它只会浪费太多空间。
您的 UUID 字母表有 65 (2*26+10+3) 个字符。因此,使用 n 个字符,您可以编码 65^n 个不同的值。
例如,xxx 格式(其中 x 是 0-99 之间的数字)实际上只有 100^3 个不同的值,因此可以使用 log65(100^3) ~ 3.31 = 4 个字符进行编码。
因此,对于前三个元数据,您需要 1+4+1=9 个字符,或者如果您组合三个字段 log65(100^3*100^3*16) ~ 7.28 = 8 个字符。
对于产品取代级联问题,我建议将 UUID 分为两部分,第一部分包含短 UUID,第二部分包含元数据。当您引用被取代的产品时,请使用短 UUID。
Thinking in terms of decimal or hex digits is a bad idea, it just wastes sooo much space.
Your UUID alphabet has 65 (2*26+10+3) characters. So with n characters you can encode 65^n different values.
For example the x.x.x format (where x is a number between 0-99) has really just 100^3 different values, so it can be encoded with log65(100^3) ~ 3.31 = 4 characters.
So for the first three metadata you need 1+4+1=9 characters, or if you combine the three fields log65(100^3*100^3*16) ~ 7.28 = 8 characters.
For the product supersede cascading problem I would suggest splitting the UUID into two parts, the first part containing a short UUID and the second part the metadata. When you referer to a superseded product use the short UUID.
这些数据需要人类可读吗???
如果不是,那么也许您应该将此视为将结构/对象序列化为最大长度为 128 的字节数组的问题。
例如,您可以使用格式 (UID[int],Device[byte],ArrayLength[byte ],ProductID01[int16],...),然后获取生成的字节数组并进行 base64 处理。或者,如果您要发布此数据并且不需要 URL 安全,则只需将其作为 char 数组(基本上是 base256)发送即可。我不知道您的任何限制,但您可以根据最大范围调整数据类型。例如,如果您认为数组长度永远不会大于 16,那么您可以为 DeviceType 和 ArrayLength 拆分一个字节。
更好的是使用像 protoBuf 这样的序列化框架。不过不知道有没有移植到iOS上。
Does this data need to be human readable???
If not then maybe you should look at this as a problem of serializing a struct/object into a byte array with a max length of 128.
For example you could use the format (UID[int],Device[byte],ArrayLength[byte],ProductID01[int16],...), then take the resulting byte array and base64 it. Or if you are posting this data and it doesn't need to be URL safe, then just send it as a char array (basically base256). I don't know any of your limits, but you may adjust the datatypes based on the max ranges. For example, if you think the array length will never be larger than 16, then you could split a byte for DeviceType and ArrayLength.
Even better would be to use a serlization framework like protoBuf. But I don't know if it's been ported to iOS yet.