用于生成用户友好的字母数字 ID(如企业 ID、SKU)的选项有哪些

发布于 2024-07-07 04:45:01 字数 196 浏览 7 评论 0原文

以下是要求:

必须是字母数字,8-10 个字符,以便用户友好。 这些将作为唯一键存储在数据库中。 我使用 GUId 作为主键,因此使用 GUId 生成这些唯一 ID 的选项会更好。

我正在考虑一个基于 n 的转换器,它接受 Guid 并转换为 8 个字符的唯一字符串。

首选简短、轻量级的算法,因为它会经常被调用。

Here are the requirements:

Must be alphanumeric, 8-10 characters so that it is user friendly. These will be stored as unique keys in database. I am using Guids as primary keys so an option to use GUids to generate these unique Ids would be preferable.

I am thinking on the lines of a base-n converter that takes a Guid and converts to an 8 character unique string.

Short, light-weight algorithm preferred as it would be called quite often.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

安静被遗忘 2024-07-14 04:45:01
8 characters - perfectly random - 36^8 = 2,821,109,907,456 combinations
10 characters - perfectly random - 36^10 = 3,656,158,440,062,976 combinations
GUID's - statistically unique* - 2^128 = 340,000,000,000,000,000,000,000,000,000,000,000,000 combinations

* GUID 是否 100% 都是唯一的? [stackoverflow]

你的 GUID 问题 -> 字符转换; 虽然您的 GUID 在统计上是唯一的,但通过采用任何子集,您会降低随机性并增加碰撞的机会。 您当然不想创建非独特的 SKU。


解决方案 1:

使用与对象和业务规则相关的数据创建 SKU。

即可能有一个小的属性组合使对象变得唯一(自然键) 。 组合自然键的元素,对其进行编码和压缩以创建 SKU。 通常,您所需要的只是一个日期时间字段(即 CreationDate)和一些其他属性来实现此目的。 您在创建 sku 时可能会遇到很多漏洞,但 sku 与您的用户更相关。

假设:

Wholesaler, product name, product version, sku
Amazon,     IPod Nano,    2.2,             AMIPDNN22
BestBuy,    Vaio,         3.2,             BEVAIO32

解决方案 2:

一种方法,保留一系列数字,然后按顺序释放它们,并且永远不会两次返回相同的数字。 您仍然可能会在范围内出现漏洞。 虽然您可能不需要生成足够的 SKU 来发挥作用,但请确保您的要求允许这样做。

一种实现是在数据库中有一个带有计数器的key表。 计数器在事务中递增。 重要的一点是,软件中的方法不是递增 1,而是抓取一个块。 伪 C# 代码如下。

-- what the key table may look like
CREATE TABLE Keys(Name VARCHAR(10) primary key, NextID INT)
INSERT INTO Keys Values('sku',1)

// some elements of the class
public static SkuKeyGenerator 
{
    private static syncObject = new object();
    private static int nextID = 0;
    private static int maxID = 0;
    private const int amountToReserve = 100;

    public static int NextKey()
    {
        lock( syncObject )
        {
            if( nextID == maxID )
            {
                ReserveIds();
            }
            return nextID++;
        }
    }
    private static void ReserveIds()
    {
        // pseudocode - in reality I'd do this with a stored procedure inside a transaction,
        // We reserve some predefined number of keys from Keys where Name = 'sku'
        // need to run the select and update in the same transaction because this isn't the only
        // method that can use this table.
        using( Transaction trans = new Transaction() ) // pseudocode.
        {
             int currentTableValue = db.Execute(trans, "SELECT NextID FROM Keys WHERE Name = 'sku'");
             int newMaxID = currentTableValue + amountToReserve;
             db.Execute(trans, "UPDATE Keys SET NextID = @1 WHERE Name = 'sku'", newMaxID);

             trans.Commit();

             nextID = currentTableValue;
             maxID = newMaxID;
        }
    } 

这里的想法是,您保留足够的键,以便您的代码不会经常访问数据库,因为获取键范围是一项昂贵的操作。 您需要很好地了解需要保留的密钥数量,以平衡密钥丢失(应用程序重新启动)与过快耗尽密钥并返回数据库。 这个简单的实现无法重用丢失的密钥。

由于此实现依赖于数据库和事务,因此您可以让应用程序同时运行,并且所有应用程序都会生成唯一的密钥,而无需经常访问数据库。

请注意,以上内容大致基于企业应用程序模式第 222 页的密钥表架构(福勒)。 该方法通常用于生成主键,而不需要数据库标识列,但您可以了解如何根据您的目的进行调整。

8 characters - perfectly random - 36^8 = 2,821,109,907,456 combinations
10 characters - perfectly random - 36^10 = 3,656,158,440,062,976 combinations
GUID's - statistically unique* - 2^128 = 340,000,000,000,000,000,000,000,000,000,000,000,000 combinations

* Is a GUID unique 100% of the time? [stackoverflow]

The problem with your GUID -> character conversion; while your GUID is statistically unique, by taking any subset you decrease randomness and increase the chance of collisions. You certainly don't want to create non-unqiue SKU's.


Solution 1:

Create SKU using data relevant to the object and business rules.

i.e. There likely to be a small combination of attributes that makes an object unique (a natural key). Combine the elements of the natural key, encode and compress them to create a SKU. Often all you need is a date-time field (ie CreationDate) and a few other properties to achieve this. You're likely to have a lot of holes in sku creation, but sku's are more relevant to your users.

hypothetically:

Wholesaler, product name, product version, sku
Amazon,     IPod Nano,    2.2,             AMIPDNN22
BestBuy,    Vaio,         3.2,             BEVAIO32

Solution 2:

A method that reserves a range of numbers, and then proceeds to release them sequentially, and never returns the same number twice. You can still end up with holes in the range. Likely though you don't need to generate enough sku's to matter, but ensure your requirements allow for this.

An implementation is to have a key table in a database that has a counter. The counter is incremented in a transaction. An important point is that rather than incrementing by 1, the method in software grabs a block. pseudo-c#-code is as follows.

-- what the key table may look like
CREATE TABLE Keys(Name VARCHAR(10) primary key, NextID INT)
INSERT INTO Keys Values('sku',1)

// some elements of the class
public static SkuKeyGenerator 
{
    private static syncObject = new object();
    private static int nextID = 0;
    private static int maxID = 0;
    private const int amountToReserve = 100;

    public static int NextKey()
    {
        lock( syncObject )
        {
            if( nextID == maxID )
            {
                ReserveIds();
            }
            return nextID++;
        }
    }
    private static void ReserveIds()
    {
        // pseudocode - in reality I'd do this with a stored procedure inside a transaction,
        // We reserve some predefined number of keys from Keys where Name = 'sku'
        // need to run the select and update in the same transaction because this isn't the only
        // method that can use this table.
        using( Transaction trans = new Transaction() ) // pseudocode.
        {
             int currentTableValue = db.Execute(trans, "SELECT NextID FROM Keys WHERE Name = 'sku'");
             int newMaxID = currentTableValue + amountToReserve;
             db.Execute(trans, "UPDATE Keys SET NextID = @1 WHERE Name = 'sku'", newMaxID);

             trans.Commit();

             nextID = currentTableValue;
             maxID = newMaxID;
        }
    } 

The idea here is that you reserve enough keys so that your code doesn't go the the database often, as getting the key range is an expensive operation. You need to have a good idea of the number of keys you need to reserve to balance key loss (application restart) versus exhausting keys too quickly and going back to the database. This simple implementation has no way to reuse lost keys.

Because this implementation relies a database and transactions you can have applications running concurrently and all generate unique keys without needing to go to the database often.

Note the above is loosely based on key table, page 222 from Patterns of Enterprise Application Architecture (Fowler). The method is usually used to generate primary keys without the need of a database identity column, but you can see how it can be adapted for your purpose.

许仙没带伞 2024-07-14 04:45:01

您可以考虑 base 36. 因为它可以处理字母和数字。
考虑从集合中删除 I(眼睛)和 O(哦),这样它们就不会与 1(一)和 0(零)混淆。 有些人可能也会抱怨 2 和 Z。

You might consider base 36. in that it can do letters and numbers.
Consider removing I (eye) and O (Oh) from your set so they don't get mixed up with 1 (one) and 0 (zero). Some people might complain about 2 and Z as well.

剩余の解释 2024-07-14 04:45:01

如果您正在寻找“用户友好”,您可能需要尝试使用整个单词,而不是简单地使其简短/字母数字,因此,类似于:

words = [s.strip().lower() for s in open('/usr/share/dict/canadian-english') if "'" not in s]
mod = len(words)

def main(script, guid):
    guid = hash(guid)

    print "+".join(words[(guid ** e) % mod] for e in (53, 61, 71))

if __name__ == "__main__":
    import sys
    main(*sys.argv)

Which 产生如下输出:

oranjestad+compressing+wellspring
padlock+discommoded+blazons
pt+olenek+renews

Which is amusing. 否则,仅采用 guid 的前 8-10 个字符或 guid 的 sha1/md5 哈希值可能是您的最佳选择。

If you're looking for "user friendly" you might want to try using entire words rather than simply making it short/alphanumberic, thus, something like:

words = [s.strip().lower() for s in open('/usr/share/dict/canadian-english') if "'" not in s]
mod = len(words)

def main(script, guid):
    guid = hash(guid)

    print "+".join(words[(guid ** e) % mod] for e in (53, 61, 71))

if __name__ == "__main__":
    import sys
    main(*sys.argv)

Which produces output like:

oranjestad+compressing+wellspring
padlock+discommoded+blazons
pt+olenek+renews

Which is amusing. Otherwise, simply taking the first 8-10 characters of the guid or sha1/md5 hash of the guid is probably your best bet.

彡翼 2024-07-14 04:45:01

最简单的可行方法是每次需要一个值时都会递增的计数器。 八个(左零填充)数字为您提供了从 00000000 到 99999999 的 1 亿个可能值(尽管您可以插入空格或连字符以方便人类阅读,如 000-000-00)。

如果您需要超过 1 亿个值,您可以增加长度或在替代位置使用字母。 使用 A0A0A0A0 到 Z9Z9Z9Z9 可为您提供超过 45 亿个可能的值 (4,569,760,000)。 获取一个长整数并产生这样的编码是很简单的代码(对最右边的数字进行 mod 10,除以 10,然后对最右边的字母进行 mod 26,等等)。如果您有足够的内存,最快的方法是将计数器转换为 mod 260 数组,并使用每个 mod 260 值作为两个字符字符串数组的索引(“A0”、“A1”、“A2”,依此类推,直到“A9”、“ B0”、“B1”等至“Z9”)。

基数 36 的问题(在另一个回复中提到)是,您不仅要担心读者对相似字符的混淆(1 与 I,0 与 O,2 与 Z,5 与 S),而且还要担心组合相邻字母可能会被读者视为拼写令人反感或淫秽的单词或缩写。

The simplest thing that could possibly work is a counter that is incremented every time a value is required. Eight (left-zero-padded) digits gives you 100 million possible values 00000000 thru 99999999 (although you might interject spaces or hyphens for human readability, as in 000-000-00).

If you will need more than 100 million values, you could either increase the length or use letters in alternate positions. Using A0A0A0A0 thru Z9Z9Z9Z9 gives you over four-and-a-half billion possible values (4,569,760,000) available. It is a trivial bit of code to take a long integer and produce such an encoding (mod 10 for the rightmost digit, div by 10 then mod 26 for the rightmost letter, etc.) If you have the memory to burn, the fastest way is to convert the counter to a mod 260 array, and use each mod 260 value as an index into an array of two-character strings ("A0", "A1", "A2", and so on thru "A9", "B0", "B1", etc. thru "Z9").

The problem with base 36 (mentioned in another reply) is that you not only have to worry about reader confusion of similar characters (one vs. I, zero vs. O, two vs. Z, five vs. S) but also about combinations of adjacent letters that might be perceived by readers as spelling distasteful or obscene words or abbreviations.

清风无影 2024-07-14 04:45:01

您可能想尝试 CRC32 哈希算法。 CRC32 生成一个 8 个字符的字符串。

http://en.wikipedia.org/wiki/Cyclic_redundancy_check

http://textop.us/Hashing/CRC

You may want to try a CRC32 hashing algorithm. The CRC32 generates an 8 character string.

http://en.wikipedia.org/wiki/Cyclic_redundancy_check

http://textop.us/Hashing/CRC

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文