Python UUID 表示为特殊字符

发布于 2024-08-21 09:49:01 字数 873 浏览 15 评论 0原文

在Python中创建UUID时，就像这样：

>>> uuid.uuid1()
UUID('a8098c1a-f86e-11da-bd1a-00112444be1e')

如何将UUID映射到由大写字母AZ减去字符D、F、I、O、Q和U，加上数字，加上字符“组成的字符串” +”和“=”。即从整数或字符串到 32 个（相对 OCR 友好）字符集：

[ABCEGHJKLMNPRSTVWXYZ1234567890+=]

我将其称为 OCRf 集（对于 OCR 友好）。

我想要一个同构函数：

def uuid_to_ocr_friendly_chars(uid)
    """takes uid, an integer, and transposes it into a string made 
       of the the OCRf set
    """
    ...

我的第一个想法是经历将 uuid 更改为基数 32 的过程。例如，

OCRf = "ABCEGHJKLMNPRSTVWXYZ1234567890+="

def uuid_to_ocr_friendly_chars(uid):
     ocfstr = ''
     while uid > 1:
        ocfstr += OCRf[uid % 32]
        uid /= 32
     return ocfstr

但是，我想知道此方法是否是进行此转换的最佳和最快方法- 或者是否有更简单、更快的方法（例如内置的、更智能的算法或只是更好的方法）。

我很感谢您的意见。谢谢。

原文

When creating a UUID in Python, likeso:

>>> uuid.uuid1()
UUID('a8098c1a-f86e-11da-bd1a-00112444be1e')

How could one map that UUID into a string made up of the capitalized alphabet A-Z minus the characters D, F, I, O, Q, and U, plus the numerical digits, plus the characters "+" and "=". i.e. the from an integer or string onto the set of 32 (relatively OCR friendly) characters:

[ABCEGHJKLMNPRSTVWXYZ1234567890+=]

I'll call this the OCRf set (for OCR friendly).

I'd like to have an isomorphic function:

def uuid_to_ocr_friendly_chars(uid)
    """takes uid, an integer, and transposes it into a string made 
       of the the OCRf set
    """
    ...

My first thought is to go through the process of changing the uuid to base 32. e.g.

OCRf = "ABCEGHJKLMNPRSTVWXYZ1234567890+="

def uuid_to_ocr_friendly_chars(uid):
     ocfstr = ''
     while uid > 1:
        ocfstr += OCRf[uid % 32]
        uid /= 32
     return ocfstr

However, I'd like to know if this method is the best and fastest way to go about this conversion - or if there's a simpler and faster method (e.g. a builtin, a smarter algorithm, or just a better method).

I'm grateful for your input. Thank you.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

将军与妓 2024-08-28 09:49:01

将表示形式“压缩”18.75%（即从 32 个字符到 26 个字符）对您来说有多重要？因为，如果保存这么小的字节百分比并不是绝对重要的，像 uid.hex.upper().replace('D','Z') 这样的东西会按照你的要求做（不使用您提供了整个字母表，但唯一的成本就是缺少 18.75% 的“挤压”）。

如果压缩最后一个字节至关重要，那么我会处理每个 20 位的子字符串——即 5 个十六进制字符，即时髦字母表中的 4 个字符。其中有 6 个（加上剩余的 8 位，您可以像上面那样使用 hex.upper().replace ，因为做任何更花哨的事情都没有任何好处）。您可以通过切片 .hex 轻松获取子字符串，并使用 int(theslice, 16) 将每个子字符串转换为 int。然后，您基本上可以应用上面使用的相同算法 - 但算术都是在更小的数字上完成的，因此速度增益应该是重要的。另外，不要通过循环 += 来构建字符串 - 列出所有“数字”，并在末尾 ''.join 它们 - - 这也是性能改进。

回复收藏 0 原文

爱本泡沫多脆弱 2024-08-28 09:49:01

>>> OCRf = 'ABCEGHJKLMNPRSTVWXYZ1234567890+='
>>> uuid = 'a8098c1a-f86e-11da-bd1a-00112444be1e'
>>> binstr = bin(int(uuid.replace("-",""),16))[2:].zfill(130)
>>> ocfstr = "".join(OCRf[int(binstr[i:i+5],2)] for i in range(0,130,5))
>>> ocfstr
'HLBJJB2+ETCKSP7JWACGYGMVW+'

再次转换回来

>>> "%x"%(int("".join(bin(OCRf.index(i))[2:].zfill(5) for i in ocfstr),2))
'a8098c1af86e11dabd1a00112444be1e'

>>> OCRf = 'ABCEGHJKLMNPRSTVWXYZ1234567890+='
>>> uuid = 'a8098c1a-f86e-11da-bd1a-00112444be1e'
>>> binstr = bin(int(uuid.replace("-",""),16))[2:].zfill(130)
>>> ocfstr = "".join(OCRf[int(binstr[i:i+5],2)] for i in range(0,130,5))
>>> ocfstr
'HLBJJB2+ETCKSP7JWACGYGMVW+'

To convert back again

>>> "%x"%(int("".join(bin(OCRf.index(i))[2:].zfill(5) for i in ocfstr),2))
'a8098c1af86e11dabd1a00112444be1e'

回复收藏 0 原文

病女 2024-08-28 09:49:01

transtbl = string.maketrans(
  'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
  'ABCEGHJKLMNPRSTVWXYZ1234567890+='
)

uuidstr = uuid.uuid1()

print base64.b32encode(str(uuidstr).replace('-', '').decode('hex')).rstrip('=').translate(transtbl)

是的，这个方法确实让我有点不舒服，谢谢你的询问。

transtbl = string.maketrans(
  'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567',
  'ABCEGHJKLMNPRSTVWXYZ1234567890+='
)

uuidstr = uuid.uuid1()

print base64.b32encode(str(uuidstr).replace('-', '').decode('hex')).rstrip('=').translate(transtbl)

Yes, this method does make me a bit ill, thanks for asking.

回复收藏 0 原文

~没有更多了~