创建诸如Instagram帖子或YouTube视频之类的URL
我正在建立一个社交网络,我将在很长一段时间内随机创建自己的URL,这是如何创建诸如Django之类的Instagram帖子(如Django)之类的URL:
https://www.instagram.com/p/ceqccczdeoneonap/
or 或
https://www.youtube.com/watch?v=mhraau9-jg4
我的问题是一只手这些URL必须是唯一的,另一方面,当用户上传帖子的数量超过100,000时,在大规模上没有任何意义,因为数据库的性能降低了
另一点是使用UUID,可以在很大程度上解决这个唯一性的问题,但是UUID产生的字符串很长,如果我缩短了这些字符串并减少字符串中的字母数量,则可能会发生碰撞。 我想知道几个相同的字符串,
我想知道这个问题是否有一个解决方案,即生成的URL既短而独特又独特,同时保持数据库性能,
谢谢您的时间
I'm building a social network and I're going to create my own urls at random the question that has been on my mind for a long time is how to create urls like Instagram posts like Django like the following:
https://www.instagram.com/p/CeqcZdeoNaP/
or
https://www.youtube.com/watch?v=MhRaaU9-Jg4
My problem is that on the one hand these urls have to be unique and on the other hand it does not make sense that on a large scale when the number of uploaded posts by the user is more than 100,000 I set unique = True Because the performance of the database decreases
Another point is the use of uuids, which solves this problem of uniqueness to a large extent, but the strings produced by uuid are very long, and if I shorten these strings and reduce the number of letters in the string, there is a possibility of a collision. And that several identical strings are produced
I wanted to know if there is a solution to this issue that generated urls are both short and unique while maintaining database performance
Thank you for your time ????
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可能会选择围绕ULID设计。 https://github.com/ulid/spec
仍然是128位。
他们所做的工程折衷是48位可预测的低渗透钟
用80位nonce搭配。
从时间戳开始,它可以与Postgres B-Trees一起使用。
他们每个角色序列化5位,而不是十六进制提供的4位。
为了简洁起见,您可以选择6个。
同样,您也可以选择调整时钟刻度粒度,
并减少其范围。
保持生日悖论
您也可以选择使用较小的nonce。
当前的设计具有良好的碰撞阻力
每个时钟刻度最多可达2^40个标识符,
这可能是满足您需求的过度杀伤。
You might choose to design around ULIDs. https://github.com/ulid/spec
It's still 128 bits.
The engineering tradeoff they made was 48 bits of predictable low-entropy clock
catenated with an 80-bit nonce.
Starting with a timestamp makes it play very nicely with postgres B-trees.
They serialize 5 bits per character instead of the 4 bits offered by hex.
You could choose to go for 6 if you want, for the sake of brevity.
Similarly you could also choose to adjust the clock tick granularity,
and reduce its range.
Keeping the Birthday Paradox in mind,
you might choose to use a smaller nonce, as well.
The current design offers good collision resistance
up to around 2^40 identifiers per clock tick,
which might be overkill for your needs.