如何以编程方式生成类似 Heroku 的子域名?
我们都见过一些有趣的子域,当您使用简单的“heroku create”将应用程序部署到 Heroku 时,您会自动分配这些子域。
一些示例:blazing-mist-4652、electric-night-4641、morning-frost-5543、radiant-river-7322 等等。
看起来它们都遵循形容词-名词-4 位数字的模式(大部分)。他们是否只是简单地输入一些形容词和名词的字典,然后在您推送应用程序时从中随机选择组合?是否有一个 Ruby gem 可以实现这一点,也许提供一本可以按词性搜索的字典,或者这是需要手动完成的事情?
We've all seen the interesting subdomains that you get automatically assigned when you deploy an app to Heroku with a bare "heroku create".
Some examples: blazing-mist-4652, electric-night-4641, morning-frost-5543, radiant-river-7322, and so on.
It seems they all follow a adjective-noun-4digitnumber pattern (for the most part). Did they simply type out a dictionary of some adjectives and nouns, then choose combinations from them at random when you push an app? Is there a Ruby gem that accomplishes this, perhaps provides a dictionary which one could search by parts of speech, or is this something to be done manually?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
Heroku API 团队的工程师:我们采用最简单的方法来生成应用程序名称,这基本上就是您所建议的:将形容词和名词数组保留在内存中,从每个元素中随机选择一个元素,并将其与来自的随机数组合1000 到 9999。
这不是我写过的最令人兴奋的代码,但有趣的是看看我们必须做什么才能扩展这个:
起初我们正在选择一个名称,试图
INSERT
然后挽救唯一性约束错误以选择不同的名称。当我们有大量名称(以及使用它们的不太大的应用程序集)时,这种方法效果很好,但在一定规模上,我们开始注意到名称生成过程中出现了很多冲突。为了使其更具弹性,我们决定选择几个名称并通过单个查询检查哪些名称仍然可用。显然,由于竞争条件,我们仍然需要检查错误并重试,但表中的应用程序如此之多,这显然更有效。
挂钩,以便在我们的名称池不足时收到警报(例如:如果 1/3 的随机名称被占用,则发送警报)。
第一次遇到冲突问题时,我们只是从 2 位数字增加到 4 位数字,从根本上增加了名称池的大小。有了 61 个形容词和 74 个名词,这使我们的名称从 ~400k 增加到 ~40mi (
61 * 74 * 8999
)。但是当我们运行 200 万个应用程序时,我们再次开始收到冲突警报,并且速度比预期高得多:大约一半的名称发生冲突,考虑到我们的池大小和运行的应用程序数量,这毫无意义.
正如您可能已经猜到的,罪魁祸首是
rand
是一个非常糟糕的伪随机数生成器。使用SecureRandom
选择随机元素和数字反而从根本上降低了碰撞量,使其符合我们最初的预期。由于需要进行大量工作来扩展这种方法,我们首先要问是否有更好的方法来生成名称。讨论的一些想法是:
使名称生成成为应用程序 ID 的函数。这会快得多,并且完全避免冲突问题,但缺点是,它会浪费大量已删除应用程序的名称(该死的,作为不同集成测试的一部分,我们很快就创建和删除了很多应用程序) .
使名称生成具有确定性的另一个选项是在数据库中拥有可用名称池。这将使您可以轻松地执行诸如仅在删除应用程序 2 周后重用名称之类的操作。
很高兴看到下次碰撞警报触发时我们会做什么!
希望这可以帮助任何从事友好名称生成工作的人。
Engineer at the Heroku API team here: we went with the simplest approach to generate app names, which is basically what you suggested: keep arrays of adjectives and nouns in memory, pick an element from each at random and combine it with a random number from 1000 to 9999.
Not the most thrilling code I've written, but it's interesting to see what we had to do in order to scale this:
At first we were picking a name, trying to
INSERT
and then rescuing the uniqueness constraint error to pick a different name. This worked fine while we had a large pool of names (and a not-so-large set of apps using them), but at a certain scale we started to notice a lot of collisions during name generation.To make it more resilient we decided to pick several names and check which ones are still available with a single query. We obviously still need to check for errors and retry because of race conditions, but with so many apps in the table this is clearly more effective.
It also has the added benefit of providing an easy hook for us to get an alert if our name pool is low (eg: if 1/3 of the random names are taken, send an alert).
The first time we had issues with collisions we just radically increased the size of our name pool by going from 2 digits to 4. With 61 adjectives and 74 nouns this took us from ~400k to ~40mi names (
61 * 74 * 8999
).But by the time we were running 2 million apps we started receiving collision alerts again, and at a much higher rate than expected: About half of the names were colliding, what made no sense considering our pool size and amount of apps running.
The culprit as you might have guessed is that
rand
is a pretty bad pseudorandom number generator. Picking random elements and numbers withSecureRandom
instead radically lowered the amount of collisions, making it match what we expected in first place.With so much work going to scale this approach we had to ask whether there's a better way to generate names in first place. Some of the ideas discussed were:
Make the name generation a function of the application id. This would be much faster and avoid the issue with collisions entirely, but on the downside it would waste a lot of names with deleted apps (and damn, we have A LOT of apps being created and deleted shortly after as part of different integration tests).
Another option to make name generation deterministic is to have the pool of available names in the database. This would make it easy to do things like only reusing a name 2 weeks after the app was deleted.
Excited to see what we'll do next time the collision alert triggers!
Hope this helps anyone working on friendly name generation out there.
我创建了一个 gem 来执行此操作: RandomUsername
类似的 gem 是 Bazaar 和 Faker。
I've created a gem to do this: RandomUsername
Similar gems are Bazaar and Faker.
有几种可能性。
您可以生成随机字符串。
如果你想使用真正的单词,你需要一本字典。然后,您可以创建一个生成单词和数字排列的结果。
另一种不错的选择是 Ruote 采用的方法。 Ruote 依赖 rufus-mnemo 为每个进程生成唯一的名称。 rufus-mnemo 提供了将整数转换为更容易记住的“单词”的方法,反之亦然。
您可以为记录生成唯一的 ID,然后将其转换为单词。
There are several possibilities.
You can generate a random string.
If you want to use real words, you need a dictionary. Then you can create a result generating a permutation of words and digits.
Another nice alternative is the one adopted by Ruote. Ruote relies on rufus-mnemo to generate an unique name for each process.
rufus-mnemo
provides methods for turning integer into easier to remember ‘words’ and vice-versa.You can generate an unique id for the record, then convert it into a word.
感谢@m1foley,
我正在使用你的 gem https://github.com/polleverywhere/random_username
生成随机数名称:
解释:
一些生成的名称:
Thanks to @m1foley
I am using your gem https://github.com/polleverywhere/random_username
Generate random name:
Explanation:
Some generated name: