Domize算法
我试图弄清楚 http://domize.com 如何对其网站进行编码以提高速度(我正在使用 C#) 。
例如,如果我搜索
[cns][vwl][cns][cns][vwl][cns]
顺序排列的 6 个字母域名
这将搜索按辅音、元音、辅音、辅音、元音、辅音
结果是:
巴巴巴 巴巴克 巴巴德 巴巴夫 ... 祖祖克斯 祖祖伊 zuzzuz
由于有 21 个辅音和 5 个元音(不包括“y”),这意味着有 21 * 5 * 21 * 21 * 5 * 21 = 4,862,025 种可能的组合。
结果相对较快,因此他们不可能在短时间内循环遍历所有这些结果。
现在,我知道他们一次只显示前 100 个,但为了获得前 100 个,他们必须至少构建一些结果。
我的问题是:他们怎么做到这么快?
我的想法是创建一个数组的数组。在本例中,将有 6 个数组(因为有 6 个组),每个数组中都有辅音/元音的可能性。但我不知道如何正确地循环这些来构建域。
我确信有一种更快/更好的方法,比如哈希表或矩阵之类的东西,但我对这些还不够了解,无法自己解决。
如果有人可以提供 C# 代码,我将非常感激!
谢谢!
I'm trying to figure out how http://domize.com codes their site for speed (I'm using C#).
For example, if I search for
[cns][vwl][cns][cns][vwl][cns]
This will search for a 6 letter domain name that is in this order
consonant, vowel, consonant, consonant, vowel, consonant
Resulting in:
babbab
babbac
babbad
babbaf
...
zuzzux
zuzzuy
zuzzuz
Since there are 21 consonants, and 5 vowels (not including "y"), that would mean that there are 21 * 5 * 21 * 21 * 5 * 21 = 4,862,025 possible combinations.
There results are relatively fast, so they can't possibly be looping through all of those results in that short of a time.
Now, I understand that they only show the first 100 at a time, but in order to get those first 100, they have to build at least SOME of the results.
My question is: how do they do it so fast?
My thought was to create an array of arrays. In this case, there would be 6 arrays (because of 6 groups), with the consonant/vowel possibilities in each array. But I didn't know how to loop through those correctly to build the domains.
I'm sure there is a faster/better way, like maybe hash tables or matrices or something, but I don't know enough about those to figure it out on my own.
If anybody could provide C# code, I would really appreciate it!
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我猜测它是由 trie 的某种变体支持的。
I would guess that it's backed by some variant of a trie .