谷歌短网址是如何工作的?
即使字母区分大小写,Google 短 URL 如何仅用四个字符来满足网络中如此多的 URL 的需求?
说 fn(一些 url)-> url 的四个字母,他们怎么突然使用相同的函数,在一段时间后为 url 提供五个字母?他们如何从 url 中知道它是在四个字母还是五个字母的 url 中?
How can the Google short URL cater for so many URLs in web with just four character, even though alphabets are case sensitive?
Say fn(some url)-> four letter for url, how can they suddenly use the same function which gives five letter for url after sometime? How will they know whether it is in four letter or five letter url from url?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
26个字母* 2(大写/小写)= 52 ^ 4(4次方)= 7311616个网址,
如果加上数字,那就是62^4 = 14776336个网址。
所以他们在添加第五个字母/数字之前还有一些时间
26 letters * 2 (upper/lowercase) = 52 ^ 4 (to the power of 4) = 7311616 urls
if they add digits, it would be 62^4 = 14776336 urls.
so they have some time to go before adding a 5th letter/digit
它的工作方式与所有其他缩短器相同 - 字符是已缩短的 URL 的唯一 ID。有 52 个字母(大写和小写)加上数字和特殊字符,有多种组合可供使用。
It works the same way as all the other shorteners- the characters are a unique ID of the URL that has been shortened. With 52 letters (upper and lower case) plus numbers and special characters, there are plenty of combinations to work with.
方法如下:
我不相信谷歌有 700 万个页面值得缩短。编辑:
显然您可以使用 Google 工具栏缩短网址:
尽管如此,这还不是“广泛”的消费者用途。如果用完,他们会添加更多字母。
回复更新的问题:
说 fn(some url)-> url 的四个字母,他们怎么突然使用相同的函数,在一段时间后为 url 提供五个字母?
Google 并不是简单地对 URL 进行哈希处理然后使用它(请记住,哈希值只是一种方式,因此无论如何您都无法从中获取原始 URL - 它必须存储在数据库中)。他们可能从哈希开始,然后在数据库中执行查找以查看该密钥是否已存在。如果没有,它将被用作密钥。如果它已经存在,他们将使用其他方法来执行哈希,或者以使其唯一的方式操作哈希。
他们如何从 url 中知道它是在四个字母还是五个字母的 url 中?
如果 URL 的末尾有 4 个字母,那么他们就是这样知道的......
This is how:
I don't believe even Google has 7 million pages worth shortening.Edit:
Apparently you can shorten URLs using the Google Toolbar:
Still, that is not "broad" consumer use. If they run out, they'll add more letters.
Response to updated question:
Say fn(some url)-> four letter for url, how can they suddenly use the same function which gives five letter for url after sometime?
Google is not simply hashing the URL then just using it (remember, hashes are only 1 way so you couldn't get the original URL out of them anyway - it must be stored in a database). They may start with a hash, then perform a lookup in a database to see if that key already exists. If it does not, it will be used as the key. If it already exists, they'll use some other method to perform the hash, or manipulate the hash in a way that makes it unique.
How will they know whether it is in four letter or five letter url from url?
If the end of the URL has 4 letters, then that's how they know...
英文字母表中有26个字母。 Lower + Upper 是 52。
52 * 52 * 52 * 52 = 7.311.616 他们受到这个数字的限制。如果他们用完了 4 个字母的 url,他们可以毫无问题地升级到 5 个字母,不是吗?
我认为添加数字不是一个好主意,因为 0(零)和 O、1(一)和 l(L)、I(大写 i)和 l(小写 L)非常相似。
There are 26 letters in English Alphabet. Lower + Upper it is 52.
52 * 52 * 52 * 52 = 7.311.616 They are limited with this number. If they run out of 4 letter urls, they can upgrade to 5 without any problem, Cannot they?
I don't think adding digits is a good idea for that since 0 (zero) and O, 1 (one) and l (L), I (upper case i) and l (lower case L) are very similar.
(26 + 26 + 10) * (26 + 26 + 10) * (26 + 26 + 10) * (26 + 26 + 10) = 14776336
这是 62 个可能的字符的 26 个小写字母、26 个大写字母和 10 个数字。实际上,我认为它可能是其他值的 Base-64 编码表示,因此该数字可能更像 16777216。
(26 + 26 + 10) * (26 + 26 + 10) * (26 + 26 + 10) * (26 + 26 + 10) = 14776336
That's 26 lower case, 26 upper case, and 10 digits for 62 possible characters. And actually I think it's likely a base-64 encoded representation of some other value, so the number is probably more like 16777216.
我不知道谷歌是怎么做到的。但我想实现短 URL 的一种方法是使用字符 0-9a-zA-Z 递增值——本质上是使用 62 进制数字系统。因此,生成该值的方法可能会查找最近使用的值,然后将其加一。例如:abcz + 1 = abcA。或者:ZZZZ + 1 = 00000。
I don't know how Google does it. But I imagine one way you could implement a short URL is by incrementing the value using the characters 0-9a-zA-Z -- essentially using a base 62 number system. So, the method that generates the value might look up the most recently used value, then increment it by one. For example: abcz + 1 = abcA. Or: ZZZZ + 1 = 00000.