短网址服务如何运作?

发布于 2024-08-07 20:39:28 字数 308 浏览 8 评论 0原文

TinyURL等服务如何运作元标记有效吗?
他们是否只是将微小的 URL 密钥与仅提供到原始 URL 的“HTTP 重定向”的[虚拟?]网页相关联?或者还有更多“魔力”吗?

【原文】 我经常使用 TinyURL、Metamark 等 URL 缩短服务,但每次使用时,我都想知道这些服务是如何工作的。他们是否创建一个将重定向到另一个页面的新文件,或者是否使用子域?

How do services like TinyURL or Metamark work?
Do they simply associate the tiny URL key with a [virtual?] web page which merely provide an "HTTP redirect" to the original URL? or is there more "magic" to it ?

[original wording]
I often use URL shortening services like TinyURL, Metamark, and others, but every time I do, I wonder how these services work. Do they create a new file that will redirect to another page or do they use subdomains?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

梦醒灬来后我 2024-08-14 20:39:28

不,他们不使用文件。当您单击此类链接时,HTTP 请求将发送到其服务器,其中包含完整的 URL,例如 http://bit。 ly/duSk8wK(指向此问题的链接)。他们读取映射到数据库的路径部分(此处为 duSk8wK)。在数据库中,他们会找到描述(有时)、您的姓名(有时)和真实的 URL。然后他们发出重定向,这是一个 HTTP 302 响应和标头中的目标 URL。

这种直接重定向很重要。如果您要使用文件或首先加载 HTML 然后重定向,浏览器会将 TinyUrl 添加到历史记录中,这不是您想要的。此外,重定向到的网站会将引荐来源网址(您最初来自的网站)视为 TinyUrl 链接所在的网站(即 twitter.com、您自己的网站,无论链接在哪里)。这同样重要,以便网站所有者可以看到人们来自哪里。如果加载重定向的页面,这也将不起作用。

PS:重定向的类型还有很多。 HTTP 301 的意思是:永久重定向。如果发生这种情况,浏览器将不再请求 bit.ly 或 TinyUrl 站点,并且这些站点想要计算点击次数。这就是使用 HTTP 302 的原因,它是一个临时重定向。浏览器每次都会再次询问 TinyUrl.com 或 bit.ly,这样就可以为您计算点击次数(一些小型 url 服务提供此功能)。

No, they don't use files. When you click on a link like that, an HTTP request is send to their server with the full URL, like http://bit.ly/duSk8wK (links to this question). They read the path part (here duSk8wK), which maps to their database. In the database, they find a description (sometimes), your name (sometimes) and the real URL. Then they issue a redirect, which is a HTTP 302 response and the target URL in the header.

This direct redirect is important. If you were to use files or first load HTML and then redirect, the browser would add TinyUrl to the history, which is not what you want. Also, the site that is redirected to will see the referrer (the site that you originally come from) as being the site the TinyUrl link is on (i.e., twitter.com, your own site, wherever the link is). This is just as important, so that site owners can see where people are coming from. This too, would not work if a page gets loaded that redirects.

PS: there are more types of redirect. HTTP 301 means: redirect permanent. If that would happen, the browser will not request the bit.ly or TinyUrl site anymore and those sites want to count the hits. That's why HTTP 302 is used, which is a temporary redirect. The browser will ask TinyUrl.com or bit.ly each time again, which makes it possible to count the hits for you (some tiny url services offer this).

不离久伴 2024-08-14 20:39:28

其他人已经回答了重定向的工作原理,但您还应该知道它们如何生成小网址。您可能会误以为他们创建了 URL 的哈希值,以便为缩短的 URL 生成唯一的代码。在大多数情况下这是不正确的,它们没有使用哈希算法(可能会发生冲突)。

大多数流行的 URL 缩短服务只是获取 URL 数据库中的 ID,然后将其转换为 Base 36 [a-z0-9](不区分大小写)或 Base 62(区分大小写)。

TinyURL 数据库表的简化示例:

ID       URL                           VisitCount
 1       www.google.com                        26
 2       www.stackoverflow.com               2048
 3       www.reddit.com                        64
...
 20103   www.digg.com                         201
 20104   www.4chan.com                         20

允许灵活路由的 Web 框架使处理传入 URL 变得非常容易(Ruby、ASP.NET MVC 等)。

因此,在您的网络服务器上,您可能有一个如下所示的路由操作(伪代码):

Route: www.mytinyurl.com/{UrlID}
Route Action: RouteURL(UrlID);

它将任何传入的请求路由到您的服务器,该请求在您的域 www.mytinyurl.com 之后有任何文本到您的关联方法 RouteURL。它将在 URL 中的正斜杠之后传入的文本提供给该方法。

因此,假设您请求:www.mytinyurl.com/fif

“fif”将被传递到您的方法 RouteURL(String UrlID)。然后,RouteURL 会将“fif”转换为其等效的 base10 20103,并且将发出数据库请求以重定向到 ID 20103 下存储的任何 URL(在本例中为 www.digg.com)。在重定向到正确的 URL 之前,您还可以将 Digg 的访问计数增加 1。

这是一个非常简单的示例,但您应该能够了解总体思路。

Others have answered how the redirects work but you should also know how they generate their tiny urls. You'll mistakenly hear that they create a hash of the URL in order to generate that unique code for the shortened URL. This is incorrect in most cases, they aren't using a hashing algorithm (where you could potentially have collisions).

Most of the popular URL shortening services simply take the ID in the database of the URL and then convert it to either Base 36 [a-z0-9] (case insensitive) or Base 62 (case sensitive).

A simplified example of a TinyURL Database Table:

ID       URL                           VisitCount
 1       www.google.com                        26
 2       www.stackoverflow.com               2048
 3       www.reddit.com                        64
...
 20103   www.digg.com                         201
 20104   www.4chan.com                         20

Web Frameworks that allow flexible routing make handling the incoming URL's really easy (Ruby, ASP.NET MVC, etc).

So, on your webserver you might have a route action that looks like (pseudo code):

Route: www.mytinyurl.com/{UrlID}
Route Action: RouteURL(UrlID);

Which routes any incoming request to your server that has any text after your domain www.mytinyurl.com to your associated method, RouteURL. It supplies the text that is passed in after the forward slash in your URL to that method.

So, lets say you requested: www.mytinyurl.com/fif

"fif" would then be passed to your method, RouteURL(String UrlID). RouteURL would then convert "fif" to its base10 equivalent, 20103, and a database request will be made to redirect to whatever URL is stored under the ID 20103 (in this case, www.digg.com). You would also increase the visit count for Digg by one before redirecting to the correct URL.

This is a really simplified example but you should be able to get the general idea.

风蛊 2024-08-14 20:39:28

作为 @A Salcedo 答案的扩展:

一些 url 缩短服务 (Tinyarro.ws) 走向极端,使用 Unicode (UTF-8) 对缩短的 url 中的字符进行编码 - 这允许在必须添加额外符号之前访问更多的网站。由于大多数 UTF-8 都被接受使用 ((IRI) 大多数浏览器处理的 RFC 3987),每个符号从 62 个站点跳转到~1,112,064

从长远来看,我们可以用 2 个符号对 1.2366863e+12 个站点进行编码 (1,112,064*1,112,064) - 2009 年 11 月,bit.ly 上的缩短链接被访问21 亿次(大约在那个时候,bit.ly 和 TinyURL 是使用最广泛的URL 缩短服务。),这比仅容纳 2 个符号要少大约 600 倍,因此,在所有 URL 缩短服务的完整存在期间,它应该至少再持续 20 年,直到添加第三个符号。

As an extension to @A Salcedo answer:

Some url shortening services (Tinyarro.ws) go to extreme by using Unicode (UTF-8) to encode characters in shortened url - which allows higher amount of websites before having to add additional symbol. Since most of UTF-8 is accepted for use ((IRI) RFC 3987 handled by most browsers) that bumps from 62 sites per symbol to ~1,112,064.

To put in perspective one can encode 1.2366863e+12 sites with 2 symbols (1,112,064*1,112,064) - in November 2009, shortened links on bit.ly were accessed 2.1 billion times (Around that time, bit.ly and TinyURL were the most widely used URL-shortening services.) which is ~600 times less than you can fit in just 2 symbols, so for full duration of existence of all url shortening services it should last another 20 years minimum till adding third symbol.

独自唱情﹋歌 2024-08-14 20:39:28

简而言之,URL 缩短器将任意长的字符序列(原始的、长的、蹩脚的 url)映射为短而光滑的字符序列。这只不过是散列,它最常用于创建查找表、HashMap、用于加密目的的 md5 散列等。

为了了解 URL 缩短过程,我在 GitHub 上创建了一个演示项目以及一篇博客文章。请参考此内容并告诉我是否有帮助。

博客文章:URL 缩短

In simple words, URL shortener maps an arbitrary long sequence of character ( original, long crappy url ) into a short and slick sequence of characters. This is nothing but Hashing, which is most commonly used to create lookup tables, HashMap, md5 Hash for cryptographic purposes etc.

To understand the URL-Shortening process I have created a demo project on GitHub and also a blog post. Do refer to this and let me know if it was helpful.

Blog Post : URL Shortening

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文