为什么要将您的 Javascript 文件移至您也拥有的另一个主域？

发布于 2024-07-06 04:31:00 字数 2193 浏览 11 评论 0原文

我注意到，就在去年左右，许多主要网站都对其页面结构方式进行了相同的更改。每个人都将其 Javascript 文件从托管在与页面本身相同的域（或其子域）上，转移到托管在不同名称的域上。

这不仅仅是并行化

现在，有一种众所周知的技术可以将页面的组件分布到多个域以并行下载。雅虎推荐它，就像许多其他人一样。例如，www.example.com 是 HTML 的托管位置，然后您将图像放在 images.example.com 上，将 javascript 放在 scripts.example.com 上。这回避了这样一个事实：大多数浏览器为了成为良好的网络公民而限制每台服务器的同时连接数。

以上不是我正在谈论的内容。

它不仅仅是重定向到内容交付网络（或者也许是——参见问题底部），

我所说的是专门在一个完全不同的域上托管 Javascript。让我具体说一下。就在去年左右，我注意到：

youtube.com 已将其 .JS 文件移至 ytimg.com

cnn.com已将其 .JS 文件移至 cdn.turner.com

weather.com 已将其 .JS 文件移至 j.imwx.com

现在，我了解像 Akamai 这样的内容交付网络，他们专门为大型网站外包此内容。（特纳特殊领域中的名称“cdn”让我们了解了这个概念的重要性）。

但请注意这些示例，每个站点都有自己专门为此目的注册的域，而不是内容交付网络或其他基础设施提供商的域。事实上，如果您尝试从大多数这些脚本域加载主页，它们通常会重定向回公司的主域。如果您反向查找所涉及的 IP，它们有时会指向 CDN 公司的服务器，有时则不然。

我为什么关心？

我之前曾在两家不同的安全公司工作过，因此对恶意 JavaScript 产生了偏执。

因此，我遵循将网站列入白名单的做法，允许 Javascript（以及其他活动内容，例如 Java）在其上运行。因此，为了使像 cnn.com 这样的网站正常工作，我必须手动将 cnn.com 放入列表中。这是一种背后的痛苦，但我更喜欢它而不是其他选择。

当人们使用像 scripts.cnn.com 这样的东西来并行化时，通过适当的通配符可以很好地工作。当人们使用 CDN 公司域名之外的子域名时，我可以只允许 CDN 公司的主域名前面加上通配符，一箭多雕（例如 *.edgesuite.net 和 *.akamai.com）。

现在我发现（截至 2008 年）这还不够。现在我必须浏览我想要列入白名单的页面的源代码，并找出该网站用于存储其 Javascript 的“秘密”域（或多个域）。在某些情况下，我发现我必须允许三个不同的域才能使网站正常工作。

为什么所有这些主要网站都开始这样做？

编辑：好的正如“onebyone”指出的，它似乎确实与内容的CDN 传递有关。因此，让我根据他的研究稍微修改一下问题......

为什么weather.com使用j.imwx.com而不是twc.vo.llnwd。净？

为什么 youtube.com 使用 s.ytimg.com 而不是 static.cache.l.google.com？

这背后一定有一个道理。

原文

I've noticed that just in the last year or so, many major websites have made the same change to the way their pages are structured. Each has moved their Javascript files from being hosted on the same domain as the page itself (or a subdomain of that), to being hosted on a differently named domain.

It's not simply parallelization

Now, there is a well known technique of spreading the components of your page across multiple domains to parallelize downloading. Yahoo recommends it as do many others. For instance, www.example.com is where your HTML is hosted, then you put images on images.example.com and javascripts on scripts.example.com. This gets around the fact that most browsers limit the number of simultaneous connections per server in order to be good net citizens.

The above is not what I am talking about.

It's not simply redirection to a content delivery network (or maybe it is--see bottom of question)

What I am talking about is hosting Javascripts specifically on an entirely different domain. Let me be specific. Just in the last year or so I've noticed that:

youtube.com has moved its .JS files to ytimg.com

cnn.com has moved its .JS files to cdn.turner.com

weather.com has moved its .JS files to j.imwx.com

Now, I know about content delivery networks like Akamai who specialize in outsourcing this for large websites. (The name "cdn" in Turner's special domain clues us in to the importance of this concept here).

But note with these examples, each site has its own specifically registered domain for this purpose, and its not the domain of a content delivery network or other infrastructure provider. In fact, if you try to load the home page off most of these script domains, they usually redirect back to the main domain of the company. And if you reverse lookup the IPs involved, they sometimes appear point to a CDN company's servers, sometimes not.

Why do I care?

Having formerly worked at two different security companies, I have been made paranoid of malicious Javascripts.

As a result, I follow the practice of whitelisting sites that I will allow Javascript (and other active content such as Java) to run on. As a result, to make a site like cnn.com work properly, I have to manually put cnn.com into a list. It's a pain in the behind, but I prefer it over the alternative.

When folks used things like scripts.cnn.com to parallelize, that worked fine with appropriate wildcarding. And when folks used subdomains off the CDN company domains, I could just permit the CDN company's main domain with a wildcard in front as well and kill many birds with one stone (such as *.edgesuite.net and *.akamai.com).

Now I have discovered that (as of 2008) this is not enough. Now I have to poke around in the source code of a page I want to whitelist, and figure out what "secret" domain (or domains) that site is using to store their Javascripts on. In some cases I've found I have to permit three different domains to make a site work.

Why did all these major sites start doing this?

EDIT: OK as "onebyone" pointed out, it does appear to be related to CDN delivery of content. So let me modify the question slightly based on his research...

Why is weather.com using j.imwx.com instead of twc.vo.llnwd.net?

Why is youtube.com using s.ytimg.com instead of static.cache.l.google.com?

There has to a reasoning behind this.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

暖风昔人 2024-07-13 04:31:00

我认为CDN理论中有一些东西：

例如：

$ host j.imwx.com
j.imwx.com              CNAME   twc.vo.llnwd.net
twc.vo.llnwd.net        A       87.248.211.218
twc.vo.llnwd.net        A       87.248.211.219
$ whois llnwd.net
<snip ...>
Registrant:
  Limelight Networks Inc.
  2220 W. 14th Street
  Tempe, Arizona 85281-6945
  United States

Limelight就是一个CDN。

同时：

$ host s.ytimg.com
s.ytimg.com             CNAME   static.cache.l.google.com
static.cache.l.google.com       A       74.125.100.97

我猜测这是 Google 内部运行的静态内容 CDN。

$ host cdn.turner.com
cdn.turner.com A record currently not present

啊，好吧，无法赢得所有人。

顺便说一句，如果您使用带有 NoScript 插件的 Firefox，那么它将自动执行源代码搜索过程，并通过 GUI 实现白名单过程。基本上，单击状态栏中的 NoScript 图标，您会看到一个域列表，其中包含临时或永久白名单的选项，包括“此页面上的所有内容”。

I think there's something in the CDN theory:

For example:

$ host j.imwx.com
j.imwx.com              CNAME   twc.vo.llnwd.net
twc.vo.llnwd.net        A       87.248.211.218
twc.vo.llnwd.net        A       87.248.211.219
$ whois llnwd.net
<snip ...>
Registrant:
  Limelight Networks Inc.
  2220 W. 14th Street
  Tempe, Arizona 85281-6945
  United States

Limelight is a CDN.

Meanwhile:

$ host s.ytimg.com
s.ytimg.com             CNAME   static.cache.l.google.com
static.cache.l.google.com       A       74.125.100.97

I'm guessing that this is a CDN for static content run internally by Google.

$ host cdn.turner.com
cdn.turner.com A record currently not present

Ah well, can't win 'em all.

By the way, if you use Firefox with the NoScript add-on then it will automate the process of hunting through source, and GUI-fy the process of whitelisting. Basically, click on the NoScript icon in the status bar, you're given a list of domains with options to temporarily or permanently whitelist, including "all on this page".

回复收藏 0 原文

烟酒忠诚 2024-07-13 04:31:00

有很多原因：

CDN - 不同的 dns 名称可以更轻松地将静态资产转移到内容分发网络

并行性 - 图像、样式表和静态 javascript 正在使用其他两个连接，这些连接不会阻止其他请求，例如 ajax 回调或动态请求图像

Cookie 流量 - 完全正确 - 特别是对于那些习惯在 Cookie 中存储远超简单会话 ID 的网站

负载整形 - 即使没有 CDN，仍然有充分的理由将静态资产托管在经过优化以响应极快的较少 Web 服务器上快速响应大量文件 url 请求，而网站的其余部分托管在大量服务器上，响应更多处理器密集型动态请求

更新 - 您不使用 CDN 的 dns 名称的两个原因。客户端 dns 名称充当 CDN 正在缓存的资产的正确“配置单元”的关键。此外，由于您的 CDN 是一项商品服务，您可以通过更改 dns 记录来更改提供商 - 这样您就可以避免站点上的任何页面更改、重新配置或重新部署。

回复收藏 0 原文

献世佛 2024-07-13 04:31:00

限制cookie流量？

在特定域上设置 cookie 后，对该域的每个请求都会将该 cookie 发送回服务器。每个请求！

这可以很快加起来。

回复收藏 0 原文

合久必婚 2024-07-13 04:31:00

您的后续问题本质上是：假设一个受欢迎的网站正在使用 CDN，为什么他们会使用自己的 TLD（例如 imwx.com）而不是子域（static.weather.com）或 CDN 的域？

嗯，使用他们控制的域而不是 CDN 域的原因是他们保留了控制权——他们甚至有可能完全改变 CDN，只需要更改 DNS 记录，而不是必须更新数千个页面/应用程序中的链接。

那么，为什么要使用无意义的域名呢？嗯，像 .js 和 .css 这样的帮助文件的一个大问题是，您希望代理和人们的浏览器尽可能地将它们缓存在下游。如果一个人访问 gmail.com 并且所有 .js 都从浏览器缓存中加载出来，那么该网站对他们来说显得更加快捷，并且还节省了服务器端的带宽（每个人都赢了）。问题是，一旦您发送 HTTP 标头进行真正积极的缓存（即缓存一周、一年或永远），这些文件就不再可靠地从服务器加载，并且您无法对因为人们的浏览器中的东西会被破坏。

因此，公司要做的就是分阶段进行这些更改，并实际更改所有这些文件的 URL，以迫使人们的浏览器重新加载它们。循环浏览“a.imwx.com”、“b.imwx.com”等域名就是这样完成的。

通过使用无意义的域名，Javascript 开发人员和他们的 Javascript 系统管理员/CDN 联络伙伴可以拥有自己的域名/DNS，他们可以通过这些域名/DNS 来推动这些更改，他们对此负责/自治。

然后，如果 TLD 上开始发生任何类型的 cookie 阻止或脚本阻止，它们只会从一个无意义的 TLD 更改为 kyxmlek.com 或其他内容。他们不必担心不小心做了一些邪恶的事情，从而对整个 *.google.com 产生反制副作用。

回复收藏 0 原文

忆梦 2024-07-13 04:31:00

大约两三年前，我在以前的雇主那里实施了这个解决方案，当时网站由于遗留的 Web 服务器实施而开始过载。通过将 CSS 和布局图像移至 Apache 服务器，我们减少了主服务器上的负载并提高了速度。

然而，我一直认为 Javascript 函数只能从与页面本身相同的域内访问。较新的网站似乎没有这个限制：正如您所提到的，许多网站在单独的子域甚至完全独立的域上都有 Javascript 文件。

谁能告诉我为什么现在这是可能的，而几年前还不可能？

回复收藏 0 原文

她如夕阳 2024-07-13 04:31:00

您不仅可以将 JavaScript 迁移到不同的域，而且尽可能多的资产也将带来性能改进。

大多数浏览器对单个域可以同时连接的数量有限制（我认为大约是 4 个），因此当您有大量图像、js、css 等时，下载每个文件通常会出现延迟。

您可以使用 YSlow 和 FireBug 等工具来查看每个文件从服务器下载的时间。

通过将资产放在不同的域上，您可以减轻主服务器上的负载，并且可以拥有更多的同时连接并在任何给定时间下载更多文件。

我们最近推出了一个房地产网站，其中有很多图像（房屋的图像，废话：P），图像使用了这一原理，因此列出数据的速度要快得多。

我们还在许多其他具有高资产量的网站上使用了它。

回复收藏 0 原文

筱武穆 2024-07-13 04:31:00

我曾与一家从事此业务的公司合作过。他们位于具有相当良好对等互连的数据中心，因此 CDN 推理对他们来说并不那么重要（也许这会有所帮助，但他们不会因此而这样做）。他们的原因是，他们并行运行多个网络服务器，共同处理动态页面（PHP 脚本），并且他们在单独的域中提供图像和一些 javascript，在该域上使用快速、轻量级的网络服务器（例如 lighttpd 或 thttpd）来提供服务图像和静态 JavaScript。

PHP 需要 PHP。静态 Javascript 和图像则不然。当您所需要做的只是绝对最少的事情时，可以从功能齐全的网络服务器中剥离很多东西。

当然，他们可能会使用代理将特定子目录的请求重定向到不同的服务器，但使用不同的服务器处理所有静态内容会更容易。

回复收藏 0 原文

醉南桥 2024-07-13 04:31:00

如果我是一家大牌、多品牌的公司，我认为这种方法是有意义的，因为您希望将 javascript 代码作为库提供。我想让尽可能多的页面在处理地址、州名、邮政编码等内容时尽可能保持一致。 AJAX 可能使这个问题变得突出。

在当前的互联网商业模式中，域名是品牌，而不是网络名称。如果您被收购或分拆品牌，您最终会进行大量域名更改。即使对于最著名的网站来说，这也是一个问题。

在 *.netscape.com 和 *.mcom.com 中仍然有指向有用文档的链接，但这些链接早已不复存在。

Netscape 的维基百科说：

“2004 年 10 月 12 日，流行的开发者网站 Netscape DevEdge 被 AOL 关闭。DevEdge 是互联网相关技术的重要资源，维护着 Netscape 浏览器的权威文档、HTML 和 JavaScript 等相关技术的文档，以及 Danny Goodman 等行业和技术领袖撰写的热门文章。DevEdge 的一些内容已在 Mozilla 网站上重新发布。”

因此，在不到 10 年的时间内：