An architect at my work recently read Yahoo!'s Exceptional Performance Best Practices guide where it says to use a far-future Expires header for resources used by a page such as JavaScript, CSS, and images. The idea is you set a Expires header for these resources years into the future so they're always cached by the browser, and whenever we change the file and therefore need the browser to request the resource again instead of using its cache, change the filename by adding a version number.
Instead of incorporating this into our build process though, he has another idea. Instead of changing file names in source and on the server disk for each build (granted, that would be tedious), we're going to fake it. His plan is to set far-future expires on said resources, then implement two HttpModules.
One module will intercept all the Response streams of our ASPX and HTML pages before they go out, look for resource links and tack on a version parameter that is the file's last modified date. The other HttpModule will handle all requests for resources and simply ignore the version portion of the address. That way, the browser always requests a new resource file each time it has changed on disk, without ever actually having to change the name of the file on disk.
Make sense?
My concern relates to the module that rewrites the ASPX/HTML page Response stream. He's simply going to apply a bunch of Regex.Replace() on "src" attributes of <script> and <img> tags, and "href" attribute of <link> tags. This is going to happen for every single request on the server whose content type is "text/html." Potentially hundreds or thousands a minute.
I understand that HttpModules are hooked into the IIS pipeline, but this has got to add a prohibitive delay in the time it takes IIS to send out HTTP responses. No? What do you think?
If the idea is to add a query string to the static file names to indicate their version, unfortunately that will also prevent caching by the kernel-mode HTTP driver (http.sys)
Scanning each entire response based on a bunch of regular expressions will be slow, slow, slow. It's also likely to be unreliable, with hard-to-predict corner cases.
A few alternatives:
Use control adapters to explicitly replace certain URLs or paths with the current version. That allows you to focus specifically on images, CSS, etc.
Change folder names instead of file names when you version static files
Consider using ASP.NET skins to help centralize file names. That will help simplify maintenance.
In case it's helpful, I cover this subject in my book (Ultra-Fast ASP.NET), including code examples.
He's worried about stuff not being cached on the client - obviously this depends somewhat on how the user population has their browsers configured; if it's the default config then I doubt you'd need to worry about trying to second guess the client caching, it's too hard and the results aren't guaranteed, also it's not going to help new users.
As far as the HTTP Modules go - in principle I would say they are fine, but you'll want them to be blindingly fast and efficient if you take that track; it's probably worth trying out. I can't speak on the appropriateness of use RegEx to do what you want done inside, though.
If you're looking for high performance, I suggest you (or your architect) do some reading (and I don't mean that in a nasty way). I learnt something recently which I think will help -let me explain (and maybe you guys know this already).
Browsers only hold a limited number of simultaneous connections open to a specific hostname at any one time. e.g, IE6 will only do 6 connections to say www.foo.net.
If you call your images from say images.foo.net you get 6 new connections straight away.
The idea is to seperate out different content types into different hostnames (css.foo.net, scripts.foo.net, ajaxcalls.foo.net) that way you'll be making sure the browser is really working on your behalf.
StaticFileHandler - Serve Static Files in a cachable, resumable way.
CrusherModule - Serve compressed versioned JS and CSS in a cachable way.
You don't quite get kernel mode caching speed but serving from HttpRuntime.Cache has its advantages. Kernel Mode cache can't cache partial responses and you don't have fine grained control of the cache. The most important thing to implement is a consistent etag header and expires header. This will improve your site performance more than anything else.
Reducing the number of files served is probably one of the best ways to improve the speed of your website. The CrusherModule combines all the css on your site into one file and all the js into another file.
发布评论
评论(3)
需要注意的一些事情:
一些替代方案:
如果它有帮助,我会在我的书中介绍这个主题(超快速 ASP.NET),包括代码示例。
A few things to be aware of:
A few alternatives:
In case it's helpful, I cover this subject in my book (Ultra-Fast ASP.NET), including code examples.
他担心客户端上没有缓存内容 - 显然这在某种程度上取决于用户群的浏览器配置方式;如果它是默认配置,那么我怀疑您需要担心尝试再次猜测客户端缓存,这太难了,并且不能保证结果,而且它不会帮助新用户。
就 HTTP 模块而言 - 原则上我会说它们很好,但如果你走这条路,你会希望它们快得令人眼花缭乱、高效;这可能值得尝试。不过,我无法谈论使用 RegEx 来完成您想要在内部完成的操作的适当性。
如果您正在寻找高性能,我建议您(或您的架构师)阅读一些内容(我的意思并不是以一种令人讨厌的方式)。我最近学到了一些我认为会有所帮助的东西 - 让我解释一下(也许你们已经知道了)。
浏览器在任何时间仅保留对特定主机名打开的有限数量的并发连接。例如,IE6 只会进行 6 个连接来表示 www.foo.net。
如果您从 images.foo.net 调用您的图像,您会立即获得 6 个新连接。
这个想法是将不同的内容类型分成不同的主机名(css.foo.net、scripts.foo.net、ajaxcalls.foo.net),这样您就可以确保浏览器真正为您工作。
He's worried about stuff not being cached on the client - obviously this depends somewhat on how the user population has their browsers configured; if it's the default config then I doubt you'd need to worry about trying to second guess the client caching, it's too hard and the results aren't guaranteed, also it's not going to help new users.
As far as the HTTP Modules go - in principle I would say they are fine, but you'll want them to be blindingly fast and efficient if you take that track; it's probably worth trying out. I can't speak on the appropriateness of use RegEx to do what you want done inside, though.
If you're looking for high performance, I suggest you (or your architect) do some reading (and I don't mean that in a nasty way). I learnt something recently which I think will help -let me explain (and maybe you guys know this already).
Browsers only hold a limited number of simultaneous connections open to a specific hostname at any one time. e.g, IE6 will only do 6 connections to say www.foo.net.
If you call your images from say images.foo.net you get 6 new connections straight away.
The idea is to seperate out different content types into different hostnames (css.foo.net, scripts.foo.net, ajaxcalls.foo.net) that way you'll be making sure the browser is really working on your behalf.
http://code.google.com/p/talifun-web
StaticFileHandler - 服务以可缓存、可恢复的方式保存静态文件。
CrusherModule - 以可缓存的方式提供压缩版本的 JS 和 CSS。
您无法完全获得内核模式缓存速度,但从 HttpRuntime.Cache 提供服务有其优势。内核模式缓存无法缓存部分响应,并且您无法对缓存进行细粒度控制。最重要的是要实现一致的 etag 标头和过期标头。这比其他任何事情都更能提高您的网站性能。
减少所提供的文件数量可能是提高网站速度的最佳方法之一。 CrusherModule 将站点上的所有 css 合并到一个文件中,将所有 js 合并到另一个文件中。
内存便宜,硬盘慢,所以用它吧!
http://code.google.com/p/talifun-web
StaticFileHandler - Serve Static Files in a cachable, resumable way.
CrusherModule - Serve compressed versioned JS and CSS in a cachable way.
You don't quite get kernel mode caching speed but serving from HttpRuntime.Cache has its advantages. Kernel Mode cache can't cache partial responses and you don't have fine grained control of the cache. The most important thing to implement is a consistent etag header and expires header. This will improve your site performance more than anything else.
Reducing the number of files served is probably one of the best ways to improve the speed of your website. The CrusherModule combines all the css on your site into one file and all the js into another file.
Memory is cheap, hard drives are slow, so use it!