创建 etag 的最佳方法是什么?
以编程方式为网页生成 etag 的好方法是什么?是否推荐这种做法? 一些网站建议关闭 etag,其他网站建议手动生成它们,还有一些网站建议保留默认设置 - 这里最好的方法是什么?
What's a good method of programatically generating etag for web pages, and is this practice recommended? Some sites recommend turning etags off, others recommend producing them manually, and some recommend leaving the default settings active - what's the best way here?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
Mufasa、
Yahoo(和 YSlow)实际上鼓励使用它们,但需要注意的是,自动生成的 ETag 因服务器而异。
我还不能投票,所以我只是说我同意文件路径和时间戳的哈希值(或表名+主字段值+时间戳,如果由数据库内容表示)的建议。
Mufasa,
Yahoo (and YSlow) actually encourage their use, but with the caveat that auto-generated ETags will differ from server to server.
I can't yet vote so I'll just say I agree with the suggestion of a hash of the file path and timestamp (or the table name + primary field value + timestamp if being represented by db content).
我刚刚启动了 YSlow,它抱怨 Etags,所以我做了一些研究。 该问题,根据雅虎博客(请参阅评论也)是默认的 ETags 实现使用文件 inode 号或 ntfs 修订号或其他同样服务器特定的东西作为哈希的一部分。 虽然速度很快,但基本上可以防止两个不同服务器提供相同的文件具有相同的 etag 并搞砸浏览器和下游缓存或负载平衡。
之前使用 MD5 哈希的建议是一个很好的建议,尽管您必须防止它本身成为性能问题。 这些建议的实现仍然取决于读者,尽管在我看来,这是您的框架可能能够为您处理的事情。
对于我自己来说,由于我处于一个简单的环境中,文件时间戳就足够了,因此我只是在 .htaccess 文件中使用
FileETag none
在 Apache 中关闭它们。 这会关闭 YSlow 并应该使事情回退到文件的最后修改日期。I just fired up YSlow and it complained about Etags, so I did a little research. The issue, as per the Yahoo blog (see the comments too)is that the default ETags implementations uses the file inode number or ntfs revision number or soemthing else equally server specific as a part of the hash. This, while being fast, basically prevents the same file being served by 2 different servers from having the same etag and screws up both browsers and downstream caches or load balances.
The previous suggestion to use an MD5 Hash is a good one, although you have to prevent that from becoming a performance problem in and of itself. The implementation of that suggestions remains up to the reader, although off-hand it seems to me like this is the sort of thing that your framework might be able to handle for you.
For myself, since I'm in a simple environment where the file timestamp will be more than adequate, I just turned them off in Apache using
FileETag none
in my .htaccess file. This shuts up YSlow and should make things fall back to the last modified date on the file.我建议生成内容的哈希值,例如
md5($content)
。此外,为了防止哈希冲突,您可能需要添加内容元素的 ID(如果合适的话)。
I recommend generating a hash of the the content, e.g.
md5($content)
.Additionally, to prevent hash collision, you might want to add e.g. the ID of the content element to it (if this is appropriate).
一般来说,不鼓励使用它们的“站点”是 Yahoo,这是因为某些默认 Web 服务器不会自动创建在服务器场中工作的 ETAG。 (雅虎的说法是正确且准确的。)
但是,如果您有一个 Web 服务器,那就没问题了。 如果没有,您需要检查您的网络服务器如何处理此问题并采取适当的行动。
Generally, the "sites" that discourage their use is Yahoo, and that's because some default web servers do not automatically create ETAGs that work in server farms. (Which is correct and accurate of Yahoo to claim.)
But, if you have a single web server, than you're fine. If not, you'll want to check up on how your web server handles this and act appropriately.
当您严重依赖缓存时,ETag 很有意义。 它们是资源(例如 URL)状态的重要指示器。
例如,假设您使用 ajax 请求来拉取用户的最新评论,并且您想知道是否有任何新评论。 更改 ETag 来警告您的应用程序有新内容是一种较便宜的检查方法。
因为如果 ETag 相同,您可以保留缓存,但否则重建它。
ETag 与 RESTful API 一起也很有意义。
至于生成它,请查看规范,我认为你几乎可以做任何你想做的事情。 时间戳、哈希值,任何对您/您的应用程序有意义的内容。
Well ETags make sense when you rely heavily on caching. They are a great indicator for the state of a resource (e.g. a URL).
For example, let's say you use an ajax request to pull the latest comments of a user and you want to know if there are any new comments. Changing the ETag to alert your application of new content is a less expensive way to check on that.
Because if the ETag is the same, you can keep your cache, but otherwise rebuild it.
ETags also make a lot of sense with RESTful APIs.
As for generating it, looking at the spec, I think you can do almost anything you want. A timestamp, a hash, whatever makes sense to you/your application.
当您在网站生成器前面使用某种缓存机制时,ETag 确实会有所帮助。 浏览器本身不使用它们,它们监听“(如果)修改自”或“年龄”标头结构,据我所知。
无论如何,由于其简单的性质,提供带有 ETag 的 http 标头是没有问题的。 我听说许多 Web 服务器只是获取文件的位置和文件的时间戳,并对这些数据进行 md5 哈希。
例如,我们使用我们的软件构建了一个简单但有效的 etag。 我们软件中的每个“内容单元”(即 html、jpeg、gif...)都有一个唯一的 id 和版本号(即 jpeg 的 id 为“17”,版本为“2”,这意味着它被更改过一次) 。 因此 ETag 只是字符串“id-version”,此处:“17-2”。 下一次更改将是“17-3”,以便缓存器识别更改,完全加载新内容部分(一次)并将其存储在自己的缓存中。
但您也可以使用 URL 和时间戳(即文件的时间戳)。
ETags do help when you use some kind of caching mechanism in front of your website-generator. Browsers themselves do not use them, they listen to "(if) modified since" or "age" header structs, afaik.
Anyway, due to its simple nature it is no problem to provide a http-header with an ETag. I heard that many web servers simply take the location of the file and the timestamp of the file and do a md5-hash over this data.
We, as an example, built a simple but effective etag with our software. Every "content unit" (i.e. html, jpegs, gifs...) in our software has a unique id and a version number (i.e. a jpeg has the id "17" and version "2", this means it was changed once). So the ETag simply is the string "id-version", here: "17-2". With the next change it would be "17-3" so that the cacher recognizes the change, loads the new content part (once) completely and stores it in it's own cache.
But you could probably use the URL and a timestamp (i.e. the timestamp of the file), too.