缓存过期控制与上次修改

发布于 2024-07-14 00:50:40 字数 435 浏览 5 评论 0原文

在 Apache 的 mod_expires 模块中,有一个 Expires 指令,它有两个基本时间段:访问修改

ExpiresByType text/html "access plus 30 days"

可以理解,这意味着缓存将在 30 天后请求新鲜内容。

然而,

ExpiresByType text/html "modification plus 2 hours"

没有直观意义。

除非向服务器发出请求,否则浏览器缓存如何知道文件已被修改? 如果是调用服务器,缓存这个指令有什么用呢? 在我看来,我不理解缓存的一些关键部分。 请赐教。

In Apache's mod_expires module, there is the Expires directive with two base time periods, access, and modification.

ExpiresByType text/html "access plus 30 days"

understandably means that the cache will request for fresh content after 30 days.

However,

ExpiresByType text/html "modification plus 2 hours"

doesn't make intuitive sense.

How does the browser cache know that the file has been modified unless it makes a request to the server? And if it is making a call to the server, what is the use of caching this directive? It seems to me that I am not understanding some crucial part of caching. Please enlighten me.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

年华零落成诗 2024-07-21 00:50:40

以“修改”为基础的 Expires* 指令指的是服务器上文件的修改时间。 因此,如果您设置“修改加 2 小时”,则任何在文件修改(在服务器上)后 2 小时内请求内容的浏览器都会将该内容缓存到文件修改时间后 2 小时。 浏览器知道该时间是什么时候,因为服务器发送了带有正确过期时间的 Expires 标头。

让我用一个例子来解释:假设您的 Apache 配置包含该行

ExpiresDefault modification plus 2 hours

,并且您在服务器上有一个文件 index.htmlExpiresDefault 指令应用到该文件。 假设您在 GMT 9:53 上传 index.html 版本,覆盖之前存在的 index.html(如果有)。 所以现在index.html的修改时间是9:53 GMT。 如果您在服务器上运行 ls -l(或在 Windows 上运行 dir),您将在清单中看到它:

-rw-r--r--  1 apache apache    4096  Feb 18 09:53 index.html

现在,对于每个请求,Apache 都会发送 Last-Modified 标头,包含文件的上次修改时间。 由于您有 ExpiresDefault 指令,它还会发送 Expires 标头,其时间等于文件的修改时间 (9:53) 加两个小时。 因此,这是浏览器看到的部分内容:

Last-Modified: Wed, 18 Feb 2009 09:53:00 GMT
Expires: Wed, 18 Feb 2009 11:53:00 GMT

如果浏览器发出此请求的时间是在 GMT 11:53 之前,则浏览器将缓存该页面,因为该页面尚未过期。 因此,如果用户首先在 11:00 GMT 访问该页面,然后在 11:30 GMT 再次访问同一页面,浏览器将看到其缓存版本仍然有效,并且不会(或者更确切地说,允许不) 发出新的 HTTP 请求。

如果用户在 12:00 GMT 第三次访问该页面,浏览器会发现其缓存版本现已过期(11:53 之后),因此它会尝试验证该页面,并向服务器发送带有 If 的请求-修改-Since 标头。 将返回没有正文的 304(未修改)响应,因为页面的日期自首次提供以来尚未更改。 由于过期日期已过——该页面已“过时”——随后每次访问该页面时都会发出验证请求,直到验证失败。

现在,我们假设您在 11:57 上传了页面的新版本。 在这种情况下,浏览器在 12:00 尝试验证旧版本页面失败,并在响应中收到新页面和这两个新标头:(

Last-Modified: Wed, 18 Feb 2009 11:57:00 GMT
Expires: Wed, 18 Feb 2009 13:57:00 GMT

文件的最后修改时间变为 11:57上传新版本后,Apache 将到期时间计算为 11:57 + 2:00 = 13:57 GMT。)

现在直到 13:57 才需要验证(使用较新的日期)。

(当然请注意,许多其他内容是与上面列出的两个标头一起发送的,为了简单起见,我只是删除了所有其余内容)

An Expires* directive with "modification" as its base refers to the modification time of the file on the server. So if you set, say, "modification plus 2 hours", any browser that requests content within 2 hours after the file is modified (on the server) will cache that content until 2 hours after the file's modification time. And the browser knows when that time is because the server sends an Expires header with the proper expiration time.

Let me explain with an example: say your Apache configuration includes the line

ExpiresDefault modification plus 2 hours

and you have a file index.html, which the ExpiresDefault directive applies to, on the server. Suppose you upload a version of index.html at 9:53 GMT, overwriting the previous existing index.html (if there was one). So now the modification time of index.html is 9:53 GMT. If you were running ls -l on the server (or dir on Windows), you would see it in the listing:

-rw-r--r--  1 apache apache    4096  Feb 18 09:53 index.html

Now, with every request, Apache sends the Last-Modified header with the last modification time of the file. Since you have that ExpiresDefault directive, it will also send the Expires header with a time equal to the modification time of the file (9:53) plus two hours. So here is part of what the browser sees:

Last-Modified: Wed, 18 Feb 2009 09:53:00 GMT
Expires: Wed, 18 Feb 2009 11:53:00 GMT

If the time at which the browser makes this request is before 11:53 GMT, the browser will cache the page, because it has not yet expired. So if the user first visits the page at 11:00 GMT, and then goes to the same page again at 11:30 GMT, the browser will see that its cached version is still valid and will not (or rather, is allowed not to) make a new HTTP request.

If the user goes to the page a third time at 12:00 GMT, the browser sees that its cached version has now expired (it's after 11:53) so it attempts to validate the page, sending a request to the server with a If-Modified-Since header. A 304 (not modified) response with no body will be returned since the page's date has not been altered since it was first served. Since the expiry date has passed -- the page is 'stale' -- a validation request will be made every subsequent time the page is visited until validation fails.

Now, let's pretend instead that you uploaded a new version of the page at 11:57. In this case, the browser's attempt to validate the old version of the page at 12:00 fails and it receives in the response, along with the new page, these two new headers:

Last-Modified: Wed, 18 Feb 2009 11:57:00 GMT
Expires: Wed, 18 Feb 2009 13:57:00 GMT

(The last modification time of the file becomes 11:57 upon upload of the new version, and Apache calculates the expiration time as 11:57 + 2:00 = 13:57 GMT.)

Validation (using the more recent date) will not be required now until 13:57.

(Note of course that many other things are sent along with the two headers I listed above, I just trimmed out all the rest for simplicity)

就此别过 2024-07-21 00:50:40

服务器发送一个标头,例如:“Last-Modified: Wed, 18 Feb 2009 00:00:00 GMT”。 缓存的行为基于此标头或访问时间。

假设内容预计每天都会刷新,那么您希望它在“修改加 24 小时”后过期。

如果您不知道内容何时刷新,那么最好根据访问时间来确定。

The server sends a header such as: "Last-Modified: Wed, 18 Feb 2009 00:00:00 GMT". The cache behaves based on either this header or the access time.

Say if the content is expected to be refreshed every day, then you want it to expire "modification plus 24 hours".

If you don't know when the content will be refreshed, then it's better to base it on the access time.

听风念你 2024-07-21 00:50:40

我的理解是,修改要求浏览器根据 Last-Modificatied HTTP 标头的值来确定缓存时间。 因此,修改时间加上 2 小时就是上次修改时间 + 2 小时。

My understanding is that modification asks the browser to base the cache time based on the Last-Modificatied HTTP header's value. So, modification plus 2 hours would be the Last-Modificatied time + 2 hours.

揽月 2024-07-21 00:50:40

首先感谢David Z上面的详细解释。 在回答 Bushman 的问题(即如果服务器仍然需要发出请求,为什么调用缓存有意义),答案是时间保存在服务器返回的内容中。 如果缓存指令指示文件的内容仍然是新鲜的,则不会返回内容,而是返回 304 代码和空响应正文。 这就是节省时间的地方。

比我给出的更好的解释是这里,来自 https://devcenter.heroku.com/articles/increasing-application-performance-with-http-cache-headers

虽然条件请求确实会通过网络调用调用,但未修改的资源会导致空的响应正文 - 节省将资源传输回最终客户端的成本。 后端服务通常还能够非常快速地确定资源的上次修改日期,而无需访问资源,这本身就节省了重要的处理时间。

基于时间

基于时间的条件请求可确保仅当请求的资源自缓存浏览器副本以来发生更改时才会传输内容。 如果缓存的副本是最新的,则服务器返回 304 响应代码。

为了启用条件请求,应用程序通过 Last-Modified 响应标头指定资源的最后修改时间。

缓存控制:公共,最大年龄=31536000
最后修改时间:2011 年 1 月 3 日星期一 17:45:57 GMT

下次浏览器请求此资源时,它只会使用 If-Modified-Since 请求标头询问资源内容(如果自该日期以来资源内容未更改)

If-Modified-Since:2011 年 1 月 3 日星期一 17:45:57 GMT

如果资源自 2011 年 1 月 3 日星期一 17:45:57 GMT 以来未发生更改,服务器将返回带有 304 响应代码的空正文。

First of all, thanks to David Z for the detailed explanation above. In answer to bushman's question about why does it make sense to invoke caching if the server is still required to make a request, the answer is that the time is saved in what is returned by the server. If the cache directives indicate that a file's content is still fresh, instead of returning content, a 304 code is returned with an empty response body. That is where the time is saved.

A better explanation than I've given is here, from https://devcenter.heroku.com/articles/increasing-application-performance-with-http-cache-headers :

Though conditional requests do invoke a call across the network, unmodified resources result in an empty response body – saving the cost of transferring the resource back to the end client. The backend service is also often able to very quickly determine a resource’s last modified date without accessing the resource which itself saves non-trivial processing time.

Time-based

A time-based conditional request ensures that only if the requested resource has changed since the browser’s copy was cached will the contents be transferred. If the cached copy is the most up-to-date then the server returns the 304 response code.

To enable conditional requests the application specifies the last modified time of a resource via the Last-Modified response header.

Cache-Control:public, max-age=31536000
Last-Modified: Mon, 03 Jan 2011 17:45:57 GMT

The next time the browser requests this resource it will only ask for the contents of the resource if they’re unchanged since this date using the If-Modified-Since request header

If-Modified-Since: Mon, 03 Jan 2011 17:45:57 GMT

If the resource hasn’t changed since Mon, 03 Jan 2011 17:45:57 GMT the server will return with an empty body with the 304 response code.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文