缓存页面的 Mod 重写规则

发布于 2024-10-07 19:17:14 字数 2115 浏览 4 评论 0原文

我正在基于子域的(Rails)应用程序中缓存页面。某些操作的页面缓存到/public/cache/(subdomain)/。该应用程序在 Apache 下使用 Phusion Passenger 运行。缓存工作正常。问题是 Apache 没有像它应该的那样获取缓存的页面并绕过 Rails。我的重写规则是错误的,我需要帮助修复它们。

作为许多示例之一,我使用了位于以下位置的建议: https://github.com/yeah/ page_cache_fu#readme,内容如下:

RewriteMap uri_escape int:escape
<Directory /var/www/example.com/current/public>

  RewriteEngine On
  RewriteCond %{REQUEST_METHOD} GET [NC]
  RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}%{REQUEST_URI}%{QUERY_STRING}.html -f
  RewriteRule ^([^.]+)$ cache/%{HTTP_HOST}/$1${uri_escape:%{QUERY_STRING}}.html [L]

  RewriteCond %{REQUEST_METHOD} GET [NC]
  RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/index.html -f
  RewriteRule ^$ cache/%{HTTP_HOST}/index.html

问题是它似乎期望该目录是完整的http主机(即它在cache/subdomain.example.com中查找而不仅仅是cache/subdomain)。

编辑:即使我将Rails应用程序更改为缓存到cache/subdomain.example.com,Apache仍然不使用它们,所以看起来除了子域方面还有更多错误。

有人可以帮我想出正确的规则吗?

编辑(2):

我已将重写简化为以下内容(只是为了尝试达到工作起点):

RewriteEngine On
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com$ [NC]
RewriteCond ^stats$ cache/%1/stats.html [L]

我认为这会导致 http://abc.example.com/stats 重写为 http://abc.example.com/cache/abc/stats.html

事实并非如此。我还添加了一个 RewriteLog 条目,我在那里看到的内容让我认为它正在尝试重定向到 http://abc.example.com/var/www/example.com/current/public/cache/abc/stats.html。如果我添加“R”选项以及在浏览器中看到的“L”,则进一步证实了这一点 http://abc.example.com/var/www/....等。即,它似乎附加了完整的文档根,而不仅仅是面向公众的部分。

当然上面的结果是我得到一个404错误返回到浏览器。

你能看出我的规则还有什么问题吗?

编辑:这实际上是一个错误。

http://code.google.com/p/phusion-passenger /问题/详细信息?id=563

I am caching pages in my (Rails) application based on subdomain. The pages for certain actions are cached to /public/cache/(subdomain)/. The application is running under Apache with Phusion Passenger. The caching is working fine. The problem is that Apache is not picking up the cached pages and bypassing Rails like it should be. My rewrite rules are wrong and I need help fixing them.

I have used, as one example of many, the suggestion located at: https://github.com/yeah/page_cache_fu#readme, which is as follows:

RewriteMap uri_escape int:escape
<Directory /var/www/example.com/current/public>

  RewriteEngine On
  RewriteCond %{REQUEST_METHOD} GET [NC]
  RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}%{REQUEST_URI}%{QUERY_STRING}.html -f
  RewriteRule ^([^.]+)$ cache/%{HTTP_HOST}/$1${uri_escape:%{QUERY_STRING}}.html [L]

  RewriteCond %{REQUEST_METHOD} GET [NC]
  RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/index.html -f
  RewriteRule ^$ cache/%{HTTP_HOST}/index.html

The problem with this is it seems to be expecting the directory to be the full http host (i.e. it's looking in cache/subdomain.example.com rather than just cache/subdomain).

Edit: Even when I change the Rails app to cache to cache/subdomain.example.com Apache still does not use them so it seems that there is more wrong than just the subdomain aspect.

Could someone please help me come up with the correct rule?

Edit(2):

I have simplified my rewrite to the following (just to try to get to a working starting point):

RewriteEngine On
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com$ [NC]
RewriteCond ^stats$ cache/%1/stats.html [L]

I would think this would cause http://abc.example.com/stats to be rewritten to http://abc.example.com/cache/abc/stats.html

It is not. I also added a RewriteLog entry and what I see there makes me think it is trying to redirect to http://abc.example.com/var/www/example.com/current/public/cache/abc/stats.html. This is further confirmed by that if I add an 'R' option along with the 'L' I see in my browser http://abc.example.com/var/www/....etc. I.e. it seems to be appending the full document root instead of just the public facing part.

Of course the result of the above is that I get a 404 error returned to the browser.

Can you see what is still wrong with my rule?

Edit: It's actually a bug.

http://code.google.com/p/phusion-passenger/issues/detail?id=563

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

゛清羽墨安 2024-10-14 19:17:14

好吧,这看起来应该可以工作,但事实并非如此。我对此做了很多测试,看起来问题出在 RewriteRule 中的 ^([^.]+)$ 上。现在,我用谷歌搜索了这个,这似乎是一个足够常见的模式,所以我不明白问题是什么。我只知道当我在 RewriteRule 中使用该模式时,该规则会失败。如果我将其更改为 ^([^.]+),它似乎可以工作。

希望对 mod_rewrite 有更多经验的人能够向我们解释该模式可能存在的问题。

编辑:我刚刚意识到 ^([^.]+)$ 的问题:

由于您正在构建缓存,因此“正常”文件将存在于其通常的位置。这意味着,如果您向服务器请求 /file,那么根据您的配置,它会说“嘿,file 不存在,所以让我们尝试一下默认扩展名 .html!”然后它就会发现 file.html。现在,当您到达 RewriteRule 时,^([^.]+)$ 正则表达式将与 file.html 匹配,而不是 file

^([^.]+)$ 表示“字符串的开头,后跟尽可能多的非句点字符,然后是字符串的结尾”,这对file 因为它不包含句点。它对 file.html 失败,因为 ^[^.]+ 将与 file 匹配,但正则表达式随后期望找到结尾字符串(即 $),它反而找到 .html 并失败。

^(.*)$ 起作用的原因是它保证只有 .* 才是整个字符串,因为 .* 匹配“与任何字符一样多”,因此正则表达式的 (.*)$ 部分之间不可能存在任何字符。 [^.]+ 的情况并非如此。


为了提取子域,您将需要反向引用 RewriteCond。基本上,如果您在 RewriteCond 中捕获引用(即将某些内容封装在括号内),则这些引用可用于紧随其后的 RewriteRule。

例如,如果我这样写:

 RewriteCond %{HTTP_HOST} ^([^.]+)\.example.com

那么括号将捕获子域 - 请注意 [^.]+ 周围的 ()

如果我要在下一行,上面捕获的文本将可以作为 %1 访问。

所以你的 RewriteRule 看起来像这样:

 RewriteRule ^([^.]+) cache/%1/$1${uri_escape:%{QUERY_STRING}}.html [L]

希望有帮助。

Alright, this looks like it should work, but it doesn't. I've done a lot of testing with this, and it seems like the problem is the ^([^.]+)$ in the RewriteRule. Now, I did Google this, and it seems like it's a common enough pattern, so I don't understand what the issue could be. I just know that when I use that pattern in a RewriteRule, the rule fails. If I change it to ^([^.]+), it seems to work.

Hopefully someone with more experience with mod_rewrite can come along and explain to us what the problem with that pattern might be.

Edit: I just realized the problem with ^([^.]+)$:

Since you're building a cache, then the "normal" file will exist in its usual place. The implication of this is that if you ask the server for /file then, depending on your configuration, it will say "hey, file doesn't exist, so let's try the default extension of .html!" and so it goes off and finds file.html. Now when you get to the RewriteRule, the ^([^.]+)$ regex will be matched against file.html NOT file.

The ^([^.]+)$ says "the start of the string, followed by as many non-period characters as you can grab, followed by the end of the string" which works fine against file because it contains no periods. It fails against file.html because ^[^.]+ will match against file, but where the regex then expects to find the end of the string (i.e. $), it instead finds .html and fails.

The reason ^(.*)$ works is that it's guaranteed that only .* will be the whole of the string, since .* matches "as many of any character" so there is no character that can possibly exist between the (.*) and $ portions of the regex. That's not the case with [^.]+.


In order to extract the subdomain, you're going to need to backreference a RewriteCond. Basically, if you capture a reference (i.e. encapsulate something inside parens) in a RewriteCond, those references are available to a RewriteRule which immediately follows it.

For example, if I wrote this:

 RewriteCond %{HTTP_HOST} ^([^.]+)\.example.com

Then the parentheses would capture the subdomain - note the () around [^.]+

If I were then to write a RewriteRule on the next line, the text captured above would become accessible as %1.

So your RewriteRule would look like this:

 RewriteRule ^([^.]+) cache/%1/$1${uri_escape:%{QUERY_STRING}}.html [L]

Hope that helps.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文