如何在 .htaccess 上下文中阻止多个 mod_rewrite 传递(或无限循环)

发布于 2024-12-10 06:50:05 字数 5228 浏览 0 评论 0原文

我正在开发一个在共享 Apache v2.2 服务器上运行的网站,因此所有配置都是通过 .htaccess 文件进行的,并且我想使用 mod_rewrite 以不太完全直接的方式将 URL 映射到文件系统。举例来说,假设我想做的是:

  • 将 URL www.mysite.com/Alice 映射到文件系统文件夹 /public_html/Bob
  • 映射 URL www.mysite.com/Bob 到文件系统文件夹 /public_html/Alice

现在,经过几个小时的精心设计规则集(真正的规则集,而不是 Alice/Bob 的规则集!)我把我所有的在 /public_html 中的 .htaccess 文件中精心设计的重写规则,并对其进行了测试......结果却出现 500 服务器错误!

我被一个有据可查的“陷阱”抓住了。在 Apache 中:当在 .htaccess 文件中使用 mod_rewrite 规则时,重写的 URL 将重新提交以进行另一轮处理(就好像它是外部请求一样)。发生这种情况是为了可以应用重写请求的目标文件夹中的任何规则,但它可能会导致网络服务器出现一些非常违反直觉的行为!

在上面的示例中,这意味着对 www.mysite.com/Alice/foo.html 的请求将被重写为 /Bob/foo.html,然后重新提交(作为对 www.mysite.com/Bob/foo.html 的请求发送到服务器。然后将其重新重写回 /Alice/foo.html 并重新提交,这会导致它重新重写为 /Bob/foo.html,并且很快;随后发生无限循环...仅因服务器超时错误而中断。


问题是,如何确保 .htaccess mod_rewrite 规则集仅应用一次?


RewriteRule 中的 [L] 标志会在一次通过规则集期间停止所有进一步的重写,但不会阻止整个规则集在重写后重新应用。 -写入的URL重新提交到服务器。根据文档,Apache v2.3.9+(目前处于 Beta 版)包含一个 [END] 标志,该标志恰好提供了此功能。不幸的是,虚拟主机仍在使用 Apache 2.2,他们拒绝了我升级到测试版的礼貌请求!

我们需要一种解决方法,提供与 [END] 标志类似的功能。我的第一个想法是我可以使用环境变量:在第一次重写过程中设置一个标志,告诉后续过程不要进一步重写。如果我将标志变量称为“END”,代码可能如下所示:

#  Prevent further rewriting if 'END' is flagged
RewriteCond %{ENV:END} =1
RewriteRule .* - [L]

#  Map /Alice to /Bob, and /Bob to /Alice, and flag 'END' when done
RewriteRule ^Alice(/.*)?$ Bob$1 [L,E=END:1]
RewriteRule ^Bob(/.*)?$ Alice$1 [L,E=END:1]

不幸的是,此代码不起作用:经过一番实验,我发现环境变量无法在将重写的 URL 重新提交到的过程中幸存下来。服务器。 此 Apache 文档页面 上的最后一行表明环境变量 < em>应该能够在内部重定向中生存,但我发现事实并非如此。

[编辑:在某些服务器上,它确实有效。如果是这样,这是一个比下面的解决方案更好的解决方案。您必须在自己的服务器上亲自尝试一下才能看到。]

尽管如此,总体想法还是可以挽救的。经过几个小时的绞尽脑汁,并听取了一位同事的一些建议,我意识到 HTTP 请求标头在内部重定向中被保留,所以如果我可以将我的标志存储在其中一个中,它可能会起作用!


这是我的解决方案:


# This header flags that there's no more rewriting to be done.
# It's a kludge until use of the END flag becomes possible in Apache v2.3.9+
# ######## REMOVE this directive for Apache 2.3.9+, and change all [...,L,E=END:1]
# ######## to just [...,END] in all the rules below!

RequestHeader set SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj 1 env=END


# If our special end-of-rewriting header is set this rule blocks all further rewrites.
# ######## REMOVE this directive for Apache 2.3.9+, and change all [...,L,E=END:1]
# ######## to just [...,END] in all the rules below!

RewriteCond %{HTTP:SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj} =1 [NV]
RewriteRule .* - [L]


#  Map /Alice to /Bob, and /Bob to /Alice, and flag 'END' when done

RewriteRule ^Alice(/.*)?$ Bob$1 [L,E=END:1]
RewriteRule ^Bob(/.*)?$ Alice$1 [L,E=END:1]

......而且,它有效!原因如下:在 .htaccess 文件内,与各种 apache 模块关联的指令按照主 Apache 配置中定义的模块顺序执行(或者,这是我的理解,无论如何......)。在本例中(对于此解决方案的成功至关重要)mod_headers 设置为在 mod_rewrite 之后执行,因此 RequestHeader 指令在重写规则之后执行。这意味着当标志列表中带有 [E=END:1] 的 RewriteRule 匹配时,SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj 标头就会添加到 HTTP 请求中。在下一次传递时(在重写的请求重新提交到服务器之后),第一个 RewriteRule 检测到此标头,并中止任何进一步的重写。

关于此解决方案需要注意的一些事项是:

  1. 如果 Apache 配置为在 mod_rewrite 之前运行 mod_headers,则该解决方案将不起作用。 (我不确定这是否可能,或者如果可能的话,会有多不寻常)。

  2. 如果外部用户在向服务器发出的 HTTP 请求中包含 SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj 标头,则会禁用所有 URL 重写规则,并且该用户将“按原样”看到文件系统目录结构。这就是在标头名称末尾使用随机 ASCII 字符字符串的原因 - 这是为了使标头难以猜测。这是一项功能还是一个安全漏洞取决于您的观点!

  3. 这里的想法是一种解决方法,模仿在尚未使用 [END] 标志的 Apache 版本中的使用。如果您想要的只是确保您的规则集仅运行一次,无论触发哪些规则,那么您可以放弃使用“END”环境变量,只需执行以下操作:

    RewriteCond %{HTTP:SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj} =1 [NV]
    重写规则 .* - [L]
    
    请求标头设置 SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj 1
    
    # 将 /Alice 映射到 /Bob,将 /Bob 映射到 /Alice
    RewriteRule ^Alice(/.*)?$ Bob$1 [L]
    RewriteRule ^Bob(/.*)?$ Alice$1 [L]
    

    或者更好的是,这个(尽管 REDIRECT_* 变量在 Apache v2.2 文档中记录很少 - 它们似乎只被提及此处) - 所以我不能保证它适用于所有版本的 Apache):

    RewriteCond %{ENV:REDIRECT_STATUS} !^$
    重写规则.* - [L]。 
    
    # 将 /Alice 映射到 /Bob,将 /Bob 映射到 /Alice
    RewriteRule ^Alice(/.*)?$ Bob$1 [L]
    RewriteRule ^Bob(/.*)?$ Alice$1 [L]
    

    但是,一旦您运行 Apache v2.3.9+,我预计使用 [END] 标志会比上述解决方案更有效,因为(大概)它完全避免了重写的 URL 被重新提交给用于另一个重写过程的服务器。

    请注意,您可能还想阻止子请求的重写,在这种情况下,您可以将 RewriteCond 设置为 don-do-any-more-rewriting 规则,如下所示:

    RewriteCond %{ENV:REDIRECT_STATUS} !^$ [或]
    RewriteCond %{IS_SUBREQ} =true
    重写规则 .* - [L]
    
  4. 这里的想法是一种解决方法,用于替换尚未使用 [END] 标志的 Apache 版本中的使用。但事实上,您可以使用这种通用方法来存储不仅仅是一个标志 - 您可以存储在内部服务器重定向中持续存在的任意字符串或数字,并根据任何测试条件设计您的重写规则以依赖于它们RuleCond 提供。 (我无法立即想出你为什么要这样做的原因......但是,嘿,你拥有的灵活性和控制力越多,就越好,对吧?)


我想读到这里的人都已经发现我并不是真的在问问题。更重要的是,我找到了自己的解决方案来解决我遇到的问题,并希望将其发布在这里以供参考,以防其他人遇到同样的问题。这是这个网站的一个重要部分,对吧?

...

但是由于这应该是一个问答论坛,我会问:

  • 任何人都可以看到吗?此解决方案有任何潜在问题(除了我已经提到的问题之外)?
  • 或者有人有更好的方法来实现同样的目标吗?

I'm working on a website running on a shared Apache v2.2 server, so all configuration is via .htaccess files, and I wanted to use mod_rewrite to map URLs to the filesystem in less-than-completely-straightforward way. Just for example's sake, let's say that what I wanted to do was this:

  • Map URL www.mysite.com/Alice to filesystem folder /public_html/Bob
  • Map URL www.mysite.com/Bob to filesystem folder /public_html/Alice

Now, after several hours work carefully designing the ruleset (the real one, not the Alice/Bob one!) I put all my carefully crafted rewriting rules in a .htaccess file in /public_html, and tested it out ...only to get a 500 server error!

I'd been caught out by a well documented "gotcha!" in Apache: When mod_rewrite rules are used inside a .htaccess file, a re-written URL is re-submitted for another round of processing (as if it were an external request). That happens so that any rules in the target folder of the re-written request can be applied, but it can result in some very counter-intuitive behaviour by the webserver!

In the above example, that means that a request for www.mysite.com/Alice/foo.html gets rewritten to /Bob/foo.html, and then resubmitted (internally) to the server as a request for www.mysite.com/Bob/foo.html. This is then re-rewritten back to /Alice/foo.html and resubmitted, which causes it to get re-re-rewritten to /Bob/foo.html, and so on; an infinite loop ensues... broken only by a server timeout error.


The question is, how to ensure that a .htaccess mod_rewrite ruleset only gets applied ONCE?


The [L] flag in a RewriteRule stops all further rewriting during a single pass through the ruleset, but doesn't stop the entire ruleset from being re-applied after the re-written URL is resubmitted to the server. According to the documentation, Apache v2.3.9+ (currently in Beta) contains an [END] flag that provides precisely this functionality. Unfortunately, the web host is still using Apache 2.2, and they declined my polite request to upgrade to the beta version!

What's needed is a workaround that provides similar functionality to the [END] flag. My first thought was that I could use an environment variable: Set a flag during the first rewriting pass that would tell subsequent passes to do no further rewriting. If I called my flag variable 'END', the code might look like this:

#  Prevent further rewriting if 'END' is flagged
RewriteCond %{ENV:END} =1
RewriteRule .* - [L]

#  Map /Alice to /Bob, and /Bob to /Alice, and flag 'END' when done
RewriteRule ^Alice(/.*)?$ Bob$1 [L,E=END:1]
RewriteRule ^Bob(/.*)?$ Alice$1 [L,E=END:1]

Unforunately this code doesn't work: After a bit of experimentation, I discovered that environment variables don't survive the process of re-submitting the rewritten URL to the server. The last line on this Apache documentation page suggests that environment variables ought to survive internal redirects, but I found that not to be the case.

[EDIT: On some servers, it does work. If so, it's a better solution than what follows below. You'll have to try it for yourself on your own server to see.]

Still, the general idea can be salvaged. After many hours of hair-pulling, and some advice from a colleague, I realised that HTTP request headers are preserved across internal redirects, so if I could store my flag in one of those, it might work!


Here's my solution:


# This header flags that there's no more rewriting to be done.
# It's a kludge until use of the END flag becomes possible in Apache v2.3.9+
# ######## REMOVE this directive for Apache 2.3.9+, and change all [...,L,E=END:1]
# ######## to just [...,END] in all the rules below!

RequestHeader set SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj 1 env=END


# If our special end-of-rewriting header is set this rule blocks all further rewrites.
# ######## REMOVE this directive for Apache 2.3.9+, and change all [...,L,E=END:1]
# ######## to just [...,END] in all the rules below!

RewriteCond %{HTTP:SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj} =1 [NV]
RewriteRule .* - [L]


#  Map /Alice to /Bob, and /Bob to /Alice, and flag 'END' when done

RewriteRule ^Alice(/.*)?$ Bob$1 [L,E=END:1]
RewriteRule ^Bob(/.*)?$ Alice$1 [L,E=END:1]

...and, it worked! Here's why: Inside a .htaccess file, directives associated with various apache modules execute in the module order defined in the main Apache configuration (or, that's my understanding, anyway...). In this case (and critically for the success of this solution) mod_headers was set to execute after mod_rewrite, so the RequestHeader directive gets executed after the rewrite rules. That means the the SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj header gets added to the HTTP request iff a RewriteRule with [E=END:1] in its flag list gets matched. On the next pass (after the re-written request is resubmitted to the server) the first RewriteRule detects this header, and aborts any further rewriting.

Some things to note about this solution are:

  1. It won't work if Apache is configured to run mod_headers before mod_rewrite. (I'm not sure if that's even possible, or if so, how unusual it'd be).

  2. If an external user includes a SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj header in their HTTP request to the server, it'll disable all URL rewriting rules, and that user will see the filesystem directory structure "as-is". That's the reason for the random string of ascii characters at the end of the header name - it's to make the header hard to guess. Whether this is a feature or a security vulnerability depends on your point of view!

  3. The idea here was a workaround to mimic the use of the [END] flag in Apache versions that don't yet have it. If all you wanted was to ensure your ruleset only runs once, regardless of which rules are triggered, then you could probably drop the use of the 'END' environment variable and just do this:

    RewriteCond %{HTTP:SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj} =1 [NV]
    RewriteRule .* - [L]
    
    RequestHeader set SPECIAL-HEADER-STOP-FURTHER-REWRITES-kjhsdf87653vasj 1
    
    #  Map /Alice to /Bob, and /Bob to /Alice
    RewriteRule ^Alice(/.*)?$ Bob$1 [L]
    RewriteRule ^Bob(/.*)?$ Alice$1 [L]
    

    Or even better, this (though the REDIRECT_* variables are poorly documented in the Apache v2.2 documetation - they seem to be only mentioned here) - so I can't guarantee it'd work on all versions of Apache):

    RewriteCond %{ENV:REDIRECT_STATUS} !^$
    RewriteRule .* - [L]. 
    
    #  Map /Alice to /Bob, and /Bob to /Alice
    RewriteRule ^Alice(/.*)?$ Bob$1 [L]
    RewriteRule ^Bob(/.*)?$ Alice$1 [L]
    

    However, once you're running Apache v2.3.9+, I expect that using the [END] flag would be more efficient than the above solution, because (presumably) it altogether avoids the rewritten URL being re-submitted to the server for another rewriting pass.

    Note that you may also want to block rewriting of subrequests, in which case you can a RewriteCond to the don't-do-any-more-rewriting rule, like this:

    RewriteCond %{ENV:REDIRECT_STATUS} !^$ [OR]
    RewriteCond %{IS_SUBREQ} =true
    RewriteRule .* - [L]
    
  4. The idea here was a workaround to replace the use of the [END] flag in Apache versions that don't yet have it. But in fact you can use this general approach to store more than just a single flag - you could store arbitrary strings or numbers that would persist across an internal server redirect, and design your rewrite rules to depend on them based on any of the test conditions RuleCond provides. (I can't, off the top of my head, think of a reason why you'd want to do that... but hey, the more flexibility and control you have, the better, right?)


I guess anyone who's read this far has figured out that I'm not really asking a question here. It's more a matter of my having found my own solution to a problem I had, and wanting to post it up here for reference in case anyone else has run into the same problem. That's a big part of what this webiste is for, right?

...

But since this is supposed to be a question-and-answer forum, I'll ask:

  • Can anyone see any potential problems with this solution (other than those I've already mentioned)?
  • Or does anyone have a better way of achieving the same thing?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

寒冷纷飞旳雪 2024-12-17 06:50:05

根据您的 Apache 版本,此条件可能有效(将其添加到“停止重写”规则:即 RewriteRule .* - [L] .. 或仅针对特定有问题的规则):

RewriteCond %{ENV:REDIRECT_STATUS} ^$

REDIRECT_STATUS 在第一次/初始重写中将为空,并且在任何后续周期中的值为 200(也可能是其他值 - 尚未检查那么深)。

不幸的是,它在某些系统上有效,但在其他系统上无效,我个人不知道是什么导致它工作。

除此之外,最常见的是添加重写条件来检查原始 URL,例如通过解析 %{THE_REQUEST} 变量,例如 RewriteCond %{THE_REQUEST} ^[AZ]+\s .+\.php\sHTTP/.+ ——但这仅对个别有问题的规则有意义。

一般来说,您应该避免这种“重写 A -> B,然后 B -> A”的情况(我很确定您已经意识到这一点)。

至于你自己的解决方案——“如果没有损坏就不要修复”——如果它有效,那就太好了,因为我没有看到这种方法有任何重大问题。

Depending on your Apache build, this condition may work (add it to "stop-rewriting" rule: i.e. RewriteRule .* - [L] .. or just for specific problematic rule):

RewriteCond %{ENV:REDIRECT_STATUS} ^$

REDIRECT_STATUS will be empty of very first / initial rewrite and will have value of 200 (or maybe other value as well -- have not checked that deep) on any subsequent cycle.

Unfortunately it works on some systems and does not on others and I personally have no idea what is responsible for making it working.

Other than this the most common thing is to add rewrite condition to check the original URL, for example by parsing %{THE_REQUEST} variable e.g. RewriteCond %{THE_REQUEST} ^[A-Z]+\s.+\.php\sHTTP/.+ -- but this only makes sense for individual problematic rules.

In general -- you should avoid such "rewrite A -> B and then B -> A" situations (I'm pretty sure you are aware of that).

As for your own solution -- "don't fix if it ain't broken" -- if it works then it's great as I do not see any major problems with such approach.

万劫不复 2024-12-17 06:50:05

我不太确定为什么需要这样做,但我建议为在这种情况下运行的用户提供一些建议:

  1. 将文件夹 Bob 重命名为 Alice ,反之亦然怎么样?那么 Apache 不需要对它们做任何事情。

  2. 如果这对您的应用程序很重要,您可以转换应用程序吗?检测鲍勃和爱丽丝并在您的应用程序中交换它们。相反?

在 PHP 中,它会是这样的:

if($path == "Bob") {
  $path = "Alice";
}
else if($path == "Alice") {
  $path = "Bob";
}

完成。

否则添加另一个子文件夹可能会很有用。因此 /Bob 变为 /a/Alice,而 /Alice 变为 /b/Bob。然后你就消除了混乱。这也可以使用另一个参数(查询字符串)来完成,这或多或少是通过设置在 .htaccess 中测试的环境变量来完成的。

I'm not too sure why you need to do that, but I'd suggest a couple of things for users who run in such a situation:

  1. How about renaming the folder Bob into Alice and vice versa? Then Apache doesn't need to do anything about them.

  2. If that's important for your application, could you just transform the app. to detect Bob and Alice and just swap those in your app. instead?

In PHP it would be something like this:

if($path == "Bob") {
  $path = "Alice";
}
else if($path == "Alice") {
  $path = "Bob";
}

Done.

Otherwise adding another sub-folder could be useful. So /Bob becomes /a/Alice and /Alice becomes /b/Bob. Then you remove the confusion. That could also be done with another parameter (query string), which is more or less what you're doing by setting an environment variable that you test in your .htaccess.

囍孤女 2024-12-17 06:50:05

RewriteRule 设置的变量(修改路径的变量)可在“下一轮”(“内部重定向”)上使用,并预先添加前缀 REDIRECT_。
所以你的第一个代码片段应该是这样的:

RewriteCond %{ENV:REDIRECT_END} =1
RewriteRule .* - [L]

这对我来说适用于 apache 2.4.10。

Variables set by RewriteRule (that one that modified path) are available on "next round" ("internal redirect") with prefix REDIRECT_ prepanded.
So your first code snippet should look this way:

RewriteCond %{ENV:REDIRECT_END} =1
RewriteRule .* - [L]

This works for me with apache 2.4.10.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文