使用 .htaccess 将 http 重定向到 https 时,某些 url 会出现奇怪的 401 错误

发布于 2025-01-04 17:52:28 字数 1837 浏览 2 评论 0 原文

好的,这是第七天尝试寻找 401 错误出现原因的失败尝试...

现在, 根文件夹中的 .htaccess 仅包含 3 个字符串(已简化),并且项目中不再有 .htaccess 文件:

RewriteEngine On
RewriteCond %{HTTPS} !on
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}

因此,它将所有请求重定向为 https。它适用于任何 url,甚至适用于 /administration 目录。

因此,

http://mydomain.com

https://mydomain.com

如果输入了 https://mydomain.com,则不会发生重定向

http://mydomain.com/administration/index.php

变为

https://mydomain.com/administration/index.php

如果输入了https://mydomain.com/administration/index.php,则没有重定向。

这很清楚,问题在下面。

我希望 /administration 目录受到密码保护。我的共享主机控制面板允许保护目录,而无需手动创建 .htaccess 和 .htpasswd(您选择要保护的目录,创建用户名和密码,然后自动创建 .htaccess 和 .htpasswd)。因此,.htaccess 出现在 /administration 文件夹中。 .htpasswd 出现在其他地方,.htpasswd 的路径是正确的,并且一切看起来都是正确的(其工作方式与手动创建它相同)。因此,项目中有 2 个 .htaccess 文件,一个位于根目录,一个位于 /administration 目录(其中 .htpasswd 位于目录 .htaccess 知道它在哪里)。

创建密码后, 结果是:

您输入:

https://mydomain.com/administration/index.php

然后它要求输入密码。 如果输入正确的话 显示 https://mydomain.com/administration/index.php结果:完美。

但是,如果你输入 http://mydomain.com/administration/index.php(是的,http,不带 S) 然后不是重定向到相同的 https 页面, 它以未知原因重定向

https://mydomain.com/401.shtml (starts with httpS)

,甚至不询问密码。 为什么?

我已经就这个问题联系了客户支持,他们确定问题出在 .htaccess 文件中,并且他们没有修复 .htaccess 文件(很明显,他们没有,我也没有)不介意)。

为什么会发生这种情况? 我是否忘记在 .htaccess 文件中放置一些标志或一些选项来更改默认设置?

如果输入的不是 https,而是 http,则为文件夹 /administration 手动创建 .htaccess 和 .htpasswd(不是从托管控制面板)会导致相同的 401 错误。

仅当 URL 指向 /administration 目录时才会出现此问题。

谢谢。

OK, here is the 7th day of unsuccessfull attempt to find an answer why 401 error appears...

Now,
.htaccess in the root folder contains the only 3 strings (was simplified) and there are NO more .htaccess files in the project:

RewriteEngine On
RewriteCond %{HTTPS} !on
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}

So, it redirects all requests to be https. It works fine for any urls, even for /administration directory.

So,

http://mydomain.com

becomes

https://mydomain.com

If https://mydomain.com was entered, there are no redirections.

http://mydomain.com/administration/index.php

becomes

https://mydomain.com/administration/index.php

If https://mydomain.com/administration/index.php was entered, there are no redirections.

That's clear, and the problem is below.

I want /administration directory to be password protected. My Shared Hosting Control Panel allows to protect directories without manual creating of .htaccess and .htpasswd (you choose a directory to protect, create username and password, and .htaccess and .htpasswd are created automatically). So, .htaccess appears in the /administration folder. .htpasswd appears somewhere else, the path to .htpasswd is correct, and everything looks correct (it works the same way as to create it manually). So, there are 2 .htaccess files in the project, one in the root directory and one in the /administration directory (with .htpasswd at the directory .htaccess knows where it is).

Once the password is created,
the results are:

You enter:

https://mydomain.com/administration/index.php

Then it asks to enter a password.
If you enter it correctly,
https://mydomain.com/administration/index.php is displayed.
The result: works perfect.

But, if you enter
http://mydomain.com/administration/index.php (yes, http, without S)
then instead of redirecting to the same,but https page,
it redirects to

https://mydomain.com/401.shtml (starts with httpS)

by unknown reason and even does NOT ask a password. Why?

I've contacted a customer support regarding this question and they are sure the problem is in .htaccess file, and they do not fix .htaccess files (that's clear, they do not, I don't mind).

Why does this happen?
Did I forget to put some flags, or some options to change default settings in the .htaccess file?

P.S.Creating .htaccess and .htpasswd manually (not from hosting Control Panel) for the folder /administration causes the same 401 error in case if not https, but http was entered.

And the problem appears with URLs to /administration directory only.

Thank you.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

一杯敬自由 2025-01-11 17:52:28

尝试使用这个代替。不是 L 和 R 标志。

RewriteEngine On
RewriteCond %{HTTPS} !on
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

还要先清除浏览器缓存,以删除旧的不正确的重定向。

如果这不起作用,请尝试使用这个。

RewriteCond %{HTTPS} !on
RewriteCond %{THE_REQUEST} ^(GET|HEAD)\ ([^\ ]+)
RewriteRule ^ https://%{HTTP_HOST}%2 [L,R=301]

我觉得写它有点不好,因为在我看来这有点黑客。

编辑
看来第二个选项解决了问题。所以这里是关于它为什么起作用的解释。

认证模块在重写模块之前执行。由于首次请求页面时未发送用户名和密码,因此身份验证模块在内部将请求 url“重写”为 401 页面的 url。此 mod_rewrite 出现后,%{THE_REQUEST} 现在包含 401.shtml 而不是原始网址。因此,生成的重定向包含 401.shtml,而不是您想要的 url。

要获取原始(不是“重写”)的 url,您需要从 %{THE_REQUEST} 中提取它。 THE_REQUEST 的格式为 [requestmethod] [url] HTTP[versionnumber]。 RewriteCond 仅提取中间部分 ([url])。

为了完整起见,我将 [L,R=301] 标志添加到第二个解决方案中。

Try using this instead. Not the L and R flag.

RewriteEngine On
RewriteCond %{HTTPS} !on
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

Also clear your browsers cache first, to remove the old incorrect redirect.

If that doesn't work try using this.

RewriteCond %{HTTPS} !on
RewriteCond %{THE_REQUEST} ^(GET|HEAD)\ ([^\ ]+)
RewriteRule ^ https://%{HTTP_HOST}%2 [L,R=301]

I feel a bit bad about writing it, as it seems kind of hackish in my view.

EDIT
Seems the 2nd option fixed the problem. So here is the explanation as to why it works.

The authentication module is executed before the rewrite module. Because the username and password is not send when first requesting the page, the authentication module internally 'rewrites' the request url to the 401 page's url. After this mod_rewrite comes and %{THE_REQUEST} now contains 401.shtml instead of the original url. So the resulting redirect contains the 401.shtml, and not the url you want.

The get to the original (not 'rewritten') url, you need to extract it from %{THE_REQUEST}. THE_REQUEST is in the form [requestmethod] [url] HTTP[versionnumber]. The RewriteCond extracts just the middle part ([url]).

For completeness I added the [L,R=301] flags to the second solution.

乜一 2025-01-11 17:52:28

我想我找到了更好的解决方案!

只需将其添加到位于以下位置的 .htaccess

ErrorDocument 401 "Unauthorized"

解决方案:

http://forum.kohanaframework.org/discussion/8934/solved-for-reall-this-time-p-htaccess-folder-password-protection/

- 编辑

我最终发现该问题的根本原因是 ModSecurity 标记了我的 POST 数据(脚本和 iframe 标记导致问题)。它会尝试返回 401/403,但找不到默认错误文档,因为 ModSecurity 让我的 htaccess 变得混乱。

使用ErrorDocument 401“Unauthorized”绕过了丢失错误文档的问题,但没有解决根本原因。

为此,我最终使用 javascript 将“盐”添加到既不是空格也不是单词字符的任何内容...

  $("form").submit(function(event) {
    $("textarea,[type=text]").each(function() {
      $(this).val($(this).val().replace(/([^\s\w])/g, "foobar$1salt"));
    });
  });

然后 PHP 再次去除盐...

function stripSalt($value) {
  if (is_array($value)) $value = array_map('stripSalt', $value);
  else $value = preg_replace("/(?:foobar)+(.)(?:salt)+/", "$1", $value);

  return $value;
}
$_POST = stripSalt($_POST);

非常、非常、非常重要的注意事项:
不要使用“foobar$1salt”,否则这篇文章只是向黑客展示了如何绕过您的 ModSecurity!

正则表达式注释:
我认为可能值得一提这里发生的事情...

(?:foobar)+ = 匹配盐的前半部分一次或多次,但不要将其存储为匹配组;

(.) = 匹配任何字符并将其存储为第一个也是唯一的组(可通过 $1 访问);

(?:salt)+ = 匹配盐的后半部分一次或多次,但不要将其存储为匹配组。

每个字符多次匹配盐很重要,因为如果您点击提交然后使用后退按钮,您将返回到表单,所有盐仍然在那里。再次点击提交,就会添加更多的盐。这种情况可能会一次又一次地发生,直到你最终得到类似的结果:
foob​​arfoobarfoobarfoobar>盐盐盐盐盐

I think I found an even better solution to this!

Just add this to your .htaccess

ErrorDocument 401 "Unauthorized"

Solution found at:

http://forum.kohanaframework.org/discussion/8934/solved-for-reall-this-time-p-htaccess-folder-password-protection/

-- EDIT

I eventually found the root cause of the issue was ModSecurity flagging my POST data (script and iframe tags cause issues). It would try to return a 401/403 but couldn't find the default error document because ModSecurity had made my htaccess go haywire.

Using ErrorDocument 401 "Unauthorized" bypassed the missing error document problem but did nothing to address the root cause.

For this I ended up using javascript to add 'salt' to anything which was neither whitespace nor a word character...

  $("form").submit(function(event) {
    $("textarea,[type=text]").each(function() {
      $(this).val($(this).val().replace(/([^\s\w])/g, "foobar$1salt"));
    });
  });

then PHP to strip the salt again...

function stripSalt($value) {
  if (is_array($value)) $value = array_map('stripSalt', $value);
  else $value = preg_replace("/(?:foobar)+(.)(?:salt)+/", "$1", $value);

  return $value;
}
$_POST = stripSalt($_POST);

Very, Very, Very Important Note:
Do not use "foobar$1salt" otherwise this post has just shown hackers how to bypass your ModSecurity!

Regex Notes:
I thought it may be worth mentioning what's going on here...

(?:foobar)+ = match first half of salt one or more times but don't store this as a matched group;

(.) = match any character and store this as the first and only group (accessible via $1);

(?:salt)+ = match second half of salt one or more times but don't store this as a matched group.

It's important to match the salt multiple times per character because if you've hit submit and then you use the back button you will go back to the form with all the salt still in there. Hit submit again and more salt gets added. This can happen again and again until you end up with something like:
foobarfoobarfoobarfoobar>saltsaltsaltsalt

掩耳倾听 2025-01-11 17:52:28

我对上述解决方案并不满意,因此我提出了另一种解决方案:

在现代 Web 服务器配置中,我们应该将所有流量重定向到 HTTPS,这样用户就无法在没有 HTTPS 的情况下访问任何内容。用户使用 HTTPS 浏览我们的内容后,我们可以使用身份验证。考虑到这一点,我们可以将身份验证指令包装在 If 指令中:

<If "%{HTTPS} == 'on'">
  AuthType Basic
  ...
</If>

您可以根据需要保留并使用 Rewrite 指令。

使用此解决方案:

  • 您不得按照 Hoogs 的建议更改 ErrorDocument
  • 您不得按照 Gerben 的建议以黑客方式从 THE_REQUEST 中提取路径

I was not satisfied with the solutions above so I came up with another one:

In a modern web server configuration we should redirect all traffic to HTTPS so the user can not reach any content without HTTPS. After the user's browsing our content with HTTPS we can use authentication. With this in mind we can wrap the authentication directive in an If directive:

<If "%{HTTPS} == 'on'">
  AuthType Basic
  ...
</If>

You can leave and use Rewrite directives as you like.

With this solution:

  • you must not change ErrorDocument as suggested by Hoogs
  • you must not extract path from THE_REQUEST in a hackish way as suggested by Gerben
反差帅 2025-01-11 17:52:28

这种情况是,在没有框在您面前的情况下在 Apache 上进行故障排除有点棘手,但我认为发生的是您的重写指令在路径解析后正在处理,并且是有密码要求的路径解析。

回顾一下,Apache 中解析 URL 的方式是请求传入并从一个模块传递到另一个模块,有点像桶队。每个模块都做自己的事情......有些模块进行内容协商,有些模块将 URL 转换为文件路径,有些检查身份验证,其中之一是 mod_rewrite ...

您在配置中看到这一点的地方实际上是两者都有Location 指令和 Directory 指令在大多数方面看起来相同,但它们是不同的,因为 Locations 讨论 URL,而 Directory 讨论文件系统路径。

无论如何,我的猜测是,Apache 在发现需要重定向到 HTTPS 之前,发现它需要密码才能访问该内容。 (mod_rewrite 是一种疯狂的模块,它可以以令人惊讶的方式搞乱各种事情......它可以进行路径转换、一点点重写、发出子请求和一堆其他疯狂的事情)。

我能想到的解决这个问题的方法很少。

  1. 更改 http 站点的 vhosts 容器中的目录根目录,以便它找不到密码文件(这将是我的方法)
  2. 更改模块加载顺序,以便 mod_rewrite 在链中较早发生(可能会产生意外后果)
  3. 使用 < code>setenvif

最后一个需要更多解释。还记得我跟你说过的水桶大队吗? Apache模块还可以设置环境变量,这些变量完全位于模块->模块->模块->链之外。如果站点不是 HTTPS,您也许可以设置一个环境变量。然后,无论您如何设置访问控制,都可以使用 SetEnvIf 指令来始终允许访问已设置的资源,但您必须确保您将遵守该重写规则。

正如我所说,我的选择是#1,但有时人们需要做一些疯狂的事情,而 Apache 会让你这么做。

如今,我对 https:// 网站的实际 SOP 是,我只是将所有端口 80 内容发送到一个根本无法提供任何内容的虚拟主机。然后我 mod_rewrite 所有内容都通过 https://... badda bing,baddaoom,没有 http,也没有复杂的安全风险。

This is the type of thing is that is a bit tricky to troubleshoot on Apache without the box right in front of you, but I what I think is happening is that your rewrite directive is being processed after path resolution, and it's the path resolution that has the password requirement.

Backing up a bit, the way a URL is resolved in Apache is that the request comes in and gets handed from module to module, kind of like a bucket brigade. Each module does its own thing....some modules do content negotiation, some translate URLs to file paths, some check authentication, one of them is mod_rewrite ...

One place where you see this in the configuration is actually that there is both a Location directive and a Directory directive which seem the same in most respects, but they are different because Locations talk about URLs and Directories talk about filesystem paths.

Anyhow, my guess is that going down the bucket brigade, Apache figures out that it needs a password to access that content before it figures out that it needs to redirect to HTTPS. (mod_rewrite is kind of a crazy module and it can mess with all kinds of things in surprising ways..it can do path translation, bits and pieces of rewrite, make subrequests, and a bunch of other nutty things).

There are few ways you can fix this that I can think of.

  1. Change your directory root in the vhosts container for the http site so that it can't find the passworded file (this would be my approach)
  2. Change your module load order so that mod_rewrite happens earlier in the chain (may have unexpected consequences)
  3. Use setenvif

That last one needs more explanation. Remember the bucket brigade I told you about? Apache modules can also set environment variables, which are completely outside of the module->module->module->chain. You could, perhaps, set an environment variable if the site is not HTTPS. Then however you set up your access control could use the SetEnvIf directive to always allow access to the resource if it's set, BUT you have to make sure for sure that you're going to hit that rewrite rule.

As I said, my choice would #1 but sometimes people need to do crazy things, and Apache will let you.

My real-world SOP for https:// sites these days is that I just shoot all of my port 80 content over to a single vhost that can't serve any content at all. Then i mod_rewrite everything over https://... badda bing, badda boom, no http and no convoluted security risks.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文