目录索引缺失时的 404 与 403

发布于 2024-10-19 03:15:16 字数 1134 浏览 6 评论 0原文

这主要是一个关于解释 HTTP 规范的最佳方式的哲学问题。没有目录索引的目录(例如index.html)应该返回404还是403? (403 是 Apache 中的默认值。)

例如,假设以下 URL 存在并且可以访问:

http://example.com/files/file_1/
http://example.com/files/file_2/

但没有任何内容:(

http://example.com/files/

假设我们使用 301 来强制所有 URL 的尾部斜杠。)

我认为应该考虑以下几点account:

  • 默认情况下,Apache 在这种情况下返回 403。这对我来说很重要。他们考虑过这个问题,并决定使用 403。
  • 根据 W3C,403 的意思是“服务器理解请求,但拒绝满足它”。我认为这意味着如果 URL 有意义但仍然被禁止,您应该返回 403。
  • 如果客户端正确猜测 URL 映射到磁盘上的真实目录,403 可能会导致信息泄露。
  • http://example.com/files/ 不是资源,它内部映射到目录的事实不应与状态代码相关。
  • 如果您将 URL 方案解释​​为从客户端的角度定义目录结构,则内部实现仍然无关紧要,但也许外观确实应该对状态代码产生一些影响。也许,即使您创建了相同的 URL 结构而不在内部使用目录,您仍然应该使用 403,因为它与客户端对目录结构的感知有关。

权衡之下,您认为最好的方法是什么?我们是否应该说“资源就是资源,如果不存在,那就是 404”?或者我们应该说,“如果它有斜杠,那么对于客户端来说它看起来像一个目录,因此如果没有索引,它就是 403?”

如果您属于 403 阵营,您是否认为即使内部实现不使用目录,您也应该不遗余力地返回 403?例如,假设您有一个具有以下 URL 的动态 Web 应用程序:http://example.com/users/joe,它映射到生成 Joe 的个人资料页面的一些代码。假设您没有编写列出所有用户的内容,http://example.com/users/ 是否应该返回 403? (在这种情况下,许多(如果不是全部)Web 框架都会返回 404。)

This is mostly a philosophical question about the best way to interpret the HTTP spec. Should a directory with no directory index (e.g. index.html) return 404 or 403? (403 is the default in Apache.)

For example, suppose the following URLs exist and are accessible:

http://example.com/files/file_1/
http://example.com/files/file_2/

But there's nothing at:

http://example.com/files/

(Assume we're using 301s to force trailing slashes for all URLs.)

I think several things should be taken into account:

  • By default, Apache returns 403 in this scenario. That's significant to me. They've thought about this stuff, and they made the decision to use 403.
  • According to W3C, 403 means "The server understood the request, but is refusing to fulfill it." I take that to mean you should return 403 if the URL is meaningful but nonetheless forbidden.
  • 403 might result in information disclosure if the client correctly guesses that the URL maps to a real directory on disk.
  • http://example.com/files/ isn't a resource, and the fact that it internally maps to a directory shouldn't be relevant to the status code.
  • If you interpret the URL scheme as defining a directory structure from the client's perspective, the internal implementation is still irrelevant, but perhaps the outward appearance should indeed have some bearing on the status codes. Maybe, even if you created the same URL structure without using directories internally, you should still use 403s, because it's about the client's perception of a directory structure.

In the balance, what do you think is the best approach? Should we just say "a resource is a resource, and if it doesn't exist, it's a 404?" Or should we say, "if it has slashes, it looks like a directory to the client, and therefore it's a 403 if there's no index?"

If you're in the 403 camp, do you think you should go out of your way to return 403s even if the internal implementation doesn't use directories? Suppose, for example, that you have a dynamic web app with this URL: http://example.com/users/joe, which maps to some code that generates the profile page for Joe. Assuming you don't write something that lists all users, should http://example.com/users/ return 403? (Many if not all web frameworks return 404 in this case.)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

剩余の解释 2024-10-26 03:15:16

回答这个问题的第一步是参考 RFC 2616: HTTP/1.1。特别是讨论 403 Forbidden404 未找到

  • 10.4.4 403 禁止

服务器理解该请求,但拒绝满足它。授权不会有帮助,并且不应重复请求。如果请求方法不是 HEAD 并且服务器希望公开为什么请求没有被满足,它应该在实体中描述拒绝的原因。如果服务器不希望将此信息提供给客户端,则可以使用状态代码 404(未找到)。

  • 10.4.5 404 未找到

服务器未找到任何与请求 URI 匹配的内容。没有说明这种情况是暂时的还是永久性的。如果服务器通过某种内部可配置机制知道旧资源永久不可用并且没有转发地址,则应使用 410(消失)状态代码。当服务器不希望准确揭示请求被拒绝的原因,或者没有其他响应适用时,通常使用此状态代码。

我对此的解释是 404 是更常见的错误代码,它只是说“那里什么都没有”。 403 表示“那里什么都没有,不要再试一次!”。

Apache 可能在没有显式索引文件的目录上返回 403 的原因之一是自动索引(即列出其中的所有文件)被禁用(也称为“禁止”)。在这种情况下,说“禁止列出该目录中的所有文件”比说“没有目录”更有意义。

The first step to answering this is to refer to RFC 2616: HTTP/1.1. Specifically the sections talking about 403 Forbidden and 404 Not Found.

  • 10.4.4 403 Forbidden

The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. If the request method was not HEAD and the server wishes to make public why the request has not been fulfilled, it SHOULD describe the reason for the refusal in the entity. If the server does not wish to make this information available to the client, the status code 404 (Not Found) can be used instead.

  • 10.4.5 404 Not Found

The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.

My interpretation of this is that 404 is the more general error code that just says "there's nothing there". 403 says "there's nothing there, don't try again!".

One reason why Apache might return 403 on directories without explicit index files is that auto-indexing (i.e. listing all files in it) is disabled (a.k.a "forbidden"). In that case saying "listing all files in this directory is forbidden" makes more sense than saying "there is no directory".

爱你是孤单的心事 2024-10-26 03:15:16

为什么 404 更可取的另一个论点是:谷歌网站管理员工具。

事实上,对于 404,Google 网站管理员工具会显示引用站点(允许您清理指向目录的错误链接),而对于 403,它不会显示它。

Another argument why 404 is preferable: google webmaster tools.

Indeed, for a 404, Google Webmaster Tool displays the referer (allowing you to clean up the bad link to the directory), whereas for a 403, it doesn't display it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文