目录名称中包含下划线的 URL 编码?

发布于 2024-08-20 20:19:56 字数 656 浏览 6 评论 0原文

我们在工作中遇到了奇怪的争论,我在这方面可能是错的,所以这就是我问的原因。

我们的软件将目录输出到 Apache 服务器,该服务器将目录名称中的下划线替换为 %5F。

例如,如果目录名称在我们的软件中以字符串形式列出,则它将是:“andy_test”,但是当软件将目录输出到 Apache 服务器时,它将变成“andy%5Ftest”。不幸的是,当您访问服务器上的 url 时,它最终会变成“andy%255Ftest”。

不知何故,这对我来说似乎是错误的,进展再次是:

  1. andy_test <- (作为软件中的字符串)
  2. andy%5Ftest <- (作为服务器上的目录列出)
  3. andy%255Ftest <- (必须使用当从 Web 浏览器调用与服务器上的 URL 相同的目录时。)

我假设“%5”是下划线编码,“%25”是“%”编码。

现在在我看来,目录名称应该在服务器上列出的方式就是简单的 andy_test ,如果您使用编码的 URI,那么也许您最终会得到“andy%5Ftest”来访问上的目录阿帕奇服务器。

我向后端的人询问了这一点,他们说他们只是:“对不是字母或数字的任何东西进行编码。

所以我想我对此有点困惑。你能告诉我谁是对的吗?并引导我了解一些有关原因的信息?

We've run into an odd argument where I work, and I may be wrong on this, so this is why I am asking.

Our software outputs a directory to an Apache server that replaces an underscore with a %5F in the name of the directory.

For instance if the name of the directory was listed as a string in our software it would be: "andy_test", but then when the software outputs the directory to the Apache server, it would become "andy%5Ftest". Unfortunately, when you access the url on the server it ends up becoming "andy%255Ftest".

Somehow this seems wrong to me, once again the progression is:

  1. andy_test <- (as a string in the software)
  2. andy%5Ftest <- (listed as a directory on the server)
  3. andy%255Ftest <- (must be used when calling the same directory as a URL on the server from a web browser.)

I'm assuming that "%5" is encoding for underscore, and that "%25" is encoding for "%".

Now it would seem to me that the way that the directory name should be listed on the server would be just plain andy_test and if you were using an encoded URI then maybe you would end up with the "andy%5Ftest" to access the directory on the apache server.

I asked the guys on the backend about it, and they said that they were just: "encoding anything that was not a letter or a number.

So I guess I'm a bit confused on this. Can you tell me who is right, and direct me to some information on why?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

魔法少女 2024-08-27 20:19:57

您不应在创建目录名称时对其进行编码(按照您的建议)。编码应该只发生在将其分发给浏览器的最后阶段。这就是为什么你最终得到“双”编码:%25 是 %,5F 是下划线第一个编码的剩余部分。

另请注意,您不需要根据 rfc1738 对下划线进行编码。

2.2。 URL字符编码问题

...

因此,只有字母数字、特殊字符“$-_.+!*'(),”和
可以使用用于其保留目的的保留字符
URL 中未编码。

You should not encode the directory names as you create them (as you suggested). Encoding should only happen at the last stage where it is handed out to the browser. That's why you are ending up with 'double' encoding: %25 is % and 5F is the leftover from the first encoding of underscore.

Also, note that you don't need to encode underscores according to rfc1738.

2.2. URL Character Encoding Issues

...

Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL.

与之呼应 2024-08-27 20:19:57

您所显示的内容中发生了双重编码。两个步骤就足够了:

andy_test 既是软件中的字符串,也是文件系统中目录或脚本的实际名称(Web 服务器访问的资源)

andy%5Ftestandy_test URL 编码的。该字符串应该由浏览器使用(在下划线情况下并不真正需要,但在其他情况下可能需要)。

andy%255ftest 只是 andy_test URL 编码了两次,这没有意义,应该没有必要。只需决定在哪里进行编码即可。如果您在代码级别和网络服务器级别都执行此操作,则可能会发生这种情况,并且结果是链接断开,除非您再次解码两次,这不是真正需要的,也不是理智的。

There is double encoding happening in what you are showing. Two steps should be enough:

andy_test is both the string in the software and the actual name of the directory or script in the filesystem (the resource the web server accesses)

andy%5Ftest is andy_test URL encoded. This string should the browser use (it's not really needed in the underscore case, but may be in other cases).

andy%255ftest is just andy_test URL encoded twice, which makes no sense, there should be no need to. Just decide WHERE you will do the encoding. If you do it both at the code level and at the webserver level this is what can happen and the result is broken links unless you are decoding two times again, which is not really needed nor sane.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文