PHP：比较百分比编码不同的 URI

发布于 2024-09-27 01:45:25 字数 661 浏览 15 评论 0原文

在 PHP 中，我想比较两个相对 URL 是否相等。问题：URL 的百分比编码可能不同，例如

/dir/file+file 与 /dir/file%20file
/dir/file(file) vs. /dir/file%28file%29
/dir/file%5bfile vs. /dir/file%5Bfile

根据到 RFC 3986，服务器应该以相同的方式对待这些 URI。但如果我使用 == 进行比较，我最终会得到不匹配的结果。

所以我正在寻找一个 PHP 函数，它将接受两个字符串，如果它们表示相同的 URI，则返回 TRUE（不计算相同字符的编码/解码变体、大写/小写十六进制数字）以编码字符表示，+ 与 %20 表示空格），如果不同则返回 FALSE。

我事先知道这些字符串中只有 ASCII 字符——没有 unicode。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

君勿笑 2024-10-04 01:45:25

function uriMatches($uri1, $uri2)
{
    return urldecode($uri1) == urldecode($uri2);
}

echo uriMatches('/dir/file+file', '/dir/file%20file');      // TRUE
echo uriMatches('/dir/file(file)', '/dir/file%28file%29');  // TRUE
echo uriMatches('/dir/file%5bfile', '/dir/file%5Bfile');    // TRUE

url 解码

function uriMatches($uri1, $uri2)
{
    return urldecode($uri1) == urldecode($uri2);
}

echo uriMatches('/dir/file+file', '/dir/file%20file');      // TRUE
echo uriMatches('/dir/file(file)', '/dir/file%28file%29');  // TRUE
echo uriMatches('/dir/file%5bfile', '/dir/file%5Bfile');    // TRUE

urldecode

回复收藏 0 原文

就像说晚安 2024-10-04 01:45:25

编辑：请查看@webbiedave 的回复。他的要好得多（我什至不知道 PHP 中有一个函数可以做到这一点......每天学习一些新东西）

您将必须解析字符串以查找匹配的内容 %##查找这些百分比编码的出现情况。然后从中获取数字，您应该能够传递它，以便 chr() 函数获取这些百分比编码的字符。重建字符串，然后您应该能够匹配它们。

不确定这是最有效的方法，但考虑到 URL 通常不会那么长，因此应该不会对性能造成太大影响。

回复收藏 0 原文

海未深 2024-10-04 01:45:25

我知道这里的这个问题似乎是由 webbiedave 解决的，但我也有自己的问题。

第一个问题：编码字符不区分大小写。因此 %C3 和 %c3 都是完全相同的字符，尽管它们作为 URI 不同。因此两个 URI 都指向同一位置。

第二个问题：folder%20(2) 和folder%20%282%29 都是有效的urlencoded URI，它们指向相同的位置，尽管它们是不同的URI。

第三个问题：如果我去掉 url 编码字符，我就会有两个具有相同 URI 的位置，如 bla%2Fblubb 和 bla/blubb。

那么该怎么办呢？为了比较两个 URI，我需要对它们进行规范化，将它们拆分为所有组件，对所有路径和查询部分进行一次 urldecode，对它们进行 rawurlencode 并将它们重新粘合在一起，然后我可以比较它们。

这可能是对其进行标准化的函数：

function normalizeURI($uri) {
    $components = parse_url($uri);
    $normalized = "";
    if ($components['scheme']) {
        $normalized .= $components['scheme'] . ":";
    }
    if ($components['host']) {
        $normalized .= "//";
        if ($components['user']) { //this should never happen in URIs, but still probably it's anything can happen thursday
            $normalized .= rawurlencode(urldecode($components['user']));
            if ($components['pass']) {
                $normalized .= ":".rawurlencode(urldecode($components['pass']));
            }
            $normalized .= "@";
        }
        $normalized .= $components['host'];
        if ($components['port']) {
            $normalized .= ":".$components['port'];
        }
    }
    if ($components['path']) {
        if ($normalized) {
            $normalized .= "/";
        }
        $path = explode("/", $components['path']);
        $path = array_map("urldecode", $path);
        $path = array_map("rawurlencode", $path);
        $normalized .= implode("/", $path);
    }
    if ($components['query']) {
        $query = explode("&", $components['query']);
        foreach ($query as $i => $c) {
            $c = explode("=", $c);
            $c = array_map("urldecode", $c);
            $c = array_map("rawurlencode", $c);
            $c = implode("=", $c);
            $query[$i] = $c;
        }
        $normalized .= "?".implode("&", $query);
    }
    return $normalized;
}

现在您可以将 webbiedave 的函数更改为：

function uriMatches($uri1, $uri2) {
    return normalizeURI($uri1) === normalizeURI($uri2);
}

应该可以。是的，它比我想要的要复杂得多。

I know this problem here seems to be solved by webbiedave, but I had my own problems with it.

First problem: Encoded characters are case-insensitive. So %C3 and %c3 are both the exact same character, although they are different as a URI. So both URIs point to the same location.

Second problem: folder%20(2) and folder%20%282%29 are both validly urlencoded URIs, which point to the same location, although they are different URIs.

Third problem: If I get rid of the url encoded characters I have two locations having the same URI like bla%2Fblubb and bla/blubb.

So what to do then? In order to compare two URIs, I need to normalize both of them in a way that I split them in all components, urldecode all paths and query-parts for once, rawurlencode them and glue them back together and then I could compare them.

And this could be the function to normalize it:

function normalizeURI($uri) {
    $components = parse_url($uri);
    $normalized = "";
    if ($components['scheme']) {
        $normalized .= $components['scheme'] . ":";
    }
    if ($components['host']) {
        $normalized .= "//";
        if ($components['user']) { //this should never happen in URIs, but still probably it's anything can happen thursday
            $normalized .= rawurlencode(urldecode($components['user']));
            if ($components['pass']) {
                $normalized .= ":".rawurlencode(urldecode($components['pass']));
            }
            $normalized .= "@";
        }
        $normalized .= $components['host'];
        if ($components['port']) {
            $normalized .= ":".$components['port'];
        }
    }
    if ($components['path']) {
        if ($normalized) {
            $normalized .= "/";
        }
        $path = explode("/", $components['path']);
        $path = array_map("urldecode", $path);
        $path = array_map("rawurlencode", $path);
        $normalized .= implode("/", $path);
    }
    if ($components['query']) {
        $query = explode("&", $components['query']);
        foreach ($query as $i => $c) {
            $c = explode("=", $c);
            $c = array_map("urldecode", $c);
            $c = array_map("rawurlencode", $c);
            $c = implode("=", $c);
            $query[$i] = $c;
        }
        $normalized .= "?".implode("&", $query);
    }
    return $normalized;
}

Now you can alter webbiedave's function to this:

function uriMatches($uri1, $uri2) {
    return normalizeURI($uri1) === normalizeURI($uri2);
}

That should do. And yes, it is quite more complicated than even I wanted it to be.

回复收藏 0 原文

~没有更多了~

关于作者

﹉夏雨初晴づ

暂无简介

文章

24 人气

关注发私信

友情链接

文江博客

PHP：比较百分比编码不同的 URI

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

tomoekana

无边思念无边月

眼角的笑意。

在风中等你

是你

syong71

友情链接

PHP：比较百分比编码不同的 URI

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

tomoekana

无边思念无边月

眼角的笑意。

在风中等你

是你

syong71

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。