检查 parse_url 中的子域

发布于 2024-09-20 00:56:25 字数 918 浏览 11 评论 0原文

我正在尝试编写一个函数来从 Facebook 获取用户个人资料 ID 或用户名。他们将网址输入到表单中，然后我尝试判断它是 Facebook 个人资料页面还是其他页面。问题是，如果他们进入应用程序页面或具有子域的其他页面，我想忽略该请求。

现在我有：

    $author_url = http://facebook.com/profile?id=12345;
            if(preg_match("/facebook/i",$author_url)){
            $parse_author_url = (parse_url($author_url));
            $parse_author_url_q = $parse_author_url['query'];
                if(preg_match('/id[=]([0-9]*)/', $parse_author_url_q, $match)){
                    $fb_id = "/".$match[1];}
                else{ $fb_id = $parse_author_url['path'];
                }
            $grav_url= "http://graph.facebook.com".$fb_id."/picture?type=square";
}
echo $gav_url;

如果 $author_url 有“id=”，则可以使用它作为配置文件 ID，如果没有，则它必须是用户名或页面名称，因此请使用它。我需要再运行一次检查，看看该 url 是否包含 facebook 但属于子域，请忽略它。我相信我可以在第一个 preg_match 中做到这一点 preg_match("/facebook/i",$author_url)

谢谢！

原文

I am trying to write a function to just get the users profile id or username from Facebook. They enter there url into a form then I'm trying to figure out if it's a Facebook profile page or other page. The problem is that if they enter an app page or other page that has a subdomain I would like to ignore that request.

Right now I have:

    $author_url = http://facebook.com/profile?id=12345;
            if(preg_match("/facebook/i",$author_url)){
            $parse_author_url = (parse_url($author_url));
            $parse_author_url_q = $parse_author_url['query'];
                if(preg_match('/id[=]([0-9]*)/', $parse_author_url_q, $match)){
                    $fb_id = "/".$match[1];}
                else{ $fb_id = $parse_author_url['path'];
                }
            $grav_url= "http://graph.facebook.com".$fb_id."/picture?type=square";
}
echo $gav_url;

This works if $author_url has "id=" then use that as the profile id if not then it must be a user name or page name so use that instead. I need to run one more check that if the url contains facebook but is a subdomain ignore it. I belive I can do that in the first preg_match preg_match("/facebook/i",$author_url)

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

浪荡不羁 2024-09-27 00:56:25

要忽略 facebook 子域，您可以确保它

$parse_author_url['host']

是 facebook.com。

如果是其他类似 login.facebook.com 或 apps.facebook.com 的内容，则无需继续。

或者，您还可以确保 URL 以 http://facebook.com 开头，如下所示：

if(preg_match("@(?:http://)?facebook@i",$author_url)){

To ignore facebook subdomains you can ensure that

$parse_author_url['host']

is facebook.com.

If its anything else like login.facebook.com or apps.facebook.com you need not proceed.

Alternatively you can also ensure that the URL begins with http://facebook.com as:

if(preg_match("@(?:http://)?facebook@i",$author_url)){

回复收藏 0 原文

星星的轨迹 2024-09-27 00:56:25

这不是您所要求的直接解决方案，但这些部件可以完成您需要做的事情。

我发现子域导致 parse_url 出现问题。也就是说，它返回一个仅包含 $result['path'] 且没有 'host' 或 'scheme' 的数组。

我的理论是，如果 parse_url 没有 'host' 或 'scheme' 结果，并且它在字符串中具有域后缀 ( .ext )，那么它就是一个子域。

这是代码：
（ $src 是我必须从子域中整理出相对 src 的 url ）：

$srcA = parse_url( $src );
//..if no scheme or host test if subdomain.
if( !$srcA['scheme'] && !$srcA['host'] ){
    //..this string / array is set elsewhere but for this example I will put it here
    $tld = "AC,AD,AE,AERO,AF,AG,AI,AL,AM,AN,AO,AQ,AR,ARPA,AS,ASIA,AT,AU,AW,AX,AZ,BA,BB,BD,BE,BF,BG,BH,BI,BIZ,BJ,BM,BN,BO,BR,BS,BT,BV,BW,BY,BZ,CA,CAT,CC,CD,CF,CG,CH,CI,CK,CL,CM,CN,CO,COM,COOP,CR,CU,CV,CW,CX,CY,CZ,DE,DJ,DK,DM,DO,DZ,EC,EDU,EE,EG,ER,ES,ET,EU,FI,FJ,FK,FM,FO,FR,GA,GB,GD,GE,GF,GG,GH,GI,GL,GM,GN,GOV,GP,GQ,GR,GS,GT,GU,GW,GY,HK,HM,HN,HR,HT,HU,ID,IE,IL,IM,IN,INFO,INT,IO,IQ,IR,IS,IT,JE,JM,JO,JOBS,JP,KE,KG,KH,KI,KM,KN,KP,KR,KW,KY,KZ,LA,LB,LC,LI,LK,LR,LS,LT,LU,LV,LY,MA,MC,MD,ME,MG,MH,MIL,MK,ML,MM,MN,MO,MOBI,MP,MQ,MR,MS,MT,MU,MUSEUM,MV,MW,MX,MY,MZ,NA,NAME,NC,NE,NET,NF,NG,NI,NL,NO,NP,NR,NU,NZ,OM,ORG,PA,PE,PF,PG,PH,PK,PL,PM,PN,POST,PR,PRO,PS,PT,PW,PY,QA,RE,RO,RS,RU,RW,SA,SB,SC,SD,SE,SG,SH,SI,SJ,SK,SL,SM,SN,SO,SR,ST,SU,SV,SX,SY,SZ,TC,TD,TEL,TF,TG,TH,TJ,TK,TL,TM,TN,TO,TP,TR,TRAVEL,TT,TV,TW,TZ,UA,UG,UK,US,UY,UZ,VA,VC,VE,VG,VI,VN,VU,WF,WS,XXX,YE,YT,ZA,ZM,ZW";

    $tldA = explode( ',' , strtolower( $tld ) );

    $isSubdomain = false;
    foreach( $tldA as $tld ){
        if( strstr( $src , '.'.$tld)!=false){
            $isSubdomain = true;
            break;
        }            
    }
    //..prefixing with the $host if it is not a subdomain.
    $src = $isSubdomain ? $src : $src = $host . '/' . $srcA['path'];

}

可以通过解析第一个“/”之前的 subdomain==true 字符串并使用正则表达式测试字符来编写进一步的确认。

希望这可以帮助一些人。

This isn't a direct solution for what you were asking but the parts are here to do what you need to do.

I found that a subdomain resulted in an issue with parse_url. Namely it returned an array with only $result['path'] and no 'host' or 'scheme'.

My theory here is if there is no 'host' or 'scheme' results from parse_url and it has domain suffix ( .ext ) in the string, it is a subdomain.

Here is the code:
(the $src is a url I had to sort out the relative src from subdomains ):

$srcA = parse_url( $src );
//..if no scheme or host test if subdomain.
if( !$srcA['scheme'] && !$srcA['host'] ){
    //..this string / array is set elsewhere but for this example I will put it here
    $tld = "AC,AD,AE,AERO,AF,AG,AI,AL,AM,AN,AO,AQ,AR,ARPA,AS,ASIA,AT,AU,AW,AX,AZ,BA,BB,BD,BE,BF,BG,BH,BI,BIZ,BJ,BM,BN,BO,BR,BS,BT,BV,BW,BY,BZ,CA,CAT,CC,CD,CF,CG,CH,CI,CK,CL,CM,CN,CO,COM,COOP,CR,CU,CV,CW,CX,CY,CZ,DE,DJ,DK,DM,DO,DZ,EC,EDU,EE,EG,ER,ES,ET,EU,FI,FJ,FK,FM,FO,FR,GA,GB,GD,GE,GF,GG,GH,GI,GL,GM,GN,GOV,GP,GQ,GR,GS,GT,GU,GW,GY,HK,HM,HN,HR,HT,HU,ID,IE,IL,IM,IN,INFO,INT,IO,IQ,IR,IS,IT,JE,JM,JO,JOBS,JP,KE,KG,KH,KI,KM,KN,KP,KR,KW,KY,KZ,LA,LB,LC,LI,LK,LR,LS,LT,LU,LV,LY,MA,MC,MD,ME,MG,MH,MIL,MK,ML,MM,MN,MO,MOBI,MP,MQ,MR,MS,MT,MU,MUSEUM,MV,MW,MX,MY,MZ,NA,NAME,NC,NE,NET,NF,NG,NI,NL,NO,NP,NR,NU,NZ,OM,ORG,PA,PE,PF,PG,PH,PK,PL,PM,PN,POST,PR,PRO,PS,PT,PW,PY,QA,RE,RO,RS,RU,RW,SA,SB,SC,SD,SE,SG,SH,SI,SJ,SK,SL,SM,SN,SO,SR,ST,SU,SV,SX,SY,SZ,TC,TD,TEL,TF,TG,TH,TJ,TK,TL,TM,TN,TO,TP,TR,TRAVEL,TT,TV,TW,TZ,UA,UG,UK,US,UY,UZ,VA,VC,VE,VG,VI,VN,VU,WF,WS,XXX,YE,YT,ZA,ZM,ZW";

    $tldA = explode( ',' , strtolower( $tld ) );

    $isSubdomain = false;
    foreach( $tldA as $tld ){
        if( strstr( $src , '.'.$tld)!=false){
            $isSubdomain = true;
            break;
        }            
    }
    //..prefixing with the $host if it is not a subdomain.
    $src = $isSubdomain ? $src : $src = $host . '/' . $srcA['path'];

}

Could write a further confirmation by parsing the subdomain==true strings before the first '/' and testing against characters with a RegEx.

Hope this helps some people out.

回复收藏 0 原文

~没有更多了~