PHP 中从 URL 解析域名

发布于 2024-08-23 00:23:50 字数 1713 浏览 11 评论 0原文

如何从 PHP 中的 URL 解析域?看来我需要一个国家域名数据库。

示例:

http://mail.google.com/hfjdhfjd/jhfjd.html -> google.com
http://www.google.bg/jhdjhf/djfhj.html ->谷歌.bg
http://www.google.co.uk/djhdjhf.php -> google.co.uk
http://www.tsk.tr/jhjgc.aspx -> tsk.tr
http://subsub.sub.nic.tr/ -> nic.tr
http://subsub.sub.google.com.tr - > google.com.tr
http://subsub.sub.itoy.info.tr-> itoy.info.tr

可以用whois请求来完成吗?

编辑:很少有域名带有 .trwww.nic.trwww.tsk.tr),其他域名如您所知: www.something.com.tr, www.something.org.tr

另外也没有 www.something.com.bg, <代码>www.something.org.bg。它们是 www.something.bg 就像德国人的 .de

但还有 www.something.a.bgwww. some.b.bg 因此 a.bgb.bgc.bg 等等。 (a.bg 就像co.uk

网络上一定有这些顶级域名的列表。

检查 Internet Explorer 中 URL http://www.agrotehnika97.a.bg/ 的颜色。 还检查

www.google.co.uk<br>
www.google.com.tr<br>
www.nic.tr<br>
www.tsk.tr

How I can parse a domain from URL in PHP? It seems that I need a country domain database.

Examples:

http://mail.google.com/hfjdhfjd/jhfjd.html -> google.com
http://www.google.bg/jhdjhf/djfhj.html -> google.bg
http://www.google.co.uk/djhdjhf.php -> google.co.uk
http://www.tsk.tr/jhjgc.aspx -> tsk.tr
http://subsub.sub.nic.tr/
-> nic.tr
http://subsub.sub.google.com.tr -> google.com.tr
http://subsub.sub.itoy.info.tr -> itoy.info.tr

Can it be done with whois request?

Edit: There are few domain names with .tr (www.nic.tr, www.tsk.tr) the others are as you know: www.something.com.tr, www.something.org.tr

Also there is no www.something.com.bg, www.something.org.bg. They are www.something.bg like the Germans' .de

But there are www.something.a.bg, www.something.b.bg thus a.bg, b.bg, c.bg and so on. (a.bg is like co.uk)

There on the net must be list of these top domain names.

Check how is coloured the url http://www.agrotehnika97.a.bg/ in Internet Explorer.
Check also

www.google.co.uk<br>
www.google.com.tr<br>
www.nic.tr<br>
www.tsk.tr

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

饮湿 2024-08-30 00:23:50

该域名存储在 $_SERVER['HTTP_HOST'] 中。

编辑:我相信这会返回整个域。要获取顶级域名,您可以这样做:

// Add all your wanted subdomains that act as top-level domains, here (e.g. 'co.cc' or 'co.uk')
// As array key, use the last part ('cc' and 'uk' in the above examples) and the first part as sub-array elements for that key
$allowed_subdomains = array(
    'cc'    => array(
        'co'
    ),
    'uk'    => array(
        'co'
    )
);

$domain = $_SERVER['HTTP_HOST'];
$parts = explode('.', $domain);
$top_level = array_pop($parts);

// Take care of allowed subdomains
if (isset($allowed_subdomains[$top_level]))
{
    if (in_array(end($parts), $allowed_subdomains[$top_level]))
        $top_level = array_pop($parts).'.'.$top_level;
}

$top_level = array_pop($parts).'.'.$top_level;

The domain is stored in $_SERVER['HTTP_HOST'].

EDIT: I believe this returns the whole domain. To just get the top-level domain, you could do this:

// Add all your wanted subdomains that act as top-level domains, here (e.g. 'co.cc' or 'co.uk')
// As array key, use the last part ('cc' and 'uk' in the above examples) and the first part as sub-array elements for that key
$allowed_subdomains = array(
    'cc'    => array(
        'co'
    ),
    'uk'    => array(
        'co'
    )
);

$domain = $_SERVER['HTTP_HOST'];
$parts = explode('.', $domain);
$top_level = array_pop($parts);

// Take care of allowed subdomains
if (isset($allowed_subdomains[$top_level]))
{
    if (in_array(end($parts), $allowed_subdomains[$top_level]))
        $top_level = array_pop($parts).'.'.$top_level;
}

$top_level = array_pop($parts).'.'.$top_level;
泛泛之交 2024-08-30 00:23:50

您可以使用 parse_url() 将其拆分并获得您想要的内容。
这是一个例子...

    $url = 'http://www.google.com/search?hl=en&source=hp&q=google&btnG=Google+Search&meta=lr%3D&aq=&oq=dasd';
    print_r(parse_url($url));

将回显...

Array
(
    [scheme] => http
    [host] => www.google.com
    [path] => /search
    [query] => hl=en&source=hp&q=google&btnG=Google+Search&meta=lr%3D&aq=&oq=dasd
)

You can use parse_url() to split it up and get what you want.
Here's an example...

    $url = 'http://www.google.com/search?hl=en&source=hp&q=google&btnG=Google+Search&meta=lr%3D&aq=&oq=dasd';
    print_r(parse_url($url));

Will echo...

Array
(
    [scheme] => http
    [host] => www.google.com
    [path] => /search
    [query] => hl=en&source=hp&q=google&btnG=Google+Search&meta=lr%3D&aq=&oq=dasd
)
陌路黄昏 2024-08-30 00:23:50

我认为您需要域名后使用的所有后缀的列表。
http://publicsuffix.org/list/ 提供最新的(或者他们声称的)当前使用的所有后缀。
该列表实际上位于此处
现在的想法是将该列表解析为一个结构,不同的级别按点分割,从结束级别开始:

例如对于域:
com.la
com.tr
com.lc

你最终会得到:

[la]=>[com]
[lc]=>[com]

等等...

然后你会从base_url 获取主机(通过使用parse_url),然后你会用点来分解它。然后你开始将值与你的结构进行匹配,从最后一个开始:

所以对于 google.com.tr 你首先要匹配 tr,然后是 com,然后一旦你进入 google,你就找不到匹配的了,这是你想要的...

I reckon you'll need a list of all suffixes used after a domain name.
http://publicsuffix.org/list/ provides an up-to-date (or so they claim) of all suffixes in use currently.
The list is actually here
Now the idea would be for you to parse up that list into a structure, with different levels split by the dot, starting by the end levels:

so for instance for the domains:
com.la
com.tr
com.lc

you'd end up with:

[la]=>[com]
[lc]=>[com]

etc...

Then you'd get the host from base_url (by using parse_url), and you'd explode it by dots. and you start matching up the values against your structure, starting with the last one:

so for google.com.tr you'd start by matching tr, then com, then you won't find a match once you get to google, which is what you want...

北城半夏 2024-08-30 00:23:50

正则表达式和 parse_url() 不适合您。

您需要使用公共后缀列表的软件包,只有这样您才能正确提取具有两级、三级TLD的域名(co.uk、a.bg、b.bg 等)。我建议使用 TLD 提取

这里是代码示例:

$extract = new LayerShifter\TLDExtract\Extract();

$result = $extract->parse('http://subsub.sub.google.com.tr');
$result->getRegistrableDomain(); // will return (string) 'google.com.tr'

Regex and parse_url() aren't solution for you.

You need package that uses Public Suffix List, only in this way you can correctly extract domains with two-, third-level TLDs (co.uk, a.bg, b.bg, etc.). I recomend use TLD Extract.

Here example of code:

$extract = new LayerShifter\TLDExtract\Extract();

$result = $extract->parse('http://subsub.sub.google.com.tr');
$result->getRegistrableDomain(); // will return (string) 'google.com.tr'
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文