PHP 中从 URL 解析域名
如何从 PHP 中的 URL 解析域?看来我需要一个国家域名数据库。
示例:
http://mail.google.com/hfjdhfjd/jhfjd.html -> google.com
http://www.google.bg/jhdjhf/djfhj.html ->谷歌.bg
http://www.google.co.uk/djhdjhf.php -> google.co.uk
http://www.tsk.tr/jhjgc.aspx -> tsk.tr
http://subsub.sub.nic.tr/ -> nic.tr
http://subsub.sub.google.com.tr - > google.com.tr
http://subsub.sub.itoy.info.tr-> itoy.info.tr
可以用whois请求来完成吗?
编辑:很少有域名带有 .tr
(www.nic.tr
、www.tsk.tr
),其他域名如您所知: www.something.com.tr
, www.something.org.tr
另外也没有 www.something.com.bg
, <代码>www.something.org.bg。它们是 www.something.bg
就像德国人的 .de
但还有 www.something.a.bg
、www. some.b.bg
因此 a.bg
、b.bg
、c.bg
等等。 (a.bg
就像co.uk
)
网络上一定有这些顶级域名的列表。
检查 Internet Explorer 中 URL http://www.agrotehnika97.a.bg/
的颜色。 还检查
www.google.co.uk<br>
www.google.com.tr<br>
www.nic.tr<br>
www.tsk.tr
How I can parse a domain from URL in PHP? It seems that I need a country domain database.
Examples:
http://mail.google.com/hfjdhfjd/jhfjd.html -> google.com
http://www.google.bg/jhdjhf/djfhj.html -> google.bg
http://www.google.co.uk/djhdjhf.php -> google.co.uk
http://www.tsk.tr/jhjgc.aspx -> tsk.tr
http://subsub.sub.nic.tr/
-> nic.tr
http://subsub.sub.google.com.tr -> google.com.tr
http://subsub.sub.itoy.info.tr -> itoy.info.tr
Can it be done with whois request?
Edit: There are few domain names with .tr
(www.nic.tr
, www.tsk.tr
) the others are as you know: www.something.com.tr
, www.something.org.tr
Also there is no www.something.com.bg
, www.something.org.bg
. They are www.something.bg
like the Germans' .de
But there are www.something.a.bg
, www.something.b.bg
thus a.bg
, b.bg
, c.bg
and so on. (a.bg
is like co.uk
)
There on the net must be list of these top domain names.
Check how is coloured the url http://www.agrotehnika97.a.bg/
in Internet Explorer.
Check also
www.google.co.uk<br>
www.google.com.tr<br>
www.nic.tr<br>
www.tsk.tr
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
该域名存储在
$_SERVER['HTTP_HOST']
中。编辑:我相信这会返回整个域。要获取顶级域名,您可以这样做:
The domain is stored in
$_SERVER['HTTP_HOST']
.EDIT: I believe this returns the whole domain. To just get the top-level domain, you could do this:
您可以使用
parse_url()
将其拆分并获得您想要的内容。这是一个例子...
将回显...
You can use
parse_url()
to split it up and get what you want.Here's an example...
Will echo...
我认为您需要域名后使用的所有后缀的列表。
http://publicsuffix.org/list/ 提供最新的(或者他们声称的)当前使用的所有后缀。
该列表实际上位于此处
现在的想法是将该列表解析为一个结构,不同的级别按点分割,从结束级别开始:
例如对于域:
com.la
com.tr
com.lc
你最终会得到:
等等...
然后你会从base_url 获取主机(通过使用parse_url),然后你会用点来分解它。然后你开始将值与你的结构进行匹配,从最后一个开始:
所以对于 google.com.tr 你首先要匹配 tr,然后是 com,然后一旦你进入 google,你就找不到匹配的了,这是你想要的...
I reckon you'll need a list of all suffixes used after a domain name.
http://publicsuffix.org/list/ provides an up-to-date (or so they claim) of all suffixes in use currently.
The list is actually here
Now the idea would be for you to parse up that list into a structure, with different levels split by the dot, starting by the end levels:
so for instance for the domains:
com.la
com.tr
com.lc
you'd end up with:
etc...
Then you'd get the host from base_url (by using parse_url), and you'd explode it by dots. and you start matching up the values against your structure, starting with the last one:
so for google.com.tr you'd start by matching tr, then com, then you won't find a match once you get to google, which is what you want...
正则表达式和 parse_url() 不适合您。
您需要使用公共后缀列表的软件包,只有这样您才能正确提取具有两级、三级TLD的域名(co.uk、a.bg、b.bg 等)。我建议使用 TLD 提取。
这里是代码示例:
Regex and parse_url() aren't solution for you.
You need package that uses Public Suffix List, only in this way you can correctly extract domains with two-, third-level TLDs (co.uk, a.bg, b.bg, etc.). I recomend use TLD Extract.
Here example of code: