当字符串没有协议://时匹配主机名?

发布于 2024-12-10 12:20:39 字数 534 浏览 1 评论 0原文

我使用此 js 代码来匹配字符串中的主机名:

url.match(/:\/\/(www\.)?(.[^/:]+)/);

当 url 开头有 protocol:// 时,此方法有效。例如:

这工作正常:

var url = "http://domain.com/page";
url.match(/:\/\/(www\.)?(.[^/:]+)/);

但这不行:

var url = "domain.com/page";
url.match(/:\/\/(www\.)?(.[^/:]+)/);

我试过了:

url.match(/(:\/\/)?(www\.)?(.[^/:]+)/);

当它不包含协议://时,它与主机名很好匹配,但当它包含它时,它只返回协议而不是主机名。

如果域名不包含该域名,我该如何匹配该域名?

I use this js code to match a hostname from a string:

url.match(/:\/\/(www\.)?(.[^/:]+)/);

This works when the url has protocol:// at the beginning. For example:

This works fine:

var url = "http://domain.com/page";
url.match(/:\/\/(www\.)?(.[^/:]+)/);

But this doesn't:

var url = "domain.com/page";
url.match(/:\/\/(www\.)?(.[^/:]+)/);

I have tried:

url.match(/(:\/\/)?(www\.)?(.[^/:]+)/);

And that matches fine the hostname when it doesn't contain protocol://, but when it does contains it it only returns the protocol and not the hostname.

How could I match the domain when it doesn't contains it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

-黛色若梦 2024-12-17 12:20:39

我使用了 Steven Levithan 的 这个函数,它可以很好地解析 url。

以下是如何使用此功能

  alert(parseUri("www.domain.com/foo").host)

I used this function from Steven Levithan, it parses urls quite decently.

Here's how you use this function

  alert(parseUri("www.domain.com/foo").host)
会发光的星星闪亮亮i 2024-12-17 12:20:39

好吧,在你脑子崩溃之前,@xanatos 的答案是一个满足基本需求的简单正则表达式。其他答案比此正则表达式更完整并处理更多情况:

(?:(?:(?:\bhttps?|ftp)://)|^)([-A-Z0-9.]+)/

第 1 组将包含您的主机名。对于正则表达式来说,URL 解析是一件脆弱的事情。你走在正确的轨道上。您有两个部分起作用的正则表达式。我只是将它们结合起来。

编辑:我昨天晚上很累。这是 jscript 的正则表达式

if (subject.match(/(?:(?:(?:\bhttps?|ftp):\/\/)|^)([\-a-z0-9.]+)\//i)) {
    // Successful match
} else {
    // Match attempt failed
}

OK before you have a brain meltdown from @xanatos answer here is a simple regex for basic needs. The other answers are more complete and handle more cases than this regex :

(?:(?:(?:\bhttps?|ftp)://)|^)([-A-Z0-9.]+)/

Group 1 will have your host name. URL parsing is a fragile thing to do with regexes. You were on the right track. You had two regexes that worked partially. I simply combined them.

Edit : I was tired yesterday night. Here is the regex for jscript

if (subject.match(/(?:(?:(?:\bhttps?|ftp):\/\/)|^)([\-a-z0-9.]+)\//i)) {
    // Successful match
} else {
    // Match attempt failed
}
若无相欠,怎会相见 2024-12-17 12:20:39

var rx = /^(?:(?:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?:\w+:\w+@)?(?:(?:[-\w]+\.)+(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?::[\d]{1,5})?(?:(?:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|#)?(?:(?:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?:#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?$/;

应该是 uber-url 解析正则表达式:-) 取自此处 http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/

在这里测试:http://jsfiddle.net/Qznzx/1/

它显示了正则表达式的无用性。

This

var rx = /^(?:(?:ht|f)tp(?:s?)\:\/\/|~\/|\/)?(?:\w+:\w+@)?(?:(?:[-\w]+\.)+(?:com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|travel|[a-z]{2}))(?::[\d]{1,5})?(?:(?:(?:\/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|\/)+|\?|#)?(?:(?:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=?(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?:#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?$/;

should be the uber-url parsing regex :-) Taken from here http://flanders.co.nz/2009/11/08/a-good-url-regular-expression-repost/

Test here: http://jsfiddle.net/Qznzx/1/

It shows the uselessness of regexes.

蓝天白云 2024-12-17 12:20:39

这可能比必要的更复杂一些,但它似乎有效:

^((?:.+?:\/\/)?(?:.[^/:]+)+)$ 
  1. 协议的非捕获组。从字符串的开头
    匹配任意数量的字符,直到 :。可能有零个或一个
    协议。
  2. url 其余部分的非捕获组。这部分必须存在。
  3. 将其全部分组为一个组。

This might be a bit more complex than necessary but it seems to work:

^((?:.+?:\/\/)?(?:.[^/:]+)+)$ 
  1. A non-capturing group for the protocol. From the start of the string
    match any number of characters until a :. There may be zero or one
    protocol.
  2. A non-capturing group for the rest of the url. This part must exist.
  3. Group it all up in single group.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文