@7c/parsedomain 中文文档教程
parsedomain
将 URL 拆分为子域、域和顶级域。
由于不同国家和组织对域的处理方式不同,因此将 URL 拆分为子域、域和顶级域部分不是一个简单的正则表达式。 parsedomain 使用来自 publicsuffix.org 的大型已知顶级域列表来识别不同的部分域的。
该模块在底层使用 trie 数据结构来确保尽可能小的库大小和最快的查找速度。 该库经过压缩和 gzip 压缩后大约有 30KB。 由于 publicsuffix.org 经常更新,数据结构构建在 npm install
上作为 postinstall
挂钩。 如果在该步骤中出现问题,库将回退到发布时已构建的预构建列表。
Installation
npm install parsedomain
Updating Database
您可能需要不时更新数据库。 有时数据库下载可能会失败,您可能需要手动下载它
cd node-modules/parsedomain && node scripts-build-tries.js
Usage
// long subdomains can be handled
expect(parseDomain("some.subdomain.example.co.uk")).to.eql({
subdomain: "some.subdomain",
domain: "example",
tld: "co.uk"
});
// protocols, usernames, passwords, ports, paths, queries and hashes are disregarded
expect(parseDomain("https://user:password@example.co.uk:8080/some/path?and&query#hash")).to.eql({
subdomain: "",
domain: "example",
tld: "co.uk"
});
// unknown top-level domains are ignored
expect(parseDomain("unknown.tld.kk")).to.equal(null);
// invalid urls are also ignored
expect(parseDomain("invalid url")).to.equal(null);
expect(parseDomain({})).to.equal(null);
Introducing custom tlds
// custom top-level domains can optionally be specified
expect(parseDomain("mymachine.local",{ customTlds: ["local"] })).to.eql({
subdomain: "",
domain: "mymachine",
tld: "local"
});
// custom regexps can optionally be specified (instead of customTlds)
expect(parseDomain("localhost",{ customTlds:/localhost|\.local/ })).to.eql({
subdomain: "",
domain: "",
tld: "localhost"
});
有时使用辅助函数应用 customTlds 参数会很有帮助
function parseLocalDomains(url) {
return parseDomain(url, {
customTlds: /localhost|\.local/
});
}
expect(parseLocalDomains("localhost")).to.eql({
subdomain: "",
domain: "",
tld: "localhost"
});
expect(parseLocalDomains("mymachine.local")).to.eql({
subdomain: "",
domain: "mymachine",
tld: "local"
});
API
parseDomain(url: string, options: ParseOptions): ParsedDomain|null
如果 url
有一个,则返回 null
未知的顶级域名,或者它不是有效的网址。
ParseOptions
{
// A list of custom tlds that are first matched against the url.
// Useful if you also need to split internal URLs like localhost.
customTlds: RegExp|Array<string>,
// There are lot of private domains that act like top-level domains,
// like blogspot.com, googleapis.com or s3.amazonaws.com.
// By default, these domains would be split into:
// { subdomain: ..., domain: "blogspot", tld: "com" }
// When this flag is set to true, the domain will be split into
// { subdomain: ..., domain: ..., tld: "blogspot.com" }
// See also https://github.com/peerigon/parse-domain/issues/4
privateTlds: boolean - default: false
}
ParsedDomain
{
tld: string,
domain: string,
subdomain: string
}
Forked
这个 repo 是从 peerigon 分叉出来的,并进行了修改/扩展
parsedomain
Splits a URL into sub-domain, domain and the top-level domain.
Since domains are handled differently across different countries and organizations, splitting a URL into sub-domain, domain and top-level-domain parts is not a simple regexp. parsedomain uses a large list of known top-level domains from publicsuffix.org to recognize different parts of the domain.
This module uses a trie data structure under the hood to ensure the smallest possible library size and the fastest lookup. The library is roughly 30KB minified and gzipped. Since publicsuffix.org is frequently updated, the data structure is built on npm install
as a postinstall
hook. If something goes wrong during that step, the library falls back to a prebuilt list that has been built at the time of publishing.
Installation
npm install parsedomain
Updating Database
You might need to update the database from time to time. Sometimes database downloading might fail and you might need to download it manually
cd node-modules/parsedomain && node scripts-build-tries.js
Usage
// long subdomains can be handled
expect(parseDomain("some.subdomain.example.co.uk")).to.eql({
subdomain: "some.subdomain",
domain: "example",
tld: "co.uk"
});
// protocols, usernames, passwords, ports, paths, queries and hashes are disregarded
expect(parseDomain("https://user:password@example.co.uk:8080/some/path?and&query#hash")).to.eql({
subdomain: "",
domain: "example",
tld: "co.uk"
});
// unknown top-level domains are ignored
expect(parseDomain("unknown.tld.kk")).to.equal(null);
// invalid urls are also ignored
expect(parseDomain("invalid url")).to.equal(null);
expect(parseDomain({})).to.equal(null);
Introducing custom tlds
// custom top-level domains can optionally be specified
expect(parseDomain("mymachine.local",{ customTlds: ["local"] })).to.eql({
subdomain: "",
domain: "mymachine",
tld: "local"
});
// custom regexps can optionally be specified (instead of customTlds)
expect(parseDomain("localhost",{ customTlds:/localhost|\.local/ })).to.eql({
subdomain: "",
domain: "",
tld: "localhost"
});
It can sometimes be helpful to apply the customTlds argument using a helper function
function parseLocalDomains(url) {
return parseDomain(url, {
customTlds: /localhost|\.local/
});
}
expect(parseLocalDomains("localhost")).to.eql({
subdomain: "",
domain: "",
tld: "localhost"
});
expect(parseLocalDomains("mymachine.local")).to.eql({
subdomain: "",
domain: "mymachine",
tld: "local"
});
API
parseDomain(url: string, options: ParseOptions): ParsedDomain|null
Returns null
if url
has an unknown tld or if it's not a valid url.
ParseOptions
{
// A list of custom tlds that are first matched against the url.
// Useful if you also need to split internal URLs like localhost.
customTlds: RegExp|Array<string>,
// There are lot of private domains that act like top-level domains,
// like blogspot.com, googleapis.com or s3.amazonaws.com.
// By default, these domains would be split into:
// { subdomain: ..., domain: "blogspot", tld: "com" }
// When this flag is set to true, the domain will be split into
// { subdomain: ..., domain: ..., tld: "blogspot.com" }
// See also https://github.com/peerigon/parse-domain/issues/4
privateTlds: boolean - default: false
}
ParsedDomain
{
tld: string,
domain: string,
subdomain: string
}
Forked
This repo was forked from peerigon and modified/extended