在 Node 中本地验证 URL 的最安全方法?
我有一个 JS 应用程序需要验证 URL。我看过建议我使用库或正则表达式的帖子,但这不是我想要的。
我想知道是否有一个本地方法可以验证 URL(如果只检查 http 和 https 协议就没有问题)。
我已经尝试过 dns 模块
dns.lookup(<MY_URL>, {}, (err) => {
if (err) {
/* do something */
} else {
/* do something else */
}
}
,但似乎并非在所有情况下都有效。
I have a JS app that needs to validate URLs. I've seen posts that suggest me to use libraries or regex, but this is not what I am looking for.
I want to know if there is a native methods that validate URLs (no problem if only check http and https protocols).
I've tried the dns module
dns.lookup(<MY_URL>, {}, (err) => {
if (err) {
/* do something */
} else {
/* do something else */
}
}
but seems to not work in every case.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以在 URL 上检查各种内容。
您可以检查它是否是一个格式正确、包含所有预期部分的 URL。如果它的格式不正确,那么它就不是一个有效的 URL,但仅仅因为它的格式正确并不意味着它实际上是一个有效的 URL。
如果通过了第一个测试,您可以解析 URL,获取主机并查看该主机是否存在于 DNS 中。如果 DNS 中不存在,则可能不是有效的 URL。
如果它存在于 DNS 中,您可以在 URL 中存在或暗示的任何端口上探测主机,并查看是否可以获得任何响应。假设这是一个 http(s) URL,您可以尝试执行
GET /
并查看是否获得任何类型的连接和/或网络响应。如果这样做,那么该域/协议/端口显然存在 http(s) 主机。如果出现连接错误,则表明主机未运行。如果您希望该 URL 适用于 GET 请求,则可以对 URL 的完整路径执行 GET 操作,看看是否获得 2xx 或 3xx 响应。如果这样做,那么您似乎拥有一个操作 URL。如果您不这样做,那么显然不是。
如果您不知道该 URL 是否适用于 GET,那么检查主机是否有任何响应(如步骤 3 所示)实际上就是您所能做的,因为您将无法通过以下方式得出任何结论测试整个路径的 GET。
There are the various things you can check on the URL.
You can check that it's a properly formed URL that has all the expected parts. If it's not properly formed, then it's not a valid URL, but just because it's properly formed does not mean it's actually a working URL.
If it passes the first test, you can parse the URL, get the host and see if the host exists in DNS. If it doesn't exist in DNS, then probably not a working URL.
If it exists in DNS, you can probe the host on whatever port is present or implied from the URL and see if you can get any response. Assuming this is an http(s) URL, you can try to do a
GET /
and see if you get any sort of connection and/or network response. If you do, then an http(s) host does apparently exist at that domain/protocol/port. If you get a connection error, then it would appear that the host is not operating.If you expect the URL to be something that works for a GET request, then you can do a GET on the full path of the URL and see if you get a 2xx or 3xx response. If you do, then you appear to have an operating URL. If you don't, then apparently not.
If you don't know if the URL is supposed to work for a GET, then checking the host for any response (as in step 3) is really all you can do because you won't be able to conclude anything by testing a GET of the whole path.