我的 https 网站无法通过 WGET 命令下载

发布于 2024-12-09 11:28:21 字数 653 浏览 0 评论 0原文

我可以通过浏览器浏览页面,但无法通过wget下载html页面。 https://money.benck.tw

当我使用 wget 时,它甚至无法连接到网站:

--2011-10-12 05:30:24--  https://money.benck.tw/
Resolving money.benck.tw... 97.107.135.68
Connecting to money.benck.tw|97.107.135.68|:443... failed: Connection timed out.
Retrying.

--2011-10-12 05:33:35--  (try: 2)  https://money.benck.tw/
Connecting to money.benck.tw|97.107.135.68|:443...

但是,我可以下载其他 https 网站,例如: https://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js 这很奇怪。

I can browse the page by browser, but I can't download the html page by wget.
https://money.benck.tw

When I use wget, it can't even connect to the website:

--2011-10-12 05:30:24--  https://money.benck.tw/
Resolving money.benck.tw... 97.107.135.68
Connecting to money.benck.tw|97.107.135.68|:443... failed: Connection timed out.
Retrying.

--2011-10-12 05:33:35--  (try: 2)  https://money.benck.tw/
Connecting to money.benck.tw|97.107.135.68|:443...

However, I can download the other https website like: https://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js
It's very weird.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

清旖 2024-12-16 11:28:21

对于此网站,您必须使用 --no-check-certificate 命令

wget --no-check-certificate https://money.benck.tw

For this website you have to use the --no-check-certificate command

wget --no-check-certificate https://money.benck.tw
呆头 2024-12-16 11:28:21

我正在实验同样的问题,我尝试从外部站点下载文件,例如 https://downloads.wordpress.org/plugin/easy-wp-smtp.zip 和我使用 --no-check-certificate 的 wget 仍然不起作用......这一行冻结了:

连接到downloads.wordpress.org (downloads.wordpress.org)|198.143.164.250|:443...

有人有同样的问题吗?

未配置 IP 表和规则。当我在同一网络上的其他服务器上执行此操作时效果很好。这只发生在该服务器上。

问候,
于弗兰西斯科

I'm experiments the same issue, I trying to download files from an external site like https://downloads.wordpress.org/plugin/easy-wp-smtp.zip and I wget using --no-check-certificate stills not working.... It's freezing in this line:

Connecting to downloads.wordpress.org (downloads.wordpress.org)|198.143.164.250|:443...

Anyone have the same issue?

No IP tables configured and rules. When I do this on other server on the same networks works fine. This only happens on this server specialy.

Regards,
Francisco Yu

伏妖词 2024-12-16 11:28:21

这是因为该页面可能被 wget 过于频繁地抓取。您需要修改标头,尤其是 useragent。

来自其他网站的示例:

--no-check-certificate 不支持,

 wget --no-check-certificate "https://www.money.pl/pieniadze/depozyty/walutowearch/1921-02-05,2021-02-05,LIBORCHF3M,strona,1.html"                                                                  --2021-02-05 17:05:34--  https://www.money.pl/pieniadze/depozyty/walutowearch/1921-02-05,2021-02-05,LIBORCHF3M,strona,1.html
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
Resolving www.money.pl (www.money.pl)... 212.77.101.20
Connecting to www.money.pl (www.money.pl)|212.77.101.20|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2021-02-05 17:05:34 ERROR 403: Forbidden.

但下载 sendign 其他标头的其他工具有效

 http -h "https://www.money.pl/pieniadze/depozyty/walutowearch/1921-02-05,2021-02-05,LIBORCHF3M,strona,1.html"  
HTTP/1.1 200 OK
Cache-control: max-age=60, public,stale-while-revalidate=5
Connection: keep-alive
Content-Encoding: gzip
Content-Length: 20756
Content-Security-Policy: upgrade-insecure-requests;
Content-Type: text/html; charset=iso-8859-2
Date: Fri, 05 Feb 2021 16:04:16 GMT
Link: <https://money.wp.pl/dGxwOTV0SyYZFTlneUtGM1pNbSY9EkhlJ1V1dglvOxgnKBALCW87GCcoEAsJbzsYJygQCwlvOxgnKBALCW87GCcoEAsJbzsYJygQCwlvOxgnKBALCW87GCcoEAsJbzsYJygQCwlvOxgnKBALCW87GCcoEAsJbzsYJygQCwlvOxgnKBALCW87GCcobXh0RUZ9WlgoNTAeDjRHBTlpZxYWIhMeKydrAld1TER2ciZYECoUSjgjIR4JKBYSNnomXEF1TUUJJD9VCi4ZEzUxcwJRdT4TKiQ5Sh0zAVJ9YWR2EyYUAjs7IVUFNRsfamZjAiJ2QUV-eWYCSXdNUn1hZHNWd0pGYmRkHVRyXUV6ZhV8LQU3JQwcEAMpYkpCfRclRBYoFhZqZmMCJ3ZWHzs5OhY0EDkoLjA0VFl1XgQ_PTgNKRMbQgIuB0lCIRQEOzUiWQB6XhYrIgVcCzMLSn9lZhYHJBkDKjM5Qh16DxYjISJJRjo=>;rel="preload";as="script";
Server: nginx
Set-Cookie: mny_ver2=v8c;Domain=.money.pl;Path=/;Max-Age=2592000;
Vary: Accept-Encoding

This is because of this page is probably scraped by wget too often. You need to modify headers, especially useragent.

Examples from other website:

--no-check-certificate does not hepls

 wget --no-check-certificate "https://www.money.pl/pieniadze/depozyty/walutowearch/1921-02-05,2021-02-05,LIBORCHF3M,strona,1.html"                                                                  --2021-02-05 17:05:34--  https://www.money.pl/pieniadze/depozyty/walutowearch/1921-02-05,2021-02-05,LIBORCHF3M,strona,1.html
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
Resolving www.money.pl (www.money.pl)... 212.77.101.20
Connecting to www.money.pl (www.money.pl)|212.77.101.20|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2021-02-05 17:05:34 ERROR 403: Forbidden.

but other tool to download sendign other headers works

 http -h "https://www.money.pl/pieniadze/depozyty/walutowearch/1921-02-05,2021-02-05,LIBORCHF3M,strona,1.html"  
HTTP/1.1 200 OK
Cache-control: max-age=60, public,stale-while-revalidate=5
Connection: keep-alive
Content-Encoding: gzip
Content-Length: 20756
Content-Security-Policy: upgrade-insecure-requests;
Content-Type: text/html; charset=iso-8859-2
Date: Fri, 05 Feb 2021 16:04:16 GMT
Link: <https://money.wp.pl/dGxwOTV0SyYZFTlneUtGM1pNbSY9EkhlJ1V1dglvOxgnKBALCW87GCcoEAsJbzsYJygQCwlvOxgnKBALCW87GCcoEAsJbzsYJygQCwlvOxgnKBALCW87GCcoEAsJbzsYJygQCwlvOxgnKBALCW87GCcoEAsJbzsYJygQCwlvOxgnKBALCW87GCcobXh0RUZ9WlgoNTAeDjRHBTlpZxYWIhMeKydrAld1TER2ciZYECoUSjgjIR4JKBYSNnomXEF1TUUJJD9VCi4ZEzUxcwJRdT4TKiQ5Sh0zAVJ9YWR2EyYUAjs7IVUFNRsfamZjAiJ2QUV-eWYCSXdNUn1hZHNWd0pGYmRkHVRyXUV6ZhV8LQU3JQwcEAMpYkpCfRclRBYoFhZqZmMCJ3ZWHzs5OhY0EDkoLjA0VFl1XgQ_PTgNKRMbQgIuB0lCIRQEOzUiWQB6XhYrIgVcCzMLSn9lZhYHJBkDKjM5Qh16DxYjISJJRjo=>;rel="preload";as="script";
Server: nginx
Set-Cookie: mny_ver2=v8c;Domain=.money.pl;Path=/;Max-Age=2592000;
Vary: Accept-Encoding
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文