如何提取受密码保护的 Web Wiki 页面?
我希望获得一些受密码保护的网页和子链接。我有用户名和密码,可以从普通的浏览器用户界面访问它们。但由于我希望将这些页面保存到本地驱动器以供以后参考,因此我使用 WGET 来获取它们:
wget --http-user=USER --http-password=PASS http://mywiki.mydomain.com/myproject
但上面的方法不起作用,因为它再次要求输入密码。有没有更好的方法可以做到这一点,而不会被系统再次要求输入密码所困扰。另外,获取特定页面上的所有链接和子链接并将它们存储到单个文件夹的最佳选择是什么。
更新: 我尝试访问的实际页面位于 HTTPS 网关后面,并且该页面的证书未得到验证。有什么办法可以解决这个问题吗?
mysystem-dsktp ~ $ wget --http-user=USER --http-password=PASS https://secure.site.mydomain.com/login?url=http://mywiki.mydomain.com%2fsite%2fmyproject%2f
--2010-01-24 18:09:21-- https://secure.site.mydomain.com/login?url=http://mywiki.mydomain.com%2fsite%2fmyproject%2f
Resolving secure.site.mydomain.com... 124.123.23.12, 124.123.23.267, 124.123.102.191, ...
Connecting to secure.site.mydomain.com|124.123.23.12|:443... connected.
ERROR: cannot verify secure.site.mydomain.com's certificate, issued by `/C=US/O=Equifax/OU=Equifax Secure Certificate Authority':
Unable to locally verify the issuer's authority.
To connect to secure.site.mydomain.com insecurely, use `--no-check-certificate'.
Unable to establish SSL connection.
我也尝试了 --no-check-certificate 选项,它不起作用。我只获得带有此选项的登录页面,而不是我请求的实际页面。
I wish to get a few web pages and the sub-links on those which are password protected. I have the user name and the password and can access them from the normal browser UI. But As I wish to save these pages to my local drive for later reference, I am using WGET to get them:
wget --http-user=USER --http-password=PASS http://mywiki.mydomain.com/myproject
But the above is not working, as it asks for the password again. Is there any better way to do this, without getting stuck with the system asking for the password again. Also, what is the best option to get all the links and sub-links on a particular page and store them to a single folder.
Update:
The actual page I am trying to access is behind a HTTPS gateway, and the certificate for the same is not gettin g validated. Is there any way to get through this?
mysystem-dsktp ~ $ wget --http-user=USER --http-password=PASS https://secure.site.mydomain.com/login?url=http://mywiki.mydomain.com%2fsite%2fmyproject%2f
--2010-01-24 18:09:21-- https://secure.site.mydomain.com/login?url=http://mywiki.mydomain.com%2fsite%2fmyproject%2f
Resolving secure.site.mydomain.com... 124.123.23.12, 124.123.23.267, 124.123.102.191, ...
Connecting to secure.site.mydomain.com|124.123.23.12|:443... connected.
ERROR: cannot verify secure.site.mydomain.com's certificate, issued by `/C=US/O=Equifax/OU=Equifax Secure Certificate Authority':
Unable to locally verify the issuer's authority.
To connect to secure.site.mydomain.com insecurely, use `--no-check-certificate'.
Unable to establish SSL connection.
I tried the --no-check-certificate option also, it is not working. I only get the login page with this option and not the actual page I requested.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你能这样尝试吗?
Could you try like this?
您似乎正在尝试访问受表单保护的页面。
您可以使用
--no-check-certificate
选项并遵循此论坛主题建议:无法使用 wget 登录。Seems you're trying to access a page secured by a form.
You could to use that
--no-check-certificate
option and to follow this forum thread suggestions: Can't log in with wget.