wget 访问主机名后面有父目录的 url
更新:我将wget从1.10升级到1.12并解决了问题。
例如
www.example.com/level1/level2/../test.html
这样wget和浏览器就会访问
www.example.com/level1/test.html
但是对于
www.example.com/../test.html
wget 会访问
www.example.com/../test.html
浏览器会访问
www.example.com/test.html
我正在使用 wget 解析一些网页以获取其大小及其内部元素。 现在我发现有些网页使用“../css/xxx.jpg”而不是“css/xxx.jpg”。 用浏览器访问网页可以,但用wget访问不行。
有办法解决吗?谢谢。
Update: I upgrade wget from 1.10 to 1.12 and solved the problem.
For example
www.example.com/level1/level2/../test.html
In this way, wget and browser will visit
www.example.com/level1/test.html
But for
www.example.com/../test.html
wget will visit
www.example.com/../test.html
browser will visit
www.example.com/test.html
I was using wget to parse some webpage to get the size of it and the elements inside it.
Now I found that some webpage are using "../css/xxx.jpg" instead of "css/xxx.jpg".
It is Ok to visit the webpage with browser, but not wget.
Is there a way to solve it? Thank you.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在将 URL 传递给 wget 之前,请修剪路径开头的“../”。 (将 URL 拆分为组件会有所帮助。)
如何执行此操作取决于您使用的语言或框架。
Before passing URLs to wget, trim "../" from the begging of the path. (splitting the URLS into components would help.)
How to do this depends on what language or framework you are using.