是否可以抑制“跳过”? wget 尝试之间的行为?

发布于 2024-11-26 10:36:53 字数 1665 浏览 2 评论 0原文

我使用 wget 通过 HTTP 下载一组文件,在一个简单的 cmd.exe 批处理中对每个 URL 使用一个 wget 调用。

另外,我在镜像之间随机交替,并希望为每个镜像保留单独的树,例如:

http://server06//files/file1.txt  -> temp\server06\files\file1.txt
http://server03//files/file65.txt -> temp\server03\files\file65.txt

我现在所做的是:

echo !url! | .\runners\wget.exe --tries=3 --force-directories --directory-prefix=.\temp\ --input-file=-

有时,由于某种原因,服务器会关闭 TCP 连接。我正在使用 --tries=3 来解决这个问题。在这种情况下,wget 的默认行为是,它将跳过已下载的字节,并从该点继续,如下所示:

2011-07-19 13:24:52 (68.1 KB/s) - Connection closed at byte 65396. Retrying.

--2011-07-19 13:24:54--  (try: 3) 
http://server06//files/filex.txt
Connecting to server|10.10.0.108|:80... failed: Unknown error.
Resolving server... 10.10.0.108
Connecting to server|10.10.0.108|:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 166400 (163K), 101004 (99K) remaining [text/plain]
Saving to:
`./temp/server06/files/filex.txt'

        [ skipping 50K ]
    50K ,,,,,,,,,, ,,,....... .......... .......... .......... 61% 2.65M 0s
   100K .......... .......... .......... .......... .......... 92% 1.62M 0s
   150K .......... ..                                         100% 1.64M=0.06s

utime(./temp/server06/files/filex.txt):
Permission denied
2011-07-19 13:25:15 (1.72 MB/s) -
`./temp/server06/files/filex.txt'
saved [166400/166400]

我的问题是我不希望 wget 分两部分下载文件。我希望 wget 尝试更多次,但如果任何尝试因任何原因失败,我希望它重新开始(即使以根本不下载文件为代价!)。

背景是我正在测试过滤器驱动程序中的一段代码,只有当文件被一次性下载时才会被覆盖。由于这种行为,我的测试失败了。

问题是:是否可以抑制这种行为?即让 wget 尝试尽可能多的参数配置,同时在每次尝试中下载完整文件或零字节?

或者我应该寻找另一种解决方法?

I'm using wget to download a set of files via HTTP, using one wget call per URL, in a simple cmd.exe batch.

Also, I alternate between mirrors randomly and want to keep separate tree for each mirror, like:

http://server06//files/file1.txt  -> temp\server06\files\file1.txt
http://server03//files/file65.txt -> temp\server03\files\file65.txt

What I do now is:

echo !url! | .\runners\wget.exe --tries=3 --force-directories --directory-prefix=.\temp\ --input-file=-

Sometimes it happens that, for some reason, server closes TCP connection. I'm using --tries=3 to work around this. In such case, default behavior of wget is, that it would skip the bytes that it already downloaded, and continue from that point, something like this:

2011-07-19 13:24:52 (68.1 KB/s) - Connection closed at byte 65396. Retrying.

--2011-07-19 13:24:54--  (try: 3) 
http://server06//files/filex.txt
Connecting to server|10.10.0.108|:80... failed: Unknown error.
Resolving server... 10.10.0.108
Connecting to server|10.10.0.108|:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 166400 (163K), 101004 (99K) remaining [text/plain]
Saving to:
`./temp/server06/files/filex.txt'

        [ skipping 50K ]
    50K ,,,,,,,,,, ,,,....... .......... .......... .......... 61% 2.65M 0s
   100K .......... .......... .......... .......... .......... 92% 1.62M 0s
   150K .......... ..                                         100% 1.64M=0.06s

utime(./temp/server06/files/filex.txt):
Permission denied
2011-07-19 13:25:15 (1.72 MB/s) -
`./temp/server06/files/filex.txt'
saved [166400/166400]

My problem is that I don't want wget to download the file in two parts. I want wget to try more times, but if any attempt fails for any reason, I want it to start over (even at the cost of not downloading the file at all!).

The background is that I'm testing a code in a filter driver that will be covered only if the file is downloaded at one piece. And my tests fail because of this behavior.

Question is: is it possible to suppress this behavior? I.e. make wget try as much as is configured by a parameter, while either downloading complete file or zero bytes within each attempt?

Or I should look for another workaround?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

南街女流氓 2024-12-03 10:36:54

我相信您会对libcurl 库感到更满意。每个 url 只需调用一次,libcurl 就会完成其余所有工作。最重要的是,该软件包还提供一流的支持。

使用 libcurl,您遇到的特殊情况不会成为问题。

华泰

I am sure you will be happier with the libcurl library. It takes just one call per url and libcurl does all the rest of the work. On top of that, there's first-rate support for the package.

The particular case you're having trouble with won't be a problem using libcurl.

HTH

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文