加快python的速度请求下载速度(通过适当的节流行为)

发布于 2025-01-21 07:57:25 字数 1995 浏览 3 评论 0 原文

如何使用Python快速下载文件?

我尝试了诸如WGET之类的不同模块,它们都需要同一时间执行。 在此示例中,我将从reddit

https://////////////v.-redd。 it/rfxd2e2zhet81/dash_1080.mp4?source = sufflback

    video_url="https://v.redd.it/rfxd2e2zhet81/DASH_1080.mp4?source=fallback"
    start = datetime.datetime.now()
    print(start)
    response = requests.get(video_url)
    stop = datetime.datetime.now()
    print(stop)
    print("status: " + str(response.status_code))

输出:

2022-04-14 15:59:52.258759
2022-04-14 16:02:03.791324
status: 200

使用firefox,同一请求似乎少于一秒钟。

右键单击和“将视频另存为”与即时无法区分。

我从堆栈溢出上研究类似问题的理解是,以下最小示例应导致下载时间确定,仅取决于我的互联网连接。 为单个连接配置了

https://www.speedtest.net/ href =“ https://i.sstatic.net/s1rwy.png” rel =“ nofollow noreferrer”>

该文件的大小约为20 MB,实际上不应该花很长时间才能下载。

作为控件,此通话快速完成。

    video_url="https://stackoverflow.com/questions/71872663/speed-up-python-requests-download-speed"
    start = datetime.datetime.now()
    print(start)
    response = requests.get(video_url)
    stop = datetime.datetime.now()
    print(stop)
    print("status: " + str(response.status_code))

输出:

2022-04-14 15:58:47.022299
2022-04-14 15:58:47.418743
status: 200

我针对在我自己的斑点存储上托管的40 MB文件也采用了相同的请求:

2022-04-14 16:07:59.304382
2022-04-14 16:08:00.729495
status: 200

根据使用Firefox,Python和Python的速度差异,在其他目标上看起来像Python在节流。

我如何使用Python脚本并相应地避免被限制?

我尝试使用Firefox在第一个请求中使用的标题无济于事 - 结果是相同的。

How can I download a file fast using python?

I tried different modules like wget and they all take about the same time to execute.
In this example I will get a file from reddit

https://v.redd.it/rfxd2e2zhet81/DASH_1080.mp4?source=fallback

    video_url="https://v.redd.it/rfxd2e2zhet81/DASH_1080.mp4?source=fallback"
    start = datetime.datetime.now()
    print(start)
    response = requests.get(video_url)
    stop = datetime.datetime.now()
    print(stop)
    print("status: " + str(response.status_code))

output:

2022-04-14 15:59:52.258759
2022-04-14 16:02:03.791324
status: 200

Using Firefox the same request completes in seemingly less than a second.

browser download

A right click and "save video as" is not distinguishable from instant.

My understanding from researching similar questions on stack overflow is that the following minimal example should result in OK download times and only depend on my internet connection. https://www.speedtest.net/ configured for a single connection gives me the following result:

connection speed

The file is about 20 MB in size and really should not take long to download.

As a control, this call finishes fast.

    video_url="https://stackoverflow.com/questions/71872663/speed-up-python-requests-download-speed"
    start = datetime.datetime.now()
    print(start)
    response = requests.get(video_url)
    stop = datetime.datetime.now()
    print(stop)
    print("status: " + str(response.status_code))

output:

2022-04-14 15:58:47.022299
2022-04-14 15:58:47.418743
status: 200

I ran the same request against a 40 MB file hosted on my own blob storage:

2022-04-14 16:07:59.304382
2022-04-14 16:08:00.729495
status: 200

Based on the speed differences using firefox, python and python on other targets it looks like Python is beeing throttled.

How can I use a python script and behave accordingly as to avoid being throttled?

I tried using the headers that firefox was using in its first request to no avail - the outcome was the same.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

奈何桥上唱咆哮 2025-01-28 07:57:25

使用firefox相同的请求似乎比
第二。右键单击和“将视频保存为”与
即时。

观察您使用代码206。 表示已发送请求,每个请求可能是文件的不同部分。下载后,将零件焊接为重新创建文件。如果每个部分都以与单个下载相似的速度,则可以允许较短的下载时间。可以使用请求通过使用Aldopate标头发送请求(请参阅 206 partial 的链接描述),并使用例如 多处理 ,但是在此之前,我必须警告您,并非所有的服务器都应该仔细支持部分gimmick,您应该仔细,您应该仔细,并且计算创建代码的其他负担是否值得获得您可以实现的收益。

Using Firefox the same request completes in seemingly less than a
second. A right click and "save video as" is not distinguishable from
instant.

Observe that you got responses with code 206. 206 Partial means that requests were sent, each presumably for different part of file. After download finish parts are welded to recreate file. This might allow shorter download time, if every part is served with similar speed as when there is single download. Such behavior might be emulated using requests by sending request with appriopate headers (see linked description of 206 Partial) and using for example multiprocessing, but before that I must warn you that not all servers support partial gimmick and you should carefully calculate if additional burden of creating code for doing so is worth gain you can achieve.

木格 2025-01-28 07:57:25

看起来解决方案是绕过Python Eco系统。
我测试了用户@Daweo在评论中建议的解决方案。

它需要ARIA2安装。

video_url="https://v.redd.it/rfxd2e2zhet81/DASH_1080.mp4?source=fallback"
start = datetime.datetime.now()
print(start)
system("aria2c " + video_url)
stop = datetime.datetime.now()
print(stop)

输出是:

2022-04-15 21:38:09.693262

04/15 21:38:09 [NOTICE] Downloading 1 item(s)

04/15 21:38:10 [NOTICE] Download complete: /.../DASH_1080.mp4

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
c2e8ce|OK  |    59MiB/s|/.../DASH_1080.mp4

Status Legend:
(OK):download completed.
2022-04-15 21:38:10.131280

因此花了大约400毫秒。

It looks like the solution is to get around the python eco system.
I tested the solution that user @Daweo suggested in the comments.

It requires an aria2 installation.

video_url="https://v.redd.it/rfxd2e2zhet81/DASH_1080.mp4?source=fallback"
start = datetime.datetime.now()
print(start)
system("aria2c " + video_url)
stop = datetime.datetime.now()
print(stop)

the output is:

2022-04-15 21:38:09.693262

04/15 21:38:09 [NOTICE] Downloading 1 item(s)

04/15 21:38:10 [NOTICE] Download complete: /.../DASH_1080.mp4

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
c2e8ce|OK  |    59MiB/s|/.../DASH_1080.mp4

Status Legend:
(OK):download completed.
2022-04-15 21:38:10.131280

So that took something like 400 ms.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文