Python:下载大文件时出现不可预测的内存错误
我编写了一个 python 脚本,用于从 HTTP 服务器下载大量视频文件(每个 50-400 MB)。到目前为止,它在长下载列表上运行良好,但由于某种原因,它很少出现内存错误。
该机器有大约 1 GB 的可用 RAM,但我认为在运行此脚本时它不会耗尽 RAM。
我监视了任务管理器和 perfmon 中的内存使用情况,它的行为始终与我所看到的相同:在下载过程中缓慢增加,然后在完成下载后恢复到正常水平(没有小泄漏蔓延或类似的东西)。
下载的行为方式是创建文件,该文件保持为 0 KB,直到下载完成(或程序崩溃),然后立即写入整个文件并关闭它。
for i in range(len(urls)):
if os.path.exists(folderName + '/' + filenames[i] + '.mov'):
print 'File exists, continuing.'
continue
# Request the download page
req = urllib2.Request(urls[i], headers = headers)
sock = urllib2.urlopen(req)
responseHeaders = sock.headers
body = sock.read()
sock.close()
# Search the page for the download URL
tmp = body.find('/getfile/')
downloadSuffix = body[tmp:body.find('"', tmp)]
downloadUrl = domain + downloadSuffix
req = urllib2.Request(downloadUrl, headers = headers)
print '%s Downloading %s, file %i of %i'
% (time.ctime(), filenames[i], i+1, len(urls))
f = urllib2.urlopen(req)
# Open our local file for writing, 'b' for binary file mode
video_file = open(foldername + '/' + filenames[i] + '.mov', 'wb')
# Write the downloaded data to the local file
video_file.write(f.read()) ##### MemoryError: out of memory #####
video_file.close()
print '%s Download complete!' % (time.ctime())
# Free up memory, in hopes of preventing memory errors
del f
del video_file
这是堆栈跟踪:
File "downloadVideos.py", line 159, in <module>
main()
File "downloadVideos.py", line 136, in main
video_file.write(f.read())
File "c:\python27\lib\socket.py", line 358, in read
buf.write(data)
MemoryError: out of memory
I wrote a python script which I am using to download a large number of video files (50-400 MB each) from an HTTP server. It has worked well so far on long lists of downloads, but for some reason it rarely has a memory error.
The machine has about 1 GB of RAM free, but I don't think it's ever maxed out on RAM while running this script.
I've monitored the memory usage in the task manager and perfmon and it always behaves the same from what I've seen: slowly increases during the download, then returns to normal level after it finishes the download (There's no small leaks that creep up or anything like that).
The way the download behaves is that it creates the file, which remains at 0 KB until the download finishes (or the program crashes), then it writes the whole file at once and closes it.
for i in range(len(urls)):
if os.path.exists(folderName + '/' + filenames[i] + '.mov'):
print 'File exists, continuing.'
continue
# Request the download page
req = urllib2.Request(urls[i], headers = headers)
sock = urllib2.urlopen(req)
responseHeaders = sock.headers
body = sock.read()
sock.close()
# Search the page for the download URL
tmp = body.find('/getfile/')
downloadSuffix = body[tmp:body.find('"', tmp)]
downloadUrl = domain + downloadSuffix
req = urllib2.Request(downloadUrl, headers = headers)
print '%s Downloading %s, file %i of %i'
% (time.ctime(), filenames[i], i+1, len(urls))
f = urllib2.urlopen(req)
# Open our local file for writing, 'b' for binary file mode
video_file = open(foldername + '/' + filenames[i] + '.mov', 'wb')
# Write the downloaded data to the local file
video_file.write(f.read()) ##### MemoryError: out of memory #####
video_file.close()
print '%s Download complete!' % (time.ctime())
# Free up memory, in hopes of preventing memory errors
del f
del video_file
Here is the stack trace:
File "downloadVideos.py", line 159, in <module>
main()
File "downloadVideos.py", line 136, in main
video_file.write(f.read())
File "c:\python27\lib\socket.py", line 358, in read
buf.write(data)
MemoryError: out of memory
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您的问题在这里:
f.read()
。该行尝试将整个文件下载到内存中。相反,请分块读取 (chunk = f.read(4096)
),并将这些片段保存到临时文件中。Your problem is here:
f.read()
. That line attempts to download the entire file into memory. Instead of that, read in chunks (chunk = f.read(4096)
), and save the pieces to temporary file.