如何调试Python中的MemoryError?跟踪内存使用情况的工具?
我有一个 Python 程序,当我向它提供一个大文件时,它会因 MemoryError 而终止。有没有什么工具可以用来找出内存的使用情况?
该程序在较小的输入文件上运行良好。该程序显然需要一些可扩展性改进;我只是想弄清楚在哪里。正如一位智者曾经说过的那样,“优化之前先进行基准测试”。
(只是为了防止不可避免的“添加更多 RAM”答案:这是在具有 4GB RAM 的 32 位 WinXP 机器上运行的,因此 Python 可以访问 2GB 可用内存。添加更多内存在技术上是不可能的。重新安装我的 64 位 PC -bit Windows 不实用。)
编辑:哎呀,这是 推荐使用哪种 Python 内存分析器?
I have a Python program that dies with a MemoryError when I feed it a large file. Are there any tools that I could use to figure out what's using the memory?
This program ran fine on smaller input files. The program obviously needs some scalability improvements; I'm just trying to figure out where. "Benchmark before you optimize", as a wise person once said.
(Just to forestall the inevitable "add more RAM" answer: This is running on a 32-bit WinXP box with 4GB RAM, so Python has access to 2GB of usable memory. Adding more memory is not technically possible. Reinstalling my PC with 64-bit Windows is not practical.)
EDIT: Oops, this is a duplicate of Which Python memory profiler is recommended?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
Heapy 是一个 Python 内存分析器,这正是您需要的工具类型。
Heapy is a memory profiler for Python, which is the type of tool you need.
最简单和轻量级的方法可能是使用 Python 的内置内存查询功能,例如
sys.getsizeof
- 只需在您的对象上运行它即可减少问题(即较小的文件),然后看看什么需要花费很多时间的记忆。The simplest and lightweight way would likely be to use the built in memory query capabilities of Python, such as
sys.getsizeof
- just run it on your objects for a reduced problem (i.e. a smaller file) and see what takes a lot of memory.对于您的情况,答案可能非常简单:不要立即读取整个文件,而是逐块处理文件。根据您的使用场景,这可能非常简单或复杂。举例来说,对于大文件,MD5 校验和计算可以更有效地完成,而无需读取整个文件。后一个更改极大地减少了某些 SCons 使用场景中的内存消耗,但几乎不可能使用内存分析器进行跟踪。
如果您仍然需要内存分析器:eliben 已经建议了 sys.getsizeof。如果这还不能解决问题,请尝试 Heapy 或 Pympler。
In your case, the answer is probably very simple: Do not read the whole file at once but process the file chunk by chunk. That may be very easy or complicated depending on your usage scenario. Just for example, a MD5 checksum computation can be done much more efficiently for huge files without reading the whole file in. The latter change has dramatically reduced memory consumption in some SCons usage scenarios but was almost impossible to trace with a memory profiler.
If you still need a memory profiler: eliben already suggested sys.getsizeof. If that doesn't cut it, try Heapy or Pympler.
您询问了一个工具推荐:
Python Memory Validator 允许您监视 Python 应用程序的内存使用情况、分配位置、GC 集合、对象实例、内存快照等。仅限 Windows。
http://www.softwareverify.com/python/memory/index.html
免责声明:我参与了该软件的创建。
You asked for a tool recommendation:
Python Memory Validator allows you to monitor the memory usage, allocation locations, GC collections, object instances, memory snapshots, etc of your Python application. Windows only.
http://www.softwareverify.com/python/memory/index.html
Disclaimer: I was involved in the creation of this software.