使用 subprocess.Popen 的 Python 内存分配错误

发布于 2024-10-21 16:43:12 字数 996 浏览 1 评论 0原文

我正在做一些生物信息学工作。我有一个Python脚本，它在某一时刻调用一个程序来执行一个昂贵的过程（序列对齐......使用大量的计算能力和内存）。我使用 subprocess.Popen 来调用它。当我在测试用例上运行它时，它完成并正常完成。然而，当我在完整文件上运行它时，它必须对不同的输入集多次执行此操作，它就会死掉。子进程抛出：

OSError: [Errno 12] Cannot allocate memory

我找到了一些链接此处和此处和这里类似的问题，但我不确定它们是否适用于我的情况。

默认情况下，序列对齐器将尝试请求 51000M 内存。它并不总是使用那么多，但可能会。加载并处理完整的输入后，这些内容就不再可用了。但是，限制它请求的数量或尝试使用运行时可能可用的较低数量仍然会出现相同的错误。我也尝试过使用 shell=True 运行和同样的事情。

这已经困扰我好几天了。感谢您的任何帮助。

编辑：扩展回溯：

File "..../python2.6/subprocess.py", line 1037, in _execute_child
    self.pid=os.fork()
OSError: [Errno 12] Cannot allocate memory

引发错误。

Edit2：在64位ubuntu 10.4上运行python 2.6.4

原文

I am doing some bioinformatics work. I have a python script that at one point calls a program to do an expensive process (sequence alignment..uses a lot of computational power and memory). I call it using subprocess.Popen. When I run it on a testcase, it completes and finishes fine. However, when I run it on the full file, where it would have to do this multiple times for different sets of inputs, it dies. Subprocess throws:

OSError: [Errno 12] Cannot allocate memory

I found a few links here and here and here to similar problems, but I'm not sure that they apply in my case.

By default, the sequence aligner will try to request 51000M of memory. It doesn't always use that much, but it might. With the full input loaded and processed, that much is not available. However, capping the amount it requests or will attempt to use at a lower amount that might be available when running still gives me the same error. I've also tried running with shell=True and same thing.

This has been bugging me for a few days now. Thanks for any help.

Edit: Expanding the traceback:

File "..../python2.6/subprocess.py", line 1037, in _execute_child
    self.pid=os.fork()
OSError: [Errno 12] Cannot allocate memory

throws the error.

Edit2: Running in python 2.6.4 on 64 bit ubuntu 10.4

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

鯉魚旗 2024-10-28 16:43:12

我真的为OP感到难过。 6 年过去了，没有人提到这是 Unix 中很常见的问题，实际上与 python 或生物信息学无关。调用 os.fork() 会暂时将父进程的内存加倍（父进程的内存必须可供子进程使用），然后将其全部丢弃以执行 exec()。虽然该内存并不总是被实际复制，但系统必须有足够的内存来允许复制它，因此，如果您的父进程正在使用一半以上的系统内存，并且您的子进程甚至会输出“wc -l ”，你将会遇到内存错误。

解决方案是使用 posix_spawn，或者在脚本开头创建所有子进程，同时内存消耗较低，然后在父进程完成内存密集型操作后使用它们。

使用关键字“os.fork”和“内存”进行谷歌搜索将显示有关该主题的几篇 Stack Overflow 帖子，这些帖子可以进一步解释正在发生的事情:)

回复收藏 0 原文

错々过的事 2024-10-28 16:43:12

这与 Python 或 subprocess 模块没有任何关系。 subprocess.Popen 只是向您报告它从操作系统接收到的错误。（顺便问一下，您使用的是什么操作系统？）来自 Linux 上的 man 2 fork：

ENOMEM    fork()  failed  to  allocate  the  necessary  kernel  structures
          because memory is tight.

您是否多次调用 subprocess.Popen ？如果是这样，那么我认为您能做的最好的事情就是确保在下一次调用之前终止并收获进程的上一次调用。

This doesn't have anything to do with Python or the subprocess module. subprocess.Popen is merely reporting to you the error that it is receiving from the operating system. (What operating system are you using, by the way?) From man 2 fork on Linux:

ENOMEM    fork()  failed  to  allocate  the  necessary  kernel  structures
          because memory is tight.

Are you calling subprocess.Popen multiple times? If so then I think the best you can do is make sure that the previous invocation of your process is terminated and reaped before the next invocation.

回复收藏 0 原文