python mmap.error:打开的文件太多。怎么了?

发布于 2024-11-04 05:28:58 字数 1428 浏览 6 评论 0原文

我正在使用 pupynere 接口 (linux) 读取一堆 netcdf 文件。以下代码会导致 mmap 错误:

import numpy as np
import os, glob
from pupynere import NetCDFFile as nc
alts = []
vals = []
path='coll_mip'
filter='*.nc'
for infile in glob.glob(os.path.join(path, filter)):
        curData = nc(infile,'r')
        vals.append(curData.variables['O3.MIXING.RATIO'][:])
        alts.append(curData.variables['ALTITUDE'][:])
        curData.close()

错误:

$ python2.7 /mnt/grid/src/profile/contra.py
Traceback (most recent call last):
  File "/mnt/grid/src/profile/contra.py", line 15, in <module>
  File "/usr/lib/python2.7/site-packages/pupynere-1.0.13-py2.7.egg/pupynere.py", line 159, in __init__
  File "/usr/lib/python2.7/site-packages/pupynere-1.0.13-py2.7.egg/pupynere.py", line 386, in _read
  File "/usr/lib/python2.7/site-packages/pupynere-1.0.13-py2.7.egg/pupynere.py", line 446, in _read_var_array
mmap.error: [Errno 24] Too many open files

有趣的是,如果我注释其中一个 append 命令(两者都可以!),它就会起作用!我做错了什么?我要关闭文件,对吗?这在某种程度上与 python 列表有关。在之前(总是复制每个元素)之前,我使用了一种不同的、低效的方法。

PS:ulimit -n 产生 1024,程序在文件号 498 处失败。

可能与此相关,但解决方案对我不起作用:NumPy 和 memmap:[Errno 24] 打开文件太多

I'm reading a bunch of netcdf files using the pupynere interface (linux). The following code results in an mmap error:

import numpy as np
import os, glob
from pupynere import NetCDFFile as nc
alts = []
vals = []
path='coll_mip'
filter='*.nc'
for infile in glob.glob(os.path.join(path, filter)):
        curData = nc(infile,'r')
        vals.append(curData.variables['O3.MIXING.RATIO'][:])
        alts.append(curData.variables['ALTITUDE'][:])
        curData.close()

Error:

$ python2.7 /mnt/grid/src/profile/contra.py
Traceback (most recent call last):
  File "/mnt/grid/src/profile/contra.py", line 15, in <module>
  File "/usr/lib/python2.7/site-packages/pupynere-1.0.13-py2.7.egg/pupynere.py", line 159, in __init__
  File "/usr/lib/python2.7/site-packages/pupynere-1.0.13-py2.7.egg/pupynere.py", line 386, in _read
  File "/usr/lib/python2.7/site-packages/pupynere-1.0.13-py2.7.egg/pupynere.py", line 446, in _read_var_array
mmap.error: [Errno 24] Too many open files

Interestingly, if I comment one of the append commands (either will do!) it works! What am I doing wrong? I'm closing the file, right? This is somehow related to the python list. I used a different, inefficient approach before (always copying each element) that worked.

PS: ulimit -n yields 1024, program fails at file number 498.

maybe related to, but solution doesn't work for me: NumPy and memmap: [Errno 24] Too many open files

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

没企图 2024-11-11 05:29:01

嗯...也许,只是也许,with curData 可以解决这个问题?只是一个疯狂的猜测。


编辑: curData 是否有 Flush 方法?您是否尝试过在 Close 之前调用它?


编辑2:
Python 2.5 的 with 语句(直接摘自 理解 Python 的“with” " 语句)

with open("x.txt") as f:
    data = f.read()
    do something with data

...基本上它总是关闭资源(很像 C# 的 using 构造)。

Hmmm... Maybe, just maybe, with curData might fix it? Just a WILD guess.


EDIT: Does curData have a Flush method, perchance? Have you tried calling that before Close?


EDIT 2:
Python 2.5's with statement (lifted straight from Understanding Python's "with" statement)

with open("x.txt") as f:
    data = f.read()
    do something with data

... basically it ALLWAYS closes the resource (much like C#'s using construct).

喜爱皱眉﹌ 2024-11-11 05:29:01

nc() 调用的成本有多高?如果它“足够便宜”,可以对每个文件运行两次,那么这行得通吗?

for infile in glob.glob(os.path.join(path, filter)):
        curData = nc(infile,'r')
        vals.append(curData.variables['O3.MIXING.RATIO'][:])
        curData.close()
        curData = nc(infile,'r')
        alts.append(curData.variables['ALTITUDE'][:])
        curData.close()

How expensive is the nc() call? If it is 'cheap enough' to run twice on every file, does this work?

for infile in glob.glob(os.path.join(path, filter)):
        curData = nc(infile,'r')
        vals.append(curData.variables['O3.MIXING.RATIO'][:])
        curData.close()
        curData = nc(infile,'r')
        alts.append(curData.variables['ALTITUDE'][:])
        curData.close()
旧城烟雨 2024-11-11 05:29:00

我的猜测是 pupynere 中的 mmap.mmap 调用使文件描述符保持打开状态(或创建一个新文件描述符)。如果你这样做怎么办:

vals.append(curData.variables['O3.MIXING.RATIO'][:].copy())
alts.append(curData.variables['ALTITUDE'][:].copy())

My guess is that the mmap.mmap call in pupynere is holding the file descriptor open (or creating a new one). What if you do this:

vals.append(curData.variables['O3.MIXING.RATIO'][:].copy())
alts.append(curData.variables['ALTITUDE'][:].copy())
掀纱窥君容 2024-11-11 05:29:00

@corlettk:是的,因为它是Linux,所以 strace -e trace=file 会这样做

strace -e trace=file,desc,munmap python2.7 /mnt/grid/src/profile/contra.py

这将准确显示何时打开哪个文件 - 甚至文件描述符。

您还可以使用

ulimit -a

查看当前有效的限制

编辑

gdb --args python2.7 /mnt/grid/src/profile/contra.py
(gdb) break dup
(gdb) run

如果这导致在与映射文件相关的断点之前出现太多断点,您可能希望在没有断点的情况下运行它一段时间,请手动中断它 (Ctrl+C)并在“正常”操作期间设置断点;也就是说,如果您有足够的时间:)

一旦中断,请检查调用堆栈

(gdb) bt

@corlettk: yeah since it is linux, do strace -e trace=file will do

strace -e trace=file,desc,munmap python2.7 /mnt/grid/src/profile/contra.py

This will show exactly which file is opened when - and even the file decriptors.

You can also use

ulimit -a

To see what limitations are currently in effect

Edit

gdb --args python2.7 /mnt/grid/src/profile/contra.py
(gdb) break dup
(gdb) run

If that results in too many breakpoints prior to the ones related to the mapped files, you might want to run it without breakpoints for a while, break it manually (Ctrl+C) and set the breakpoint during 'normal' operation; that is, if you have enough time for that :)

Once it breaks, inspect the call stack with

(gdb) bt
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文