Python 导致:IOError: [Errno 28] 设备上没有剩余空间:'../results/32766.html'在有大量空间的磁盘上

发布于 2024-11-28 19:48:29 字数 1763 浏览 1 评论 0原文

我正在运行导致上述错误的 Python 脚本。不寻常的是这个脚本在另一台机器上运行并且没有任何问题。

不同之处在于,在导致问题的机器上,我正在写入外部硬盘驱动器。更奇怪的是,这个脚本在有问题的机器上运行,并且已经写入了 30,000 多个文件。

一些相关信息(导致错误的代码):

nPage = 0
while nPage != -1:
    for d in data:
        if len(d.contents) > 1:
            if '<script' in str(d.contents):
                l = str(d.contents[1])
                start = l.find('http://')
                end = l.find('>',start)
                out = get_records.openURL(l[start:end])
                print COUNT

                with open('../results/'+str(COUNT)+'.html','w') as f:
                    f.write(out)
                COUNT += 1

    nPage = nextPage(mOut,False)

我正在写入的目录:

10:32@lorax:~/econ/estc/bin$ ll ../
total 56
drwxr-xr-x 3 boincuser boincuser  4096 2011-07-31 14:29 ./
drwxr-xr-x 3 boincuser boincuser  4096 2011-07-31 14:20 ../
drwxr-xr-x 2 boincuser boincuser  4096 2011-08-09 10:38 bin/
lrwxrwxrwx 1 boincuser boincuser    47 2011-07-31 14:21 results -> /media/cavalry/server_backup/econ/estc/results//
-rw-r--r-- 1 boincuser boincuser 44759 2011-08-09 10:32 test.html

证明有足够的空间:

10:38@lorax:~/econ/estc/bin$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             9.0G  5.3G  3.3G  63% /
none                  495M  348K  495M   1% /dev
none                  500M  164K  500M   1% /dev/shm
none                  500M  340K  500M   1% /var/run
none                  500M     0  500M   0% /var/lock
none                  9.0G  5.3G  3.3G  63% /var/lib/ureadahead/debugfs
/dev/sdc10            466G  223G  244G  48% /media/cavalry

我尝试过的一些事情:

  • 将写入路径更改为直接位置,而不是通过链接
  • 重新启动机器
  • 卸载并重新安装驱动器

I am running a Python script that is causing the above error. The unusual thing is this script is running on a different machine and is having no problems.

The difference is that on the machine that is causing the problems I am writing to an external hard drive. To make things even weirder this script has run on the problem machine and already written over 30,000 files.

Some relevant information (The code that is causing the error):

nPage = 0
while nPage != -1:
    for d in data:
        if len(d.contents) > 1:
            if '<script' in str(d.contents):
                l = str(d.contents[1])
                start = l.find('http://')
                end = l.find('>',start)
                out = get_records.openURL(l[start:end])
                print COUNT

                with open('../results/'+str(COUNT)+'.html','w') as f:
                    f.write(out)
                COUNT += 1

    nPage = nextPage(mOut,False)

The directory I'm writing to:

10:32@lorax:~/econ/estc/bin$ ll ../
total 56
drwxr-xr-x 3 boincuser boincuser  4096 2011-07-31 14:29 ./
drwxr-xr-x 3 boincuser boincuser  4096 2011-07-31 14:20 ../
drwxr-xr-x 2 boincuser boincuser  4096 2011-08-09 10:38 bin/
lrwxrwxrwx 1 boincuser boincuser    47 2011-07-31 14:21 results -> /media/cavalry/server_backup/econ/estc/results//
-rw-r--r-- 1 boincuser boincuser 44759 2011-08-09 10:32 test.html

Proof there is enough space:

10:38@lorax:~/econ/estc/bin$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             9.0G  5.3G  3.3G  63% /
none                  495M  348K  495M   1% /dev
none                  500M  164K  500M   1% /dev/shm
none                  500M  340K  500M   1% /var/run
none                  500M     0  500M   0% /var/lock
none                  9.0G  5.3G  3.3G  63% /var/lib/ureadahead/debugfs
/dev/sdc10            466G  223G  244G  48% /media/cavalry

Some things I have tried:

  • Changing the path of the write to the direct location instead of going through the link
  • Rebooting the machine
  • Unmounting and re-mounting the drive

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

惟欲睡 2024-12-05 19:48:29

在数据或元数据与 I 关联的任何情况下,都会触发ENOSPC(“设备上没有剩余空间”)错误由于空间不足,/O 操作无法写在任何地方。这并不总是意味着磁盘空间——它可能意味着物理磁盘空间、逻辑空间(例如最大文件长度)、特定数据结构中的空间或地址空间。例如,如果目录表 (vfat) 中没有空间或没有剩余任何 inode,您可以获取它。它的大致意思是“我找不到在哪里写下来”。

特别是在 Python 中,这种情况可能发生在任何写 I/O 操作上。它可能发生在 f.write 期间,但也可能发生在 openf.flush 甚至 f.close 上。它发生的地点提供了其发生原因的重要线索 - 如果它发生在 open 上,则没有足够的空间来写入条目的元数据,如果它发生在 f.write 期间f.flushf.close 剩余磁盘空间不足,或者已超出最大文件大小。

如果给定目录中的文件系统是 vfat,您大约会在同一时间达到最大文件限制。该限制应该是 2^16 个目录条目,但如果我没记错的话,其他一些因素也会影响它(例如,某些文件需要多个条目)。

最好避免在一个目录中创建太多文件。很少有文件系统能够轻松处理如此多的目录条目。除非您确定您的文件系统可以很好地处理目录中的许多文件,否则您可以考虑另一种策略(例如创建更多目录)。

PS 另外,不要相信剩余的磁盘空间 - 某些文件系统为 root 保留一些空间,而另一些文件系统会错误地计算可用空间并给您一个不正确的数字。

The ENOSPC ("No space left on device") error will be triggered in any situation in which the data or the metadata associated with an I/O operation can't be written down anywhere because of lack of space. This doesn't always mean disk space – it could mean physical disk space, logical space (e.g. maximum file length), space in a certain data structure or address space. For example you can get it if there isn't space in the directory table (vfat) or there aren't any inodes left. It roughly means “I can't find where to write this down”.

Particularly in Python, this can happen on any write I/O operation. It can happen during f.write, but it can also happen on open, on f.flush and even on f.close. Where it happened provides a vital clue for the reason that it did – if it happened on open there wasn't enough space to write the metadata for the entry, if it happened during f.write, f.flush or f.close there wasn't enough disk space left or you've exceeded the maximum file size.

If the filesystem in the given directory is vfat you'd hit the maximum file limit at about the same time that you did. The limit is supposed to be 2^16 directory entries, but if I recall correctly some other factors can affect it (e.g. some files require more than one entry).

It would be best to avoid creating so many files in a directory. Few filesystems handle so many directory entries with ease. Unless you're certain that your filesystem deals well with many files in a directory, you can consider another strategy (e.g. create more directories).

P.S. Also do not trust the remaining disk space – some file systems reserve some space for root and others miscalculate the free space and give you a number that just isn't true.

故人的歌 2024-12-05 19:48:29

尝试删除临时文件

rm -r /tmp/

Try to delete the temp files

rm -r /tmp/
不再见 2024-12-05 19:48:29

事实证明,对我来说最好的解决方案就是重新格式化驱动器。一旦重新格式化,所有这些问题都不再是问题。

It turns out the best solution for me here was to just reformat the drive. Once reformatted all these problems were no longer problems.

冷了相思 2024-12-05 19:48:29
  1. 显示内存分配位置 sudo du -x -h / |排序 -h | tail -40
  2. 如果您的 /tmp/home/user_name/.cache 文件夹占用了大量内存,请将其删除。您可以通过运行 sudo rm -R /path/to/folder

步骤 2 概述要从中删除的相当常见的文件夹(/tmp/home/user_name /.cache)。如果您在运行第一个命令时得到其他结果,显示您有大量内存正在其他地方使用,我建议从这些位置删除时要更加谨慎。

  1. Show where memory is allocated sudo du -x -h / | sort -h | tail -40
  2. Delete from either your /tmp or /home/user_name/.cache folder if these are taking up a lot of memory. You can do this by running sudo rm -R /path/to/folder

Step 2 outlines fairly common folders to delete from (/tmp and /home/user_name/.cache). If you get back other results when running the first command showing you have lots of memory being used elsewhere, I advise being a bit more cautious when deleting from those locations.

意中人 2024-12-05 19:48:29

就我而言,当我运行 df -i 时,它显示我的 inode 数量已满,然后我必须删除一些小文件或文件夹。否则,一旦索引节点已满,我们将无法创建文件或文件夹。

您所要做的就是删除尚未占用完整空间但负责填充 inode 的文件或文件夹。

In my case, when I run df -i it shows me that my number of inodes are full and then I have to delete some of the small files or folder. Otherwise it will not allow us to create files or folders once inodes get full.

All you have to do is delete files or folder that has not taken up full space but is responsible for filling inodes.

勿忘初心 2024-12-05 19:48:29

运行“export TEMPDIR=/someDir”,其中某些目录是/tmp 之外的有效目录。
在运行 python 命令之前,在提示符下运行此命令。就我而言,它是“pip install rasa[spacy]”,它之前失败了。

导出命令允许您临时使用指定的目录作为临时目录。

run "export TEMPDIR=/someDir" where some dir is a valid directory other than /tmp.
Run this on prompt before running your python command. In my case it is "pip install rasa[spacy]" which was earlier failing.

The export command allows you to temporarily use the specified dir as temp dir.

月亮坠入山谷 2024-12-05 19:48:29

单个目录中文件过多的最小可重现错误示例

不同文件系统的每个目录的大约最大文件数可以在以下位置查看:我可以在一个目录中放入多少个文件?

下面的示例最终会崩溃,为我们提供了对大小限制的具体估计:

#!/usr/bin/env python

import os
import shutil

tmpdir = 'tmp'
if os.path.isdir(tmpdir):
    shutil.rmtree(tmpdir)
os.mkdir(tmpdir)
for i in range(10000000):
    print(i)
    with open(os.path.join(tmpdir, f'{i:064}'), 'w') as f:
        pass

总是吹在我的测试系统上:

5590508
Traceback (most recent call last):
  File "/home/ciro/test/tmp/./main.py", line 12, in <module>
    with open(os.path.join(tmpdir, f'{i:064}'), 'w') as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: [Errno 28] No space left on device: 'tmp/0000000000000000000000000000000000000000000000000000000005590508'

接近 560 万个文件。

但还要注意,目录中文件名的长度很重要。如果您将它们从 64 字节减少,例如:

    with open(os.path.join(tmpdir, f'{i}'), 'w') as f:

您将能够在目录中存储更多文件,最多可达 20M 文件,然后我失去了耐心。所以你不能轻易地给它赋予一个特定的值。我第一次遇到这个问题正是在存储大量具有大基名的文件时,因为基名是比特币交易 ID,每个 ID 都有 64 个十六进制字节。

在 Python 3.11.6、Ubuntu 23.10、Linux 内核 6.5.0 的 ext4 文件系统上进行了测试,其中包含:

df -hT

包含:

Filesystem                        Type      Size  Used Avail Use% Mounted on
/dev/mapper/ubuntu--vg-ubuntu--lv ext4      1.9T  1.5T  246G  87% /

和:

sudo file -L -s /dev/mapper/ubuntu--vg-ubuntu--lv ext4

给予:

sudo dumpe2fs /dev/mapper/ubuntu--vg-ubuntu--lv | grep 'Filesystem features'
dumpe2fs 1.47.0 (5-Feb-2023)

给予:

Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum

所以没有 large_dir

请注意,上例中文件名本身的总大小为 64 B * 10000000 = 640 MB,因此磁盘中有足够的空间容纳它们。

所有这一切都让我只想在任何地方使用一个大的索引 SQLite 文件。对于现代计算机来说,500 万已经足够小了!

底层系统调用是 open 本身,它可以创建一个文件,而不是 write 将数据写入文件,并且如果您用完总空间,则会崩溃。 man 2 打开文档:

错误

要创建 ENOSPC 路径名,但包含路径名的设备没有空间容纳新文件。

我们可以通过按照 https://unix.stackexchange.com/questions/326766/what-are-the-standard-error-codes-in-linux

errno -ls

其中包含:

ENOSPC 28 No space left on device

在 Linux 内核上位于: https://github.com/torvalds/linux/blob/a4145ce1e7bc247fd6f2846e8699473448717b37/include/uapi/asm-generic/errno-base.h#L32

Minimal reproducible error example with too many files in a single directory

The approximate maximum number of files per directory for different filesystems can be seen at: How many files can I put in a directory?

The following example eventually blows up giving us a concrete estimate to the size limit:

#!/usr/bin/env python

import os
import shutil

tmpdir = 'tmp'
if os.path.isdir(tmpdir):
    shutil.rmtree(tmpdir)
os.mkdir(tmpdir)
for i in range(10000000):
    print(i)
    with open(os.path.join(tmpdir, f'{i:064}'), 'w') as f:
        pass

and always blows up on my test system at:

5590508
Traceback (most recent call last):
  File "/home/ciro/test/tmp/./main.py", line 12, in <module>
    with open(os.path.join(tmpdir, f'{i:064}'), 'w') as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: [Errno 28] No space left on device: 'tmp/0000000000000000000000000000000000000000000000000000000005590508'

so near 5.6M files.

But also note that the length of files names in the directory matters a lot. If you reduced them down from 64 bytes e.g. with:

    with open(os.path.join(tmpdir, f'{i}'), 'w') as f:

you would then be able to store way more files in the directory, that went up to 20M files and then I lost patience. So you can't just easily put a single specific value to it. I first came across this issue exactly when storing a large number of file with large basenames because the basenames were Bitcoin transaction IDs which have 64 hex bytes each.

Tested on Python 3.11.6, Ubuntu 23.10, Linux kernel 6.5.0 on an ext4 filesystem with:

df -hT

containing:

Filesystem                        Type      Size  Used Avail Use% Mounted on
/dev/mapper/ubuntu--vg-ubuntu--lv ext4      1.9T  1.5T  246G  87% /

and:

sudo file -L -s /dev/mapper/ubuntu--vg-ubuntu--lv ext4

giving:

sudo dumpe2fs /dev/mapper/ubuntu--vg-ubuntu--lv | grep 'Filesystem features'
dumpe2fs 1.47.0 (5-Feb-2023)

giving:

Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg sparse_super large_file huge_file dir_nlink extra_isize metadata_csum

so no large_dir.

Note that the total size of the file names themselves in the above example is 64 B * 10000000 = 640 MB, so there was more than enough space in the disk for them.

All of this just makes part of me just want to use one big indexed SQLite file everywhere. 5 million is a small enough number for a modern computer!

The underlying system call is open itself which can create a file, as opposed to write which writes data to a file, and would instead blow up if you ran out of total space. man 2 open documents:

ERRORS

ENOSPC pathname was to be created but the device containing pathname has no room for the new file.

and we can confirm that the cryptic "28" is ENOSPC by running as per https://unix.stackexchange.com/questions/326766/what-are-the-standard-error-codes-in-linux

errno -ls

which contains:

ENOSPC 28 No space left on device

On the Linux kernel it is at: https://github.com/torvalds/linux/blob/a4145ce1e7bc247fd6f2846e8699473448717b37/include/uapi/asm-generic/errno-base.h#L32

甜警司 2024-12-05 19:48:29

我遇到了类似的问题。上述删除 /tmp 目录的解决方案对我有用。

我没有使用服务帐户可能没有完全访问权限的默认 /tmp 位置(如果遵循最佳实践并且不使用 sudo 安装 Python 包),而是将 /tmp 目录移动到用户的主目录,并使用 TMPDIR 环境设置pip install --user ... 命令。

我遇到了上述空间不足的问题,正如上面的答案中提到的,很可能是由于创建了太多文件/目录而实际上没有用完卷存储。对我有用的解决方案是删除 /home//$USER/tmp 目录,并在每次连续部署管道运行时重新创建它。 rm -rf /tmp

I faced a similar issue. The above solutions for deleting the /tmp directory worked for me.

Instead of using the default /tmp location where the service account might not have full access (if following best practices and not using sudo to install Python packages), I moved the /tmp directory to the user's home directory, with TMPDIR environment setting honored by pip install --user ... command.

I faced the issue above of running out space, as mentioned in the answeres above most likely due to so many files/directories being created and not actually running out of volume storage. The solution that worked for me was to delete the /home/<some_domain>/$USER/tmp directory and recreate it every time my continuous deployment pipeline ran. rm -rf /tmp

还在原地等你 2024-12-05 19:48:29

我在运行测试时遇到了这个问题。
我运行的 Django 应用程序位于 Docker 容器内,因此我必须以 root 身份登录容器,运行 rm -r /tmp/ ,它解决了问题。

I ran into this problem when running tests.
The Django app I was running was inside a docker container, so I had to log in to the container as a root, run rm -r /tmp/ and it fixed the issue.

┾廆蒐ゝ 2024-12-05 19:48:29

我也有同样的问题,但最后发现根本原因是“打印太频繁”。在我的代码中,我有一些循环运行的“print(...)”逻辑。当我评论这些代码时,“设备上没有剩余空间”错误消失了。也许这是Python实现的问题。您可以尝试这个解决方案。

有关更多详细信息,示例描述性代码如下所示:

loop:
    do some logic 
    call "print" to check result

当您的逻辑代码运行得非常快时,您将非常频繁地调用“print”。有时会出现“[Errno 28]设备上没有剩余空间”错误。

我认为这是 Python“打印”代码的实现限制,尽管我还没有阅读代码。我的Python版本是3.9.7

感谢社区回复。

I have the same problem, but found the root cause is "print too frequently" at last. On my code I have some "print(...)" logic which running in loop. When I comment those code, the "No space left on device" error disappeared. Perhaps it is a Python implementation problem. You may try this solution.

For more detail, sample descriptive code will look like following:

loop:
    do some logic 
    call "print" to check result

When your logic code running very fast, you will call "print" very frequently. Then sometimes "[Errno 28] No space left on device" error will appear.

I think this is an implementation limitation of Python "print" code, though I not read the code yet. My Python version is 3.9.7

Thanks for Community reply.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文