当前位置：文江博客话题详情

读取整个文件是否会使文件句柄保持打开状态？

发布于 2024-12-04 10:58:14 字数 113 浏览 0 评论 0原文

如果您使用 content = open('Path/to/file', 'r').read() 读取整个文件，文件句柄是否会保持打开状态直到脚本退出？有没有更简洁的方法来读取整个文件？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

注定孤独终老 2024-12-11 10:58:14

该问题的答案在某种程度上取决于特定的 Python 实现。

要了解这是什么意思，请特别注意实际的 file 对象。在您的代码中，该对象仅在表达式中提及一次，并且在 read() 调用返回后立即变得不可访问。

这意味着文件对象是垃圾。唯一剩下的问题是“垃圾收集器何时收集文件对象？”。

在使用引用计数器的CPython中，这种垃圾会立即被注意到，因此会立即被收集。对于其他 Python 实现来说，情况通常并非如此。

为了确保文件关闭，更好的解决方案是这种模式：

with open('Path/to/file', 'r') as content_file:
    content = content_file.read()

它总是在块结束后立即关闭文件；即使出现异常。

编辑：要说得更详细一点：

除了 file.__exit__() 之外，它是在 with 上下文管理器设置中“自动”调用的，这是唯一的其他方式file.close() 是通过 file.__del__() 自动调用的（也就是说，除了您自己显式调用它之外）。这给我们带来了一个问题：__del__() 何时被调用？

正确编写的程序不能假设终结器将在程序终止之前的任何时刻运行。

-- https://devblogs.microsoft.com/oldnewthing/20100809-00/ ?p=13203

特别是：

对象永远不会被显式销毁；但是，当它们变得无法访问时，它们可能会被垃圾收集。 允许推迟垃圾收集或完全省略它 - 如何实现垃圾收集是一个实现质量问题，只要没有收集仍然可达的对象即可。
[...]
CPython 目前使用带有（可选）延迟检测循环链接垃圾的引用计数方案，该方案会在大多数对象变得无法访问时立即收集它们，但不能保证收集包含循环引用的垃圾。

-- https://docs.python.org/3.5 /reference/datamodel.html#objects-values-and-types

（强调我的）

但正如它所暗示的，其他实现可能有其他行为。例如，PyPy 有 6 种不同的垃圾回收实现！

The answer to that question depends somewhat on the particular Python implementation.

To understand what this is all about, pay particular attention to the actual file object. In your code, that object is mentioned only once, in an expression, and becomes inaccessible immediately after the read() call returns.

This means that the file object is garbage. The only remaining question is "When will the garbage collector collect the file object?".

in CPython, which uses a reference counter, this kind of garbage is noticed immediately, and so it will be collected immediately. This is not generally true of other python implementations.

A better solution, to make sure that the file is closed, is this pattern:

with open('Path/to/file', 'r') as content_file:
    content = content_file.read()

which will always close the file immediately after the block ends; even if an exception occurs.

Edit: To put a finer point on it:

Other than file.__exit__(), which is "automatically" called in a with context manager setting, the only other way that file.close() is automatically called (that is, other than explicitly calling it yourself,) is via file.__del__(). This leads us to the question of when does __del__() get called?

A correctly-written program cannot assume that finalizers will ever run at any point prior to program termination.

-- https://devblogs.microsoft.com/oldnewthing/20100809-00/?p=13203

In particular:

Objects are never explicitly destroyed; however, when they become unreachable they may be garbage-collected. An implementation is allowed to postpone garbage collection or omit it altogether — it is a matter of implementation quality how garbage collection is implemented, as long as no objects are collected that are still reachable.
[...]
CPython currently uses a reference-counting scheme with (optional) delayed detection of cyclically linked garbage, which collects most objects as soon as they become unreachable, but is not guaranteed to collect garbage containing circular references.

-- https://docs.python.org/3.5/reference/datamodel.html#objects-values-and-types

(Emphasis mine)

but as it suggests, other implementations may have other behavior. As an example, PyPy has 6 different garbage collection implementations!

回复收藏 0 原文

滴情不沾 2024-12-11 10:58:14

您可以使用 pathlib。

对于 Python 3.5 及更高版本：

from pathlib import Path
contents = Path(file_path).read_text()

对于旧版本的 Python，请使用 pathlib2：

$ pip install pathlib2

然后：

from pathlib2 import Path
contents = Path(file_path).read_text()

这是实际的 <代码>read_text 实现：

def read_text(self, encoding=None, errors=None):
    """
    Open the file in text mode, read it, and close the file.
    """
    with self.open(mode='r', encoding=encoding, errors=errors) as f:
        return f.read()

You can use pathlib.

For Python 3.5 and above:

from pathlib import Path
contents = Path(file_path).read_text()

For older versions of Python use pathlib2:

$ pip install pathlib2

Then:

from pathlib2 import Path
contents = Path(file_path).read_text()

This is the actual read_text implementation:

def read_text(self, encoding=None, errors=None):
    """
    Open the file in text mode, read it, and close the file.
    """
    with self.open(mode='r', encoding=encoding, errors=errors) as f:
        return f.read()

回复收藏 0 原文

轻拂→两袖风尘 2024-12-11 10:58:14

好吧，如果您必须逐行读取文件才能处理每一行，您可以使用

with open('Path/to/file', 'r') as f:
    s = f.readline()
    while s:
        # do whatever you want to
        s = f.readline()

或什至更好的方法：

with open('Path/to/file') as f:
    for line in f:
        # do whatever you want to

Well, if you have to read file line by line to work with each line, you can use

with open('Path/to/file', 'r') as f:
    s = f.readline()
    while s:
        # do whatever you want to
        s = f.readline()

Or even better way:

with open('Path/to/file') as f:
    for line in f:
        # do whatever you want to

回复收藏 0 原文

惜醉颜 2024-12-11 10:58:14

不是将文件内容作为单个字符串检索，
将内容存储为文件包含的所有行的列表可以很方便：

with open('Path/to/file', 'r') as content_file:
    content_list = content_file.read().strip().split("\n")

可以看出，需要添加串联方法 .strip().split("\n ”) 到主要答案线程。

这里，.strip() 只是删除整个文件字符串末尾的空格和换行符，
.split("\n") 通过在每个换行符 \n 处拆分整个文件字符串来生成实际列表。

而且，
这样，整个文件内容可以存储在变量中，这在某些情况下可能是需要的，而不是像之前的答案。

Instead of retrieving the file content as a single string,
it can be handy to store the content as a list of all lines the file comprises:

with open('Path/to/file', 'r') as content_file:
    content_list = content_file.read().strip().split("\n")

As can be seen, one needs to add the concatenated methods .strip().split("\n") to the main answer in this thread.

Here, .strip() just removes whitespace and newline characters at the endings of the entire file string,
and .split("\n") produces the actual list via splitting the entire file string at every newline character \n.

Moreover,
this way the entire file content can be stored in a variable, which might be desired in some cases, instead of looping over the file line by line as pointed out in this previous answer.

回复收藏 0 原文

~没有更多了~