加载模块时,Python 文档字符串和注释是否存储在内存中?

发布于 2024-08-16 14:48:08 字数 277 浏览 13 评论 0原文

加载模块时,Python 文档字符串和注释是否存储在内存中?

我想知道这是否属实,因为我通常会很好地记录我的代码;这会影响内存使用吗?

通常每个Python对象都有一个__doc__方法。这些文档字符串是从文件中读取的,还是以其他方式处理的?

我已经在论坛、Google 和邮件列表中进行了搜索,但没有找到任何相关信息。

你知道更多吗?

Are Python docstrings and comments stored in memory when a module is loaded?

I've wondered if this is true, because I usually document my code well; may this affect memory usage?

Usually every Python object has a __doc__ method. Are those docstrings read from the file, or processed otherwise?

I've done searches here in the forums, Google and Mailing-Lists, but I haven't found any relevant information.

Do you know better?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

风为裳 2024-08-23 14:48:08

默认情况下,文档字符串存在于 .pyc 字节码文件中,并从中加载(注释则不然)。如果您使用 python -OO(-OO 标志代表“强烈优化”,而不是代表“温和优化”的 -O) ),您获取并使用 .pyo 文件而不是 .pyc 文件,并且这些文件通过省略文档字符串进行优化(除了 -O 完成的优化之外),它删除了 assert 语句)。例如,考虑一个文件 foo.py ,它具有:

"""This is the documentation for my module foo."""

def bar(x):
  """This is the documentation for my function foo.bar."""
  return x + 1

您可以有以下 shell 会话...:

$ python -c'import foo; print foo.bar(22); print foo.__doc__'
23
This is the documentation for my module foo.
$ ls -l foo.pyc
-rw-r--r--  1 aleax  eng  327 Dec 30 16:17 foo.pyc
$ python -O -c'import foo; print foo.bar(22); print foo.__doc__'
23
This is the documentation for my module foo.
$ ls -l foo.pyo
-rw-r--r--  1 aleax  eng  327 Dec 30 16:17 foo.pyo
$ python -OO -c'import foo; print foo.bar(22); print foo.__doc__'
23
This is the documentation for my module foo.
$ ls -l foo.pyo
-rw-r--r--  1 aleax  eng  327 Dec 30 16:17 foo.pyo
$ rm foo.pyo
$ python -OO -c'import foo; print foo.bar(22); print foo.__doc__'
23
None
$ ls -l foo.pyo
-rw-r--r--  1 aleax  eng  204 Dec 30 16:17 foo.pyo

请注意,由于我们首先使用 -O.pyo 文件为 327 字节——即使在使用 -OO 之后也是如此,因为 .pyo 文件是 327 字节。 文件仍然存在,Python 没有重建/覆盖它,它只是使用现有的 .pyo (或者等效地,touch foo.py<。 /code> 以便 Python 知道 .pyo 已“过时”)意味着 Python 会重建它(在本例中,在磁盘上节省 123 个字节,当模块的已导入 - 但所有 .__doc__ 条目都会消失并被 None 替换)。

By default, docstrings are present in the .pyc bytecode file, and are loaded from them (comments are not). If you use python -OO (the -OO flag stands for "optimize intensely", as opposed to -O which stands for "optimize mildly), you get and use .pyo files instead of .pyc files, and those are optimized by omitting the docstrings (in addition to the optimizations done by -O, which remove assert statements). E.g., consider a file foo.py that has:

"""This is the documentation for my module foo."""

def bar(x):
  """This is the documentation for my function foo.bar."""
  return x + 1

you could have the following shell session...:

$ python -c'import foo; print foo.bar(22); print foo.__doc__'
23
This is the documentation for my module foo.
$ ls -l foo.pyc
-rw-r--r--  1 aleax  eng  327 Dec 30 16:17 foo.pyc
$ python -O -c'import foo; print foo.bar(22); print foo.__doc__'
23
This is the documentation for my module foo.
$ ls -l foo.pyo
-rw-r--r--  1 aleax  eng  327 Dec 30 16:17 foo.pyo
$ python -OO -c'import foo; print foo.bar(22); print foo.__doc__'
23
This is the documentation for my module foo.
$ ls -l foo.pyo
-rw-r--r--  1 aleax  eng  327 Dec 30 16:17 foo.pyo
$ rm foo.pyo
$ python -OO -c'import foo; print foo.bar(22); print foo.__doc__'
23
None
$ ls -l foo.pyo
-rw-r--r--  1 aleax  eng  204 Dec 30 16:17 foo.pyo

Note that, since we used -O first, the .pyo file was 327 bytes -- even after using -OO, because the .pyo file was still around and Python didn't rebuild/overwrite it, it just used the existing one. Removing the existing .pyo (or, equivalently, touch foo.py so that Python knows the .pyo is "out of date") means that Python rebuilds it (and, in this case, saves 123 bytes on disk, and a little bit more when the module's imported -- but all .__doc__ entries disappear and are replaced by None).

独留℉清风醉 2024-08-23 14:48:08

是的,文档字符串是从文件中读取的,但这不应阻止您编写它们。永远不要为了性能而牺牲代码的可读性,除非您完成了性能测试并发现您担心的事情实际上是程序中导致问题的瓶颈。我认为文档字符串在任何现实情况下都不太可能造成任何可衡量的性能影响。

Yes the docstrings are read from the file, but that shouldn't stop you writing them. Never ever compromise readability of code for performance until you have done a performance test and found that the thing you are worried about is in fact the bottleneck in your program that is causing a problem. I would think that it is extremely unlikely that a docstring will cause any measurable performance impact in any real world situation.

月棠 2024-08-23 14:48:08

它们从文件中读取(当文件编译为 pyc 或加载 pyc 时——它们必须在 object.__doc__ 下可用)但是 -->在任何合理的情况下,这都不会显着影响性能,或者您真的在编写多兆字节的文档字符串吗?

They are getting read from the file (when the file is compiled to pyc or when the pyc is loaded -- they must be available under object.__doc__) but no --> this will not significantly impact performance under any reasonable circumstances, or are you really writing multi-megabyte doc-strings?

二手情话 2024-08-23 14:48:08

Python 文档字符串和注释是吗?
当模块被存储在存储器中
已加载?

文档字符串被编译到 .pyc 文件中,并加载到内存中。注释在编译期间被丢弃,除了在编译期间忽略它们所花费的微不足道的额外时间外,对任何事情都没有影响(这种情况仅在对 .py 文件进行任何更改后发生一次,除了主脚本,每次修改时都会重新编译)跑步)。

另请注意,仅当这些字符串是模块、类定义或函数定义中的第一个时,它们才会被保留。您几乎可以在任何地方包含附加字符串,但它们将在编译期间像注释一样被丢弃。

Do Python docstrings and comments are
stored in memory when module is
loaded?

Docstrings are compiled into the .pyc file, and are loaded into memory. Comments are discarded during compilation and have no impact on anything except the insignificant extra time taken to ignore them during compilation (which happens once only after any change to a .py file, except for the main script which is re-compiled every time it is run).

Also note that these strings are preserved only if they are the first thing in the module, class definition, or function definition. You can include additional strings pretty much anywhere, but they will be discarded during compilation just as comments are.

各自安好 2024-08-23 14:48:08

正如其他答案提到的,注释在编译过程中被丢弃,但文档字符串存储在 .pyc 文件中并加载到内存中。

在 .pyc 文件中,存在使用 marshal 序列化的代码对象。虽然它不应该是可读的,但你仍然可以找到一些东西。那么为什么不直接看看它确实在 .pyc 文件中呢?

import marshal

text = '''def fn():
    """ZZZZZZZZZZZZZZZZZZ"""
    # GGGGGGGGGGGGGGGGGGG'''

code_object = compile(text, '<string>', 'exec')
serialized = marshal.dumps(code_object)
print(serialized)
print(b"ZZZZZZZZZZZZZZZZZZ" in serialized)
print(b"GGGGGGGGGGGGGGGGGGG" in serialized)

输出:

b'\xe3\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00@\x00\x00\x00s\x0c\x00\x00\x00d\x00d\x01\x84\x00Z\x00d\x02S\x00)\x03c\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00C\x00\x00\x00s\x04\x00\x00\x00d\x01S\x00)\x02Z\x12ZZZZZZZZZZZZZZZZZZN\xa9\x00r\x01\x00\x00\x00r\x01\x00\x00\x00r\x01\x00\x00\x00\xfa\x08<string>\xda\x02fn\x01\x00\x00\x00s\x02\x00\x00\x00\x00\x01r\x03\x00\x00\x00N)\x01r\x03\x00\x00\x00r\x01\x00\x00\x00r\x01\x00\x00\x00r\x01\x00\x00\x00r\x02\x00\x00\x00\xda\x08<module>\x01\x00\x00\x00\xf3\x00\x00\x00\x00'
True
False

函数的代码对象中在哪里引用了它?在 .co_consts

new_code_object = marshal.loads(serialized)
print(new_code_object.co_consts[0].co_consts[0])

输出中:

ZZZZZZZZZZZZZZZZZZ

def fn():
    """ZZZZZZZZZZZZZZZZZZ"""
    # GGGGGGGGGGGGGGGGGGG

print(fn.__code__.co_consts[0] is fn.__doc__) # True

As other answers mentioned, comments are discarded in compilation process but docstrings are stored in .pyc file and are loaded into the memory.

In .pyc files, there are code objects that are serialized with marshal. Although it's not supposed to be readable but you can still find something. So why not just see that it is indeed in .pyc file?

import marshal

text = '''def fn():
    """ZZZZZZZZZZZZZZZZZZ"""
    # GGGGGGGGGGGGGGGGGGG'''

code_object = compile(text, '<string>', 'exec')
serialized = marshal.dumps(code_object)
print(serialized)
print(b"ZZZZZZZZZZZZZZZZZZ" in serialized)
print(b"GGGGGGGGGGGGGGGGGGG" in serialized)

output:

b'\xe3\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00@\x00\x00\x00s\x0c\x00\x00\x00d\x00d\x01\x84\x00Z\x00d\x02S\x00)\x03c\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00C\x00\x00\x00s\x04\x00\x00\x00d\x01S\x00)\x02Z\x12ZZZZZZZZZZZZZZZZZZN\xa9\x00r\x01\x00\x00\x00r\x01\x00\x00\x00r\x01\x00\x00\x00\xfa\x08<string>\xda\x02fn\x01\x00\x00\x00s\x02\x00\x00\x00\x00\x01r\x03\x00\x00\x00N)\x01r\x03\x00\x00\x00r\x01\x00\x00\x00r\x01\x00\x00\x00r\x01\x00\x00\x00r\x02\x00\x00\x00\xda\x08<module>\x01\x00\x00\x00\xf3\x00\x00\x00\x00'
True
False

where is it referenced in the function's code object? in .co_consts

new_code_object = marshal.loads(serialized)
print(new_code_object.co_consts[0].co_consts[0])

output:

ZZZZZZZZZZZZZZZZZZ

def fn():
    """ZZZZZZZZZZZZZZZZZZ"""
    # GGGGGGGGGGGGGGGGGGG

print(fn.__code__.co_consts[0] is fn.__doc__) # True
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文