是否有用于处理Python对象地址的Python模块?

发布于 2024-09-18 12:34:20 字数 871 浏览 6 评论 0原文

(当我说“对象地址”时,我指的是您在 Python 中键入的用于访问对象的字符串。例如 'life.State.step'。大多数情况下,在最后一个点将是包/模块,但在某些情况下它们可以是类或其他对象。)

在我的 Python 项目 中,我经常需要使用对象地址。我必须做的一些任务:

  1. 给定一个对象,获取它的地址。
  2. 给定地址,获取对象,并在此过程中导入任何所需的模块。
  3. 通过消除冗余的中间模块来缩短对象的地址。 (例如,'life.life.State.step'可能是某个对象的官方地址,但如果'life.State.step'指向同一个对象,我想使用它,因为它更短。)
  4. 通过“扎根”指定的模块来缩短对象的地址。 (例如,'garlicsim_lib.simpacks.prisoner.prisoner.State.step'可能是对象的官方地址,但我假设用户知道囚犯在哪里包是,所以我想使用 'prisoner.prisoner.State.step' 作为地址。)

是否有一个模块/框架可以处理类似的事情?我编写了一些实用程序模块来完成这些事情,但如果有人已经编写了一个更成熟的模块来执行此操作,我更愿意使用它。

请注意:请不要试图向我展示这些东西的快速实现。它比看起来更复杂,有很多陷阱,任何快速的脏代码都可能在许多重要情况下失败。此类任务需要经过实战检验的代码。

更新:当我说“对象”时,我主要指的是类、模块、函数、方法等。抱歉之前没有说清楚。

(When I say "object address", I mean the string that you type in Python to access an object. For example 'life.State.step'. Most of the time, all the objects before the last dot will be packages/modules, but in some cases they can be classes or other objects.)

In my Python project I often have the need to play around with object addresses. Some tasks that I have to do:

  1. Given an object, get its address.
  2. Given an address, get the object, importing any needed modules on the way.
  3. Shorten an object's address by getting rid of redundant intermediate modules. (For example, 'life.life.State.step' may be the official address of an object, but if 'life.State.step' points at the same object, I'd want to use it instead because it's shorter.)
  4. Shorten an object's address by "rooting" a specified module. (For example, 'garlicsim_lib.simpacks.prisoner.prisoner.State.step' may be the official address of an object, but I assume that the user knows where the prisoner package is, so I'd want to use 'prisoner.prisoner.State.step' as the address.)

Is there a module/framework that handles things like that? I wrote a few utility modules to do these things, but if someone has already written a more mature module that does this, I'd prefer to use that.

One note: Please, don't try to show me a quick implementation of these things. It's more complicated than it seems, there are plenty of gotchas, and any quick-n-dirty code will probably fail for many important cases. These kind of tasks call for battle-tested code.

UPDATE: When I say "object", I mostly mean classes, modules, functions, methods, stuff like these. Sorry for not making this clear before.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

长不大的小祸害 2024-09-25 12:34:20

简短的回答:不。你想要的东西是不可能的。

长的答案是,你所认为的对象的“地址”根本不是。 life.State.step 只是在特定时间获取对象引用的方法之一。稍后相同的“地址”可能会给您不同的对象,或者可能是一个错误。此外,您的这个“地址”取决于上下文。在 life.State.step 中,最终对象不仅取决于 life.Statelife.State.step 是什么,还取决于 名称life在该命名空间中引用什么对象

对您请求的具体答复:

  1. 最终对象无法找出您如何引用它,也没有您向其提供该对象的任何代码。 “地址”不是一个名称,它不与对象绑定,它只是一个任意的 Python 表达式,它会产生一个对象引用(就像所有表达式一样)。你只能让这个工作,勉强 ,以及预计不会移动的特定对象,例如类和模块。即便如此,这些对象可以移动,并且经常确实移动,因此您尝试的东西可能会损坏。

  2. 如上所述,“地址”取决于很多东西,但这部分相当简单:__import__()getattr() 可以给你这些东西。然而,它们将极其脆弱,尤其是当涉及的不仅仅是属性访问时。它只能远程处理模块中的内容。

  3. “缩短”名称需要递归地检查每个可能的名称,即所有模块和所有本地名称以及它们的所有属性。这将是一个非常缓慢且耗时的过程,并且在面对具有 __getattr__ 或 __getattribute__ 方法的任何内容,或者具有不仅仅返回一个值的属性时极其脆弱。

  4. 与 3 相同。

Short answer: No. What you want is impossible.

The long answer is that what you think of as the "address" of an object is anything but. life.State.step is merely one of the ways to get a reference to the object at that particular time. The same "address" at a later point can give you a different object, or it could be an error. What's more, this "address" of yours depends on the context. In life.State.step, the end object depends not just on what life.State and life.State.step are, but what object the name life refers to in that namespace.

Specific answers to your requests:

  1. The end object has no way whatsoever of finding out how you referred to it, and neither has any code that you give the object to. The "address" is not a name, it's not tied to the object, it's merely an arbitrary Python expression that results in an object reference (as all expressions do.) You can only make this work, barely, with specific objects that aren't expected to move around, such as classes and modules. Even so, those objects can move around, and frequently do move around, so what you attempt is likely to break.

  2. As mentioned, the "address" depends on many things, but this part is fairly easy: __import__() and getattr() can give you these things. They will, however, be extremely fragile, especially when there's more involved than just attribute access. It can only remotely work with things that are in modules.

  3. "Shortening" the name requires examining every possible name, meaning all modules and all local names, and all attributes of them, recrusively. It would be a very slow and time-consuming process, and extremely fragile in the face of anything with a __getattr__ or __getattribute__ method, or with properties that do more than return a value.

  4. is the same thing as 3.

灵芸 2024-09-25 12:34:20

我发布了 address_tools 模块,它完全符合我的要求。

这是代码以下是测试

它是 GarlicSim 的一部分,因此您可以通过 安装 garlicsim 并执行 from Garlicsim.general_misc import address_tools。它的主要功能是describeresolve,与repreval并行。文档字符串解释了有关这些函数如何工作的所有内容。

GarlicSim 的 Python 3 分支 上甚至还有一个 Python 3 版本。如果您想在 Python 3 代码上使用 address_tools,请安装它。

I released the address_tools module which does exactly what I asked for.

Here is the code. Here are the tests.

It's part of GarlicSim, so you can use it by installing garlicsim and doing from garlicsim.general_misc import address_tools. Its main functions are describe and resolve, which are parallel to repr and eval. The docstrings explain everything about how these functions work.

There is even a Python 3 version on the Python 3 fork of GarlicSim. Install it if you want to use address_tools on Python 3 code.

旧时浪漫 2024-09-25 12:34:20

对于第 3 点和第 4 点,我猜您正在寻找类似的设施

from life import life  # life represents life.life
from garlicsim_lib.simpacks import prisoner

但是,不建议这样做,因为它使您或阅读代码的人更难快速了解 prisoner 代表什么(其中它来自模块吗?你必须查看代码的开头才能获取此信息)。

对于第 1 点,您可以执行以下操作:

from uncertainties import UFloat

print UFloat.__module__  # prints 'uncertainties'

import sys
module_of_UFloat = sys.modules[UFloat.__module__]

对于第 2 点,给定字符串 'garlicsim_lib.simpacks.prisoner',您可以获取它引用的对象:

obj = eval('garlicsim_lib.simpacks.prisoner')

这假设您已经导入了模块

import garlicsim_lib  # or garlicsim_lib.simpacks

如果您甚至希望这是自动的,您可以按照“仅在最简单的情况下有效”的方式进行操作

import imp

module_name = address_string.split('.', 1)[0]
mod_info = imp.find_module(module_name)
try:
    imp.load_module(module_name, *mod_info)
finally:
    # Proper closing of the module file:
    if mod_info[0] is not None:
        mod_info[0].close()

(例如,garlicsim_lib.simpacks 需要在 garlicsim_lib 中可用)。

然而,以这种方式编码是非常不寻常的。

For points 3 and 4, I guess that you are looking for facilities like

from life import life  # life represents life.life
from garlicsim_lib.simpacks import prisoner

However, this is not recommended, as it makes it harder for you or people who read your code to quickly know what prisoner represents (where module does it come from? you have to look at the beginning of the code to get this information).

For point 1, you can do:

from uncertainties import UFloat

print UFloat.__module__  # prints 'uncertainties'

import sys
module_of_UFloat = sys.modules[UFloat.__module__]

For point 2, given the string 'garlicsim_lib.simpacks.prisoner', you can get the object it refers to with:

obj = eval('garlicsim_lib.simpacks.prisoner')

This supposes that you have imported the module with

import garlicsim_lib  # or garlicsim_lib.simpacks

If you even want this to be automatic, you can do something along the lines of

import imp

module_name = address_string.split('.', 1)[0]
mod_info = imp.find_module(module_name)
try:
    imp.load_module(module_name, *mod_info)
finally:
    # Proper closing of the module file:
    if mod_info[0] is not None:
        mod_info[0].close()

This works only in the simplest cases (garlicsim_lib.simpacks need to be available in garlicsim_lib, for instance).

Coding things this way is, however, highly unusual.

太阳男子 2024-09-25 12:34:20

Twisted 的 #2 为 twins/python/reflect.py 。您需要类似的东西来制作基于字符串的配置系统,例如 Django 的 urls.py 配置。

查看代码和版本控制日志,看看他们必须做什么才能使其工作 - 并失败! - 正确的方法。

您正在寻找的其他东西对 Python 环境施加了足够的限制,以至于不存在通用解决方案之类的东西。

这是在某种程度上实现了你的#1

>>> import pickle
>>> def identify(f):
...   name = f.__name__
...   module_name = pickle.whichmodule(f, name)
...   return module_name + "." + name
... 
>>> identify(math.cos)
'math.cos'
>>> from xml.sax.saxutils import handler
>>> identify(handler)
'__main__.xml.sax.handler'
>>> 

你的#3 定义不足的东西。如果我

__builtin__.step = path.to.your.stap

这样做,那么搜索代码是否应该将其视为“步骤”?

我能想到的最简单的实现只是搜索所有模块并查找正是您想要的顶级元素

>>> import sys
>>> def _find_paths(x):
...   for module_name, module in sys.modules.items():
...     if module is None:
...         continue
...     for (member_name, obj) in module.__dict__.items():
...       if obj is x:
...         yield module_name + "." + member_name
... 
>>> def find_shortest_name_to_object(x):
...   return min( (len(name), name) for name in _find_paths(x) )[1]
... 
>>> find_shortest_name_to_object(handler)
'__builtin__._'
>>> 5
5
>>> find_shortest_name_to_object(handler)
'xml.sax.handler'
>>> 

在这里您可以看到“handler”实际上位于上一个表达式返回的 _ 中,使其成为最短的名称。

如果您想要其他东西,例如递归搜索所有模块的所有成员,那么只需编写代码即可。但正如“_”示例所示,将会有惊喜。另外,这并不稳定,因为导入另一个模块可能会使另一个对象路径可用并且更短。

这就是为什么人们一遍又一遍地说你想要的东西实际上没有任何用处,这就是为什么没有适合它的模块。

至于你的#4,通用包到底如何满足这些命名需求?

无论如何,你写了

请不要试图快速向我展示
这些事情的实施。它是
比看起来更复杂,有
有很多陷阱,并且任何
快速而肮脏的代码可能会失败
对于许多重要的案件。这些种
的任务需要经过实战检验的代码。

因此,不要将我的示例视为解决方案,而是将其视为为什么您所要求的内容毫无意义的示例。这是一个如此脆弱的解决方案空间,少数冒险进入那里的人(主要是出于好奇)有如此不同的担忧,因此一次性的定制解决方案是最好的选择。大多数这些的模块是没有意义的,如果它确实有意义,那么对模块功能的解释可能会比代码更长。

因此,你的问题的答案是“不,没有这样的模块”。

让你的问题更加混乱的是Python的C实现已经定义了一个“对象地址”。 id() 的文档说:

CPython 实现细节:这是对象的地址。

您要查找的是名称或对象的路径。不是“Python 对象地址”。

Twisted has #2 as twisted/python/reflect.py . You need something like it for making a string-based configuration system, like with Django's urls.py configuration.

Take a look at the code and the version control log to see what they had to do to make it work - and fail! - the right way.

The other things you are looking for place enough restrictions on the Python environment that there is no such thing as a general purpose solution.

Here's something which somewhat implements your #1

>>> import pickle
>>> def identify(f):
...   name = f.__name__
...   module_name = pickle.whichmodule(f, name)
...   return module_name + "." + name
... 
>>> identify(math.cos)
'math.cos'
>>> from xml.sax.saxutils import handler
>>> identify(handler)
'__main__.xml.sax.handler'
>>> 

Your #3 is underdefined. If I do

__builtin__.step = path.to.your.stap

then should the search code find it as "step"?

The simplest implementation I can think of simply searches all modules and looks for top-level elements which are exactly what you want

>>> import sys
>>> def _find_paths(x):
...   for module_name, module in sys.modules.items():
...     if module is None:
...         continue
...     for (member_name, obj) in module.__dict__.items():
...       if obj is x:
...         yield module_name + "." + member_name
... 
>>> def find_shortest_name_to_object(x):
...   return min( (len(name), name) for name in _find_paths(x) )[1]
... 
>>> find_shortest_name_to_object(handler)
'__builtin__._'
>>> 5
5
>>> find_shortest_name_to_object(handler)
'xml.sax.handler'
>>> 

Here you can see that 'handler' was actually in _ from the previous expression return, making it the shortest name.

If you want something else, like recursively searching all members of all modules, then just code it up. But as the "_" example shows, there will be surprises. Plus, this isn't stable, since importing another module might make another object path available and shorter.

That's why people say over and over again that what you want isn't actually useful for anything, and that's why there's no modules for it.

And as for your #4, how in the world will any general package cater to those naming needs?

In any case, you wrote

Please, don't try to show me a quick
implementation of these things. It's
more complicated than it seems, there
are plenty of gotchas, and any
quick-n-dirty code will probably fail
for many important cases. These kind
of tasks call for battle-tested code.

so don't think of my examples as solutions but as examples of why what you're asking for makes little sense. It's such a fragile solution space adn the few who venture there (mostly for curiosity) have such different concerns that a one-off custom solution is the best thing. A module for most of these makes no sense, and if it did make sense the explanation of what the module does would probably be longer than the code.

And hence the answer to your question is "no, there are no such modules."

What makes your question even more confusing is that the C implementation of Python already defines an "object address". The docs for id() say:

CPython implementation detail: This is the address of the object.

What you're looking for is the name, or the path to the object. Not the "Python object address."

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文