python 如何使用 setattr 或 exec 创建私有类变量？

发布于 2024-12-10 07:28:03 字数 1025 浏览 2 评论 0原文

我刚刚遇到了一种情况，在使用 setattr 或 exec 时，伪-私有类成员名称不会被破坏。

In [1]: class T:
   ...:     def __init__(self, **kwargs):
   ...:         self.__x = 1
   ...:         for k, v in kwargs.items():
   ...:             setattr(self, "__%s" % k, v)
   ...:         
In [2]: T(y=2).__dict__
Out[2]: {'_T__x': 1, '__y': 2}

我也尝试过 exec("self.__%s = %s" % (k, v)) ，结果相同：

In [1]: class T:
   ...:     def __init__(self, **kwargs):
   ...:         self.__x = 1
   ...:         for k, v in kwargs.items():
   ...:             exec("self.__%s = %s" % (k, v))
   ...:         
In [2]: T(z=3).__dict__
Out[2]: {'_T__x': 1, '__z': 3}

执行 self.__dict__["_%s__% s" % (self.__class__.__name__, k)] = v 可以，但 __dict__ 是只读属性。

是否有另一种方法可以动态创建这些psuedo-私有类成员（无需在名称修改中进行硬编码）？

更好地表达我的问题：

当 python 遇到设置双下划线 (self.__x) 属性时，它“在幕后”会做什么？是否有一个神奇的函数可以用来进行修改？

原文

I've just run into a situation where pseudo-private class member names aren't getting mangled when using setattr or exec.

In [1]: class T:
   ...:     def __init__(self, **kwargs):
   ...:         self.__x = 1
   ...:         for k, v in kwargs.items():
   ...:             setattr(self, "__%s" % k, v)
   ...:         
In [2]: T(y=2).__dict__
Out[2]: {'_T__x': 1, '__y': 2}

I've tried exec("self.__%s = %s" % (k, v)) as well with the same result:

In [1]: class T:
   ...:     def __init__(self, **kwargs):
   ...:         self.__x = 1
   ...:         for k, v in kwargs.items():
   ...:             exec("self.__%s = %s" % (k, v))
   ...:         
In [2]: T(z=3).__dict__
Out[2]: {'_T__x': 1, '__z': 3}

Doing self.__dict__["_%s__%s" % (self.__class__.__name__, k)] = v would work, but __dict__ is a readonly attribute.

Is there another way that I can dynamically create these psuedo-private class members (without hard-coding in the name mangling)?

A better way to phrase my question:

What does python do “under the hood” when it encounters a double underscore (self.__x) attribute being set? Is there a magic function that is used to do the mangling?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

帅气尐潴 2024-12-17 07:28:03

我相信Python在编译期间会进行私有属性修改……特别是，它发生在它刚刚将源代码解析为抽象语法树并将其呈现为字节代码的阶段。这是执行期间虚拟机唯一一次知道在其（词法）范围内定义函数的类的名称。然后，它会破坏伪私有属性和变量，并保持其他所有内容不变。这有几个含义......

特别是字符串常量不会被破坏，这就是为什么你的setattr(self, "__X", x) 被留下。
由于重整依赖于源中函数的词法范围，因此在类外部定义然后“插入”的函数不会进行任何重整，因为有关它们“所属”的类的信息在编译时。
据我所知，没有一种简单的方法可以确定（在运行时）函数是在哪个类中定义的...至少在没有大量依赖于的 inspect 调用的情况下是这样在源反射上比较函数和类源之间的行号。即使这种方法不是 100% 可靠，也存在可能导致错误结果的边界情况。
这个过程实际上对于修改来说相当不精致 - 如果您尝试访问一个对象上的 __X 属性，而该对象不是该函数的实例在其中按词法定义，它仍然会对该类进行修改...让您将私有类属性存储在其他对象的实例中！（我几乎认为最后一点是一个功能，而不是一个错误）

因此变量重整必须手动完成，以便您计算重整的 attr 应该是什么，以便调用 setattr.

关于重整本身，它是由 _Py_Mangle 函数完成的，它使用以下逻辑：

__X 获取一个下划线并在前面添加类名。例如，如果它是 Test，则损坏的属性是 _Test__X。
唯一的例外是，如果类名以下划线开头，这些下划线将被删除。例如，如果类是__Test，则损坏的属性仍然是_Test__X。
类名中的尾部下划线不会被去除。

为了将这一切包装在一个函数中......

def mangle_attr(source, attr):
    # return public attrs unchanged
    if not attr.startswith("__") or attr.endswith("__") or '.' in attr:
        return attr
    # if source is an object, get the class
    if not hasattr(source, "__bases__"):
        source = source.__class__
    # mangle attr
    return "_%s%s" % (source.__name__.lstrip("_"), attr)

我知道这有点“硬编码”名称修改，但它至少被隔离到一个函数。然后它可以用于修改 setattr 的字符串：

# you should then be able to use this w/in the code...
setattr(self, mangle_attr(self, "__X"), value)

# note that would set the private attr for type(self),
# if you wanted to set the private attr of a specific class,
# you'd have to choose it explicitly...
setattr(self, mangle_attr(somecls, "__X"), value)

或者，以下 mangle_attr 实现使用 eval，以便它始终使用 Python 当前的修改逻辑（尽管我不认为上面列出的逻辑已经改变）......

_mangle_template = """
class {cls}:
    @staticmethod
    def mangle():
        {attr} = 1
cls = {cls}
"""

def mangle_attr(source, attr):
    # if source is an object, get the class
    if not hasattr(source, "__bases__"):
        source = source.__class__
    # mangle attr
    tmp = {}
    code = _mangle_template.format(cls=source.__name__, attr=attr)
    eval(compile(code, '', 'exec'), {}, tmp); 
    return tmp['cls'].mangle.__code__.co_varnames[0]

# NOTE: the '__code__' attr above needs to be 'func_code' for python 2.5 and older

I believe Python does private attribute mangling during compilation... in particular, it occurs at the stage where it has just parsed the source into an abstract syntax tree, and is rendering it to byte code. This is the only time during execution that the VM knows the name of the class within whose (lexical) scope the function is defined. It then mangles psuedo-private attributes and variables, and leaves everything else unchanged. This has a couple of implications...

String constants in particular are not mangled, which is why your setattr(self, "__X", x) is being left alone.
Since mangling relies on the lexical scope of the function within the source, functions defined outside of the class and then "inserted" do not have any mangling done, since the information about the class they "belong to" was not known at compile-time.
As far as I know, there isn't an easy way to determine (at runtime) what class a function was defined in... At least not without a lot of inspect calls that rely on source reflection to compare line numbers between the function and class sources. Even that approach isn't 100% reliable, there are border cases that can cause erroneous results.
The process is actually rather indelicate about the mangling - if you try to access the __X attribute on an object that isn't an instance of the class the function is lexically defined within, it'll still mangle it for that class... letting you store private class attrs in instances of other objects! (I'd almost argue this last point is a feature, not a bug)

So the variable mangling is going to have to be done manually, so that you calculate what the mangled attr should be in order to call setattr.

Regarding the mangling itself, it's done by the _Py_Mangle function, which uses the following logic:

__X gets an underscore and the class name prepended. E.g. if it's Test, the mangled attr is _Test__X.
The only exception is if the class name begins with any underscores, these are stripped off. E.g. if the class is __Test, the mangled attr is still _Test__X.
Trailing underscores in a class name are not stripped.

To wrap this all up in a function...

def mangle_attr(source, attr):
    # return public attrs unchanged
    if not attr.startswith("__") or attr.endswith("__") or '.' in attr:
        return attr
    # if source is an object, get the class
    if not hasattr(source, "__bases__"):
        source = source.__class__
    # mangle attr
    return "_%s%s" % (source.__name__.lstrip("_"), attr)

I know this somewhat "hardcodes" the name mangling, but it is at least isolated to a single function. It can then be used to mangle strings for setattr:

# you should then be able to use this w/in the code...
setattr(self, mangle_attr(self, "__X"), value)

# note that would set the private attr for type(self),
# if you wanted to set the private attr of a specific class,
# you'd have to choose it explicitly...
setattr(self, mangle_attr(somecls, "__X"), value)

Alternately, the following mangle_attr implementation uses an eval so that it always uses Python's current mangling logic (though I don't think the logic laid out above has ever changed)...

_mangle_template = """
class {cls}:
    @staticmethod
    def mangle():
        {attr} = 1
cls = {cls}
"""

def mangle_attr(source, attr):
    # if source is an object, get the class
    if not hasattr(source, "__bases__"):
        source = source.__class__
    # mangle attr
    tmp = {}
    code = _mangle_template.format(cls=source.__name__, attr=attr)
    eval(compile(code, '', 'exec'), {}, tmp); 
    return tmp['cls'].mangle.__code__.co_varnames[0]

# NOTE: the '__code__' attr above needs to be 'func_code' for python 2.5 and older

回复收藏 0 原文

十级心震 2024-12-17 07:28:03

解决这个问题：

当 python 遇到 double 时，它会在“幕后”做什么？
正在设置下划线 (self.__x) 属性？有没有神奇的功能
那是用来进行修改的？

AFAIK，它基本上是编译器中的特殊情况。因此，一旦进入字节码，名称就已经被破坏了；解释器根本看不到完整的名称，也不知道需要进行任何特殊处理。这就是为什么通过 setattr、exec 或通过在 __dict__ 中查找字符串进行引用不起作用的原因； 编译器将所有这些视为字符串，并且不知道它们与属性访问有任何关系，因此它会原封不动地传递它们。 解释器对名称修改一无所知，因此它只是直接使用它们。

当我需要解决这个问题时，我只是手动完成了相同的名称修改，虽然很黑客。我发现使用这些“私有”名称通常是一个坏主意，除非您知道需要它们来实现其预期目的：允许类的继承层次结构全部使用相同的属性名称但有一个副本每堂课。仅仅因为属性名称应该是私有实现细节而在属性名称上加上双下划线似乎弊大于利；我已经开始只使用一个下划线作为外部代码不应该接触它的提示。

回复收藏 0 原文

木緿 2024-12-17 07:28:03

这是我迄今为止所掌握的技巧。欢迎提出改进建议。

class T(object):

    def __init__(self, **kwds):
        for k, v in kwds.items():
            d = {}
            cls_name = self.__class__.__name__

            eval(compile(
                'class dummy: pass\n'
                'class {0}: __{1} = 0'.format(cls_name, k), '', 'exec'), d)

            d1, d2 = d['dummy'].__dict__, d[cls_name].__dict__
            k = next(k for k in d2 if k not in d1)

            setattr(self, k, v)

>>> t = T(x=1, y=2, z=3)
>>> t._T__x, t._T__y, t._T__z
(1, 2, 3)

Here's the hack I have so far. Suggestions for improvement are welcome.

class T(object):

    def __init__(self, **kwds):
        for k, v in kwds.items():
            d = {}
            cls_name = self.__class__.__name__

            eval(compile(
                'class dummy: pass\n'
                'class {0}: __{1} = 0'.format(cls_name, k), '', 'exec'), d)

            d1, d2 = d['dummy'].__dict__, d[cls_name].__dict__
            k = next(k for k in d2 if k not in d1)

            setattr(self, k, v)

>>> t = T(x=1, y=2, z=3)
>>> t._T__x, t._T__y, t._T__z
(1, 2, 3)

回复收藏 0 原文

~没有更多了~