我应该在 Python 中使用名称修饰吗?
在其他语言中,有助于生成更好代码的一般准则始终是使所有内容尽可能隐藏。如果不确定变量应该是私有的还是受保护的,最好选择私有。
对于 Python 来说也是如此吗?我应该首先在所有内容上使用两个前导下划线,然后只在需要时减少它们的隐藏程度(仅一个下划线)吗?
如果约定只使用一个下划线,我也想知道其基本原理。
这是我对 JBernardo 的回答留下的评论。它解释了我为什么问这个问题,也解释了为什么我想知道为什么 Python 与其他语言不同:
我来自的语言训练你认为一切都应该只在需要时公开,而不是更多。原因是这将减少依赖性并使代码更安全地更改。 Python 的反向做事方式——从公开开始到隐藏——对我来说很奇怪。
In other languages, a general guideline that helps produce better code is always make everything as hidden as possible. If in doubt about whether a variable should be private or protected, it's better to go with private.
Does the same hold true for Python? Should I use two leading underscores on everything at first, and only make them less hidden (only one underscore) as I need them?
If the convention is to use only one underscore, I'd also like to know the rationale.
Here's a comment I left on JBernardo's answer. It explains why I asked this question and also why I'd like to know why Python is different from the other languages:
I come from languages that train you to think everything should be only as public as needed and no more. The reasoning is that this will reduce dependencies and make the code safer to alter. The Python way of doing things in reverse -- starting from public and going towards hidden -- is odd to me.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
第一:为什么要隐藏你的数据?为什么这如此重要?
大多数时候你并不是真的想这样做,但你这样做是因为其他人正在这样做。
如果您真的真的不希望人们使用某些东西,请在其前面添加一个下划线。就是这样……Pythonistas 知道带有一个下划线的东西并不能保证每次都能工作,并且可能会在你不知情的情况下发生变化。
这就是我们的生活方式,我们对此表示同意。
使用两个下划线会让你的类很难子类化,甚至你也不想这样工作。
First: Why do you want to hide your data? Why is that so important?
Most of the time you don't really want to do it but you do because others are doing.
If you really really really don't want people using something, add one underscore in front of it. That's it... Pythonistas know that things with one underscore is not guaranteed to work every time and may change without you knowing.
That's the way we live and we're okay with that.
Using two underscores will make your class so bad to subclass that even you will not want to work that way.
所选答案很好地解释了属性如何消除对私有属性的需求,但我还要补充一点,模块级别的函数消除了对私有方法的需求。
如果将方法转变为模块级别的函数,则消除了子类覆盖它的机会。将某些功能移至模块级别比尝试通过名称修改来隐藏方法更具 Python 风格。
The chosen answer does a good job of explaining how properties remove the need for private attributes, but I would also add that functions at the module level remove the need for private methods.
If you turn a method into a function at the module level, you remove the opportunity for subclasses to override it. Moving some functionality to the module level is more Pythonic than trying to hide methods with name mangling.
以下代码片段将解释所有不同的情况:
无下划线 (a)
打印测试对象的所有有效属性
在这里,您可以看到 __a 的名称已更改为 _Test__a,以防止该变量被任何子类覆盖。这个概念在Python中被称为“Name Mangling”。
你可以像这样访问它:
同样,对于_a,该变量只是通知开发人员它应该用作该类的内部变量,即使你访问它,python解释器也不会做任何事情,但它这不是一个好的做法。
变量可以从任何地方访问,就像公共类变量一样。
希望回答对你有帮助:)
Following code snippet will explain all different cases :
no underscore (a)
printing all valid attributes of Test Object
Here, you can see that name of __a has been changed to _Test__a to prevent this variable to be overridden by any of the subclass. This concept is known as "Name Mangling" in python.
You can access this like this :
Similarly, in case of _a, the variable is just to notify the developer that it should be used as internal variable of that class, the python interpreter won't do anything even if you access it, but it is not a good practise.
a variable can be accesses from anywhere it's like a public class variable.
Hope the answer helped you :)
乍一看,它应该与其他语言相同(在“其他”下我指的是 Java 或 C++),但事实并非如此。
在 Java 中,您将所有不应在外部访问的变量设为私有。同时,在 Python 中你无法实现这一点,因为不存在“隐私”(正如 Python 原则之一所述 - “我们都是成年人”)。所以双下划线仅意味着“伙计们,不要直接使用这个字段”。相同的含义有单下划线,同时当您必须从所考虑的类继承时,它不会引起任何头痛(只是双下划线可能引起的问题的一个例子)。
因此,我建议您默认对“私有”成员使用单下划线。
At first glance it should be the same as for other languages (under "other" I mean Java or C++), but it isn't.
In Java you made private all variables that shouldn't be accessible outside. In the same time in Python you can't achieve this since there is no "privateness" (as one of Python principles says - "We're all adults"). So double underscore means only "Guys, do not use this field directly". The same meaning has singe underscore, which in the same time doesn't cause any headache when you have to inherit from considered class (just an example of possible problem caused by double underscore).
So, I'd recommend you to use single underscore by default for "private" members.
“如果不确定变量应该是私有的还是受保护的,最好选择私有。” - 是的,Python 中也是如此。
这里的一些答案提到了“约定”,但没有提供这些约定的链接。 Python 权威指南 PEP 8 指出明确地:
其他答案中已经考虑了公共和私有之间的区别,以及 Python 中的名称修改。从同一个链接,
"If in doubt about whether a variable should be private or protected, it's better to go with private." - yes, same holds in Python.
Some answers here say about 'conventions', but don't give the links to those conventions. The authoritative guide for Python, PEP 8 states explicitly:
The distinction between public and private, and name mangling in Python have been considered in other answers. From the same link,
#Python 名称修改示例程序
#EXAMPLE PROGRAM FOR Python name mangling
如有疑问,请将其保留为“公开” - 我的意思是,不要添加任何内容来掩盖您的属性名称。如果您有一个具有某些内部价值的类,请不要担心它。而不是写:
默认写这个:
这肯定是一种有争议的做事方式。 Python 新手讨厌它,甚至一些 Python 老手也鄙视这个默认值 - 但无论如何它是默认值,所以我建议你遵循它,即使你感觉不舒服。
如果您确实想要发送消息“不能碰这个!”对于您的用户来说,通常的方法是在变量前面添加一个下划线。这只是一个约定,但人们理解它并在处理此类内容时加倍小心:
这对于避免属性名称和属性名称之间的冲突也很有用:
双下划线怎么样?好吧,我们主要使用双下划线魔法以避免意外方法重载和名称与超类的属性冲突。如果您编写一个可以多次扩展的类,那么它会非常有价值。
如果您想将其用于其他目的,可以,但既不常用也不推荐。
编辑:为什么会这样?嗯,通常的 Python 风格并不强调将事物设为私有 - 相反!造成这种情况的原因有很多——其中大多数都是有争议的……让我们看看其中的一些。
Python 有属性
如今,大多数面向对象语言都使用相反的方法:不应该使用的东西不应该是可见的,因此属性应该是私有的。从理论上讲,这将产生更易于管理、耦合更少的类,因为没有人会鲁莽地更改对象的值。
然而,事情并没有那么简单。例如,Java 类有许多仅获取值的getter 和仅设置值的setter。比方说,您需要七行代码来声明一个属性 - Python 程序员会说这过于复杂。此外,您还需要编写大量代码来获取一个公共字段,因为您可以在实践中使用 getter 和 setter 来更改其值。
那么为什么要遵循这种默认私有政策呢?只需默认公开您的属性即可。当然,这在 Java 中是有问题的,因为如果您决定向属性添加一些验证,则需要将
代码中的 all: 更改为,让我们说
setAge()
为:在 Java(和其他语言)中,默认情况下无论如何都会使用 getter 和 setter,因为它们写起来可能很烦人,但如果您发现自己处于我所描述的情况,可以节省您很多时间。
但是,您不需要在 Python 中执行此操作,因为 Python 有属性。如果您有此类:
...然后您决定验证年龄,则无需更改代码的
person.age = Age
部分。只需添加一个属性(如下所示)假设您可以做到并且仍然使用
person.age =age
,为什么要添加私有字段以及 getter 和 setter?(另请参阅 Python 不是 Java 和 这篇文章介绍了使用 getter 和设置器。)。
无论如何,一切都是可见的 - 尝试隐藏会使您的工作变得复杂
即使在具有私有属性的语言中,您也可以通过一些反射/内省库来访问它们。人们在框架中经常这样做,为了解决紧急需求。问题在于,内省库只是一种复杂的方式来完成您可以使用公共属性完成的事情。
由于 Python 是一种非常动态的语言,因此将这种负担添加到您的类中会适得其反。
问题在于无法看到 - 需要才能看到
对于 Pythonista,封装并不是无法看到类的内部结构,而是可以避免查看它。封装是组件的属性,用户可以使用它而无需关心内部细节。如果你可以使用一个组件而不用担心它的实现,那么它就是封装的(在Python程序员看来)。
现在,如果您编写了一个类,则可以在不考虑实现细节的情况下使用它,如果您出于某种原因想要查看该类的内部也没有问题。重点是:你的 API 应该很好,剩下的就是细节。
Guido 这么说
好吧,这没有争议:他实际上是这么说的 。 (寻找“开放式和服”。)
这就是文化。
是的,有一些原因,但没有关键原因。这主要是 Python 编程的文化方面。坦率地说,也可能是相反的情况——但事实并非如此。另外,您也可以轻松地反问:为什么某些语言默认使用私有属性?与 Python 实践的主要原因相同:因为这是这些语言的文化,每种选择都有优点和缺点。
由于已经存在这种文化,因此建议您遵循它。否则,当你在 Stack Overflow 中提问时,Python 程序员会告诉你从代码中删除
__
,你会感到恼火:)When in doubt, leave it "public" - I mean, do not add anything to obscure the name of your attribute. If you have a class with some internal value, do not bother about it. Instead of writing:
write this by default:
This is for sure a controversial way of doing things. Python newbies hate it, and even some old Python guys despise this default - but it is the default anyway, so I recommend you to follow it, even if you feel uncomfortable.
If you really want to send the message "Can't touch this!" to your users, the usual way is to precede the variable with one underscore. This is just a convention, but people understand it and take double care when dealing with such stuff:
This can be useful, too, for avoiding conflict between property names and attribute names:
What about the double underscore? Well, we use the double underscore magic mainly to avoid accidental overloading of methods and name conflicts with superclasses' attributes. It can be pretty valuable if you write a class to be extended many times.
If you want to use it for other purposes, you can, but it is neither usual nor recommended.
EDIT: Why is this so? Well, the usual Python style does not emphasize making things private - on the contrary! There are many reasons for that - most of them controversial... Let us see some of them.
Python has properties
Today, most OO languages use the opposite approach: what should not be used should not be visible, so attributes should be private. Theoretically, this would yield more manageable, less coupled classes because no one would change the objects' values recklessly.
However, it is not so simple. For example, Java classes have many getters that only get the values and setters that only set the values. You need, let us say, seven lines of code to declare a single attribute - which a Python programmer would say is needlessly complex. Also, you write a lot of code to get one public field since you can change its value using the getters and setters in practice.
So why follow this private-by-default policy? Just make your attributes public by default. Of course, this is problematic in Java because if you decide to add some validation to your attribute, it would require you to change all:
in your code to, let us say,
setAge()
being:So in Java (and other languages), the default is to use getters and setters anyway because they can be annoying to write but can spare you much time if you find yourself in the situation I've described.
However, you do not need to do it in Python since Python has properties. If you have this class:
...and then you decide to validate ages, you do not need to change the
person.age = age
pieces of your code. Just add a property (as shown below)Suppose you can do it and still use
person.age = age
, why would you add private fields and getters and setters?(Also, see Python is not Java and this article about the harms of using getters and setters.).
Everything is visible anyway - and trying to hide complicates your work
Even in languages with private attributes, you can access them through some reflection/introspection library. And people do it a lot, in frameworks and for solving urgent needs. The problem is that introspection libraries are just a complicated way of doing what you could do with public attributes.
Since Python is a very dynamic language, adding this burden to your classes is counterproductive.
The problem is not being possible to see - it is being required to see
For a Pythonista, encapsulation is not the inability to see the internals of classes but the possibility of avoiding looking at it. Encapsulation is the property of a component that the user can use without concerning about the internal details. If you can use a component without bothering yourself about its implementation, then it is encapsulated (in the opinion of a Python programmer).
Now, if you wrote a class you can use it without thinking about implementation details, there is no problem if you want to look inside the class for some reason. The point is: your API should be good, and the rest is details.
Guido said so
Well, this is not controversial: he said so, actually. (Look for "open kimono.")
This is culture
Yes, there are some reasons, but no critical reason. This is primarily a cultural aspect of programming in Python. Frankly, it could be the other way, too - but it is not. Also, you could just as easily ask the other way around: why do some languages use private attributes by default? For the same main reason as for the Python practice: because it is the culture of these languages, and each choice has advantages and disadvantages.
Since there already is this culture, you are well-advised to follow it. Otherwise, you will get annoyed by Python programmers telling you to remove the
__
from your code when you ask a question in Stack Overflow :)首先 - 什么是名称修改?
当您在类定义中并使用
__any_name
或__any_name_
(即两个(或更多)前导下划线并在最多有一个尾随下划线。现在:
表面上的用途是防止子类使用该类使用的属性。
潜在的价值在于避免与想要覆盖行为的子类程序发生名称冲突,以便父类功能保持按预期工作。但是,示例 Python 文档是 Liskov 不可替代的,而且我也没有想到任何有用的例子。
缺点是它增加了阅读和理解代码库的认知负担,尤其是在调试时,您会在源代码中看到双下划线名称,在调试器中看到损坏的名称。
我个人的做法是有意避免它。我在一个非常大的代码库上工作。它的罕见用途就像拇指酸痛一样突出,而且似乎没有道理。
你确实需要意识到它,这样当你看到它时你就知道了。
PEP 8
PEP 8,Python 标准库风格指南,目前表示(删节) ):
它是如何运作的?
如果在类定义中添加两个下划线(不以双下划线结尾),则名称将被破坏,并且在对象上将添加下划线后跟类名:
请注意,仅当类定义为parsed:
此外,当 Python 新手无法手动访问类定义中定义的名称时,他们有时会难以理解正在发生的情况。这并不是反对它的有力理由,但如果您有学习受众,则需要考虑这一点。
一个下划线?
当我的目的是让用户不要把手放在某个属性上时,我倾向于只使用一个下划线,但这是因为在我的心理模型中,子类可以访问该名称(他们总是拥有该名称,因为无论如何他们都可以轻松地发现损坏的名称)。
如果我正在审查使用
__
前缀的代码,我会问他们为什么要调用名称修改,以及他们是否不能用单个下划线做得很好,请记住,如果子类化为类和类属性选择相同的名称,尽管如此,还是会发生名称冲突。First - What is name mangling?
Name mangling is invoked when you are in a class definition and use
__any_name
or__any_name_
, that is, two (or more) leading underscores and at most one trailing underscore.And now:
The ostensible use is to prevent subclassers from using an attribute that the class uses.
A potential value is in avoiding name collisions with subclassers who want to override behavior, so that the parent class functionality keeps working as expected. However, the example in the Python documentation is not Liskov substitutable, and no examples come to mind where I have found this useful.
The downsides are that it increases cognitive load for reading and understanding a code base, and especially so when debugging where you see the double underscore name in the source and a mangled name in the debugger.
My personal approach is to intentionally avoid it. I work on a very large code base. The rare uses of it stick out like a sore thumb and do not seem justified.
You do need to be aware of it so you know it when you see it.
PEP 8
PEP 8, the Python standard library style guide, currently says (abridged):
How does it work?
If you prepend two underscores (without ending double-underscores) in a class definition, the name will be mangled, and an underscore followed by the class name will be prepended on the object:
Note that names will only get mangled when the class definition is parsed:
Also, those new to Python sometimes have trouble understanding what's going on when they can't manually access a name they see defined in a class definition. This is not a strong reason against it, but it's something to consider if you have a learning audience.
One Underscore?
When my intention is for users to keep their hands off an attribute, I tend to only use the one underscore, but that's because in my mental model, subclassers would have access to the name (which they always have, as they can easily spot the mangled name anyways).
If I were reviewing code that uses the
__
prefix, I would ask why they're invoking name mangling, and if they couldn't do just as well with a single underscore, keeping in mind that if subclassers choose the same names for the class and class attribute there will be a name collision in spite of this.我不会说实践会产生更好的代码。可见性修饰符只会分散您对手头任务的注意力,并且作为副作用会迫使您的界面按您的预期使用。一般来说,强制可见性可以防止程序员在没有正确阅读文档的情况下把事情搞砸。
更好的解决方案是 Python 鼓励的路线:您的类和变量应该有详细的文档记录,并且它们的行为应该清晰。来源应该是可用的。这是一种更具可扩展性和更可靠的代码编写方式。
我的 Python 策略是这样的:
最重要的是,应该清楚每件事的作用。如果其他人将使用它,请记录下来。如果您希望它在一年内有用,请将其记录下来。
附带说明一下,您实际上应该在其他语言中使用“protected”:您永远不知道您的类以后可能会被继承以及它的用途。最好只保护那些您确定不能或不应该被外部代码使用的变量。
I wouldn't say that practice produces better code. Visibility modifiers only distract you from the task at hand, and as a side effect force your interface to be used as you intended. Generally speaking, enforcing visibility prevents programmers from messing things up if they haven't read the documentation properly.
A far better solution is the route that Python encourages: Your classes and variables should be well documented, and their behaviour clear. The source should be available. This is far more extensible and reliable way to write code.
My strategy in Python is this:
Above all, it should be clear what everything does. Document it if someone else will be using it. Document it if you want it to be useful in a year's time.
As a side note, you should actually be going with protected in those other languages: You never know your class might be inherited later and for what it might be used. Best to only protect those variables that you are certain cannot or should not be used by foreign code.
您不应该从私人数据开始,并在必要时将其公开。相反,您应该从弄清楚对象的接口开始。也就是说,你应该首先弄清楚世界所看到的是什么(公共的东西),然后弄清楚要实现这一点需要哪些私人的东西。
其他语言很难使曾经公开的事情变得私密。也就是说,如果我将变量设为私有或受保护,我会破坏很多代码。但对于 python 中的属性来说,情况并非如此。相反,即使重新排列内部数据,我也可以保持相同的界面。
_ 和 __ 之间的区别在于 python 实际上尝试强制执行后者。当然,它并没有真正努力,但确实让它变得困难。 _ 只是告诉其他程序员其意图是什么,他们可以随意忽略,后果自负。但忽略这条规则有时会有所帮助。示例包括调试、临时黑客以及使用不符合您使用方式的第三方代码。
You shouldn't start with private data and make it public as necessary. Rather, you should start by figuring out the interface of your object. I.e. you should start by figuring out what the world sees (the public stuff) and then figure out what private stuff is necessary for that to happen.
Other language make difficult to make private that which once was public. I.e. I'll break lots of code if I make my variable private or protected. But with properties in python this isn't the case. Rather, I can maintain the same interface even with rearranging the internal data.
The difference between _ and __ is that python actually makes an attempt to enforce the latter. Of course, it doesn't try really hard but it does make it difficult. Having _ merely tells other programmers what the intention is, they are free to ignore at their peril. But ignoring that rule is sometimes helpful. Examples include debugging, temporary hacks, and working with third party code that wasn't intended to be used the way you use it.
对此已经有很多好的答案,但我将提供另一个。这也是对那些一直说双下划线不是私有的(确实是私有的)的人的回应。
如果你看看Java/C#,它们都有私有/受保护/公共。所有这些都是编译时构造。它们仅在编译时强制执行。如果您要在 Java/C# 中使用反射,您可以轻松访问私有方法。
现在,每次在 Python 中调用函数时,本质上都是在使用反射。这些代码在Python中是相同的。
“点”语法只是后一段代码的语法糖。主要是因为仅使用一次函数调用就已经很丑陋了。从那里开始情况变得更糟。
因此,不可能有 Java/C# 版本的 private,因为 Python 不会编译代码。 Java 和 C# 无法在运行时检查函数是私有的还是公共的,因为该信息已经消失(并且它不知道函数是从哪里调用的)。
现在有了这些信息,双下划线的名称修饰对于实现“私有性”最有意义。现在,当从“self”实例调用一个函数并且它注意到它以“__”开头时,它只是在那里执行名称修改。它只是更多的语法糖。该语法糖允许在仅使用反射来访问数据成员的语言中相当于“私有”。
免责声明:我从未听过 Python 开发人员说过这样的话。缺乏“私有”的真正原因是文化,但您也会注意到大多数脚本/解释语言没有私有。除了编译时之外,严格可执行的私有在任何方面都不实用。
There are already a lot of good answers to this, but I'm going to offer another one. This is also partially a response to people who keep saying that double underscore isn't private (it really is).
If you look at Java/C#, both of them have private/protected/public. All of these are compile-time constructs. They are only enforced at the time of compilation. If you were to use reflection in Java/C#, you could easily access private method.
Now every time you call a function in Python, you are inherently using reflection. These pieces of code are the same in Python.
The "dot" syntax is only syntactic sugar for the latter piece of code. Mostly because using getattr is already ugly with only one function call. It just gets worse from there.
So with that, there can't be a Java/C# version of private, as Python doesn't compile the code. Java and C# can't check if a function is private or public at runtime, as that information is gone (and it has no knowledge of where the function is being called from).
Now with that information, the name mangling of the double underscore makes the most sense for achieving "private-ness". Now when a function is called from the 'self' instance and it notices that it starts with '__', it just performs the name mangling right there. It's just more syntactic sugar. That syntactic sugar allows the equivalent of 'private' in a language that only uses reflection for data member access.
Disclaimer: I have never heard anybody from the Python development say anything like this. The real reason for the lack of "private" is cultural, but you'll also notice that most scripting/interpreted languages have no private. A strictly enforceable private is not practical at anything except for compile time.