什么是名称修改,它是如何工作的?
请解释什么是名称修改、它是如何工作的、它解决什么问题以及使用的上下文和语言。名称修改策略(例如编译器选择什么名称以及为什么)是一个优点。
Please explain what is name mangling, how it works, what problem it solves, and in which contexts and languages is used. Name mangling strategies (e.g. what name is chosen by the compiler and why) a plus.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
在您选择的编程语言中,如果标识符是从单独编译的单元导出的,则它需要一个在链接时已知的名称。名称修改解决了编程语言中标识符重载的问题。 (如果在多个上下文中使用相同的名称或具有多个含义,则标识符被“重载”。)
一些示例:
在 C++ 中,函数或方法
get
可能会在以下位置重载:多种类型。在 Ada 或 Modula-3 中,函数
get
可能出现在多个模块中。多种类型和多种模块涵盖了常见的上下文。
典型策略:
将每种类型映射到字符串,并使用组合的高级标识符和“类型字符串”作为链接时名称。在 C++ 中很常见(特别容易,因为仅允许函数/方法和参数类型重载)和 Ada(也可以重载结果类型)。
如果一个标识符在多个模块或命名空间中使用,请将模块名称与标识符名称连接起来,例如,
List_get
而不是List.get
.根据链接时名称中哪些字符是合法的,您可能需要进行额外的修改;例如,可能需要使用下划线作为“转义”字符,以便您可以区分
List_my.get
->List__my_get
来自
List.my_get
->List_my__get
(诚然,这个例子已经达到了,但作为编译器编写者,我必须保证源代码中的不同标识符映射到不同的链接时名称。这就是全部名称修改的原因和目的。)
In the programming language of your choice, if an identifier is exported from a separately compiled unit, it needs a name by which it is known at link time. Name mangling solves the problem of overloaded identifiers in programming languages. (An identifier is "overloaded" if the same name is used in more than one context or with more than one meaning.)
Some examples:
In C++, function or method
get
may be overloaded at multiple types.In Ada or Modula-3, function
get
may appear in multiple modules.Multiple types and multiple modules cover the usual contexts.
Typical strategies:
Map each type to a string and use the combined high-level identifier and "type string" as the link-time name. Common in C++ (especially easy since overloading is permitted only for functions/methods and only on argument types) and Ada (where you can overload result types as well).
If an identifier is used in more than one module or namespace, join the name of the module with the name of the identifier, e.g.,
List_get
instead ofList.get
.Depending on what characters are legal in link-time names, you may have to do additional mangling; for example, it may be necessary to use the underscore as an 'escape' character, so you can distinguish
List_my.get
->List__my_get
from
List.my_get
->List_my__get
(Admittedly this example is reaching, but as a compiler writer, I have to guarantee that distinct identifiers in the source code map to distinct link-time names. That's the whole reason and purpose for name mangling.)
简而言之,名称修改是编译器更改源代码中标识符名称的过程,以帮助 linker 来消除这些标识符之间的歧义。
维基百科有一篇关于这个主题的精彩文章,其中有几个很好的例子。
Simply put, name-mangling is a process by which compilers changes the names of identifiers in your source code in order to aid the linker in disambiguating between those identifiers.
Wikipedia has a wonderful article on this subject with several great examples.
在Python中,名称修饰是一种系统,通过该系统,类变量在类内部和外部具有不同的名称。程序员通过在变量名的开头放置两个下划线来“激活”它。
例如,我可以定义一个带有一些成员的简单类:
在Python实践中,以下划线开头的变量名是“内部”的,而不是类接口的一部分,因此程序员不应该依赖它。然而,它仍然可见:
以两个下划线开头的变量名仍然是公共的,但它是名称修饰的,因此更难访问:
但是,如果我们知道名称修饰是如何工作的,我们就可以理解它:
即classname 在变量名前面加上一个额外的下划线。
Python 没有“私有”成员与“公共”成员的概念;一切都是公开的。名称修改是程序员可以发出的最强烈的信号,即不应从类外部访问该变量。
In python, name-mangling is a system by which class variables have different names inside and outside the class. The programmer "activates" it by putting two underscores at the start of the variable name.
For example, I can define a simple class with some members:
In python practice, a variable name starting with an underscore is "internal" and not part of the class interface, and so programmers should not rely on it. However, it is still visible:
A variable name starting with two underscores is still public, but it is name-mangled and thus harder to access:
If we know how the name-mangling works, however, we can get at it:
i.e. the classname is prepended to the variable name with an extra underscore.
Python has no concept of 'private' versus 'public' members; everything is public. Name-mangling is the strongest-possible signal a programmer can send that the variable should not be accessed from outside the class.
名称修改是编译器修改对象的“编译”名称的一种方法,使其成为与您以一致方式指定的不同。
这使得编程语言能够灵活地为多个编译对象提供相同的名称,并具有一致的方式来查找适当的对象。例如,这允许具有相同名称的多个类存在于不同的命名空间中(通常通过将命名空间添加到类名中,等等)。
许多语言中的运算符和方法重载更进一步 - 每个方法最终都会在编译库中以“损坏的”名称结尾,以便允许一种类型上的多个方法以相同的名称存在。
Name mangling is a means by which compilers modify the "compiled" name of an object, to make it different than what you specified in a consistent manner.
This allows a programming language the flexibility to provide the same name to multiple, compiled objects, and have a consistent way to lookup the appropriate object. For example, this allows multiple classes with the same name to exist in different namespaces (often by prepending the namespace into the class name, etc).
Operator and method overloading in many languages take this a step further - each method ends up with a "mangled" name in the compiled library in order to allow multiple methods on one type to exist with the same name.
资料来源:http://sickprogrammersarea.blogspot.in/2014 /03/technical-interview-questions-on-c_6.html
名称修改是 C++ 编译器使用的过程,为程序中的每个函数提供唯一的名称。在 C++ 中,通常程序至少有几个同名的函数。因此,名称修饰可以被视为 C++ 中的一个重要方面。
示例:
通常,成员名称是通过将成员名称与类名称连接起来唯一生成的,例如给定声明:
val 变为如下所示:
Source:http://sickprogrammersarea.blogspot.in/2014/03/technical-interview-questions-on-c_6.html
Name mangling is the process used by C++ compilers give each function in your program a unique name. In C++, generally programs have at-least a few functions with the same name. Thus name mangling can be considered as an important aspect in C++.
Example:
Commonly, member names are uniquely generated by concatenating the name of the member with that of the class e.g. given the declaration:
val becomes something like:
在 Fortran 中,需要名称修饰,因为该语言不区分大小写,这意味着 Foo、FOO、fOo、foo 等都将解析为相同的符号,其名称必须以某种方式规范化。不同的编译器以不同的方式实现重整,当与使用不同编译器编译的 C 或二进制对象进行交互时,这会带来很大的麻烦。例如,GNU g77/g95 总是在小写名称后面添加一个尾部下划线,除非该名称已包含一个或多个下划线。在本例中,添加两个下划线。
例如,以下例程
产生以下重整符号:
为了从 C 调用 Fortran 代码,必须调用正确的重整例程名称(显然要考虑可能的不同重整策略,以真正独立于编译器)。要从 Fortran 调用 C 代码,C 编写的接口必须导出正确修改的名称并将调用转发到 C 例程。然后可以从 Fortran 调用该接口。
In Fortran, name mangling is needed because the language is case insensitive, meaning that Foo, FOO, fOo, foo etc.. will all resolve to the same symbol, whose name must be normalized in some way. Different compilers implement mangling differently, and this a source of great trouble when interfacing with C or binary objects compiled with a different compiler. GNU g77/g95, for example, always adds a trailing underscore to the lowercased name, unless the name already contains one or more underscores. In this case, two underscores are added.
For example, the following routine
Produces the following mangled symbols:
In order to call Fortran code from C, the properly mangled routine name must be invoked (obviously keeping into account possible different mangling strategies to be truly compiler independent). To call C code from fortran, a C-written interface must export properly mangled names and forward the call to the C routine. This interface can then be called from Fortran.
大多数面向对象语言都提供函数重载功能。
函数重载
如果任何类具有多个名称相同但参数不同的函数,则类型 &数,则称它们已超载。函数重载允许您对不同的函数使用相同的名称。
重载函数的方法
如何通过名称修改实现函数重载?
C++ 编译器在生成目标代码时区分不同的函数 - 它通过根据参数的类型和数量添加有关参数的信息来更改名称。这种添加附加信息以形成函数名称的技术称为名称修改。
C++ 标准没有指定任何特定的名称修饰技术,因此不同的编译器可能会向函数名称附加不同的信息。
我已经在gcc4.8.4上运行了示例程序。
该程序有 3 个名为 fun 的函数,它们根据参数数量及其类型而有所不同。
这些函数名称的修改如下:
Most of the object oriented language provide function overloading feature.
Function Overloading
If any class have multiple functions with same names but different parameters type & number then they are said to be overloaded. Function overloading allows you to use the same name for different functions.
Ways to overload a function
How function overloading is achieved with name mangling?
C++ compiler distinguishes between different functions when it generates object code – it changes names by adding information about arguments based on type and number of arguments. This technique of adding additional information to form function names is called Name Mangling.
C++ standard doesn’t specify any particular technique for name mangling, so different compilers may append different information to function names.
I have run the sample program on gcc4.8.4.
This program have 3 functions named fun with differ on based on number of arguments and their types.
These functions name's are mangled as below:
在设计链接编辑器时,C、FORTAN 和 COBOL 等语言没有命名空间、类、类成员等。需要名称修改来支持面向对象的功能,例如那些不支持它们的链接编辑器。链接编辑器不支持附加功能这一事实经常被忽视;人们通过说由于链接编辑器而需要名称修改来暗示这一点。
由于支持名称修改功能的语言要求存在很大差异,因此对于如何在链接编辑器中支持它的问题没有简单的解决方案。链接编辑器旨在处理来自各种编译器的输出(目标模块),因此必须具有支持名称的通用方法。
At the time that link editors were designed, languages such as C, FORTAN and COBOL did not have namespaces, classes, members of classes and such other things. Name mangling is required to support object-oriented features such as those with a link editor that does not support them. The fact that the link editor does not support the additional features is often missed; people imply it by saying that name mangling is required due to the link editor.
Since there is so much variation among language requirements to support what name mangling does, there is not a simple solution to the problem of how to support it in a link editor. Link editors are designed to work with output (object modules) from a variety of compilers and therefore must have a universal way to support names.
之前的所有答案都是正确的,但这里是带有示例的 Python 视角/推理。
定义
当类中的变量有前缀 __(即两个下划线)&没有后缀 __(即两个或更多下划线)那么它被认为是私有标识符。
Python 解释器会转换任何私有标识符,并将名称改写为 _class__identfier
为什么
这是必要的,因为可以避免因覆盖属性而导致的问题。换句话说,为了覆盖,Python 解释器必须能够为子方法与父方法构建不同的 id,并使用 __(双下划线)使 python 能够做到这一点。在下面的示例中,如果没有 __help,此代码将无法工作。
All previous answers are correct but here is the Python perspective/reasoning with example.
Definition
When an variable in a class has a prefix of __ (i.e. two underscore) & does not have a suffix of __ (i.e. two underscore or more) then it's considered private identfier.
Python interpreter converts any private identifier and it mangles the name to _class__identfier
Why
This is needed because to avoid problems that could be caused by overriding attributes. In other words, in order to override, the Python interpreter has to be able to build distinct id for child method versus parent method and using __ (double underscore) enable python to do this. In below example, without __help this code would not work.
这里的答案很棒,所以这只是我的一点经验的补充:我使用名称修饰来了解什么工具(gcc / vs / ...)以及参数如何传递到堆栈中以及我使用的调用约定处理,并且基于名称,例如,如果看到
_main
我知道它是一个Cdecl
对其他人来说也是如此the answers here are awesome so this is just an addition from my little experience: i use name mangling in order to know , what tools ( gcc / vs /...) and how parameters passed into the stack and what calling convention i'm dealing with, and that based on the name so for example if see
_main
i know it's aCdecl
the same for others