链接器如何允许在不同的目标文件中对函数模板进行多个定义,但只允许普通函数的一个定义
我知道如何在使用 C++ 模板时使用 inline 关键字来避免“多重定义”。然而,我很好奇的是,链接器如何区分哪个专业化是完全专业化并违反 ODR 并报告错误,而另一个专业化是隐式的并正确处理它?
从 nm
输出中,我们可以看到 main.o 和 other.o 中 int-version max() 和 char-version max() 都有重复的定义,但 C++ 链接器仅报告“多重定义错误”对于 char-version max()' 但让 'char-version max() 成功链接?链接器如何区分它们?
// tmplhdr.hpp
#include <iostream>
// this function is instantiated in main.o and other.o
// but leads no 'multiple definition' error by linker
template<typename T>
T max(T a, T b)
{
std::cout << "match generic\n";
return (b<a)?a:b;
}
// 'multiple definition' link error if without inline
template<>
inline char max(char a, char b)
{
std::cout << "match full specialization\n";
return (b<a)?a:b;
}
// main.cpp
#include "tmplhdr.hpp"
extern int mymax(int, int);
int main()
{
std::cout << max(1,2) << std::endl;
std::cout << mymax(10,20) << std::endl;
std::cout << max('a','b') << std::endl;
return 0;
}
// other.cpp
#include "tmplhdr.hpp"
int mymax(int a, int b)
{
return max(a, b);
}
Ubuntu 上的测试输出是合理的;但 Cygwin 上的输出相当奇怪且令人困惑...
==== Cygwin 上的测试 ====
g++ 链接器仅报告“char max(char, char)”重复。
$ g++ -o main.exe main.cpp other.cpp
/usr/lib/gcc/x86_64-pc-cygwin/11/../../../../x86_64-pc-cygwin/bin/ld:
/tmp/ccYivs3O.o:other.cpp:(.text$_Z3maxIcET_S0_S0_[_Z3maxIcET_S0_S0_]+0x0):
multiple definition of `char max<char>(char, char)';
/tmp/cc7HJqbS.o:main.cpp:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
我转储了 .o 对象文件,但没有发现太多线索(也许我对对象格式规范不太熟悉)。
$ nm main.o | grep max | c++filt.exe
0000000000000000 p .pdata$_Z3maxIcET_S0_S0_
0000000000000000 p .pdata$_Z3maxIiET_S0_S0_
0000000000000000 t .text$_Z3maxIcET_S0_S0_
0000000000000000 t .text$_Z3maxIiET_S0_S0_
0000000000000000 r .xdata$_Z3maxIcET_S0_S0_
0000000000000000 r .xdata$_Z3maxIiET_S0_S0_
0000000000000000 T char max<char>(char, char) <-- full specialization
0000000000000000 T int max<int>(int, int) <<-- implicit specialization
U mymax(int, int)
$ nm other.o | grep max | c++filt.exe
0000000000000000 p .pdata$_Z3maxIcET_S0_S0_
0000000000000000 p .pdata$_Z3maxIiET_S0_S0_
0000000000000000 t .text$_Z3maxIcET_S0_S0_
0000000000000000 t .text$_Z3maxIiET_S0_S0_
0000000000000000 r .xdata$_Z3maxIcET_S0_S0_
0000000000000000 r .xdata$_Z3maxIiET_S0_S0_
000000000000009b t _GLOBAL__sub_I__Z5mymaxii
0000000000000000 T char max<char>(char, char) <-- full specialization
0000000000000000 T int max<int>(int, int) <-- implicit specialization
0000000000000000 T mymax(int, int)
==== 在 Ubuntu 上测试 ====
这是我在从 tmplhdr.hpp
tony@Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ g++ -o main main.o other.o
/usr/bin/ld: other.o: in function `char max<char>(char, char)':
other.cpp:(.text+0x0): multiple definition of `char max<char>(char, char)'; main.o:main.cpp:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
'char-version 中删除 inline
后使用 g++-9 在 Ubuntu 上得到的结果max()'用T
标记,不允许有多个定义;但'in-version max()'被标记为W
,它允许多个定义。然而,我开始好奇为什么 nm 在 Cygwin 上与 Ubuntu 上给出不同的标记?为什么 Cgywin 上的链接器可以正确处理两个 T
定义?
tony@Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ nm main.o | grep max | c++filt
0000000000000133 t _GLOBAL__sub_I__Z3maxIcET_S0_S0_
0000000000000000 T char max<char>(char, char)
0000000000000000 W int max<int>(int, int)
U mymax(int, int)
tony@Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ nm other.o | grep max | c++filt
00000000000000d7 t _GLOBAL__sub_I__Z3maxIcET_S0_S0_
0000000000000000 T char max<char>(char, char)
0000000000000000 W int max<int>(int, int)
000000000000003e T mymax(int, int)
I know how to use inline keyword to avoid 'multiple definition' while using C++ template. However, what I am curious is that how linker is distinguishing which specialization is full specialization and violating ODR and reporting error, while another specialization is implicit and correctly handle it?
From the nm
output, we can see duplicated definitions in main.o and other.o for both int-version max() and char-version max(), but C++ linker only reports 'multiple definition error for char-version max()' but let 'char-version max() go a successful link? How linker differentiate them and does this?
// tmplhdr.hpp
#include <iostream>
// this function is instantiated in main.o and other.o
// but leads no 'multiple definition' error by linker
template<typename T>
T max(T a, T b)
{
std::cout << "match generic\n";
return (b<a)?a:b;
}
// 'multiple definition' link error if without inline
template<>
inline char max(char a, char b)
{
std::cout << "match full specialization\n";
return (b<a)?a:b;
}
// main.cpp
#include "tmplhdr.hpp"
extern int mymax(int, int);
int main()
{
std::cout << max(1,2) << std::endl;
std::cout << mymax(10,20) << std::endl;
std::cout << max('a','b') << std::endl;
return 0;
}
// other.cpp
#include "tmplhdr.hpp"
int mymax(int a, int b)
{
return max(a, b);
}
Test output on Ubuntu is reasonable; but output on Cygwin is rather strange and confusing...
==== Test on Cygwin ====
g++ linker only reported 'char max(char, char)' is duplicated.
$ g++ -o main.exe main.cpp other.cpp
/usr/lib/gcc/x86_64-pc-cygwin/11/../../../../x86_64-pc-cygwin/bin/ld:
/tmp/ccYivs3O.o:other.cpp:(.text$_Z3maxIcET_S0_S0_[_Z3maxIcET_S0_S0_]+0x0):
multiple definition of `char max<char>(char, char)';
/tmp/cc7HJqbS.o:main.cpp:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
I dumped my .o object file and found no many clues (maybe I am not quite familiar with object format spec.).
$ nm main.o | grep max | c++filt.exe
0000000000000000 p .pdata$_Z3maxIcET_S0_S0_
0000000000000000 p .pdata$_Z3maxIiET_S0_S0_
0000000000000000 t .text$_Z3maxIcET_S0_S0_
0000000000000000 t .text$_Z3maxIiET_S0_S0_
0000000000000000 r .xdata$_Z3maxIcET_S0_S0_
0000000000000000 r .xdata$_Z3maxIiET_S0_S0_
0000000000000000 T char max<char>(char, char) <-- full specialization
0000000000000000 T int max<int>(int, int) <<-- implicit specialization
U mymax(int, int)
$ nm other.o | grep max | c++filt.exe
0000000000000000 p .pdata$_Z3maxIcET_S0_S0_
0000000000000000 p .pdata$_Z3maxIiET_S0_S0_
0000000000000000 t .text$_Z3maxIcET_S0_S0_
0000000000000000 t .text$_Z3maxIiET_S0_S0_
0000000000000000 r .xdata$_Z3maxIcET_S0_S0_
0000000000000000 r .xdata$_Z3maxIiET_S0_S0_
000000000000009b t _GLOBAL__sub_I__Z5mymaxii
0000000000000000 T char max<char>(char, char) <-- full specialization
0000000000000000 T int max<int>(int, int) <-- implicit specialization
0000000000000000 T mymax(int, int)
==== Test on Ubuntu ====
This is what I have got on my Ubuntu with g++-9 after having remove inline
from tmplhdr.hpp
tony@Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ g++ -o main main.o other.o
/usr/bin/ld: other.o: in function `char max<char>(char, char)':
other.cpp:(.text+0x0): multiple definition of `char max<char>(char, char)'; main.o:main.cpp:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
'char-version max()' is marked with T
which is not allowed to have multiple definitions; but 'in-version max()' is marked as W
which allows multiple definitions. However, I start to be curious why nm
gives different marks on Cygwin than on Ubuntu?? and Why linker on Cgywin can handle two T
definitions correctly?
tony@Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ nm main.o | grep max | c++filt
0000000000000133 t _GLOBAL__sub_I__Z3maxIcET_S0_S0_
0000000000000000 T char max<char>(char, char)
0000000000000000 W int max<int>(int, int)
U mymax(int, int)
tony@Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ nm other.o | grep max | c++filt
00000000000000d7 t _GLOBAL__sub_I__Z3maxIcET_S0_S0_
0000000000000000 T char max<char>(char, char)
0000000000000000 W int max<int>(int, int)
000000000000003e T mymax(int, int)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您需要了解
nm
输出不能为您提供完整的情况。nm
是 binutils 的一部分,并使用libbfd
。其工作原理是将各种目标文件格式解析为 libbfd 内部表示,然后像 nm 之类的工具以人类可读的格式打印该内部表示。有些东西会“在翻译中丢失”。这就是为什么你应该~永远不要使用例如
objdump
来查看ELF< /code> 文件(至少不在
ELF
文件的符号表中)。正如您正确推断的那样,Linux 上允许使用多个
max()
符号的原因是编译器将它们作为W
(弱定义)符号发出。Windows 也是如此,除了 Windows 使用较旧的
COFF
格式,该格式没有弱符号。相反,该符号被发送到一个特殊的.linkonce.$name
部分,并且链接器知道它可以选择任何此类部分到链接中,但应该只执行一次一次 (即它知道丢弃任何其他目标文件中该部分的所有其他重复项)。You need to understand that the
nm
output does not give you the full picture.nm
is part of binutils, and useslibbfd
. The way this works is that various object file formats are parsed intolibbfd
-internal representation, and then tools likenm
print that internal representation in human-readable format.Some things get "lost in translation". This is the reason you should ~never use e.g.
objdump
to look atELF
files (at least not at the symbol table of theELF
files).As you correctly deduced, the reason multiple
max<int>()
symbols are allowed on Linux is that the compiler emits them as aW
(weakly defined) symbol.The same is true for Windows, except Windows uses older
COFF
format, which doesn't have weak symbols. Instead, the symbol is emitted into a special.linkonce.$name
section, and the linker knows that it can select any such section into the link, but should only do that once (i.e. it knows to discard all other duplicates of that section in any other object file).