如何删除未使用的 C/C++带有 GCC 和 ld 的符号?
我需要严格优化可执行文件的大小(ARM
开发)并且 我注意到在我当前的构建方案(gcc
+ ld
)中未使用的符号不会被删除。
对生成的可执行文件/库使用 arm-strip --strip-unneeded
不会改变可执行文件的输出大小(我不知道为什么,也许它只是可以' t)。
修改我的构建管道的方式是什么(如果存在),以便从结果文件中删除未使用的符号?
我什至不会想到这一点,但我当前的嵌入式环境不是很“强大”并且 即使从 2M
中节省 500K
也会带来非常好的加载性能提升。
更新:
不幸的是,我使用的当前gcc
版本没有-dead-strip
选项和-ffunction-sections。 .. + --gc-sections
for ld
不会对结果输出产生任何显着差异。
我很震惊这甚至成为了一个问题,因为我确信 gcc + ld
应该自动删除未使用的符号(为什么他们甚至必须保留它们?)。
I need to optimize the size of my executable severely (ARM
development) and
I noticed that in my current build scheme (gcc
+ ld
) unused symbols are not getting stripped.
The usage of the arm-strip --strip-unneeded
for the resulting executables / libraries doesn't change the output size of the executable (I have no idea why, maybe it simply can't).
What would be the way (if it exists) to modify my building pipeline, so that the unused symbols are stripped from the resulting file?
I wouldn't even think of this, but my current embedded environment isn't very "powerful" and
saving even 500K
out of 2M
results in a very nice loading performance boost.
Update:
Unfortunately the current gcc
version I use doesn't have the -dead-strip
option and the -ffunction-sections... + --gc-sections
for ld
doesn't give any significant difference for the resulting output.
I'm shocked that this even became a problem, because I was sure that gcc + ld
should automatically strip unused symbols (why do they even have to keep them?).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(13)
-fdata-sections -ffunction-sections -Wl,--gc-sections
最小示例分析这些选项在以下位置提到:https://stackoverflow.com/a/6770305/895245 我只是想确认它们是否有效并检查一下如何使用objdump。
我们得出的结论与其他帖子提到的类似:
-flto
会导致未使用的符号被删除,即使同一编译单元中使用了其他符号单独文件,仅限
-O3
notmain.c
main.c
编译仅适用于
-O3
:反汇编
notmain.o
:输出包含:
反汇编
notmain.o
:输出包含:
结论
i2
和f2
即使没有使用,也出现在最终的输出文件中。即使我们添加了
-Wl,--gc-sections
来:尝试删除未使用的部分,也不会改变任何内容,因为在目标文件
notmain.o
i2
与i1
(.data
) 出现在同一部分,f2
出现在同一部分作为f1
(.text
),它们已被使用,因此将其整个部分带入最终文件中。-fdata-sections -ffunction-sections -Wl,--gc-sections
我们将编译命令修改为:
反汇编
notmain.o
:输出包含:
因此我们看到所有内容如何根据符号名称本身命名自己的部分。
反汇编
notmain.o
:输出包含:
并且不包含
i2
和f2
。这是因为这次每个符号都在自己的部分中,因此-Wl,--gc-sections
能够删除每个未使用的符号。内联使符号不被视为已使用
为了测试内联的效果,让我们将测试符号移动到与
main.c
相同的文件中:main2.c
然后:
反汇编
main2.o
:输出包含:
有趣的是
main
位于单独的部分.text.startup
上,可能是为了允许其余文本进行 GC 处理。我们还看到
f1
在lea 0x1(%rax,%rdi,1),%eax
上完全内联(直接加 1),而出于我不知道的原因了解i1
仍用于mov 0x0(%rip),%eax
待重定位,另请参阅:链接器是做什么的?反汇编main2.out
后重定位就清楚了以下。反汇编
main2.out
:输出包含:
并且
f1
和f2
被完全删除,因为f1
被内联并且因此不再标记为已使用,因此整个.text
部分被删除。如果我们强制
f1
不内联:那么
f1
和f2
都会出现在main2.out
上。不同目标文件的部分是独立的,即使它们具有相同的名称
显然,例如:
notmain2.c
然后:
不包含
f3
和f4
,尽管包含了f1
和f2
,并且两者都不是名为.text
的部分。可能的缺点:
-fdata-sections -ffunction-sections -Wl,--gc-sections
:链接速度较慢我们应该找到一些基准,但这很可能,因为当一个符号引用同一编译单元中的另一个符号时,需要进行更多的重定位,因为它们不再出现在独立部分中。
-flto
会导致符号被删除,即使使用了同一编译单元中的其他符号此外,无论 LTO 是否会导致内联发生,都会发生这种情况。考虑:
notmain.c
main.c
编译和反汇编:
反汇编包含:
并且
f2
不存在。因此,即使使用了f1
,f2
也被删除了。我们还注意到
i1
和i2
消失了。编译器似乎认识到i1
从未真正修改过,只是将其“内联”为常量1
:add $0x1,%eax
。相关问题: GCC LTO 是否执行跨文件死代码消除? 由于某种原因,如果使用
-O0
编译目标文件,则不会发生代码消除:为什么 GCC 不这样做使用 -O0 编译目标文件时是否可以使用 LTO 消除死代码?在 Ubuntu 23.04 amd64、GCC 12.2.0 上测试。
-fdata-sections -ffunction-sections -Wl,--gc-sections
minimal example analysisThese options were mentioned at: https://stackoverflow.com/a/6770305/895245 and I just wanted to confirm that they work and inspect a bit how with
objdump
.The conclusions we draw similar to what others posts mentioned:
-flto
leads to unused symbols being removed even if other symbols are used in the same compilation unitSeparate files,
-O3
onlynotmain.c
main.c
Compile only with
-O3
:Disassemble
notmain.o
:The output contains:
Disassemble
notmain.o
:The output contains:
Conclusion both
i2
andf2
were present in the final output file even though the weren't used.Even if we had added
-Wl,--gc-sections
to:to try and remove unused sections, that wouldn't have changed anything, because in the object file
notmain.o
i2
appears in the same section asi1
(.data
), andf2
appears in the same section asf1
(.text
), which were used and therefore bring their entire sections in the final file.-fdata-sections -ffunction-sections -Wl,--gc-sections
We modify the compilation commands to:
Disassemble
notmain.o
:Output contains:
So we see how everything gets its own section named based on the symbol name itself.
Disassemble
notmain.o
:The output contains:
and it does not contain
i2
norf2
. This is because this time every symbol was in its own section, and so-Wl,--gc-sections
was able to remove every single unused symbol.Inlining makes a symbol not be considered as used
To test the effect of inlining, let's move our test symbols to the same file as
main.c
:main2.c
And then:
Disassemble
main2.o
:The output contains:
Interesting how
main
is on a separate section.text.startup
, possibly to allow the rest of text to be GC'ed.We also see that
f1
was fully inlined onlea 0x1(%rax,%rdi,1),%eax
(directly adds 1), while for reasons I don't understandi1
is still used atmov 0x0(%rip),%eax
pending relocation, see also: What do linkers do? The relocation will be clear after disassemblingmain2.out
below.Disassemble
main2.out
:The output contains:
and
f1
andf2
were entirely removed, becausef1
was inlined and therefore not marked as used anymore, so the entire.text
section got removed.If we forced
f1
not to be inlined with:then both
f1
andf2
would appear onmain2.out
.Sections of different object files are separate even though they have the same name
Obviously, e.g.:
notmain2.c
and then:
does not contain
f3
andf4
, even thoughf1
andf2
were included, and both are no sections called.text
.Possible downside of:
-fdata-sections -ffunction-sections -Wl,--gc-sections
: slower link speedWe should find some benchmark, but this is likely, as it would require more relocations to be done when one symbol refers to another symbol from the same compilation unit, as they are no present in independent section anymore.
-flto
leads to symbols being removed even if other symbols in the same compilation unite are usedAlso, this happens whether or not LTO would lead to inline happening. Consider:
notmain.c
main.c
Compile and disassemble:
The disassembly contains:
and
f2
is not present. Sof2
was removed even thoughf1
is used.We also note that
i1
andi2
are gone. The compiler appears to recognize thati1
is never really modified and just "inlines" it as the constant1
at:add $0x1,%eax
.Related question: Does GCC LTO perform cross-file dead code elimination? For some reason code elimination does not happen if you compile the object file with
-O0
: Why GCC does not do function dead code elimination with LTO when compiling the object file with -O0?Tested on Ubuntu 23.04 amd64, GCC 12.2.0.
建议遗留行为使用所有可选代码构建静态库,并将编译单元减少到容纳微小任务所需的最低限度(也建议作为 unix 设计中的模式)
当链接代码并指定静态库时(< code>.a 存档)链接器仅处理从初始
crt0.o
代码引用的所有已编译模块,并且无需任何分段编译代码即可实现此目的。我们在代码中做到了这一点,获得的好处可能不是最佳的,但允许我们以良好的内存占用继续开发并节省大量未使用的代码,但永远不会出现诸如让编译器调查之类的问题。我总是使用这个引理:如果该功能不是必需的,则不要绑定它。
Legacy behaviour recommended to build static libraries with all the optional code and to reduce the compilation unit to the minimum necessary to hold a tiny task (recommended also as a pattern in unix design)
When you link the code and specify a static library (a
.a
archive) the linker only processes all the compiled modules that are referenced from the initialcrt0.o
code, and this can be achieved without any section divided compiling code.We have done this in our code getting a probably not optimum benefit, but allowing us to continue development with a good memory footprint and saving a lot of unused code, but never incurring in issues like making the compiler to investigate that. I always use this lemma: if the feature is not necessary, don't tie to it.
您可以在目标文件(例如可执行文件)上使用 strip 二进制文件来从中删除所有符号。
注意:它更改文件本身并且不创建副本。
You can use strip binary on object file(eg. executable) to strip all symbols from it.
Note: it changes file itself and don't create copy.
虽然不严格涉及符号,但如果要考虑大小 - 始终使用
-Os
和-s
标志进行编译。-Os
优化生成的代码以获得最小可执行文件大小,-s
从可执行文件中删除符号表和重定位信息。有时 - 如果需要小尺寸 - 使用不同的优化标志可能 - 也可能不 - 有意义。例如,切换
-ffast-math
和/或-fomit-frame-pointer
有时甚至可以节省数十个字节。While not strictly about symbols, if going for size - always compile with
-Os
and-s
flags.-Os
optimizes the resulting code for minimum executable size and-s
removes the symbol table and relocation information from the executable.Sometimes - if small size is desired - playing around with different optimization flags may - or may not - have significance. For example toggling
-ffast-math
and/or-fomit-frame-pointer
may at times save you even dozens of bytes.strip --strip-unneeded
仅对可执行文件的符号表进行操作。它实际上并没有删除任何可执行代码。标准库通过将所有函数拆分为单独的目标文件,然后使用 ar 组合起来,从而实现您想要的结果。如果您随后将生成的存档链接为库(即向 ld 提供选项
-l your_library
),则 ld 将仅包含目标文件,因此包含实际使用的符号。您还可能会找到对此类似问题 使用。
strip --strip-unneeded
only operates on the symbol table of your executable. It doesn't actually remove any executable code.The standard libraries achieve the result you're after by splitting all of their functions into seperate object files, which are combined using
ar
. If you then link the resultant archive as a library (ie. give the option-l your_library
to ld) then ld will only include the object files, and therefore the symbols, that are actually used.You may also find some of the responses to this similar question of use.
我不知道这是否有助于解决您当前的困境,因为这是最近的功能,但您可以以全局方式指定符号的可见性。在编译时传递 -fvisibility=hidden -fvisibility-inlines-hidden 可以帮助链接器稍后删除不需要的符号。如果您正在生成可执行文件(而不是共享库),则无需执行更多操作。
更多信息(以及例如库的细粒度方法)可在 GCC wiki 上找到。
I don't know if this will help with your current predicament as this is a recent feature, but you can specify the visibility of symbols in a global manner. Passing
-fvisibility=hidden -fvisibility-inlines-hidden
at compilation can help the linker to later get rid of unneeded symbols. If you're producing an executable (as opposed to a shared library) there's nothing more to do.More information (and a fine-grained approach for e.g. libraries) is available on the GCC wiki.
来自 GCC 4.2.1 手册的
-fwhole-program
部分:From the GCC 4.2.1 manual, section
-fwhole-program
:答案是
-flto
。您必须将其传递给编译和链接步骤,否则它不会执行任何操作。它实际上工作得非常好 - 将我编写的微控制器程序的大小减少到之前大小的 50% 以下!
不幸的是,它看起来确实有点问题——我有过一些没有正确构建的例子。这可能是由于我正在使用的构建系统(QBS;它非常新)造成的,但无论如何,我建议您仅在可能的情况下为最终构建启用它,并彻底测试该构建。
The answer is
-flto
. You have to pass it to both your compilation and link steps, otherwise it doesn't do anything.It actually works very well - reduced the size of a microcontroller program I wrote to less than 50% of its previous size!
Unfortunately it did seem a bit buggy - I had instances of things not being built correctly. It may have been due to the build system I'm using (QBS; it's very new), but in any case I'd recommend you only enable it for your final build if possible, and test that build thoroughly.
编程习惯也有帮助;例如,将
static
添加到不在特定文件外部访问的函数;使用较短的符号名称(可以有一点帮助,但可能不会太多);尽可能使用 const char x[] ; ...本文虽然讨论了动态共享对象,但可以包含一些建议,如果遵循这些建议,可以提供帮助使最终的二进制输出大小更小(如果您的目标是 ELF)。Programming habits could help too; e.g. add
static
to functions that are not accessed outside a specific file; use shorter names for symbols (can help a bit, likely not too much); useconst char x[]
where possible; ... this paper, though it talks about dynamic shared objects, can contain suggestions that, if followed, can help to make your final binary output size smaller (if your target is ELF).在我看来,尼莫提供的答案是正确的。如果这些说明不起作用,则问题可能与您使用的 gcc/ld 版本有关,作为练习,我使用详细说明编译了一个示例程序 here
然后我使用逐渐更积极的死代码删除开关编译了代码:
这些编译和链接参数分别生成大小为 8457、8164 和 6160 字节的可执行文件,其中最重要的贡献来自“strip-all”声明。如果您无法在您的平台上产生类似的减少,那么您的 gcc 版本可能不支持此功能。我在 Linux Mint 2.6.38-8-generic x86_64 上使用 gcc(4.5.2-8ubuntu4)、ld(2.21.0.20110327)
It seems to me that the answer provided by Nemo is the correct one. If those instructions do not work, the issue may be related to the version of gcc/ld you're using, as an exercise I compiled an example program using instructions detailed here
Then I compiled the code using progressively more aggressive dead-code removal switches:
These compilation and linking parameters produced executables of size 8457, 8164 and 6160 bytes, respectively, the most substantial contribution coming from the 'strip-all' declaration. If you cannot produce similar reductions on your platform,then maybe your version of gcc does not support this functionality. I'm using gcc(4.5.2-8ubuntu4), ld(2.21.0.20110327) on Linux Mint 2.6.38-8-generic x86_64
如果要相信此帖子,您需要提供
-ffunction-sections
和-fdata-sections
到 gcc,它将把每个函数和数据对象放在自己的部分中。然后你给出--gc-sections< /code>
到 GNU ld 以删除未使用的部分。
If this thread is to be believed, you need to supply the
-ffunction-sections
and-fdata-sections
to gcc, which will put each function and data object in its own section. Then you give and--gc-sections
to GNU ld to remove the unused sections.您需要检查您的文档以了解您的 gcc 和 gcc 版本。 ld:
但是对我来说(OS X gcc 4.0.1)我找到了这些 ld
这个有用的选项
gcc/g++ man 中还有一个注释,只有在编译时启用优化时才会执行某些类型的死代码消除。
虽然这些选项/条件可能不适用于您的编译器,但我建议您在文档中查找类似的内容。
You'll want to check your docs for your version of gcc & ld:
However for me (OS X gcc 4.0.1) I find these for ld
And this helpful option
There's also a note in the gcc/g++ man that certain kinds of dead code elimination are only performed if optimization is enabled when compiling.
While these options/conditions may not hold for your compiler, I suggest you look for something similar in your docs.
对于 GCC,这是通过两个阶段完成的:
首先编译数据,但告诉编译器将代码分成翻译单元内的单独部分。这将通过使用以下两个编译器标志来对函数、类和外部变量完成:
链接器优化标志将翻译单元链接在一起(这会导致链接器丢弃未引用的部分):
使用 如果您的 cpp 中声明了两个函数,但其中一个未使用,您可以使用 gcc(g++) 的以下命令省略未使用的函数:(
请注意,-Os 是一个附加编译器标志,告诉 GCC 优化大小)
For GCC, this is accomplished in two stages:
First compile the data but tell the compiler to separate the code into separate sections within the translation unit. This will be done for functions, classes, and external variables by using the following two compiler flags:
Link the translation units together using the linker optimization flag (this causes the linker to discard unreferenced sections):
So if you had one file called test.cpp that had two functions declared in it, but one of them was unused, you could omit the unused one with the following command to gcc(g++):
(Note that -Os is an additional compiler flag that tells GCC to optimize for size)