GCC/进行构建时间优化
我们有使用 gcc 和 make 文件的项目。 项目还包含一个大子项目(SDK)和许多使用该 SDK 和一些共享框架的相对较小的子项目。
我们使用预编译头,但这仅有助于重新编译速度更快。
是否有任何已知的技术和工具可以帮助构建时优化? 或者您可能知道一些有关此主题或相关主题的文章/资源?
We have project which uses gcc and make files. Project also contains of one big subproject (SDK) and a lot of relatively small subprojects which use that SDK and some shared framework.
We use precompiled headers, but that helps only for re-compilation to be faster.
Is there any known techniques and tools to help with build-time optimizations? Or maybe you know some articles/resources about this or related topics?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
您可以从两个方面解决该问题:重构代码以降低编译器所看到的复杂性,或者加快编译器的执行速度。
无需接触代码,您就可以向其中添加更多编译功能。 使用 ccache 避免重新编译已经编译的文件,并使用 distcc 在更多机器之间分配构建时间。 使用 make -j,其中 N 是核心数+1(如果本地编译),或者更大的数字(对于分布式构建)。 该标志将并行运行多个编译器。
重构代码。 更喜欢前向声明而不是包含(简单)。 尽可能地解耦以避免依赖(使用 PIMPL 习惯用法)。
模板实例化的成本很高,它们会在使用它们的每个编译单元中重新编译。 如果您可以重构模板以转发声明它们,然后仅在一个编译单元中实例化它们。
You can tackle the problem from two sides: refactor the code to reduce the complexity the compiler is seeing, or speed up the compiler execution.
Without touching the code, you can add more compilation power into it. Use ccache to avoid recompiling files you have already compiled and distcc to distribute the build time among more machines. Use make -j where N is the number of cores+1 if you compile locally, or a bigger number for distributed builds. That flag will run more than one compiler in parallel.
Refactoring the code. Prefer forward declaration to includes (simple). Decouple as much as you can to avoid dependencies (use the PIMPL idiom).
Template instantiation is expensive, they are recompiled in every compilation unit that uses them. If you can refactor your templates as to forward declare them and then instantiate them in only one compilation unit.
我能想到的最好的
make
是-j
选项。 这告诉make
并行运行尽可能多的作业:make -j
如果您想将并发作业的数量限制为 n 个,您可以可以使用:
make -j
n确保依赖项正确,以便
make
不会运行不必要的作业。另一件需要考虑的事情是
gcc
使用-O
开关进行的优化。 您可以指定各种优化级别。 优化程度越高,编译和链接时间越长。 我处理的一个项目需要 2 分钟才能与-O3
链接,而与-O1
则需要半分钟。 您应该确保没有进行超出需要的优化。 您可以在不优化开发构建的情况下进行构建,也可以在优化部署构建的情况下进行构建。使用调试信息进行编译 (
gcc -g
) 可能会增加可执行文件的大小,并可能影响构建时间。 如果您不需要它,请尝试将其删除,看看它是否对您有影响。链接的类型(静态与动态)应该有所不同。 据我了解静态链接需要更长的时间(尽管我在这里可能是错的)。 您应该看看这是否会影响您的构建。
The best I can think of with
make
is the-j
option. This tellsmake
to run as many jobs as possible in parallel:make -j
If you want to limit the number of concurrent jobs to n you can use:
make -j
nMake sure the dependencies are correct so
make
doesn't run jobs it doesn't have to.Another thing to take into account is optimizations that
gcc
does with the-O
switch. You can specify various levels of optimization. The higher the optimization, the longer the compile and link times. A project I work with runs takes 2 minutes to link with-O3
, and half a minute with-O1
. You should make sure you're not optimizing more than you need to. You could build without optimization for development builds and with optimization for deployment builds.Compiling with debug info (
gcc -g
) will probably increase the size of your executable and may impact your build time. If you don't need it, try removing it to see if it affects you.The type of linking (static vs. dynamic) should make a difference. As far as I understand static linking takes longer (though I may be wrong here). You should see if this affects your build.
根据项目的描述,我猜想每个目录都有一个 Makefile,并且经常使用递归 make。 在这种情况下,来自“递归使被认为有害”的技术应该有很大帮助。
From the description of the project I guess that you have one Makefile per directory and are using recursive make a lot. In that case techniques from "Recursive Make Considered Harmful" should help very much.
如果您有多台可用的计算机,则 gcc 由 distcc 很好地分发。
另外,您还可以使用ccache。
所有这一切只需对 makefile 进行很少的更改即可实现。
If you have multiple computers available gcc is well distributed by distcc.
You can also use ccache in addition.
All this works with very little changes of the makefiles.
另外,您可能希望使源代码文件尽可能小且独立,即更喜欢许多较小的目标文件而不是一个巨大的单个目标文件。
这也将有助于避免不必要的重新编译,此外,您可以为每个源代码目录或模块拥有一个包含目标文件的静态库,基本上允许编译器尽可能多地重用以前编译的代码。
在之前的任何响应中尚未提及的其他内容是使符号链接尽可能“私有”,即如果代码不必在外部可见,则更喜欢代码的静态链接(函数、变量)。
此外,您可能还想考虑使用 GNU gold 链接器< /a>,对于为 ELF 目标编译 C++ 代码来说效率更高。
基本上,我建议您仔细分析构建过程并检查花费最多时间的地方,这将为您提供一些有关如何优化构建过程或项目源代码结构的提示。
Also, you'll probably want to keep your source code files as small and self-contained as possible/feasible, i.e. prefer many smaller object files over one huge single object file.
This will also help avoid unnecessary recompilations, in addition you can have one static library with object files for each source code directory or module, basically allowing the compiler to reuse as much previously compiled code as possible.
Something else, which wasn't yet mentioned in any of the previous responses, is making symbol linkage as 'private' as possible, i.e. prefer static linkage (functions, variables) for your code if it doesn't have to be visible externally.
In addition, you may also want to look into using the GNU gold linker, which is much more efficient for compiling C++ code for ELF targets.
Basically, I'd advise you to carefully profile your build process and check where the most time is spend, that'll give you some hints as to how to optimize your build process or your projects source code structure.
您可以考虑切换到不同的构建系统(这显然不适用于所有人),例如 SCons。 SCons 比 make 聪明得多。 它会自动扫描标头依赖项,因此您始终拥有最小的重建依赖项集。 通过将
Decider('MD5-timestamp')
行添加到 SConstruct 文件中,SCons 将首先查看文件的时间戳,如果它比之前构建的时间戳更新,它将使用文件的 MD5 以确保您确实更改了某些内容。 这不仅适用于源文件,也适用于目标文件。 这意味着,例如,如果您更改评论,则无需重新链接。头文件的自动扫描也确保我永远不必输入 scons --clean。 它总是做正确的事。
You could consider switching to a different build system (which obviously won't work for everyone), such as SCons. SCons is much smarter than make. It automatically scans header dependencies, so you always have the smallest set of rebuild dependencies. By adding the line
Decider('MD5-timestamp')
to your SConstruct file, SCons will first look at the time stamp of a file, and if it's newer than the previously built time stamp, it will use the MD5 of the file to make sure you actually changed something. This works not just on source files but object files as well. This means that if you change a comment, for instance, you don't have to re-link.The automatic scanning of header files has also ensured that I never have to type scons --clean. It always does the right thing.
如果您有一个包含开发人员计算机的 LAN,也许您应该尝试实现分布式编译器解决方案,例如 distcc。
如果构建期间的所有时间都花在分析依赖关系或执行某些单个串行任务上,这可能没有帮助。 对于将许多源文件编译成目标文件的原始处理,并行构建显然有帮助,正如 Nathan(在一台机器上)所建议的那样。 跨多台机器并行可以更进一步。
If you have a LAN with developer machines, perhaps you should try implementing a distributed compiler solution, such as distcc.
This might not help if all of the time during the build is spent analyzing dependencies, or doing some single serial task. For the raw crunch of compiling many source files into object files, parallel building obviously helps, as suggested (on a single machine) by Nathan. Parallelizing across multiple machines can take it even further.
http://ccache.samba.org/ 加快速度。
我从事一个中型项目,这是我们为加快编译时间所做的唯一事情。
http://ccache.samba.org/ speeds up big time.
I work on a middle sized project, and that's the only thing we do to speed up the compile time.
如果您可以访问多台计算机,则可以使用 distcc 分布式编译器来缩短构建时间。
以下是来自 IBMdeveloperWorks 的一篇与 distcc 相关的文章以及如何使用它:
http://www.ibm.com/developerworks/linux/library/l -distcc.html
减少构建时间的另一种方法是使用预编译头。 这是gcc 的起点。
如果你的机器有多个 cpu/核心(核心/cpu 数量的 2 倍就可以了),那么在使用 make 构建时也不要忘记使用 -j。
You can use distcc distributed compiler to reduce the build time if you have access to several machines.
Here's an article from from IBM developerWorks related to distcc and how you can use it:
http://www.ibm.com/developerworks/linux/library/l-distcc.html
Another method to reduce build time is to use precompiled headers. Here's a starting point for gcc.
Also don't forget to use -j when building with make if your machine has more than one cpu/core(2x the number of cores/cpus is just fine).
使用小文件可能并不总是一个好的建议。 磁盘的最小扇区大小为 32 或 64K,文件至少占用一个扇区。 因此,1024 个 3K 大小的文件(内部有小代码)实际上将在磁盘上占用 32 或 64 MB,而不是预期的 3 MB。 驱动器需要读取 32/64 meg。 如果文件分散在磁盘上,则读取时间会随着寻道时间而增加。 显然,这对磁盘缓存有一定的帮助。 预编译头也可以很好地帮助缓解这种情况。
因此,在适当尊重编码准则的情况下,仅仅将每个 strucct、typedef 或实用程序类放入单独的文件中是没有意义的。
Using small files may not always be a good recommendation. A disk have a 32 or 64K min sector size, with a file taking at least a sector. So 1024 files of 3K size (small code inside) will actually take 32 or 64 Meg on disk, instead of the expected 3 meg. 32/64 meg that needs to be read by the drive. If files are dispersed around on the disk you increase read time even more with seek time. This is helped with Disk Cache obviously, to a limit. pre-compiled header can also be of good help alleviating this.
So with due respect to coding guidelines, there is no point in going out of them just to place each strcuct, typedef or utility class into separate files.