您使用了哪些策略来缩短大型项目的构建时间?
我曾经参与过一个 C++ 项目,整个重建大约需要一个半小时。 小规模编辑、构建、测试周期大约需要 5 到 10 分钟。 这是一场毫无成果的噩梦。
您必须处理的最糟糕的构建时间是多少?
您使用了哪些策略来缩短大型项目的构建时间?
更新:
您认为所使用的语言在多大程度上应归咎于该问题? 我认为 C++ 很容易对大型项目产生大量依赖,这通常意味着即使对源代码进行简单的更改也可能导致大规模重建。 您认为哪种语言最能解决大型项目的依赖问题?
I once worked on a C++ project that took about an hour and a half for a full rebuild. Small edit, build, test cycles took about 5 to 10 minutes. It was an unproductive nightmare.
What is the worst build times you ever had to handle?
What strategies have you used to improve build times on large projects?
Update:
How much do you think the language used is to blame for the problem? I think C++ is prone to massive dependencies on large projects, which often means even simple changes to the source code can result in a massive rebuild. Which language do you think copes with large project dependency issues best?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(20)
这本书大规模C++软件设计提供了我用过的非常好的建议在过去的项目中。
This book Large-Scale C++ Software Design has very good advice I've used in past projects.
强大的编译机和并行编译器。 我们还确保尽可能少地需要完整的构建。 我们不会更改代码以使其编译速度更快。
效率和正确性比编译速度更重要。
Powerful compilation machines and parallel compilers. We also make sure the full build is needed as little as possible. We don't alter the code to make it compile faster.
Efficiency and correctness is more important than compilation speed.
在 Visual Studio 中,您可以设置一次编译的项目数量。 它的默认值为 2,增加该值会减少一些时间。
如果您不想弄乱代码,这将会有所帮助。
In Visual Studio, you can set number of project to compile at a time. Its default value is 2, increasing that would reduce some time.
This will help if you don't want to mess with the code.
这是我们在 Linux 下进行开发时所做的事情列表:
This is the list of things we did for a development under Linux :
我们尝试过创建代理类一次。
这些实际上是类的简化版本,仅包含公共接口,减少了需要在头文件中公开的内部依赖项的数量。 然而,它们付出了沉重的代价:将每个类分散到多个文件中,当类接口发生更改时,所有这些文件都需要更新。
We tried creating proxy classes once.
These are really a simplified version of a class that only includes the public interface, reducing the number of internal dependencies that need to be exposed in the header file. However, they came with a heavy price of spreading each class over several files that all needed to be updated when changes to the class interface were made.
一般来说,我参与过的大型 C++ 项目的构建时间很慢,而且非常混乱,代码中散布着许多相互依赖关系(大多数 cpp 中使用的相同包含文件、胖接口而不是瘦接口)。 在这些情况下,缓慢的构建时间只是更大问题的一个症状,也是一个小问题。 重构以形成更清晰的界面并将代码分解为库,从而改进了架构并缩短了构建时间。 当你创建一个库时,它会迫使你思考什么是接口,什么不是,这实际上(根据我的经验)最终会改进代码库。 如果没有技术原因必须划分代码,一些程序员在维护过程中就会将任何内容放入任何头文件中。
In general large C++ projects that I've worked on that had slow build times were pretty messy, with lots of interdependencies scattered through the code (the same include files used in most cpps, fat interfaces instead of slim ones). In those cases, the slow build time was just a symptom of the larger problem, and a minor symptom at that. Refactoring to make clearer interfaces and break code out into libraries improved the architecture, as well as the build time. When you make a library, it forces you to think about what is an interface and what isn't, which will actually (in my experience) end up improving the code base. If there's no technical reason to have to divide the code, some programmers through the course of maintenance will just throw anything into any header file.
Cătălin Pitiş 报道了很多好东西。 我们做的其他事情:
Cătălin Pitiș covered a lot of good things. Other ones we do:
这是我的一个小烦恼,所以即使你已经接受了一个很好的答案,我还是会插话:
在 C++ 中,它不是语言本身,而是语言强制的构建模型,它在七十年代很棒,并且头文件较多的库。
Cătălin Pitiş 的回答唯一错误的是:“购买更快的机器”应该放在第一位。 这是最简单、影响最小的方法。
我最糟糕的情况是在 W2K Professional 上运行 VC6 的老化构建机器上运行了大约 80 分钟。 现在,在具有 4 个超线程核心、8G RAM Win 7 x64 和不错的磁盘的机器上,相同的项目(包含大量新代码)只需不到 6 分钟。 (类似的机器,处理器功率减少约 10..20%,配备 4G RAM 和 Vista x86,需要两倍的时间)
奇怪的是,增量构建现在大多数时候比完全重建慢。
It's a pet peeve of mine, so even though you already accepted an excellent answer, I'll chime in:
In C++, it's less the language as such, but the language-mandated build model that was great back in the seventies, and the header-heavy libraries.
The only thing that is wrong about Cătălin Pitiș' reply: "buy faster machines" should go first. It is the easyest way with the least impact.
My worst was about 80 minutes on an aging build machine running VC6 on W2K Professional. The same project (with tons of new code) now takes under 6 minutes on a machine with 4 hyperthreaded cores, 8G RAM Win 7 x64 and decent disks. (A similar machine, about 10..20% less processor power, with 4G RAM and Vista x86 takes twice as long)
Strangely, incremental builds are most of the time slower than full rebuuilds now.
完整构建大约需要 2 小时。 我尽量避免对基类进行修改,因为我的工作主要是这些基类的实现,所以我只需要构建小组件(几分钟)。
Full build is about 2 hours. I try to avoid making modification to the base classes and since my work is mainly on the implementation of these base classes I only need to build small components (couple of minutes).
创建一些单元测试项目来测试各个库,这样,如果您需要编辑会导致巨大重建的低级类,您可以在重建整个应用程序之前使用 TDD 来了解新代码的工作原理。 Themis 提到的 John Lakos 的书提供了一些非常实用的建议,可以帮助您重组图书馆,使之成为可能。
Create some unit test projects to test individual libraries, so that if you need to edit low level classes that would cause a huge rebuild, you can use TDD to know your new code works before you rebuild the entire app. The John Lakos book as mentioned by Themis has some very practical advice for restructuring your libraries to make this possible.
[稍后编辑]
8.购买更快的机器。
[Later Edit]
8. Buy faster machines.
我的策略非常简单——我不做大型项目。 现代计算的整体主旨是从庞大和单一转向小型和组件化。 因此,当我处理项目时,我会将内容分解为可以独立构建和测试的库和其他组件,并且彼此之间的依赖性最小。 这种环境中的“完整构建”实际上永远不会发生,因此没有问题。
My strategy is pretty simple - I don't do large projects. The whole thrust of modern computing is away from the giant and monolithic and towards the small and componentised. So when I work on projects, I break things up into libraries and other components that can be built and tested independantly, and which have minimal dependancies on each other. A "full build" in this kind of environment never actually takes place, so there is no problem.
有时有用的一种技巧是将所有内容都包含到一个 .cpp 文件中。 由于每个文件都会处理一次包含内容,因此这可以节省您大量时间。 (这样做的缺点是,它使编译器无法并行编译)
您应该能够指定并行编译多个 .cpp 文件(-j 在 linux 上使用 make,在 MSVC 上使用 /MP - MSVC 也有并行编译多个项目的选项。这些是单独的选项,没有理由不使用这两个选项)
同样,分布式构建(例如 Incredibuild)可能有助于减轻单个系统的负担。
SSD 磁盘应该是一个巨大的胜利,尽管我自己还没有测试过(但 C++ 构建涉及大量文件,这很快就会成为瓶颈)。
如果小心使用的话,预编译头文件也有帮助。 (如果它们必须经常重新编译,它们也会伤害您)。
最后,尽量减少代码本身的依赖性也很重要。 使用 pImpl 习惯用法,使用前向声明,使代码尽可能模块化。 在某些情况下,使用模板可以帮助您解耦类并最大程度地减少依赖性。 (当然,在其他情况下,模板会显着减慢编译速度)
但是,是的,你是对的,这很大程度上是一个语言问题。 我不知道还有哪种语言会受到这种程度的问题。 大多数语言都有一个模块系统,允许它们消除头文件,这是一个重要因素。 C 有头文件,但它是一种非常简单的语言,编译时间仍然是可以管理的。 C++ 两全其美。 一种庞大而复杂的语言,以及一种可怕的原始构建机制,需要一次又一次地解析大量代码。
One trick that sometimes helps is to include everything into one .cpp file. Since includes are processed once per file, this can save you a lot of time. (The downside to this is that it makes it impossible for the compiler to parallelize compilation)
You should be able to specify that multiple .cpp files should be compiled in parallel (-j with make on linux, /MP on MSVC - MSVC also has an option to compile multiple projects in parallel. These are separate options, and there's no reason why you shouldn't use both)
In the same vein, distributed builds (Incredibuild, for example), may help take the load off a single system.
SSD disks are supposed to be a big win, although I haven't tested this myself (but a C++ build touches a huge number of files, which can quickly become a bottleneck).
Precompiled headers can help too, when used with care. (They can also hurt you, if they have to be recompiled too often).
And finally, trying to minimize dependencies in the code itself is important. Use the pImpl idiom, use forward declarations, keep the code as modular as possible. In some cases, use of templates may help you decouple classes and minimize dependencies. (In other cases, templates can slow down compilation significantly, of course)
But yes, you're right, this is very much a language thing. I don't know of another language which suffers from the problem to this extent. Most languages have a module system that allows them to eliminate header files, which area huge factor. C has header files, but is such a simple language that compile times are still manageable. C++ gets the worst of both worlds. A big complex language, and a terrible primitive build mechanism that requires a huge amount of code to be parsed again and again.
最后两个将我们的链接时间从大约 12 分钟缩短到 1-2 分钟。 请注意,只有当事物具有巨大的可见性时才需要这样做,即“无处不在”并且存在许多不同的常量和类。
干杯
The last two gave us a reduced linking time from around 12 minutes to 1-2 minutes. Note that this is only needed if things have a huge visibility, i.e. seen "everywhere" and if there are many different constants and classes.
Cheers
IncrediBuild
IncrediBuild
Unity 构建
Incredibuild
指向实现
前向声明的指针
,将项目的“完成”部分编译到 dll 中
Unity Builds
Incredibuild
Pointer to implementation
forward declarations
compiling "finished" sections of the proejct into dll's
ccache & distcc(用于 C/C++ 项目)-
ccache 缓存编译的输出,使用预处理的文件作为查找输出的“密钥”。 这很棒,因为预处理非常快,而且通常强制重新编译的更改实际上不会更改许多文件的源。 此外,它确实加快了完全重新编译的速度。 同样不错的是您可以在团队成员之间共享缓存的实例。 这意味着只有第一个获取最新代码的人才能真正编译任何东西。
distcc 在机器网络上进行分布式编译。 仅当您有用于编译的计算机网络时,这才有用。 它与 ccache 配合良好,并且只移动预处理的源代码,因此您在编译器引擎系统上唯一需要担心的是它们具有正确的编译器(不需要标头或整个源代码树可见) )。
ccache & distcc (for C/C++ projects) -
ccache caches compiled output, using the pre-processed file as the 'key' for finding the output. This is great because pre-processing is pretty quick, and quite often changes that force recompile don't actually change the source for many files. Also, it really speeds up a full re-compile. Also nice is the instance where you can have a shared cache among team members. This means that only the first guy to grab the latest code actually compiles anything.
distcc does distributed compilation across a network of machines. This is only good if you HAVE a network of machines to use for compilation. It goes well with ccache, and only moves the pre-processed source around, so the only thing you have to worry about on the compiler engine systems is that they have the right compiler (no need for headers or your entire source tree to be visible).
最好的建议是构建真正理解依赖关系的 makefile,并且不会因为微小的更改而自动重建世界。 但是,如果完整重建需要 90 分钟,而小型重建需要 5-10 分钟,那么您的构建系统很可能已经做到了这一点。
构建可以并行完成吗? 是使用多个核心,还是使用多个服务器?
检查预编译位是否确实是静态的并且不需要每次都重新构建。 使用但未更改的第三方工具/库是这种处理的良好候选者。
如果适用,将构建限制为单个“流”。 “完整产品”可能包括调试版本或 32 位和 64 位版本之类的内容,或者可能包括每次派生/构建的帮助文件或手册页。 删除开发不需要的组件可以显着缩短构建时间。
构建也打包产品吗? 这真的是开发和测试所必需的吗? 该构建是否包含一些可以跳过的基本健全性测试?
最后,您可以重构代码库,使其更加模块化并具有更少的依赖项。 大规模 C++ 软件设计是学习解耦大型软件产品的绝佳参考变成更容易维护和更快构建的东西。
编辑:在本地文件系统上构建而不是在 NFS 安装的文件系统上构建也可以显着加快构建时间。
The best suggestion is to build makefiles that actually understand dependencies and do not automatically rebuild the world for a small change. But, if a full rebuild takes 90 minutes, and a small rebuild takes 5-10 minutes, odds are good that your build system already does that.
Can the build be done in parallel? Either with multiple cores, or with multiple servers?
Checkin pre-compiled bits for pieces that really are static and do not need to be rebuilt every time. 3rd party tools/libraries that are used, but not altered are a good candidate for this treatment.
Limit the build to a single 'stream' if applicable. The 'full product' might include things like a debug version, or both 32 and 64 bit versions, or may include help files or man pages that are derived/built every time. Removing components that are not necessary for development can dramatically reduce the build time.
Does the build also package the product? Is that really required for development and testing? Does the build incorporate some basic sanity tests that can be skipped?
Finally, you can re-factor the code base to be more modular and to have fewer dependencies. Large Scale C++ Software Design is an excellent reference for learning to decouple large software products into something that is easier to maintain and faster to build.
EDIT: Building on a local filesystem as opposed to a NFS mounted filesystem can also dramatically speed up build times.