distcc 中的链接阶段

发布于 2024-09-24 05:44:59 字数 124 浏览 7 评论 0原文

使用 distcc 构建项目时的链接阶段是在本地完成的，而不是像编译那样发送到其他计算机上完成，是否有任何特殊原因？阅读 distcc 白页并没有给出明确的答案，但我猜测与编译相比，链接目标文件所花费的时间并不是很重要。有什么想法吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

ヅ她的身影、若隐若现 2024-10-01 05:44:59

distcc 的工作方式是在本地预处理输入文件，直到创建单个文件翻译单元。然后该文件通过网络发送并编译。在此阶段，远程 distcc 服务器只需要一个编译器，甚至不需要项目的头文件。然后，编译的输出被移回客户端并作为目标文件存储在本地。请注意，这意味着不仅链接，而且预处理都是在本地执行的。这种工作分工对于其他构建工具来说很常见，例如 ccache（始终执行预处理，然后它尝试使用先前缓存的结果解析输入，如果成功则返回二进制文件而不重新编译）。

如果要实现分布式链接器，则必须确保网络中的所有主机都具有完全相同的配置，否则必须批量发送操作所需的所有输入。这意味着分布式编译将生成一组目标文件，并且所有这些目标文件都必须通过网络推送，以便远程系统链接并返回链接的文件。请注意，这可能需要引用并存在于链接器路径中但不存在于链接器命令行中的系统库，因此“预链接”必须确定实际需要发送哪些库集。即使可能，这也需要本地系统猜测/计算所有真正的依赖关系并发送它们，这会对网络流量产生很大的影响，并且实际上可能会减慢进程，因为发送的成本可能大于链接的成本 - 如果获取依赖项的成本本身并不像链接那么昂贵。

我目前正在进行的项目有一个超过 100M 的静态链接可执行文件。静态库的大小不等，但如果分布式系统认为最终的可执行文件要远程链接，则可能需要最终可执行文件（模板、内联......所有这些都出现在所有文件中）的三到五倍的网络流量。包含它们的翻译单元，因此网络上会有多个副本）。

The way that distcc works is by locally preprocessing the input files until a single file translation unit is created. That file is then sent over the network and compiled. At that stage the remote distcc server only needs a compiler, it does not even need the header files for the project. The output of the compilation is then moved back to the client and stored locally as an object file. Note that this means that not only linking, but also preprocessing is performed locally. That division of work is common to other build tools, like ccache (preprocessing is always performed, then it tries to resolve the input with previously cached results and if succeeds returns the binary without recompiling).

If you were to implement a distributed linker, you would have to either ensure that all hosts in the network have the exact same configuration, or else you would have to send all required inputs for the operation in one batch. That would imply that distributed compilation would produce a set of object files, and all those object files would have to be pushed over the network for a remote system to link and return the linked file. Note that this might require system libraries that a referred and present in the linker path, but not present in the linker command line, so a 'pre-link' would have to determine what set of libraries are actually required to be sent. Even if possible this would require the local system to guess/calculate all real dependencies and send them with a great impact in network traffic and might actually slow down the process, as the cost of sending might be greater than the cost of linking --if the cost of getting the dependencies is not itself almost as expensive as linking.

The project I am currently working on has a single statically linked executable of over 100M. The static libraries range in size but if a distributed system would consider that the final executable was to be linked remotely it would require probably three to five times as much network traffic as the final executable (templates, inlines... all these appear in all translation units that include them, so there would be multiple copies flying around the network).

回复收藏 0 原文