make -j distcc 是否可以扩展超过 5 倍?

发布于 2024-08-17 07:39:27 字数 898 浏览 5 评论 0原文

由于 distcc 无法保持状态,只能发送作业和标头,并让这些服务器仅使用刚刚发送的数据并进行预处理和编译,因此我认为最新的 distcc 在可扩展性方面存在问题。
在我的本地构建环境中,有 appx.要构建 10,000 个 c/c++ 文件,当有 20 个构建服务器时,我只能比不使用 distcc(但使用 make -j)快 2 倍。
您认为问题出在哪里?

如果有人使用 make -j 和 distcc 实现了超过 10 - 20 倍的可扩展性,请告诉我。

以下产品声称 make -j 和 distcc 的扩展速度不可能超过 5 倍。 http://www.electric-cloud.com/products/electricaccelerator.php

我认为这可以通过以下方式改进:

  • 让 distccd 服务器维护会话
  • 与这些会话绑定,它们将缓存自己的头目录
  • 预处理将根据 distccd 服务器的需求完成
  • 这将通过 LD_PRELOADed 库 libdistcc.so 完成将替换 stat/open 系统调用并通过网络获取头文件。 ……

有人做过这种事吗?


我认为 Electric Cloud 做了类似的事情,但我认为我们有更多的优化空间:

  1. 服务器应该通过非常快的网络文件系统共享相同的源代码存储库。
  2. 我们应该并行进行构建文件解析和包含头解析。

  3. 在不大幅改变构建描述的情况下似乎相当困难。
    欢迎任何想法、现有技术、解决方法。

Since distcc cannot keep states and just possible to send jobs and headers and let those servers to use only the data just sent and preprocess and compile, I think the lastest distcc has problem in scalability.
In my local build environment which has appx. 10,000 c/c++ files to build, I could only make 2 times faster than not using distcc (but using make -j) when having 20 build servers.
What do you think is the problem?

If anyone has achieved scalability more than 10 - 20 times using make -j and distcc, please let me know.

The following product claims that it is impossible to scale make -j and distcc faster than 5 times.
http://www.electric-cloud.com/products/electricaccelerator.php

I think this can be improved by:

  • Letting the distccd server to maintain sessions
  • Tied to those sessions, they will cache their own header directories
  • Preprocess will be done demand base from the distccd server
  • This will be done through a LD_PRELOADed library libdistcc.so which will replace stat/open syscalls and fetches the header files over network.
    ...

Has anyone done this kind of thing?


I think Electric Cloud does similar stuff but I think we have more space to optimize:

  1. Servers should share the same source code repository over really fast network filesystem.
  2. We should make the build file parsing and include header parsing in parallel.

  3. seems to be quite tough without changing the build descriptions drastically.
    Any ideas, prior art, workaround is welcome.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

丘比特射中我 2024-08-24 07:39:27

是的,distcc 可以扩展到 5 倍以上。

我们必须弄清楚您的环境中的限制因素是什么。

  1. 一个常见的问题是您的 makefile 不允许它一次实际分派多个文件。您可以查看有多少编译器进程正在运行。如果这是问题所在,您可能需要调试 makefile 以允许更多并行性。

  2. 也许客户端正在运行的许多作业由于某种原因无法分发。 distcc 客户端日志会告诉您是否属于这种情况。

  3. 也许由于某种原因,客户端过载并且无法足够快地传递作业;然而,在达到这个目标之前,您很可能会获得 2 个以上的工作。

  4. 服务器可能已超载,无法接受更多作业。但是,如果您有 20 台服务器,那么它们应该能够每台至少接收一台服务器。

  5. 也许网络已饱和,客户端和服务器都在其上阻塞(在 GbE 上不太可能;在 100Mb 上可能。)

    也许网络已

在您知道限制因素是启动会话之前,考虑如何保持会话打开还为时过早。

它可能是#1或#2。发布您的日志的摘录。

Yes, distcc can scale up well above 5x.

We have to work out what the limiting factor is in your environment.

  1. One common problem is that your makefiles won't allow it to actually dispatch more than a couple of files at a time. You can just have a look at how many compiler processes are running. If this is the problem you may need to debug your makefiles to allow more parallelism.

  2. Perhaps many of the jobs the client is running can't be distributed for some reason. The distcc client log will tell you if this is the case.

  3. Perhaps for some reason the client is overloaded and not able to pass out jobs fast enough; however it's very likely you would get above 2 jobs before hitting this.

  4. Perhaps the servers are overloaded and can't accept any more jobs. But if you have 20 servers they should be able to take at least one each.

  5. Perhaps the network is saturated and both the client and server are blocking on it (unlikely on GbE; possible on 100Mb.)

Thinking about hacks to keep sessions open is premature until you know that the limiting factor is starting the sessions.

It's probably #1 or #2. Post an excerpt from your log.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文