我如何找出为什么 g++处理某个特定文件需要很长时间?

发布于 2024-08-17 23:40:29 字数 779 浏览 1 评论 0原文

我正在 Linux 上使用 mingw32 交叉编译器构建大量自动生成的代码,包括一个特别大的文件(~15K 行)。大多数文件都非常快,但这个大文件的编译时间却出乎意料地长(约 15 分钟)。

我尝试过操作各种优化标志,看看它们是否有任何效果,但没有任何运气。我真正需要的是某种方法来确定 g++ 正在做什么,而这需要很长时间。是否有任何(相对简单的)方法可以让 g++ 生成有关不同编译阶段的输出,以帮助我缩小可能出现的问题的范围?

遗憾的是,我没有能力重建这个交叉编译器,因此不可能向编译器添加调试信息并单步调试它。

文件内容:

  • 一堆包含
  • 一堆字符串比较、
  • 一堆 if-then 检查和构造函数调用

该文件是一个工厂,用于生成某个父类的大量不同的特定子类。然而,大多数内容并没有什么特别的。


根据 Neil Butterworth 的建议,-ftime-report 的结果表明“生命分析”阶段耗时 921 秒,占据了 15 分钟的大部分时间。

这似乎发生在数据流分析期间。文件本身是一堆条件字符串比较,通过作为字符串提供的类名构造对象。

我们认为将其更改为指向函数指针的名称映射可能会有所改善,因此我们将尝试这样做。


事实上,生成一堆工厂函数(每个对象)并创建从对象的字符串名称到指向其工厂函数的指针的映射,将编译时间从原来的 15 分钟减少到大约 25 秒,这将为每个人节省大量时间在他们的构建上。

再次感谢 Neil Butterworth 提供有关 -ftime-report 的提示。

I am building a lot of auto-generated code, including one particularly large file (~15K lines), using a mingw32 cross compiler on linux. Most files are extremely quick, but this one large file takes an unexpectedly long time (~15 minutes) to compile.

I have tried manipulating various optimization flags to see if they had any effect, without any luck. What I really need is some way of determining what g++ is doing that is taking so long. Are there any (relatively simple) ways to have g++ generate output about different phases of compilation, to help me narrow down what the hang-up might be?

Sadly, I do not have the ability to rebuild this cross-compiler, so adding debugging information to the compiler and stepping through it is not a possibility.

What's in the file:

  • a bunch of includes
  • a bunch of string comparisons
  • a bunch of if-then checks and constructor invocations

The file is a factory for producing a ton of different specific subclasses of a certain parent class. Most of the includes, however, are nothing terribly fancy.


The results of -ftime-report, as suggested by Neil Butterworth, indicate that the "life analysis" phase is taking 921 seconds, which takes up most of the 15 minutes.

It appears that this takes place during data flow analysis. The file itself is a bunch of conditional string comparisons, constructing an object by class name provided as a string.

We think changing this to point into a map of names to function pointers might improve things a bit, so we're going to try that.


Indeed, generating a bunch of factory functions (per object) and creating a map from the string name of the object to a pointer to its factory function reduced compile time from the original 15 minutes to about 25 seconds, which will save everyone tons of time on their builds.

Thanks again to Neil Butterworth for the tip about -ftime-report.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

杀お生予夺 2024-08-24 23:40:29

不会提供您想要的所有详细信息,但请尝试使用 -v (详细)和 -ftime-report 标志运行。后者生成编译器所做工作的摘要。

Won't give all the details you want, but try running with the -v (verbose) and -ftime-report flags. The latter produces a summary of what the compiler has been up to.

柏林苍穹下 2024-08-24 23:40:29

它很可能包含大量内容。我相信 -MD 会列出给定 CPP 文件中的所有包含文件(包括包含等)。

It most probably includes TONNES of includes. I believe -MD will list out all the include files in a given CPP file (That includes includes of includes and so forth).

孤芳又自赏 2024-08-24 23:40:29

一般来说,拖慢 g++ 速度的是模板。例如 Boost 就喜欢使用它们。这意味着好的代码,出色的性能,但编译速度很差。

另一方面,15分钟显得非常长。快速谷歌搜索后,似乎这是 mingw 的常见问题

What slows g++ down in general are templates. For example Boost loves to use them. This means nice code, great performances but poor compiling speed.

On the other hand, 15min seems extremely long. After a quick googling, it seems that it is a common problem with mingw

_蜘蛛 2024-08-24 23:40:29

我会使用 #if 0 / #endif 从编译中消除大部分源文件。对不同的代码块重复此操作,直到查明哪些块速度较慢。对于初学者,您可以使用 #if 0 / #endif 排除除 < 之外的所有内容,看看您的 #include 是否存在问题。代码>#include 的。

I'd use #if 0 / #endif to eliminate large portions of the source file from compilation. Repeat with different blocks of code until you pinpoint which block(s) are slow. For starters, you can see if your #include's are the problem by using #if 0 / #endif to exclude everything but the #include's.

栀梦 2024-08-24 23:40:29

另一个可以尝试的过程是在代码中添加“进度标记”pragma,以捕获花费很长时间的代码部分。 Visual Studio 编译器提供了#pragma message(),尽管没有用于执行此操作的标准编译指示。

在代码开头放置一个标记,在代码末尾放置一个标记。结束标记可能是 #error,因为您不关心源文件的其余部分。相应地移动标记以捕获耗时最长的代码部分。

只是一个想法...

Another process to try is to add "progress marker" pragmas to your code to trap the portion of the code that is taking a long time. The Visual Studio compiler provides #pragma message(), although there is not a standard pragma for doing this.

Put one marker at the beginning of the code and a marker at the end of the code. The end marker could be a #error since you don't care about the remainder of the source file. Move the markers accordingly to trap the section of code taking the longest time.

Just a thought...

神魇的王 2024-08-24 23:40:29

与 @Goz 和 @Josh_Kelley 相关,您可以使用 -E 让 gcc/g++ 吐出预处理后的源代码(带有 #includes 内联)。这是确定源有多大的一种方法。

如果编译器本身就是问题,您也许可以跟踪花费很长时间的编译命令,以查看是否存在特定的文件访问或特定的内部操作花费了很长时间。

Related to @Goz and @Josh_Kelley, you can get gcc/g++ to spit out the preprocessed source (with #includes inline) using -E. That's one way to determine just how large your source is.

And if the compiler itself is the problem, you may be able to strace the compile command that's taking a long time to see whether there's a particular file access or a specific internal action that's taking a long time.

ゞ花落谁相伴 2024-08-24 23:40:29

编译器看到的是预处理器的输出,因此单个源的大小不是一个好的衡量标准,您必须考虑源及其包含的所有文件,以及它们包含的文件等。模板的实例化多种类型会为所使用的每个单独类型生成代码,因此最终可能会产生大量代码。例如,如果您在许多类中广泛使用了 STL 容器。

一个源代码中的 15K 行相当多,但即使拆分,所有代码仍然需要编译;然而,使用增量构建可能意味着它并不总是需要编译。确实不需要那么大的文件;这只是糟糕的实践/设计。当文件达到 500 行时,我开始考虑更好的模块化(尽管我对此并不教条)

What the compiler sees is the output of the pre-processor, so the size of the individual source is not a good measure, you have to consider the source and all the files it includes, and the files they include etc. Instantiation of templates for multiple types generates code for each separate type used, so that could end up being a lot of code. If you have made extensive used of STL containers for many classes for example.

15K lines in one source is rather a lot, but even if split up, all that code still needs to be compiled; however using an incremental build may mean that it does not all need compiling all the time. There really is no need for a file that large; its just poor practice/design. I start thinking about better modularisation when a file gets to 500 lines (although I am not dogmatic about it)

熊抱啵儿 2024-08-24 23:40:29

编译过程中需要注意的一件事是您的计算机有多少可用内存。如果编译器分配了太多内存以至于计算机开始交换,编译时间将会大大增加。

如果您发现这种情况发生,一个简单的解决方案是安装更多 RAM...或者只是将文件分成多个可以单独编译的部分。

One thing to watch during the compile is how much memory your computer has free. If the compiler allocates so much memory that the computer starts swapping, compile time will go way, way up.

If you see that happen, an easily solution is to install more RAM... or just split the file into multiple parts that can be compiled separately.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文