自动查找给定机器上最快的 exe 的编译器选项?

发布于 2024-08-25 11:17:15 字数 456 浏览 9 评论 0原文

是否有一种方法可以自动找到最佳编译器选项(在给定机器上),从而产生最快的可执行文件?

当然,我使用 g++ -O3,但还有其他标志可能会使代码运行得更快,例如 -ffast-math 等,其中一些标志取决于硬件。

有谁知道我可以在 configure.ac 文件(GNU autotools)中放入一些代码,以便通过 ./configure 命令将标志自动添加到 Makefile 中?

除了自动确定最佳标志之外,我还对一些有用的编译器标志感兴趣,这些编译器标志非常适合用作大多数优化的可执行文件的默认值。

更新:大多数人建议尝试不同的标志并凭经验选择最好的标志。对于该方法,我有一个后续问题:是否有一个实用程序列出了我正在运行的机器上可能的所有编译器标志(例如测试 SSE 指令是否可用等)?

Is there a method to automatically find the best compiler options (on a given machine), which result in the fastest possible executable?

Naturally, I use g++ -O3, but there are additional flags that may make the code run faster, e.g. -ffast-math and others, some of which are hardware-dependent.

Does anyone know some code I can put in my configure.ac file (GNU autotools), so that the flags will be added to the Makefile automatically by the ./configure command?

In addition to automatically determining the best flags, I would be interested in some useful compiler flags that are good to use as a default for most optimized executables.

Update: Most people suggest to just try different flags and select the best ones empirically. For that method, I'd have a follow-up question: Is there a utility that lists all compiler flags that are possible for the machine I'm running on (e.g. tests if SSE instructions are available etc.)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

懵少女 2024-09-01 11:17:15

我认为您不能在配置时执行此操作,但至少有一个程序尝试在给定特定可执行文件和机器的情况下优化 gcc 选项标志。例如,请参阅 http://www.coyotegulch.com/products/acovea/

通过对目标机器的一些了解,您也许可以使用它来为您的代码找到一组好的选项。

I don't think you can do this at configure-time, but there is at least one program which attempts to optimize gcc option flags given a particular executable and machine. See http://www.coyotegulch.com/products/acovea/ for example.

You might be able to use this with some knowledge of your target machine(s) to find a good set of options for your code.

红墙和绿瓦 2024-09-01 11:17:15

嗯 - 是的。这是可能的。查看配置文件引导优化

Um - yes. This is possible. Look into profile-guided optimization.

鸵鸟症 2024-09-01 11:17:15

一些编译器提供“-fast”选项来自动为给定的编译主机选择最积极的优化。 http://en.wikipedia.org/wiki/Intel_C%2B%2B_Compiler

不幸的是,g++ 不提供类似的标志。

作为下一个问题的后续问题,对于 g++,您可以将 -mtune 选项与 -O3 一起使用,这将为您提供相当快的默认值。接下来的挑战是找到编译主机的处理器类型。您可能想查看 autoconf 宏存档,看看有人编写了必要的测试。否则,假设是 Linux,你必须解析 /proc/cpuinfo 来获取处理器类型

some compilers provide "-fast" option to automatically select most aggressive optimization for given compilation host. http://en.wikipedia.org/wiki/Intel_C%2B%2B_Compiler

Unfortunately, g++ does not provide similar flags.

as a follow-up to your next question, for g++ you can use -mtune option together with -O3 which will give you reasonably fast defaults. Challenge then is to find processor type of your compilation host. you may want to look on autoconf macro archive, to see somebody wrote necessary tests. otherwise, assuming linux, you have to parse /proc/cpuinfo to get processor type

夜司空 2024-09-01 11:17:15

经过一番谷歌搜索后,我发现了这个脚本:gcccpuopt

在我的一台机器(32 位)上,它输出:

-march=pentium4 -mfpmath=sse

在另一台机器(64 位)上,它输出:

$ ./gcccpuopt 
Warning: The optimum *32 bit* architecture is reported
-m32 -march=core2 -mfpmath=sse

所以,它并不完美,但可能会有所帮助。

After some googling, I found this script: gcccpuopt.

On one of my machines (32bit), it outputs:

-march=pentium4 -mfpmath=sse

On another machine (64bit) it outputs:

$ ./gcccpuopt 
Warning: The optimum *32 bit* architecture is reported
-m32 -march=core2 -mfpmath=sse

So, it's not perfect, but might be helpful.

半﹌身腐败 2024-09-01 11:17:15

另请参阅-mcpu=native/-mtune=native gcc 选项。

See also -mcpu=native/-mtune=native gcc options.

总攻大人 2024-09-01 11:17:15

是否有一种方法可以自动找到最佳编译器选项(在给定机器上),从而产生最快的可执行文件?

不。

您可以使用多种编译器选项来编译您的程序,然后对每个版本进行基准测试,然后选择“最快”的版本,但这几乎不可靠,并且可能对您的程序没有用处。

Is there a method to automatically find the best compiler options (on a given machine), which result in the fastest possible executable?

No.

You could compile your program with a large assortment of compiler options, then benchmark each and every version, then select the one that is "fastest," but that's hardly reliable and probably not useful for your program.

往昔成烟 2024-09-01 11:17:15

这是一个适合我的解决方案,但设置确实需要一些时间。在 Hans Petter Langtangen 所著的《计算科学的 Python 脚本》(我认为这是一本优秀的书)中,给出了一个使用简短的 Python 脚本进行数值实验以确定 C/Fortran/... 的最佳编译器选项的示例。程序。这在第 1.1.11 章“嵌套异构数据结构”中进行了描述。

书中示例的源代码可在 http://folk.uio.no 处免费获取/hpl/scripting/index.html (我不确定许可证,因此不会在这里重现任何代码),特别是您可以在 TCSE3-3rd 的代码中找到类似数值测试的代码-examples.tar.gz 文件中的 src/app/wavesim2D/F77/compile.py ,您可以将其用作编写适合特定系统/语言(在您的情况下为 C++)的脚本的基础。

This is a solution that works for me, but it does take a little while to set up. In "Python Scripting for Computational Science" by Hans Petter Langtangen (an excellent book in my opinion), an example is given of using a short python script to do numerical experiments to determine the best compiler options for your C/Fortran/... program. This is described in Chapter 1.1.11 on "Nested Heterogeneous Data Structures".

Source code for examples from the book are freely available at http://folk.uio.no/hpl/scripting/index.html (I'm not sure of the license, so will not reproduce any code here), and in particular you can find code for a similar numerical test in the code in TCSE3-3rd-examples.tar.gz in the file src/app/wavesim2D/F77/compile.py , which you could use as a base for writing a script which is appropriate for a particular system/language (C++ in your case).

命比纸薄 2024-09-01 11:17:15

优化您的应用程序主要是您的工作,而不是编译器的工作。

这是我正在谈论的内容的示例。

一次您已经做到了这一点,如果您的应用程序受计算限制,并且代码中存在热点(而不是库代码中),那么编译器对速度的优化将会产生一些影响,因此您可以尝试不同的标志组合。

Optimizing your app is mainly your job, not the compiler's.

Here's an example of what I'm talking about.

Once you've done that, IF your app is compute-bound, with hotspots in your code (not in library code) THEN the compiler optimizations for speed will make some difference, so you can try different flag combinations.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文