是否可以使用 GPU 加速在 gcc 编译器上编译多个程序?
有什么方法或工具可以在使用 GCC 编译器编译程序时应用 GPU 加速吗?现在我已经创建了一个程序来迭代编译给定的程序列表。这需要几分钟的时间。我知道一些程序,例如 Pyrit,可以帮助应用 GPU 加速来预计算哈希值。
如果没有可用的此类工具,请建议是否使用 OpenCL 或其他任何工具来重新编程我的代码。
Is there any way or tool to apply GPU acceleration on compiling programs with GCC compiler? Right now I have created a program to compile the given list of programs iteratively. It takes a few minutes. I know of a few programs like Pyrit which helps to apply GPU acceleration for precomputing hashes.
If there are no such tools available, Please advice on whether to use OpenCL or anything else to reprogram my code.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
A. 在命令式编程语言中,语句是按顺序执行的,每个语句都可能改变程序的状态。因此,分析翻译单元本质上是顺序的。
示例:查看常量传播如何工作 -
您需要按顺序浏览这些语句,然后才能确定分配给
b
和c
的值在编译时是常量。(但是,单独的基本块可能会彼此并行编译和优化。)
B. 除此之外,不同的遍也需要按顺序执行,并相互影响。
举个例子:根据指令表,你分配寄存器,然后你发现你需要将一个寄存器溢出到内存中,所以你需要生成新的指令。这再次改变了时间表。
因此,您也不能并行执行“寄存器分配”和“调度”等“传递”(实际上,我认为有一些文章计算机科学家/数学家试图一起解决这两个问题,但我们不要深究)。
(同样,人们可以通过管道化传递来实现一些并行性。)
此外,GPU 尤其不适合,因为:
GPU 擅长浮点数学。编译器不需要或使用太多的东西(除了在程序中优化浮点运算时)
GPU 擅长 SIMD。即对多个输入执行相同操作。这又不是编译器需要做的事情。如果编译器需要优化数百个浮点运算,可能会有好处(一个疯狂的例子是:程序员定义了几个大型 FP 数组,为它们分配了常量,然后编写确实是一个写得很糟糕的代码。)
因此,除了基本块的并行编译和流水线通道之外,在“内部编译”级别上没有太多并行性。 C 文件'.但并行性是可能的,易于实现,并且在更高层次上不断使用。例如,
GNU Make
具有-j=N
参数。这基本上意味着:只要它找到N
个独立的作业(通常,编译一堆文件就是GNU Make
的用途),它就会生成N
进程(或并行编译不同文件的N
个gcc
实例)。A. In an imperative programming language, statements are executed in sequence, and each statement may change the program's state. So analyzing translation units is inherently sequential.
An example: Check out how constant propagation might work -
You need to go through those statements sequentially before you figure out that the values assigned to
b
andc
are constants at compile time.(However separate basic blocks may possibly be compiled and optimized in parallel with each other.)
B. On top of this, different passes need to execute sequentially as well, and affect each other.
An example: Based on a schedule of instructions, you allocate registers, then you find that you need to spill a register to memory, so you need to generate new instructions. This changes the schedule again.
So you can't execute 'passes' like 'register allocation' and 'scheduling' in parallel either (actually, I think there are articles where computer scientists/mathematicians have tried to solve these two problems together, but lets not go into that).
(Again, one can achieve some parallelism by pipelining passes.)
Moreover, GPUs especially don't fit because:
GPUs are good at floating point math. Something compilers don't need or use much (except when optimizing floating point arithmetic in the program)
GPUs are good at SIMD. i.e. performing the same operation on multiple inputs. This is again, not something a compiler needs to do. There may be a benefit if the compiler needs to, say, optimize several hundred floating point operations away (A wild example would be: the programmer defined several large FP arrays, assigned constants to them, and then wrote code to operate on these. A very badly written program indeed.)
So apart from parallelizing compilation of basic blocks and pipelining passes, there is not much parallelism to be had at the level of 'within compilation of a C file'. But parallelism is possible, easy to implement, and constantly used at a higher level.
GNU Make
, for example, has the-j=N
argument. Which basically means: As long as it findsN
independent jobs (usually, compiling a bunch of files is whatGNU Make
is used for anyway), it spawnsN
processes (orN
instances ofgcc
compiling different files in parallel).如果您问的是“您可以自动编写 GPU 加速代码以与 GCC 和 LLVM 一起使用吗?”答案是肯定的。 NVIDIA 和 Google 开发了基于 LLVM 的开源编译器项目:
NVIDIA CUDA LLVM:
GOOGLE GPUCC:
如果您的问题是,“我可以使用 GPU 来加速非 CUDA 通用代码吗汇编?”目前答案是否定的。 GPU 擅长某些事情,例如并行任务,而不擅长其他事情,例如编译器所涉及的分支。好消息是,您可以使用带有 CPU 的 PC 网络来获得 2-10 倍的编译速度,具体取决于代码的优化程度,并且您可以获得适用于桌面的最快的多核 CPU 和高速 SSD在您诉诸网络构建之前,以更少的麻烦获得收益。
有一些工具可以将 C/C++/ObjC 编译器任务分发到计算机网络,例如 Distcc。它包含在旧版本的 XCode 中,但已被删除,并且不支持将其与 Swift 一起使用。
有一个类似于Distcc的商业工具称为 Incredibuild,支持 Visual Studio C/C++ 和 Linux 开发环境:
有一些不错的文章关于 Incredibuild 与 Distcc 的实际使用情况,以及与本机编译器中的增量构建支持相比的权衡,以便进行小的更改,例如单个文件中的一行,而无需重新编译其他所有内容。需要考虑的要点:
IF what you are asking is, "Can you automatically write GPU-accelerated code for use with GCC and LLVM?" the answer is yes. NVIDIA and Google make open-source LLVM-based compiler projects:
NVIDIA CUDA LLVM:
GOOGLE GPUCC:
If your question is, "can I use the GPU to speed up non-CUDA generic code compilation?" the answer is currently no. The GPU is good at certain things like parallel tasks, bad at others like branches which compilers are all about. The good news is, you can use a network of PCs with CPUs to get 2-10x compile speedups, depending on how optimized your code is already, and you can get the fastest multi-core CPU and high-speed SSD available for your desktop to get gains for less hassle before you resort to network builds.
There are tools to distribute C/C++/ObjC compiler tasks to a network of computers like Distcc. It was included in older versions of XCode but has been removed, and there's no support for using it with Swift.
There is a commercial tool similar to Distcc called Incredibuild that supports Visual Studio C/C++ and Linux development environments:
There are some good articles about real-world use of Incredibuild vs Distcc and tradeoffs compared to the incremental build support in the native compiler for making small changes like a single line in a single file without recompiling everything else. Points to consider: