GPU编程?
我是 GPU 编程世界的新手,我尝试阅读维基百科和谷歌搜索,但我仍然有几个问题:
我下载了一些 GPU 示例,对于 CUDA,有一些 .cu 文件和一些 CPP 文件,但所有代码都是普通的 C/C++ 代码,只是一些奇怪的函数,例如 cudaMemcpyToSymbol ,其余的都是纯 C 代码。问题是,.cu代码是用nvcc编译然后用gcc链接的吗?或者它是如何编程的?
如果我编写了要在 GPU 上运行的代码,它会在所有 GPU 上运行吗?或者只是 CUDA?或者是否有一种为 CUDA 编写的方法和一种为 ATI 编写的方法以及为两者编写的方法?
I'm new to the GPU Programming world, I've tried reading on Wikipedia and Googling, but I still have several questions:
I downloaded some GPU Examples, for CUDA, there were some .cu files and some CPP files, but all the code was normal C/C++ Code just some weird functions like
cudaMemcpyToSymbol
and the rest was pure c code. The question is, is the .cu code compiled with nvcc and then linked with gcc? Or how is it programmed?if I coded something to be run on GPU, will it run on ALL GPUs? or just CUDA? or is there a method to write for CUDA and a Method to write for ATI and a method to write for both?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
回答你的第二个问题:
如果你想编写独立于平台的 GPGPU 代码,OpenCL 是(唯一)可行的方法。
ATI的网站实际上有很多关于OpenCL的资源,如果你稍微搜索一下,他们的示例项目很容易修改成你需要的,或者只是理解代码。
OpenCL 规范和参考页也是非常好的知识来源:
http://www.khronos.org/registry/cl/ sdk/1.1/docs/man/xhtml/
http://www.khronos.org/registry/cl/specs/opencl -1.1.pdf
有很多演讲解释了一些核心概念,也解释了如何编写我推荐的快速代码(也适用于 CUDA)。
几乎可以回答你的第一个问题:
在 OpenCL 中,代码在运行时编译到您正在使用的特定 GPU(以保证速度)。
To answer your second question:
OpenCL is the (only) way to go if you want to write platform independent GPGPU code.
ATIs website actually has a lot of resources for OpenCL if you search a little, and their example projects are very easy to modify into what you need, or just to understand the code.
The OpenCL spec and reference pages is also a very good source of knowledge:
http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/
http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf
There are a lot of talks that explain some of the core concepts, and also that explain how to write fast code that I would recommend (that is applicable to CUDA too).
To almost answer your first question:
In OpenCL, the code is compiled at runtime to the specific GPU you're using (to guarantee speed).
您可能想要阅读一些有关 CUDA 的背景知识 - 这不是您可以通过查看一些代码示例来了解的内容。现在亚马逊上大约有 3 种不同的 CUDA 书籍,并且在 http://developer.nvidia.com 上有很多参考资料。
回答您的问题:
是的,.cu 文件使用 nvcc 编译为中间形式 (PTX) - 随后在运行时转换为 GPU 特定的代码
生成的代码将在 nVidia GPU 的子集上运行,子集的大小取决于您使用的 CUDA 功能你的代码
You probably want to do some background reading on CUDA - it's not something you can just pick up by looking at a few code samples. There are about 3 different CUDA books on Amazon now, and there is a lot of reference material at http://developer.nvidia.com.
To answer your questions:
yes, .cu files are compiled with nvcc to an intermediate form (PTX) - this is subsequently converted to GPU-specific code at run-time
the generated code will run on a subset of nVidia GPUs, the size of the subset depending on what CUDA capabilities you use in your code
完成@nulvinge 给出的答案后,我想说 OpenCL 之于 GPU 编程就像 OpenGL 之于 GPU 渲染一样。但它不是多架构开发的唯一选择,您也可以使用 DirectCompute,但我不会说这是最好的选择,只是如果您希望代码在每个 DirectX11 兼容 GPU(包括一些英特尔显卡芯片)上运行太对了?
但即使您正在考虑使用 OpenCL 进行一些 GPU 编程,也不要忘记研究您正在使用的平台的架构。 ATI CPU、GPU 和 NVIDIA GPU 有很大差异,如果您想充分利用所使用的每个平台,则需要对代码进行调整...
幸运的是,NVIDIA 和 AMD 都有编程指南来帮助您: )
completing the answer given by @nulvinge, I'd say that OpenCL its to GPU Programming like OpenGL is to GPU Rendering. But its not the only option for multi-architecture development, you could also use DirectCompute, but I wouldn't say that its the best option, just if you want your code running on every DirectX11 compatible GPUs, that includes some intel graphics cards chips too right?
But even if you are thinking in doing some GPU programming with OpenCL, do not forget to study the architecture of the platforms that you're using. ATI CPUs, GPUs and NVIDIA GPUs have big differences and your code is needed to be tuned for each platform that you're using if you want to get the most of it...
Fortunately both NVIDIA and AMD have Programming Guides to help you:)
除了之前的答案之外,对于 CUDA,您还需要一张 NVIDIA 卡/GPU,除非您有权访问远程卡/GPU,我会推荐 Coursera 上的这门课程:
异构并行编程
它不仅介绍了 CUDA 和 OpenCL、内存模型、平铺、处理边界条件和性能考虑因素,还介绍了基于指令的语言,例如 OpenACC、一个用于将并行性表达到代码中的高级语言,将大部分并行编程工作留给编译器(很好的开始)。此外,本课程有一个在线平台,您可以在其中使用他们的 GPU,这非常适合开始 GPU 编程,而无需担心软件/硬件设置。
In addition to previous answers, for CUDA you would need a NVIDIA card/GPU, unless you have access for a remote one, which I would recommend this course from Coursera:
Heterogeneous Parallel Programming
It not just gives an introduction to CUDA and OpenCL, memory model, tiling, handling boundary conditions and performance considerations, but also directive-based languages such as OpenACC, a high level language for expressing parallelism into your code, leaving mostly of the parallel programming work for the compiler (good to start with). Also, this course has a online platform where you can use their GPU, which is good to start GPU programming without concerning about software/hardware setup.
如果您想编写可移植代码,可以在不同的 GPU 设备以及 CPU 上执行。您需要使用 OpenCL。
实际上,要配置内核,您需要用 C 语言编写主机代码。如果您想为 CUDA 内核编写配置文件,与 OpenCL 内核相比,配置文件可能会更短。
If you want to write a portable code which you can execute on different GPU devices and also on CPUs. You need to use OpenCL.
Actually, to configure your kernel you need to write a host code in C. The configuration file might be shorter if you want to write it for CUDA kernels comparing to OpenCL one.