我可以仅使用 Python 来编程 Nvidia 的 CUDA 还是必须学习 C?
我想这个问题本身就说明了问题。我有兴趣做一些严肃的计算,但我并不是一名程序员。我可以将足够多的Python串在一起来完成我想要的事情。但是我可以用 python 编写程序并让 GPU 使用 CUDA 执行它吗?或者我必须混合使用 python 和 C 吗?
Klockner (sp)“pyCUDA”网页上的示例混合了 python 和 C,所以我不确定答案是什么。
如果有人想插话 Opencl,请随意。我几周前才听说这个 CUDA 业务,但不知道你可以这样使用你的视频卡。
I guess the question speaks for itself. I'm interested in doing some serious computations but am not a programmer by trade. I can string enough python together to get done what I want. But can I write a program in python and have the GPU execute it using CUDA? Or do I have to use some mix of python and C?
The examples on Klockner's (sp) "pyCUDA" webpage had a mix of both python and C, so I'm not sure what the answer is.
If anyone wants to chime in about Opencl, feel free. I heard about this CUDA business only a couple of weeks ago and didn't know you could use your video cards like this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
您应该查看 CUDAmat 和 Theano。两者都是编写在 GPU 上执行的代码的方法,而无需真正了解 GPU 编程。
You should take a look at CUDAmat and Theano. Both are approaches to writing code that executes on the GPU without really having to know much about GPU programming.
我相信,使用 PyCUDA,您的计算内核将始终必须编写为“CUDA C 代码”。 PyCUDA 负责许多繁琐的簿记工作,但不会从 Python 代码构建计算 CUDA 内核。
I believe that, with PyCUDA, your computational kernels will always have to be written as "CUDA C Code". PyCUDA takes charge of a lot of otherwise-tedious book-keeping, but does not build computational CUDA kernels from Python code.
pyopencl 提供了 PyCUDA 的一个有趣的替代方案。它被描述为 PyCUDA 的“姐妹项目”。它是 OpenCL API 的完整包装器。
据我了解,OpenCL 在 GPU 上运行的优势比 Nvidia 的要强。
pyopencl offers an interesting alternative to PyCUDA. It is described as a "sister project" to PyCUDA. It is a complete wrapper around OpenCL's API.
As far as I understand, OpenCL has the advantage of running on GPUs beyond Nvidia's.
已经很好的答案了,但另一个选择是 Clyther。通过将 Python 的子集编译到 OpenCL 内核中,您甚至无需使用 C 语言即可编写 OpenCL 程序。
Great answers already, but another option is Clyther. It will let you write OpenCL programs without even using C, by compiling a subset of Python into OpenCL kernels.
一个有前途的库是 Copperhead (替代链接),你只需要装饰你想要由GPU运行的函数(然后你可以选择加入/选择退出它来看看cpu或gpu之间最好的对于该功能)
A promising library is Copperhead (alternative link), you just need to decorate the function that you want to be run by the GPU (and then you can opt-in / opt-out it to see what's best between cpu or gpu for that function)
有一套很好的基本数学结构,其中已经编写了计算内核,可以通过 pyCUDA 的 cumath 访问 模块。如果您想做更多复杂的或特定/自定义的事情,您将不得不在内核定义中编写一些 C 语言,但 pyCUDA 的好处在于它会为您完成繁重的 C 工作;它在后端进行了大量的元编程,因此您不必担心严肃的 C 编程,只需担心小片段。给出的示例之一是用于计算点积的 Map/Reduce 内核:
dot_krnl = ReductionKernel(np.float32,neutral="0",
reduce_expr =“a + b”,
map_expr="x[i]*y[i]",
argument="float *x, float *y")
每个参数中的小代码片段都是 C 行,但它实际上为您编写了程序。
ReductionKernel
是map/reduce类型函数的自定义内核类型,但有不同的类型。官方 pyCUDA 文档 的示例部分提供了更多详细信息。祝你好运!
There is a good, basic set of math constructs with compute kernels already written that can be accessed through pyCUDA's
cumath
module. If you want to do more involved or specific/custom stuff you will have to write a touch of C in the kernel definition, but the nice thing about pyCUDA is that it will do the heavy C-lifting for you; it does a lot of meta-programming on the back-end so you don't have to worry about serious C programming, just the little pieces. One of the examples given is a Map/Reduce kernel to calculate the dot product:dot_krnl = ReductionKernel(np.float32, neutral="0",
reduce_expr="a+b",
map_expr="x[i]*y[i]",
arguments="float *x, float *y")
The little snippets of code inside each of those arguments are C lines, but it actually writes the program for you. the
ReductionKernel
is a custom kernel type for map/reducish type functions, but there are different types. The examples portion of the official pyCUDA documentation goes into more detail.Good luck!
Scikits CUDA 包 可能是一个更好的选择,前提是它不需要任何任何可以表示为 numpy 数组操作的操作的低级知识或 C 代码。
Scikits CUDA package could be a better option, provided that it doesn't require any low-level knowledge or C code for any operation that can be represented as numpy array manipulation.
我也想知道同样的事情并进行了一些搜索。我发现下面链接的文章似乎可以回答您的问题。然而,您早在 2014 年就问过这个问题,而 Nvidia 的文章没有注明日期。
https://developer.nvidia.com/how-to-cuda-python
该视频介绍了设置、初始示例以及非常重要的分析。但是,我不知道您是否可以实现所有常见的通用计算模式。我认为你可以,因为据我所知,NumPy 没有任何限制。
I was wondering the same thing and carried a few searches. I found the article linked below which seems to answer your question. However, you asked this back in 2014 and the Nvidia article does not have a date.
https://developer.nvidia.com/how-to-cuda-python
The video goes through the set up, an initial example and, quite importantly, profiliing. However, I do not know if you can implement all of the usual general compute patterns. I would think you can because as far as I could there are no limitations in NumPy.