使用 CUDA 实现、python (pycuda) 或 C++ 处理图像?
我正在做一个使用 CUDA 处理图像的项目。该项目只是图像的加法或减法。
请问您的专业意见,哪个最好,这两个有什么优缺点?
我感谢大家的意见和/或建议,因为这个项目对我来说非常重要。
I am in a project to process an image using CUDA. The project is simply an addition or subtraction of the image.
May I ask your professional opinion, which is best and what would be the advantages and disadvantages of those two?
I appreciate everyone's opinions and/or suggestions since this project is very important to me.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
一般答案:没关系。使用您更熟悉的语言。
但是请记住,pycuda 只是 CUDA C 接口的包装器,因此它可能并不总是最新的,而且它还增加了另一个潜在的错误来源,……
Python 非常擅长快速原型设计,所以我' d 个人选择Python。如果需要,您可以稍后切换到 C++。
General answer: It doesn't matter. Use the language you're more comfortable with.
Keep in mind, however, that pycuda is only a wrapper around the CUDA C interface, so it may not always be up-to-date, also it adds another potential source of bugs, …
Python is great at rapid prototyping, so I'd personally go for Python. You can always switch to C++ later if you need to.
如果管道的其余部分使用 Python,并且您已经使用 Numpy 来加速,那么 pyCUDA 是加速昂贵操作的良好补充。但是,根据图像的大小和程序流程,使用 pyCUDA 可能不会获得太多加速。在 PCI 总线上来回传递数据会产生延迟,而这种延迟只能通过大数据量来弥补。
对于您的情况(加法和减法),pyCUDA 中有内置运算,您可以利用它们来发挥自己的优势。然而,根据我的经验,使用 pyCUDA 来做一些重要的事情需要首先了解 CUDA 的工作原理。对于没有 CUDA 知识的人来说,pyCUDA 可能是一个陡峭的学习曲线。
If the rest of your pipeline is in Python, and you're using Numpy already to speed things up, pyCUDA is a good complement to accelerate expensive operations. However, depending on the size of your images and your program flow, you might not get too much of a speedup using pyCUDA. There is latency involved in passing the data back and forth across the PCI bus that is only made up for with large data sizes.
In your case (addition and subtraction), there are built-in operations in pyCUDA that you can use to your advantage. However, in my experience, using pyCUDA for something non-trivial requires knowing a lot about how CUDA works in the first place. For someone starting from no CUDA knowledge, pyCUDA might be a steep learning curve.
看看openCV,它包含很多图像处理函数和所有加载/保存的帮助程序/显示图像并操作相机。
它现在还支持 CUDA,一些图像处理功能已在 CUDA 中重新实现,它为您提供了一个很好的框架来实现您自己的功能。
Take a look at openCV, it contains a lot of image processing functions and all the helpers to load/save/display images and operate cameras.
It also now supports CUDA, some of the image processing functions have been reimplemented in CUDA and it gives you a good framework to do your own.
亚历克斯的回答是正确的。包装所消耗的时间是最少的。请注意,PyCUDA 有一些不错的元编程结构,用于生成可能有用的内核。
如果您所做的只是添加或减去图像的元素,那么您可能根本不应该使用 CUDA。通过 PCI-E 总线来回传输所需的时间将使您从并行性中节省的时间相形见绌。
任何时候处理 CUDA 时,考虑 CGMA 比率(计算与全局内存访问比率)都是很有用的。您的加法/减法只是 2 次内存访问(1 次读和 1 次写)的 1 次浮点运算。从 CUDA 的角度来看,这最终非常糟糕。
Alex's answer is right. The amount of time consumed in the wrapper is minimal. Note that PyCUDA has some nice metaprogramming constructs for generating kernels which might be useful.
If all you're doing is adding or subtracting elements of an image, you probably shouldn't use CUDA for this at all. The amount of time it takes to transfer back and forth across the PCI-E bus will dwarf the amount of savings you get from parallelism.
Any time you deal with CUDA, it's useful to think about the CGMA ratio (computation to global memory access ratio). Your addition/subtraction is only 1 float point operation for 2 memory accesses (1 read and 1 write). This ends up being very lousy from a CUDA perspective.