OpenCL 替代模数用途、建议
我过去曾在 C++ 中使用过这个简单的函数来模拟简单形式的曲面细分。该函数需要一个数字和一个除数。除数必须是(2 的幂 - 1),并且 n 应介于 0 和除数之间。它使用按位 & 返回 n % (d+1) 的模结果。
相当肯定这个函数是这样的:
unsigned int BitwiseMod(unsigned int n, unsigned int d){ return n & d; }
我想在 OpenCL 中有效地使用它,并且想知道它是否也能像我想象的那样工作。在我看来,模数在 GPU 上是一项非常昂贵的操作,但我熟悉使用它来形成幅度空间和其他技术来遍历数据。
更多时候,我更可能简单地假设函数有一些开销来编写此代码。
x[i] = 8*(i&d)+offset[i]; //OR in other contexts,...
num = i&d+offset[i];
x[num] = data;
问题是:这有用还是会妨碍,如果有用,您能给我一些我可能尝试应用它的例子吗?
There is this simple function which I have used with C++ in the past to simulate simple forms of tessellation. The function takes a number and a divisor. The divisor must be (a power of two - 1) and n should be between 0 and divisor. It returns a modulus result of n % (d+1) using bitwise &.
Fairly sure the function goes like:
unsigned int BitwiseMod(unsigned int n, unsigned int d){ return n & d; }
I am wanting to use this effectively in OpenCL and am wondering if it will work as I imagine it too. In my mind, modulus is a very expensive operation on the GPU but I am familiar using it to form magnitude spaces and other techniques to travel through data.
More often, I would be more likely to simply write this assuming functions have some overhead.
x[i] = 8*(i&d)+offset[i]; //OR in other contexts,...
num = i&d+offset[i];
x[num] = data;
The question is: Will this be useful or get in the way, if useful can you give me some examples where I might try to apply it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在 NVidia 的架构、GT200 及更高版本上,Modulo 并不是特别慢,并不比普通整数除法慢。有关详细信息,请参阅本文。
然而,使用按位与仍然要快得多。由于 GPU 上的函数调用成本高昂,因此 OpenCL 编译器默认积极使用内联来提高性能。您应该可以使用函数调用,因为它将被内联。
On NVidia's architectures, GT200 and up, Modulo isn't particularly slow, not slower than a normal integer divide. See this paper for details.
However, using a bitwise AND is still quite a lot faster. As function calls are expensive on GPUs, OpenCL compilers aggressively use inlining to improve performance by default. You should be fine with a function call, as it will be inlined.