像一维数组一样读取帧缓冲区纹理
我正在使用 GL 进行一些 gpgpu 计算,并希望从帧缓冲区读取结果。 我的帧缓冲区纹理在逻辑上是一个一维数组,但我将其变成了二维数组以获得更大的区域。现在我想读取具有任何给定长度的帧缓冲区纹理中的任何任意像素。
这意味着所有计算都已在 GPU 端完成,我只需要将某些数据传递到可以在纹理边界上对齐的 CPU。
这可能吗?如果是,它是否比整个图像上的 glReadPixels
慢/快,然后剪掉我需要的内容?
编辑 当然,我了解 OpenCL/CUDA,但它们并不需要,因为我希望我的程序在(几乎)任何平台上开箱即用。
我还知道 glReadPixels 非常慢,原因之一可能是它提供了一些我不需要的功能(在 2D 中操作)。因此我要求一个更基本的函数,可能会更快。
I am doing some gpgpu calculations with GL and want to read my results from the framebuffer.
My framebuffer-texture is logically an 1D array, but I made it 2D to have a bigger area. Now I want to read from any arbitrary pixel in the framebuffer-texture with any given length.
That means all calculations are already done on GPU side and I only need to pass certain data to the cpu that could be aligned over the border of the texture.
Is this possible? If yes is it slower/faster than glReadPixels
on the whole image and then cutting out what I need?
EDIT
Of course I know about OpenCL/CUDA but they are not desired because I want my program to run out of the box on (almost) any platform.
Also I know that glReadPixels is very slow and one reason might be that it offers some functionality that I do not need (Operating in 2D). Therefore I asked for a more basic function that might be faster.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以使用像素缓冲区对象 (PBO) 将像素数据从帧缓冲区传输到 PBO,然后使用
glMapBufferARB
直接读取数据:http://www.songho.ca/opengl/gl_pbo.html
You can use a pixel buffer object (PBO) to transfer pixel data from the framebuffer to the PBO, then use
glMapBufferARB
to read the data directly:http://www.songho.ca/opengl/gl_pbo.html
使用 glReadPixels 读取整个帧缓冲区只是为了丢弃除少数像素/行之外的所有内容,效率非常低。但是 glReadPixels 允许您在帧缓冲区内指定一个矩形,那么为什么不将其限制为获取感兴趣的几行呢?因此,您最终可能会在获取的第一行和最后一行的开头和结尾处获取一些额外的数据,但我怀疑与进行多次调用相比,其开销是最小的。
可能将数据写入图块中的帧缓冲区和/或使用 Morton order 可能有助于构建它,以便可以找到更紧密的边界框,并最大限度地减少检索的额外数据。
Reading the whole framebuffer with glReadPixels just to discard it all except for a few pixels/lines would be grossly inefficient. But glReadPixels lets you specify a rect within the framebuffer, so why not just restrict it to fetching the few rows of interest ? So you maybe end up fetching some extra data at the start and end of the first and last lines fetched, but I suspect the overhead of that is minimal compared with making multiple calls.
Possibly writing your data to the framebuffer in tiles and/or using Morton order might help structure it so a tighter bounding box can be be found and the extra data retrieved minimised.