glReadPixels() 读取 GL_DEPTH_COMPONENT 时速度缓慢
我的应用程序依赖于从帧缓冲区读取深度信息。我已经用 glReadPixels(0, 0, width, height, GL_DEPTH_COMPONENT, GL_FLOAT, &depth_data) 实现了这一点,
但是这运行速度慢得不合理,它使我的应用程序从平滑的 30fps 变为滞后的 3fps。如果我尝试读回其他维度或数据,它会在可接受的水平上运行。
给出一个概述:
- No glReadPixels ->每秒 30 帧
- glReadPixels(0, 0, 1, 1, GL_DEPTH_COMPONENT, GL_FLOAT, &depth_data); ->每秒20帧,可接受
- glReadPixels(0, 0, width, height, GL_RED, GL_FLOAT, &深度_data); ->每秒 20 帧,可接受
- glReadPixels(0, 0, width, height, GL_DEPTH_COMPONENT, GL_FLOAT, &深度_data); ->每秒 3 帧,不可接受
与其他调用相比,为什么最后一个调用如此慢?有什么办法可以补救吗?
宽度 x 高度约为 100 x 1000,随着尺寸的增加,调用变得越来越慢。
我也尝试过使用像素缓冲区对象,但这对性能没有显着影响,它只会延迟缓慢直到 glMapBuffer() 调用。
(我已经在 MacBook Air nVidia 320m 显卡 OS X 10.6 上进行了测试,奇怪的是我的旧 MacBook Intel GMA x3100 读取深度缓冲区的速度约为 15 fps。)
更新: 将 GLUT_MULTISAMPLE 排除在 glutInitDisplayMode 之外选项带来了天壤之别,使应用程序再次恢复到流畅的 20fps。我不知道这个选项首先是做什么的,有人能解释一下吗?
My application is dependent on reading depth information back from the framebuffer. I've implemented this with glReadPixels(0, 0, width, height, GL_DEPTH_COMPONENT, GL_FLOAT, &depth_data)
However this runs unreasonable slow, it brings my application from a smooth 30fps to a laggy 3fps. If I try to other dimensions or data to read back it runs on an acceptable level.
To give an overview:
- No glReadPixels -> 30 frames per second
- glReadPixels(0, 0, 1, 1, GL_DEPTH_COMPONENT, GL_FLOAT, &depth_data); -> 20 frames per second, acceptable
- glReadPixels(0, 0, width, height, GL_RED, GL_FLOAT, &depth_data); -> 20 frames per second, acceptable
- glReadPixels(0, 0, width, height, GL_DEPTH_COMPONENT, GL_FLOAT, &depth_data); -> 3 frames per second, not acceptable
Why should the last one be so slow compared to the other calls? Is there any way to remedy it?
width x height is approximately 100 x 1000, the call gets increasingly slower as I increase the dimensions.
I've also tried to use pixel buffer objects but this has no significant effect on performance, it only delays the slowness till the glMapBuffer() call.
(I've tested this on a MacBook Air nVidia 320m graphics OS X 10.6, strangely enough my old MacBook Intel GMA x3100 got ~15 fps reading the depth buffer.)
UPDATE: leaving GLUT_MULTISAMPLE out of the glutInitDisplayMode options made a world of difference bringing the application back to a smooth 20fps again. I don't know what the option does in the first place, can anyone explain?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您的主帧缓冲区启用了 MSAA(存在 GLUT_MULTISAMPLE),则会创建 2 个实际帧缓冲区 - 一个具有 MSAA,另一个是常规帧缓冲区。
第一个需要您填写。它包含正面和背面颜色表面,以及深度和模板。第二个必须仅包含通过解析相应 MSAA 表面产生的颜色。
但是,当您尝试使用 glReadPixels 读取深度时,驱动程序也被迫解析启用 MSAA 的深度表面,这可能会导致速度变慢。
If your main framebuffer is MSAA-enabled (GLUT_MULTISAMPLE is present), then 2 actual framebuffers are created - one with MSAA and one regular.
The first one is needed for you to fill. It contains front and back color surfaces, plus depth and stencil. The second one has to contain only color that is produced by resolving the corresponding MSAA surface.
However, when you are trying to read depth using
glReadPixels
the driver is forced to resolve the MSAA-enabled depth surface too, which probably causes your slowdown.您为深度缓冲区选择的存储格式是什么?
如果它不是 GLfloat,那么您会要求 GL 在读取深度缓冲区时将深度缓冲区中的每个深度转换为浮点数。 (对于你的第三个项目符号来说也是一样的,GL_RED。你的颜色缓冲区是浮动缓冲区吗?)
What is the storage format you chose for your depth buffer ?
If it is not GLfloat, then you're asking GL to convert every single depth in the depth buffer to float when reading it. (And it's the same for your 3rd bullet, with GL_RED. was your Color buffer a float buffer ?)
无论是GL_FLOAT还是GL_UNSIGNED_BYTE,glReadPixels仍然很慢。如果使用PBO来获取RGB值,会非常快。
使用PBO处理RGB值时,CPU占用率为4%。但处理深度值时会增加到50%。我尝试过 GL_FLOAT、GL_UNSIGNED_BYTE、GL_UNSIGNED_INT、GL_UNSIGNED_INT_24_8。所以我可以得出结论 PBO 对于读取深度值是没有用的
No matter it is GL_FLOAT or GL_UNSIGNED_BYTE, glReadPixels is still very slow. If you use PBO to get RGB value, it will be very fast.
When using PBO to handle RGB value, the CPU usage is 4%. But it will increase to 50% when handling depth value. I've tried GL_FLOAT, GL_UNSIGNED_BYTE, GL_UNSIGNED_INT, GL_UNSIGNED_INT_24_8. So I can conclude that PBO is useless for reading depth value