在 iOS 上使用多重采样抗锯齿会对性能产生什么影响?
我一直在尝试使用多重采样在 iOS 4 上的 iPhone 和 iPad 上进行全场景抗锯齿。一般机制使用 Apple 的 APPLE_framebuffer_multisample 扩展(http://www.khronos.org/registry/gles/extensions/APPLE/APPLE_framebuffer_multisample.txt)并在答案中进行了描述这个问题: 如何在 OpenGL 中激活多重采样iPhone 上的 ES? 并由 Apple 在其 OpenGL ES 编程指南中进行了记录。
它的工作原理与描述的一样,但当我将样本数量设置为 2 时,我的测试应用程序的绘图性能下降了大约 50%。我主要在 iPhone 4 上进行测试,使用不支持视网膜的应用程序。我正在使用 Apple 在其文档中提供的其他性能建议(使用 glDiscardFramebufferEXT 丢弃附加到多重采样帧缓冲区的渲染缓冲区,使用 glClear 在帧开始时清除整个帧缓冲区等)。
对我来说,以这种方式启用多重采样的性能开销似乎大得惊人。你们是否看到类似的结果,或者这是否表明我做错了什么?
I've been experimenting with using multisampling to do full-scene anti-aliasing on the iPhone and iPad on iOS 4. The general mechanism uses Apple's APPLE_framebuffer_multisample extension (http://www.khronos.org/registry/gles/extensions/APPLE/APPLE_framebuffer_multisample.txt) and is described in the answer to this question: How do you activate multisampling in OpenGL ES on the iPhone? and documented by Apple in their OpenGL ES Programming Guide.
It works as described, but the drawing performance of my test application suffers by about 50% when I set the number of samples to be 2. I'm primarily testing on an iPhone 4, using a non-retina-enabled application. I am using the other performance suggestions offered by Apple in their documentation (using glDiscardFramebufferEXT to discard the renderbuffers attached to the multisample framebuffer, using glClear to clear the entire framebuffer at the start of the frame, etc.).
The performance overhead of enabling multisampling in this manner seems surprisingly large to me. Are you guys seeing similar results or does this suggest that I'm doing something incorrectly?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您提到您正在 iPhone 4 上运行此程序。您的 OpenGL ES 层是否以完整的 2X Retina 显示比例因子进行渲染?也就是说,您是否将 OpenGL ES 托管层上的
contentScaleFactor
设置为[[UIScreen mainScreen] scale]
?如果是这样,那么您一开始就推送了大量的像素。在应用多重采样抗锯齿之前,您的填充率是否受到限制?要进行检查,请针对正在运行的应用程序使用 Instruments 中的 OpenGL ES 工具,并启用 Tiler Utilization 和 Renderer Utilization 统计信息。如果您的应用程序在未启用 MSAA 的情况下显示较高的渲染器利用率,则您的填充率一开始就会受到限制。由于这个瓶颈,在此基础上添加 MSAA 可能会显着降低帧速率。
在我拥有的一个受几何限制而不是填充率限制的应用程序中,在 iPhone 4 上使用 4X MSAA 时,我没有看到那么大的减速。我猜测您的应用程序中的瓶颈在于推送屏幕上的像素。
You mentioned that you're running this on an iPhone 4. Is your OpenGL ES layer rendering at the full 2X Retina display scale factor? That is, have you set the
contentScaleFactor
on the OpenGL ES hosting layer to[[UIScreen mainScreen] scale]
? If so, you're pushing a large number of pixels to start with.Are you fill rate limited before you apply the multisampled antialiasing? To check, use the OpenGL ES instrument in Instruments against your running application and enable the Tiler Utilization and Renderer Utilization statistics. If your application shows a high Renderer Utilization without MSAA enabled, you are fill rate limited to begin with. Adding MSAA on top of that could significantly reduce your framerates because of this bottleneck.
In an application that I had which was geometry limited, not fill rate limited, I didn't see that great of a slowdown when using 4X MSAA in it on an iPhone 4. I'm guessing that the bottleneck in your application is in pushing pixels to the screen.
当您将样本数设置为 2 时,您的性能会下降约 50%,这并不奇怪:您绘制了两倍的样本!多重采样意味着您本质上以比屏幕更高的分辨率将场景绘制到离屏缓冲区,然后使用过滤算法将更高分辨率的多重采样缓冲区降低到显示屏幕分辨率,希望能够减少锯齿伪影,因为最终的结果与单采样版本相比,图片实际上包含更多细节(经过过滤的更高分辨率输出)。
这是图形中一个非常常见(如果不是最常见)的性能问题:绘制的样本越多,速度就越慢。
It is not surprising that your performance suffers by about 50% when you set the # of samples to 2: you're drawing twice the samples! Multisampling means you essentially draw your scene at a higher resolution than the screen to an off-screen buffer, and then you use filtering algorithms to reduce the higher resolution multi-sampled buffer to the display screen resolution, hopefully with fewer aliasing artifacts because the final picture actually includes more detail (filtered higher resolution output) than the single-sampled version.
It is a very common (if not the most common) performance problem in graphics: the more samples you draw, the slower you go.