测量 OpenGL ES 2.0 中的渲染到纹理性能
基本上,我正在使用由两个三角形和片段着色器组成的屏幕大小的矩形进行某种图像处理,片段着色器正在执行整个处理工作。实际效果类似于动画,因为它依赖于一个名为 current_frame
的统一变量。
我对衡量“MPix/s”的性能非常感兴趣。我所做的就是这样的:
/* Setup all necessary stuff, including: */
/* - getting the location of the `current_frame` uniform */
/* - creating an FBO, adding a color attachment */
/* and setting it as the current one */
double current_frame = 0;
double step = 1.0f / NUMBER_OF_ITERATIONS;
tic(); /* Start counting the time */
for (i = 0; i < NUMBER_OF_ITERATIONS; i++)
{
glUniform1f(current_frame_handle, current_frame);
current_frame += step;
glDrawArrays(GL_TRIANGLES, 0, NUMBER_OF_INDICES);
glFinish();
}
double elapsed_time = tac(); /* Get elapsed time in seconds */
/* Calculate achieved pixels per second */
double pps = (OUT_WIDTH * OUT_HEIGHT * NUMBER_OF_ITERATIONS) / elapsed_time;
/* Sanity check by using reading the output into a buffer */
/* using glReadPixels and saving this buffer into a file */
就理论而言,我的概念有什么问题吗?
另外,我的印象是移动硬件上的 glFinish()
不一定要等待先前的渲染调用,并且可能会进行一些优化。
当然,我总是可以通过在每次绘制后执行 glReadPixels()
来强制它,但这会很慢,因此这并没有真正的帮助。
您能否告诉我我的测试场景是否合理以及是否还有其他可以做的事情。
Basically, I'm doing some sort of image processing using a screen-sized rectangle made of two triangles and a fragment shader, which is doing the whole processing stuff. The actual effect is something like an animation as it depends on a uniform variable, called current_frame
.
I'm very much interested in measuring the performance in terms of "MPix/s". What I do is something like that:
/* Setup all necessary stuff, including: */
/* - getting the location of the `current_frame` uniform */
/* - creating an FBO, adding a color attachment */
/* and setting it as the current one */
double current_frame = 0;
double step = 1.0f / NUMBER_OF_ITERATIONS;
tic(); /* Start counting the time */
for (i = 0; i < NUMBER_OF_ITERATIONS; i++)
{
glUniform1f(current_frame_handle, current_frame);
current_frame += step;
glDrawArrays(GL_TRIANGLES, 0, NUMBER_OF_INDICES);
glFinish();
}
double elapsed_time = tac(); /* Get elapsed time in seconds */
/* Calculate achieved pixels per second */
double pps = (OUT_WIDTH * OUT_HEIGHT * NUMBER_OF_ITERATIONS) / elapsed_time;
/* Sanity check by using reading the output into a buffer */
/* using glReadPixels and saving this buffer into a file */
As far as theory goes, is there anything wrong with my concept?
Also, I've got the impression that glFinish()
on mobile hardware doesn't necessarily wait for previous render calls and may do some optimizations.
Of course, I can always force it by doing glReadPixels()
after each draw, but that would be quite slow so that this wouldn't really help.
Could you advise me as to whether my testing scenario is sensible and whether there is something more that can be done.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
关于速度,使用glDrawArrays()仍然会复制共享顶点。
http://www.songho.ca/opengl/gl_vertexarray.html
只是把它扔进去在那里可以帮助加快你的结果。就你的计时概念而言,我觉得很好。您得到的结果是否与您希望的类似?
Concerning speed, using
glDrawArrays()
still duplicates the shared vertices.http://www.songho.ca/opengl/gl_vertexarray.html
Just throwing that in there to help speed up your results. As far as your timing concept, it looks fine to me. Are you getting results similar to what you had hoped?
我会预先计算所有可能的帧,然后使用 glEnableClientState() 和 glTexCoordPointer() 来更改在每个帧中绘制现有纹理的哪一部分。
I would precalculate all possible frames, and then use glEnableClientState() and glTexCoordPointer() to change which part of the existing texture is drawn in each frame.