测量 OpenGL ES 2.0 中的渲染到纹理性能

发布于 2024-11-23 18:00:28 字数 1269 浏览 5 评论 0原文

基本上,我正在使用由两个三角形和片段着色器组成的屏幕大小的矩形进行某种图像处理,片段着色器正在执行整个处理工作。实际效果类似于动画,因为它依赖于一个名为 current_frame 的统一变量。

我对衡量“MPix/s”的性能非常感兴趣。我所做的就是这样的:

/* Setup all necessary stuff, including:                 */
/* - getting the location of the `current_frame` uniform */
/* - creating an FBO, adding a color attachment          */
/*   and setting it as the current one                   */

double current_frame = 0;
double step = 1.0f / NUMBER_OF_ITERATIONS;

tic(); /* Start counting the time */

for (i = 0; i < NUMBER_OF_ITERATIONS; i++)
{
    glUniform1f(current_frame_handle, current_frame);
    current_frame += step;
    glDrawArrays(GL_TRIANGLES, 0, NUMBER_OF_INDICES);
    glFinish();
}

double elapsed_time = tac(); /* Get elapsed time in seconds */

/* Calculate achieved pixels per second */
double pps = (OUT_WIDTH * OUT_HEIGHT * NUMBER_OF_ITERATIONS) / elapsed_time;

/* Sanity check by using reading the output into a buffer */
/* using glReadPixels and saving this buffer into a file  */

就理论而言,我的概念有什么问题吗?

另外,我的印象是移动硬件上的 glFinish() 不一定要等待先前的渲染调用,并且可能会进行一些优化。

当然,我总是可以通过在每次绘制后执行 glReadPixels()强制它,但这会很慢,因此这并没有真正的帮助。

您能否告诉我我的测试场景是否合理以及是否还有其他可以做的事情。

Basically, I'm doing some sort of image processing using a screen-sized rectangle made of two triangles and a fragment shader, which is doing the whole processing stuff. The actual effect is something like an animation as it depends on a uniform variable, called current_frame.

I'm very much interested in measuring the performance in terms of "MPix/s". What I do is something like that:

/* Setup all necessary stuff, including:                 */
/* - getting the location of the `current_frame` uniform */
/* - creating an FBO, adding a color attachment          */
/*   and setting it as the current one                   */

double current_frame = 0;
double step = 1.0f / NUMBER_OF_ITERATIONS;

tic(); /* Start counting the time */

for (i = 0; i < NUMBER_OF_ITERATIONS; i++)
{
    glUniform1f(current_frame_handle, current_frame);
    current_frame += step;
    glDrawArrays(GL_TRIANGLES, 0, NUMBER_OF_INDICES);
    glFinish();
}

double elapsed_time = tac(); /* Get elapsed time in seconds */

/* Calculate achieved pixels per second */
double pps = (OUT_WIDTH * OUT_HEIGHT * NUMBER_OF_ITERATIONS) / elapsed_time;

/* Sanity check by using reading the output into a buffer */
/* using glReadPixels and saving this buffer into a file  */

As far as theory goes, is there anything wrong with my concept?

Also, I've got the impression that glFinish() on mobile hardware doesn't necessarily wait for previous render calls and may do some optimizations.

Of course, I can always force it by doing glReadPixels() after each draw, but that would be quite slow so that this wouldn't really help.

Could you advise me as to whether my testing scenario is sensible and whether there is something more that can be done.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

箹锭⒈辈孓 2024-11-30 18:00:28

关于速度,使用glDrawArrays()仍然会复制共享顶点。

glDrawElements()是减少顶点数量的解决方案
数组,因此允许向 OpenGL 传输较少的数据。

http://www.songho.ca/opengl/gl_vertexarray.html

只是把它扔进去在那里可以帮助加快你的结果。就你的计时概念而言,我觉得很好。您得到的结果是否与您希望的类似?

Concerning speed, using glDrawArrays() still duplicates the shared vertices.

glDrawElements() is the solution to reduce the number of vertices in
the array, so it allows transferring less data to OpenGL.

http://www.songho.ca/opengl/gl_vertexarray.html

Just throwing that in there to help speed up your results. As far as your timing concept, it looks fine to me. Are you getting results similar to what you had hoped?

-残月青衣踏尘吟 2024-11-30 18:00:28

我会预先计算所有可能的帧,然后使用 glEnableClientState() 和 glTexCoordPointer() 来更改在每个帧中绘制现有纹理的哪一部分。

I would precalculate all possible frames, and then use glEnableClientState() and glTexCoordPointer() to change which part of the existing texture is drawn in each frame.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文