OpenGL - 快速纹理四边形?
我试图在 3D 空间中的随机位置显示尽可能多的纹理四边形。根据我迄今为止的经验,在不将 fps 大幅降低到 30 以下的情况下,我无法显示甚至数千个图像(我的相机运动脚本变得滞后)。
现在我正在遵循一个古老的教程。初始化 OpenGL 后:
glEnable(GL_TEXTURE_2D);
glShadeModel(GL_SMOOTH);
glClearColor(0, 0, 0, 0);
glClearDepth(1.0f);
glEnable(GL_DEPTH_TEST);
glDepthFunc(GL_LEQUAL);
glHint(GL_PERSPECTIVE_CORRECTION_HINT, GL_NICEST);
我设置视点和透视图:
glViewport(0,0,width,height);
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
gluPerspective(60.0f,(GLfloat)width/(GLfloat)height,0.1f,100.0f);
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
然后加载一些纹理:
glGenTextures(TEXTURE_COUNT, &texture[0]);
for (int i...){
glBindTexture(GL_TEXTURE_2D, texture[i]);
glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MAG_FILTER,GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MIN_FILTER,GL_LINEAR_MIPMAP_NEAREST);
gluBuild2DMipmaps(GL_TEXTURE_2D,3,TextureImage[0]->w,TextureImage[0]->h,GL_RGB,GL_UNSIGNED_BYTE,TextureImage[0]->pixels);
}
最后使用以下方法绘制 GL_QUADS:
glBindTexture(GL_TEXTURE_2D, q);
glTranslatef(fDistanceX,fDistanceZ,-fDistanceY);
glBegin(GL_QUADS);
glNormal3f(a,b,c);
glTexCoord2f(d, e); glVertex3f(x1, y1, z1);
glTexCoord2f(f, g); glVertex3f(x2, y2, z2);
glTexCoord2f(h, k); glVertex3f(x3, y3, z3);
glTexCoord2f(m, n); glVertex3f(x4, y4, z4);
glEnd();
glTranslatef(-fDistanceX,-fDistanceZ,fDistanceY);
我发现所有代码都非常自我解释。不幸的是,据我所知,这种做事方式已被弃用。我在互联网上读到了一些关于 PBO 和 vertexArrays 的模糊内容,但我没有找到任何关于如何使用它们的教程。我什至不知道这些对象是否适合实现我在这里尝试做的事情(屏幕上有十亿个四边形,没有延迟)。也许这里的任何人都可以给我一个明确的建议,我应该用什么来实现结果?如果您碰巧还有一分钟的空闲时间,您能否给我一个关于如何使用这些函数的简短摘要(就像我对上面已弃用的函数所做的那样)?
I am trying to display as many textured quads as possible at random positions in the 3D space. In my experience so far, I cannot display even a couple of thousands of them without dropping the fps significantly under 30 (my camera movement script becomes laggy).
Right now I am following an ancient tutorial. After initializing OpenGL:
glEnable(GL_TEXTURE_2D);
glShadeModel(GL_SMOOTH);
glClearColor(0, 0, 0, 0);
glClearDepth(1.0f);
glEnable(GL_DEPTH_TEST);
glDepthFunc(GL_LEQUAL);
glHint(GL_PERSPECTIVE_CORRECTION_HINT, GL_NICEST);
I set the viewpoint and perspective:
glViewport(0,0,width,height);
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
gluPerspective(60.0f,(GLfloat)width/(GLfloat)height,0.1f,100.0f);
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
Then I load some textures:
glGenTextures(TEXTURE_COUNT, &texture[0]);
for (int i...){
glBindTexture(GL_TEXTURE_2D, texture[i]);
glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MAG_FILTER,GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D,GL_TEXTURE_MIN_FILTER,GL_LINEAR_MIPMAP_NEAREST);
gluBuild2DMipmaps(GL_TEXTURE_2D,3,TextureImage[0]->w,TextureImage[0]->h,GL_RGB,GL_UNSIGNED_BYTE,TextureImage[0]->pixels);
}
And finally I draw my GL_QUADS using:
glBindTexture(GL_TEXTURE_2D, q);
glTranslatef(fDistanceX,fDistanceZ,-fDistanceY);
glBegin(GL_QUADS);
glNormal3f(a,b,c);
glTexCoord2f(d, e); glVertex3f(x1, y1, z1);
glTexCoord2f(f, g); glVertex3f(x2, y2, z2);
glTexCoord2f(h, k); glVertex3f(x3, y3, z3);
glTexCoord2f(m, n); glVertex3f(x4, y4, z4);
glEnd();
glTranslatef(-fDistanceX,-fDistanceZ,fDistanceY);
I find all that code very self explaining. Unfortunately that way to do things is deprecated, as far as I know. I read some vague things about PBO and vertexArrays on the internet, but i did not find any tutorial on how to use them. I don't even know if these objects are suited to realize what I am trying to do here (a billion quads on the screen without a lag). Perhaps anyone here could give me a definitive suggestion, of what I should use to achieve the result? And if you happen to have one more minute of spare time, could you give me a short summary of how these functions are used (just as i did with the deprecated ones above)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
什么是“结果”?您还没有很好地解释您到底想要实现什么目标。您所说的只是您正在尝试绘制大量带纹理的四边形。您想用这些纹理四边形做什么?
例如,在给定相同的像素数据的情况下,您似乎正在创建具有相同宽度和高度的相同纹理。但是您将它们存储在不同的纹理对象中。 OpenGL 不知道它们包含相同的数据。因此,在渲染四边形时,您会花费大量时间不必要地交换纹理。
如果你只是随机抽取它们来测试性能,那么这个问题就没有意义。这样的测试毫无意义,因为它们完全是人为的。他们仅测试这种人工场景,即每次渲染四边形时都会更改纹理。
在不知道您最终要渲染什么的情况下,我唯一能做的就是提供一般性能建议。按顺序(即:先执行第一个,再执行后面的):
停止更改每个四边形的纹理。您可以将多个图像打包在同一纹理中,然后一次渲染使用该纹理的所有四边形,只需一次
glBindTexture
调用。四边形的纹理坐标指定它使用纹理中的哪个图像。停止使用
glTranslate
来定位每个单独的四边形。您可以使用它来定位四边形组,但您应该自己进行数学计算来计算四边形的顶点位置。一旦这些glTranslate
调用消失,您就可以在单个glBegin/glEnd
对的空间内放置多个四边形。假设您的四边形是静态的(模型空间中的固定位置),请考虑使用缓冲区对象来存储和< a href="http://www.opengl.org/wiki/Vertex_Specification">使用您的四边形数据进行渲染。
您是否尝试过OpenGL Wiki,其中有一个相当不错的教程列表(以及有关 OpenGL 的一般信息)?为了充分披露,我确实写了其中一篇。
What is "the result"? You have not explained very well what exactly it is that you're trying to accomplish. All you've said is that you're trying to draw a lot of textured quads. What are you trying to do with those textured quads?
For example, you seem to be creating the same texture, with the same width and height, given the same pixel data. But you store these in different texture objects. OpenGL does not know that they contain the same data. Therefore, you spend a lot of time swapping textures needlessly when you render quads.
If you're just randomly drawing them to test performance, then the question is meaningless. Such tests are pointless, because they are entirely artificial. They test only this artificial scenario where you're changing textures every time you render a quad.
Without knowing what you are trying to ultimately render, the only thing I can do is give general performance advice. In order (ie: do the first before you do the later ones):
Stop changing textures for every quad. You can package multiple images together in the same texture, then render all of the quads that use that texture at once, with only one
glBindTexture
call. The texture coordinates of the quad specifies which image within the texture that it uses.Stop using
glTranslate
to position each individual quad. You can use it to position groups of quads, but you should do the math yourself to compute the quad's vertex positions. Once thoseglTranslate
calls are gone, you can put multiple quads within the space of a singleglBegin/glEnd
pair.Assuming that your quads are static (fixed position in model space), consider using a buffer object to store and render with your quad data.
Did you try the OpenGL Wiki, which has a pretty good list of tutorials (as well as general information on OpenGL)? In the interest of full disclosure, I did write one of them.
实际上是数百万个。我猜你是德国人:“Milliarde”在英语中翻译为“十亿”。
这是你的主要问题。现代 OpenGL 应用程序不使用古老的渲染方法。您正在使用立即模式,这意味着您将通过多个函数调用来仅提交一个顶点。这是非常低效的。现代应用程序(例如游戏)可以达到如此高的三角形数量,因为它们不会浪费 CPU 时间来调用尽可能多的函数,也不会在数据流中浪费 CPU→GPU 带宽。
为了达到实时渲染的大量三角形,您必须将所有几何数据放置在“快速存储器”中,即图形卡上的 RAM 中。 OpenGL 为此提供的技术称为“顶点缓冲区对象”。使用 VBO,您可以使用单个绘图调用(glDrawArrays、glDrawElements 及其相关函数)来绘制大批量的几何图形。
解决掉几何图形后,您必须对 GPU 友好。如果您经常切换纹理或着色器,GPU 就不喜欢这样。切换纹理会使缓存的内容无效,切换着色器意味着停止 GPU 管道,更糟糕的是,这意味着使执行路径预测统计数据无效(GPU 会统计着色器的哪些执行路径最有可能被执行)以及它表现出的内存访问模式,这用于迭代优化着色器执行)。
Actually its in the millions. I presume you're German: "Milliarde" translates into "Billion" in English.
This is your main problem. Contemporary OpenGL applications don't use ancient rendering methods. You're using the immediate mode, which means that you're going through several function calls to just submit a single vertex. This is highly inefficient. Modern applications, like games, can reach that high triangle counts because they don't waste their CPU time on calling as many functions, they don't waste CPU→GPU bandwidth with the data stream.
To reach that high counts of triangles being rendered in realtime you must place all the geometry data in the "fast memory", i.e. in the RAM on the graphics card. The technique OpenGL offers for this is called "Vertex Buffer Objects". Using a VBO you can draw large batches of geometry using a single drawing call (glDrawArrays, glDrawElements and their relatives).
After getting the geometry out of the way, you must be nice to the GPU. GPUs don't like it, if you switch textures or shaders often. Switching a texture invalidates the contents of the cache(s), switching a shader means stalling the GPU pipeline, but worse it means invalidating the execution path prediction statistics (the GPU takes statistics which execution paths of a shader are the most probable to be executed and which memory access patterns it exhibits, this used to iteratively optimize the shader execution).