我如何画 1000+ iPhone OpenGL ES 粒子系统中的粒子(具有独特的旋转、缩放和 Alpha)不会减慢游戏速度?
我正在使用 OpenGL ES 1.1 为 iPhone 开发一款游戏。在这个游戏中,我有角色被射击时会发出的血液粒子,因此屏幕上可以同时出现 1000+ 血液粒子。问题是,当我要渲染的粒子超过 500 个时,游戏的帧速率会大幅下降。
目前,每个粒子使用 glDrawArrays(..) 渲染自身,我知道这是速度减慢的原因。所有粒子共享相同的纹理图集。
那么减少绘制大量粒子造成的速度减慢的最佳选择是什么?以下是我找到的选项:
- 将所有血液颗粒组合在一起,并使用单个 glDrawArrays(..) 调用来渲染它们 - 如果我使用此方法,是否有一个每个粒子有自己的旋转和阿尔法的方法吗?或者使用此方法时所有这些都必须具有相同的旋转?如果我无法渲染具有独特旋转的粒子,那么我就无法使用此选项。
- 在 OpenGL ES 2.0 中使用点精灵。 我还没有使用 OpenGL ES 2.0,因为我需要在规定的期限内在 App Store 上发布我的游戏。要使用 OpenGL ES 需要进行初步研究,不幸的是我没有时间进行。我将在以后的版本中升级到 OpenGL ES 2.0,但对于第一个版本,我只想使用 1.1。
这是每个粒子本身的渲染。这是我最初的粒子渲染方法,导致游戏在渲染 500 多个粒子后帧速率显着下降。
// original method: each particle renders itself.
// slow when many particles must be rendered
[[AtlasLibrary sharedAtlasLibrary] ensureContainingTextureAtlasIsBoundInOpenGLES:self.containingAtlasKey];
glPushMatrix();
// translate
glTranslatef(translation.x, translation.y, translation.z);
// rotate
glRotatef(rotation.x, 1, 0, 0);
glRotatef(rotation.y, 0, 1, 0);
glRotatef(rotation.z, 0, 0, 1);
// scale
glScalef(scale.x, scale.y, scale.z);
// alpha
glColor4f(1.0, 1.0, 1.0, alpha);
// load vertices
glVertexPointer(2, GL_FLOAT, 0, texturedQuad.vertices);
glEnableClientState(GL_VERTEX_ARRAY);
// load uv coordinates for texture
glTexCoordPointer(2, GL_FLOAT, 0, texturedQuad.textureCoords);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
// render
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);
glPopMatrix();
然后我使用了方法 1,但是使用此方法(据我所知),粒子不能具有唯一的旋转、缩放或 alpha。
// this is method 1: group all particles and call glDrawArrays(..) once
// declare vertex and uv-coordinate arrays
int numParticles = 2000;
CGFloat *vertices = (CGFloat *) malloc(2 * 6 * numParticles * sizeof(CGFloat));
CGFloat *uvCoordinates = (CGFloat *) malloc (2 * 6 * numParticles * sizeof(CGFloat));
...build vertex arrays based on particle vertices and uv-coordinates.
...this part works fine.
// get ready to render the particles
glPushMatrix();
glLoadIdentity();
// if the particles' texture atlas is not already bound in OpenGL ES, then bind it
[[AtlasLibrary sharedAtlasLibrary] ensureContainingTextureAtlasIsBoundInOpenGLES:((Particle *)[particles objectAtIndex:0]).containingAtlasKey];
glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glDisableClientState(GL_NORMAL_ARRAY);
glVertexPointer(2, GL_FLOAT, 0, vertices);
glTexCoordPointer(2, GL_FLOAT, 0, uvCoordinates);
// render
glDrawArrays(GL_TRIANGLES, 0, vertexIndex);
glPopMatrix();
我会重申我的问题:
如何渲染 1000 多个粒子而不显着降低帧速率,并且每个粒子仍然可以具有独特的旋转、Alpha 和缩放?
任何建设性的建议都会非常有帮助,我们将不胜感激!
谢谢!
I am developing a game for iPhone using OpenGL ES 1.1. In this game, I have blood particles which emit from characters when they are shot, so there can be 1000+ blood particles on the screen at any one time. The problem is that when I have over 500 particles to render, the game's frame rate drops immensely.
Currently, each particle renders itself using glDrawArrays(..), and I know this is the cause for the slow down. All particles share the same texture atlas.
So what is the best option to reduce slow down from drawing many particles? Here are the options I found:
- group all the blood particles together and render them using a single glDrawArrays(..) call --if I use this method, is there a way for each particle to have its own rotation and alpha? Or do all of them HAVE to have the same rotation when this method is used? If I can't render particles with unique rotation, then I cannot use this option.
- Use point sprites in OpenGL ES 2.0. I am not using OpenGL ES 2.0 yet b/c I need to meet a deadline which I have set to release my game on the App Store. To use OpenGL ES would require preliminary research which unfortunately I do not have the time to perform. I will upgrade to OpenGL ES 2.0 upon a later release, but for the first, I only want to use 1.1.
Here is each particle rendering itself. This is my original particle-rendering methodolgy which caused the game to experience a significant drop in frame rate after 500+ particles were being rendered.
// original method: each particle renders itself.
// slow when many particles must be rendered
[[AtlasLibrary sharedAtlasLibrary] ensureContainingTextureAtlasIsBoundInOpenGLES:self.containingAtlasKey];
glPushMatrix();
// translate
glTranslatef(translation.x, translation.y, translation.z);
// rotate
glRotatef(rotation.x, 1, 0, 0);
glRotatef(rotation.y, 0, 1, 0);
glRotatef(rotation.z, 0, 0, 1);
// scale
glScalef(scale.x, scale.y, scale.z);
// alpha
glColor4f(1.0, 1.0, 1.0, alpha);
// load vertices
glVertexPointer(2, GL_FLOAT, 0, texturedQuad.vertices);
glEnableClientState(GL_VERTEX_ARRAY);
// load uv coordinates for texture
glTexCoordPointer(2, GL_FLOAT, 0, texturedQuad.textureCoords);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
// render
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);
glPopMatrix();
Then I used method 1, but particles can't have unique rotation, scale, or alpha using this method (that I know of).
// this is method 1: group all particles and call glDrawArrays(..) once
// declare vertex and uv-coordinate arrays
int numParticles = 2000;
CGFloat *vertices = (CGFloat *) malloc(2 * 6 * numParticles * sizeof(CGFloat));
CGFloat *uvCoordinates = (CGFloat *) malloc (2 * 6 * numParticles * sizeof(CGFloat));
...build vertex arrays based on particle vertices and uv-coordinates.
...this part works fine.
// get ready to render the particles
glPushMatrix();
glLoadIdentity();
// if the particles' texture atlas is not already bound in OpenGL ES, then bind it
[[AtlasLibrary sharedAtlasLibrary] ensureContainingTextureAtlasIsBoundInOpenGLES:((Particle *)[particles objectAtIndex:0]).containingAtlasKey];
glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glDisableClientState(GL_NORMAL_ARRAY);
glVertexPointer(2, GL_FLOAT, 0, vertices);
glTexCoordPointer(2, GL_FLOAT, 0, uvCoordinates);
// render
glDrawArrays(GL_TRIANGLES, 0, vertexIndex);
glPopMatrix();
I'll reiterate my question:
How do I render 1000+ particles without frame rate drastically dropping and each particle can still have unique rotation, alpha, and scale?
Any constructive advice would really help and would be greatly appreciated!
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用大约 1-10 个纹理,每个纹理由透明背景上的 200 个红血点组成,然后每个纹理绘制大约 3-10 次。然后你就有了数千个点。您以球形图案等方式绘制所有图像 - 分层爆炸。
游戏时你不可能总是与现实做到一一对应。仔细观察一些在旧 Xbox 或 iPad 等上运行的游戏 - 您需要执行一些快捷方式 - 并且完成后它们通常看起来很棒。
Use about 1- 10 textures, each made of say 200 red blood dots on a transparent background, and then draw them each about 3 - 10 times. Then you have your thousands of dots. You draw all the images in a spherical pattern, etc - exploding in layers.
You can't always do a 1 to 1 correspondence with reality when gaming. Take a real close look at some games that run on old Xbox or iPad, etc - there are shortcuts you need to do - and they often look great when done.
每个 OpenGL ES API 调用都会产生巨大的开销,因此当您看到绘图循环有数百次传递时速度变慢也就不足为奇了。不仅仅是 glDrawArrays() 可以让您到达这里,还有单独的 glTranslatef()、glRotatef()、glScalef() 和 glColorf() 调用。由于延迟渲染在这些 GPU 上的工作方式,glDrawArrays() 可能看起来是热点,但其他调用也会伤害您。
您应该将这些粒子顶点分组到一个数组中(最好是 VBO,以便您可以更有效地利用将更新的数据流式传输到 GPU)。您绝对可以在组合顶点数组中复制单个旋转、缩放等的效果,但是您需要执行计算,确定顶点在旋转、缩放等时应位于的位置。这将放置每个帧都会给 CPU 带来一些负担,但是可以通过使用 Accelerate 框架对此进行一些矢量处理来抵消一点。
颜色和 Alpha 也可以在数组中为每个顶点提供,因此您可以控制每个粒子的颜色和 Alpha。
不过,我认为您是对的,OpenGL ES 2.0 可以通过让您编写自定义着色器程序来为此提供更好的解决方案。您可以在 VBO 中发送所有点的静态顶点,然后只需更新矩阵即可操作每个粒子以及每个粒子顶点的 alpha 值。我做了类似的事情来生成程序冒名顶替者作为球体的替身。我在此处描述了此过程,并且您可以在此处下载应用程序的源代码。
There is significant overhead with each OpenGL ES API call, so it's not a surprise that you're seeing a slowdown here with hundreds of passes through that drawing loop. It's not just glDrawArrays() that will get you here, but the individual glTranslatef(), glRotatef(), glScalef(), and glColorf() calls as well. glDrawArrays() may appear to be the hotspot, due to the way that deferred rendering works on these GPUs, but those other calls will also hurt you.
You should group these particle vertices together in one array (preferably a VBO so that you can take advantage of streaming updated data to the GPU more efficiently). You definitely can replicate the effects of individual rotation, scale, etc. in your combined vertex array, but you're going to need to perform the calculations as to where the vertices should be as they are rotated, scaled, etc. This will place some burden on the CPU for every frame, but that could be offset a bit by using the Accelerate framework to do some vector processing of this.
Color and alpha can be provided per-vertex in an array as well, so you can control that for each one of your particles.
However, I think you're right in that OpenGL ES 2.0 could provide an even better solution for this by letting you write a custom shader program. You could send static vertices in a VBO for all your points, then only have to update matrices to manipulate each particle and the alpha values for each particle vertex. I do something similar to generate procedural impostors as stand-ins for spheres. I describe this process here, and you can download the source code to the application here.