OpenGL 中的动态图块显示优化
我正在开发一款基于图块的自上而下的 2D 游戏,该游戏具有动态生成的地形,并开始在 OpenGL 中(重新)编写图形引擎。该游戏是使用 LWJGL 用 Java 编写的,我希望它保持相对平台独立性,并且也可以在旧计算机上玩。
目前我正在使用立即模式进行绘图,但显然这对于除了最简单的场景之外的任何东西来说都太慢了。
绘制的对象有两种基本类型:图块(即世界)和精灵(几乎是其他所有内容(实体、物品、效果等))。
图块大小为 20*20 像素,并以块(40*40 图块)形式存储。地形生成是按完整块完成的,就像在 Minecraft 中一样。
我现在使用的方法是迭代玩家附近的 9 个块,然后迭代内部的每个图块,为图块纹理绘制一个四边形,并根据材质绘制可选的额外四边形作为功能。 这最终会变得相当慢,但一个简单的视野外检查可以使 FPS 提高 5-10 倍。
为了优化这一点,我研究了使用 VBO 和四条带,但当地形变化时我遇到了问题。这种情况并非每帧都会发生,但也不是非常罕见的事件。 一个简单的方法是在每次更改时删除并重建块的 VBO。但这似乎不是最好的方法。我读到 VBO 可以是“动态的”,允许更改其内容。如何做到这一点,以及可以有效地更改其中的哪些数据?还有其他有效绘制世界的方法吗?
另一种类型是精灵,当前使用四边形绘制,该四边形具有从精灵表映射的纹理。因此,通过更改纹理坐标,我什至可以稍后为它们设置动画。但这是进行动画的正确方法吗?
目前,即使大量的精灵也不会减慢游戏速度,通过了解 VBO,我将能够进一步加快它们的速度,但我还没有看到任何可靠的教程来提供有效的方法做这个。也许有人认识吗?
感谢您的帮助!
I am working on a tile based, top-down 2D game with dinamically generated terrain, and started (re)writing the graphics engine in OpenGL. The game is written in Java using LWJGL, and I'd prefer it to stay relatively platform-independent, and playable on older computers too.
Currently I'm using immediate mode for drawing, but obviously this is too slow for anything but the simplest scenes.
There are two basic types of objects that are drawn: Tiles, which is the world, and Sprites, which is pretty much everything else (Entities, items, effects, ect).
The tiles are 20*20 px, and are stored in chunks (40*40 tiles). Terrain generation is done in full chunks, like in Minecraft.
The method I use now is iterating over the 9 chunks near the player, and then iterating over each tile inside, drawing one quad for the tile texture, and optional extra quads for features depending on the material.
This ends up quite slow, but a simple out-of-view check gives a 5-10x FPS boost.
For optimizing this, I looked into using VBOs and quad strips, but I have a problem when terrain changes. This doesn't happen every frame, but not a very rare event either.
A simple method would be dropping and rebuilding a chunk's VBO every time it changes. This doesn't seem the best way though. I read that VBOs can be "dynamic" allowing their content to be changed. How can this be done, and what data can be changed inside them efficiently? Are there any other ways for efficiently drawing the world?
The other type, sprites, are currently drawn with a quad with a texture mapped from a sprite sheet. So by changing texture coordinates, I can even animate them later. Is this the correct way to do the aniamtion though?
Currently even a very high number of sprites won't slow the game down much, and by understanding VBOs, I'll be able to speed them up even more, but I haven't seen any solid and reliable tutorials for an efficient way of doing this. Does anyone know one perhaps?
Thanks for the help!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不同意。除非您要绘制大量图块(每帧数万个),否则立即模式应该适合您。
关键是你必须做一些事情才能获得良好的性能:纹理图集。所有的图块都应该存储在单个纹理中。渲染时,您可以使用纹理坐标从该纹理中拉出不同的图块。因此,如果这就是您的渲染循环现在的样子:
您可以将其转换为:
如果您已经在使用纹理图集并且仍然没有获得可接受的性能,那么您可以继续使用缓冲对象等。但如果您不首先这样做,您将不会从缓冲区对象获得更好的性能。
如果所有图块无法放入单个纹理中,那么您需要执行以下两件事之一:使用多个纹理(在一对 glBegin/glEnd 中使用每个纹理渲染尽可能多的图块),或使用纹理数组。纹理数组仅在 OpenGL 3.0 级别的硬件中可用。这意味着任何 Radeon HDxxxx 或 GeForce 8xxxx 或更好。
您提到有时会在图块顶部渲染“功能”。这些功能可能使用混合和与常规图块不同的 glTexEnv 模式。在这种情况下,您需要找到将相似功能分组到单个 glBegin/glEnd 对中的方法。
正如您可能从中了解到的,性能的关键是最大限度地减少调用 glBindTexture 和 glBegin/glEnd 的次数。在每个 glBegin/glEnd 中做尽可能多的工作。
如果您希望继续使用基于缓冲区的方法(并且只有在纹理图集方法没有使您的性能达到标准时才应该烦恼),这相当简单。将所有图块“块”放入单个缓冲区对象中。不要为每一个都创建一个缓冲区;没有真正的理由这样做,而且 40x40 图块的顶点数据只有 12,800 字节。您可以将 81 个这样的块放入单个 1MB 缓冲区中。这样,您只需为地形调用 glBindBuffer 即可。这又可以节省您的性能。
我需要更多地了解您有时使用的这些“功能”,以提出优化它们的方法。但至于动态缓冲区,我不担心。只需使用 glBufferSubData 来更新有问题的缓冲区部分。如果结果发现这很慢,您可以使用多种选项来加快速度。但除非您知道这是必要的,否则您不应该打扰,因为它们很复杂。
精灵可能是从缓冲区对象方法中获益最少的东西。与即时模式相比,它确实没有什么好处。即使您要渲染数百个,每个都将有自己的变换矩阵。这意味着每个调用都必须是单独的绘制调用。所以也可能是 glBegin/glEnd。
I disagree. Unless you are drawing a lot of tiles (tens of thousands per frame), immediate mode should be just fine for you.
The key is something you will have to be doing to get good performance anyway: texture atlases. All of your tiles should be stored in a single texture. You use texture coordinate to pull different tiles out of that texture when rendering. So if this is what your render loop looks like now:
You can convert it into this:
If you are already using a texture atlas and still aren't getting acceptable performance, then you can move on to buffer objects and the like. But you won't get any better performance from buffer objects if you don't do this first.
If all of your tiles cannot fit into a single texture, then you will need to do one of two things: use multiple textures (rendering as many tiles with each texture in one glBegin/glEnd pair as possible), or use a texture array. Texture arrays are available in OpenGL 3.0-level hardware only. That means any Radeon HDxxxx or GeForce 8xxxx or better.
You mentioned that you sometimes render "features" on top of tiles. These features likely use blending and different glTexEnv modes from regular tiles. In this case, you need to find ways to group similar features into a single glBegin/glEnd pair.
As you may be gathering from this, the key to performance is minimizing the number of times you call glBindTexture and glBegin/glEnd. Do as much work as possible in each glBegin/glEnd.
If you wish to proceed with a buffer-based approach (and you should only bother if the texture atlas approach didn't get your performance up to par), it's fairly simple. Put all of your tile "chunks" into a single buffer object. Don't make a buffer for each one; there's no real reason to do so, and 40x40 tiles worth of vertex data is only 12,800 bytes. You can put 81 such chunks in a single 1MB buffer. This way, you only have to call glBindBuffer for your terrain. Which again, saves you performance.
I would need to know more about these "features" you sometimes use to suggest a way to optimize them. But as for dynamic buffers, I wouldn't worry. Just use glBufferSubData to update the part of the buffer in question. If this turns out to be slow, there are several options for making it faster that you can employ. But you shouldn't bother unless you know that it is necessary, since they're complex.
Sprites are probably something that benefits the absolute least from a buffer object approach. There's really nothing to be gained by it over immediate mode. Even if you're rendering hundreds of them, each one will have its own transformation matrix. Which means that each one will have to be a separate draw call. So it may as well be glBegin/glEnd.