directx 中多种复杂对象的典型渲染策略?
我正在学习directx。它在如何做事方面提供了巨大的自由,但可能不同的策略表现不同,并且它几乎没有提供关于什么是性能良好的使用模式的指导。
使用 directx 时,是否通常需要在每次渲染时多次交换一堆新数据?
最明显但可能效率很低的使用方式是这样的。
策略 1
在每次渲染上
加载模型 0 的所有内容(包括纹理)并渲染它(IASetVertexBuffers、VSSetShader、PSSetShader、PSSetShaderResources、PSSetConstantBuffers、VSSetConstantBuffers、Draw) 、
加载模型 1 的所有内容(包括纹理)并渲染它(IASetVertexBuffers、VSSetShader、PSSetShader、PSSetShaderResources、PSSetConstantBuffers、VSSetConstantBuffers、Draw)
等...
我猜如果要加载的最大的东西被给予专用插槽,您可以提高部分效率,例如,如果模型 0 的纹理确实很复杂,则不要在每个步骤中重新加载它,只需将其加载到插槽 1 中并保留在那里。当然,因为我不确定 DX11 中每种类型肯定有多少个寄存器,所以这很复杂(有人能指出这方面的文档吗?)
策略 2
选择一些纹理插槽进行加载和其他用于永久存储最复杂的纹理。
仅一次
将最复杂的模型、着色器和纹理加载到专用于永久存储的插槽中
在每次渲染上
使用您预留的插槽加载模型 0 尚未存在的所有内容用于加载和渲染它(IASetVertexBuffers、VSSetShader、PSSetShader、PSSetShaderResources、PSSetConstantBuffers、VSSetConstantBuffers、Draw)
使用为加载和渲染预留的插槽加载模型 1 中尚未存在的所有内容 (IASetVertexBuffers、VSSetShader 、 PSSetShader、 PSSetShaderResources、 PSSetConstantBuffers、 VSSetConstantBuffers、Draw)
等...
策略 3 我不知道,但以上可能都是错误的,因为我对此真的很陌生。
在 directx(特别是 DX11)上高效渲染的标准策略是什么,以使其尽可能高效?
I am learning directx. It provides a huge amount of freedom in how to do things, but presumably different stategies perform differently and it provides little guidance as to what well performing usage patterns might be.
When using directx is it typical to have to swap in a bunch of new data multiple times on each render?
The most obvious, and probably really inefficient, way to use it would be like this.
Stragety 1
On every single render
Load everything for model 0 (textures included) and render it (IASetVertexBuffers, VSSetShader, PSSetShader, PSSetShaderResources, PSSetConstantBuffers, VSSetConstantBuffers, Draw)
Load everything for model 1 (textures included) and render it (IASetVertexBuffers, VSSetShader, PSSetShader, PSSetShaderResources, PSSetConstantBuffers, VSSetConstantBuffers, Draw)
etc...
I am guessing you can make this more efficient partly if the biggest things to load are given dedicated slots, e.g. if the texture for model 0 is really complicated, don't reload it on each step, just load it into slot 1 and leave it there. Of course since I'm not sure how many registers there are certain to be of each type in DX11 this is complicated (can anyone point to docuemntation on that?)
Stragety 2
Choose some texture slots for loading and others for perpetual storage of your most complex textures.
Once only
Load most complicated models, shaders and textures into slots dedicated for perpetual storage
On every single render
Load everything not already present for model 0 using slots you set aside for loading and render it (IASetVertexBuffers, VSSetShader, PSSetShader, PSSetShaderResources, PSSetConstantBuffers, VSSetConstantBuffers, Draw)
Load everything not already present for model 1 using slots you set aside for loading and render it (IASetVertexBuffers, VSSetShader, PSSetShader, PSSetShaderResources, PSSetConstantBuffers, VSSetConstantBuffers, Draw)
etc...
Strategy 3
I have no idea, but the above are probably all wrong because I am really new at this.
What are the standard strategies for efficient rendering on directx (specifically DX11) to make it as efficient as possible?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
DirectX 会为您管理资源,并尽可能将它们保留在视频内存中以优化性能,但只能达到卡中视频内存的限制。即使资源仍在视频内存中,每次状态更改也会产生开销。
优化此问题的一般策略是最大限度地减少渲染过程中状态更改的数量。通常,这意味着在批次中绘制使用相同纹理的所有多边形,以及在批次中使用相同顶点缓冲区的所有对象。因此,通常在更改状态以绘制更多图元之前,您会尝试绘制尽可能多的图元,
这通常会使渲染代码变得更加复杂且难以维护,因此您需要进行一些分析来确定优化程度你愿意做。
一般来说,通过超出本问题范围的更通用的算法更改,您将获得更好的性能提升。一些示例是减少远处对象和遮挡查询的多边形数量。一个流行的真实说法是“最快的多边形是你不绘制的多边形”。以下是一些快速链接:
http:// /msdn.microsoft.com/en-us/library/bb147263%28v=vs.85%29.aspx
http://www.gamasutra.com/view/feature/3243/optimizing_direct3d_applications_.php
http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter06.html
DirectX manages the resources for you and tries to keep them in video memory as long as it can to optimize performance, but can only do so up to the limit of video memory in the card. There is also overhead in every state change even if the resource is still in video memory.
A general strategy for optimizing this is to minimize the number of state changes during the rendering pass. Commonly this means drawing all polygons that use the same texture in a batch, and all objects using the same vertex buffers in a batch. So generally you would try to draw as many primitives as you can before changing the state to draw more primitives
This often will make the rendering code a little more complicated and harder to maintain, so you will want to do some profiling to determine how much optimization you are willing to do.
Generally you will get better performance increases through more general algorithm changes beyond the scope of this question. Some examples would be reducing polygon counts for distant objects and occlusion queries. A popular true phrase is "the fastest polygons are the ones you don't draw". Here are a couple of quick links:
http://msdn.microsoft.com/en-us/library/bb147263%28v=vs.85%29.aspx
http://www.gamasutra.com/view/feature/3243/optimizing_direct3d_applications_.php
http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter06.html
其他答案本身就是问题的更好答案,但到目前为止我发现的最相关的事情是 此讨论 gamedev.net 其中一些大型游戏的简介状态更改和绘制调用。
由此得出的结果是,大牌游戏实际上似乎并不太担心这个问题,即可能需要花费大量时间来编写解决此类问题的代码,而编写代码所花费的时间可能会大惊小怪。不值得浪费时间来完成您的申请。
Other answers are better answers to the question per se, but by far the most relevant thing I found since asking was this discussion on gamedev.net in which some big title games are profiled for state changes and draw calls.
What comes out of it is that big name games don't appear to actually worry too much about this, i.e. it can take significant time to write code that addresses this sort of issue and the time it takes to spend writing code fussing with it probably isn't worth the time lost getting your application finished.