有没有办法克服 PC 上的图形 API CPU 限制瓶颈?

发布于 2024-12-19 07:33:36 字数 552 浏览 2 评论 0原文

最近,我花了很多时间研究 GPU 主题,并且看到了几篇文章,讨论由于 API 的限制,与主机游戏相比,PC 游戏很难保持领先地位。例如,在 Xbox 360 上,我的理解是游戏在内核模式下运行,并且由于硬件始终相同,因此游戏可以“更接近金属”进行编程,并且 Directx api 的抽象性较少。然而,在 PC 上,由于切换到内核模式和更多的抽象层,使用 Direct-X 或 Opengl 进行相同数量的绘制调用可能比在控制台上花费 2 倍的时间。我有兴趣听到这个问题的可能解决方案。

我听说过一些解决方案,例如直接在硬件上编程,但是(据我了解),ATI 已经发布了低级 API 的规范,nVidia 保守秘密,因此效果不太好,更不用说制作不同配置文件所增加的开发时间了。

在 Opencl 中编写整个“软件渲染”解决方案并在 GPU 上运行会更好吗?我的理解是,具有大量绘制调用的游戏受 CPU 限制,并且调用是单线程的(即在 PC 上),那么 Opencl 是一个可行的选择吗?

所以问题是: 有哪些可能的方法可以提高 Opengl 和 Directx 等图形 API 的效率,甚至消除对它们的需求?

Recently, I have been spending a lot of my time researching the topic of GPUs, and have came across several articles talking about how PC games are having a hard time staying ahead of the curve compared to console games due to limitations with the APIs. For example, on Xbox 360, it is my understanding that the games run in kernel mode, and that because the hardware will always be the same, the games can be programmed "closer to the metal" and the Directx api has less abstraction. On PC however, making the same number of draw calls with Direct-X or Opengl may take even more the 2 times the amount of time than on console due to switching to kernel mode and more layers of abstraction. I am interested in hearing possible solutions to this problem.

I have heard of a few solutions, such as programing directly on the hardware, but while (from what I understand), ATI has released the specifications of there low level API, nVidia keeps theirs secret, so that wouldn't work too well, not to mention the added development time of making different profiles.

Would programming an entire "software rendering" solution in Opencl and running that on a GPU be any better? My understanding is that games with a lot of draw calls are cpu bound and the calls are single threaded (on PC that is), so is Opencl a viable option?

So the question is:
What are possible methods to increase the efficiency of, or even remove the need for, graphics APIs such as Opengl and Directx?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

自控 2024-12-26 07:33:36

一般的解决方案是不要进行多次绘制调用。通过数组纹理、实例化和各种其他技术的纹理图集使这成为可能。

或者只是利用现代计算机的 CPU 性能比控制台高得多这一事实。或者更好的是,让自己受 GPU 限制。毕竟,如果您的 CPU 是您的瓶颈,那么这意味着您还有空闲的 GPU 能力。使用它。

OpenCL 不是与此相关的任何问题的“解决方案”。 OpenCL 无法访问实际使用 GPU 进行渲染所需执行的许多操作中的任何操作。为了将 OpenCL 用于图形,您必须不使用 GPU 的光栅器/剪辑器,它是用于在阶段之间传输信息的专用缓冲区、后 T&L 缓存或混合/深度比较/模板/等硬件。所有这些都是固定功能,并且速度极快且专业。并且对 OpenCL 完全不可用。

即便如此,它实际上并没有让它不再受 CPU 限制。您仍然需要整理正在渲染的内容等等。而且您可能无法访问图形 FIFO,因此您必须找到另一种方式来为着色器提供数据。

或者,换句话说,这是一个不需要解决的“问题”。

The general solution is to not make draw as many draw calls. Texture atlases via array textures, instancing, and various other techniques make this possible.

Or to just use the fact that modern computers have a lot more CPU performance than consoles. Or even better, make yourself GPU bound. After all, if your CPU is your bottleneck, then that means you have GPU power to spare. Use it.

OpenCL is not a "solution" to anything related to this. OpenCL has no access to any of the many things one would need to do to actually use a GPU to do rendering. In order to use OpenCL for graphics, you would have to not use the GPU's rasterizer/clipper, it's specialized buffers for transferring information from stage to stage, the post T&L cache, or the blending/depth comparison/stencil/etc hardware. All of that is fixed function and extremely fast and specialized. And completely unavailable to OpenCL.

And even then, it doesn't actually make it not CPU bound anymore. You still have to marshal what you're rendering and so forth. And you probably won't have access to the graphics FIFO, so you'll have to find another way to feed your shaders.

Or, to put it another way, this is a "problem" that doesn't need solving.

瑾夏年华 2024-12-26 07:33:36

如果您尝试用 OpenCL 编写渲染器,您最终会得到类似于 OpenGL 和 DirectX 的东西。您也很可能最终会得到比这些由许多专家多年来开发的 API 慢得多的东西。它们专门用于处理高效的光栅化并使用 OpenCL 不可用的内部挂钩。这可能是一个有趣的项目,但绝对不是一个有用的项目。

Nicol Bolas 已经为您提供了一些很好的技术来增加 GPU 相对于 CPU 的负载。最终的答案当然是最好的技术将取决于您的特定领域和限制。例如,如果您的渲染需要使用复杂的着色器和大量纹理进行大量像素过度绘制,则 CPU 将不会成为瓶颈。然而,现代硬件最重要的一般规则是通过更好的批处理来限制 OpenGL 调用的数量。

If you try to write a renderer in OpenCL, you will end up with something resembling OpenGL and DirectX. You will also most likely end up with something much slower than these APIs which were developed by many experts over many years. They are specialized to handle efficient rasterizing and use internal hooks not available to OpenCL. It could be a fun project, but definitely not a useful one.

Nicol Bolas already gave you some good techniques to increase the load of the GPU relative to the CPU. The final answer is of course that the best technique will depend on your specific domain and constraints. For example, if your rendering needs call for lots of pixel overdraw with complicated shaders and lots of textures, the CPU will not be the bottleneck. However, the most important general rule from with modern hardware is to limit the number of OpenGL calls made by better batching.

§对你不离不弃 2024-12-26 07:33:36

API。例如,在 Xbox 360 上,我的理解是游戏在内核模式下运行,并且由于硬件始终相同,因此游戏可以“更接近金属”进行编程,并且 Directx api 的抽象性较少。然而,在 PC 上,由于切换到内核模式和更多的抽象层,使用 Direct-X 或 Opengl 进行相同数量的绘制调用可能需要比在控制台上多出 2 倍的时间。

主机上近乎金属操作的优势在很大程度上被 PC 上的更大的 CPU 性能和可用内存所抵消。此外,游戏机的 HDD 速度远不如现代 PC 硬盘(SATA-1 与 SATA-3,甚至只是 PATA)快,而且许多游戏从速度更慢的光驱获取内容。

例如,PS3 360 仅提供 256MiB 内存用于游戏逻辑,另外 256MiB RAM 用于图形以及其他您无法使用的内存。 X-Box 360 提供 512MiB 统一 RAM,因此您必须将所有内容都挤进去。现在将其与低端 PC 进行比较,低端 PC 很容易就为程序配备了 2GiB RAM。即使是最便宜的显卡也至少提供 512MiB 的 RAM。游戏玩家的机器将有几个 GiB 的 RAM,GPU 将提供 1GiB 到 2GiB 之间的内存。

这极大地限制了游戏开发者的可能性,许多 PC 游戏玩家都在哀叹如此多的游戏都是“安慰性的”,但他们的 PC 却可以做更多的事情。

APIs. For example, on Xbox 360, it is my understanding that the games run in kernel mode, and that because the hardware will always be the same, the games can be programmed "closer to the metal" and the Directx api has less abstraction. On PC however, making the same number of draw calls with Direct-X or Opengl may take even more the 2 times the amount of time than on console due to switching to kernel mode and more layers of abstraction.

The benefits of close-to-metal operation on consoles is largely overcompensated on PCs by their much larger CPU performance and available memory. Add to this that the HDDs of consoles are not nearly as fast as modern PC ones (SATA-1 vs SATA-3, or even just PATA) and many games get their contents from an optical drive which is even slower.

The PS3 360 for example offers only 256MiB memory for game logic and another 256MiB of RAM for graphics and more you don't get to work with. The X-Box 360 offers 512MiB of unified RAM, so you have to squeeze everthing into that. Now compare this with a low end PC, which easily comes with 2GiB of RAM for the program alone. And even the cheapest graphics cards offer at least 512MiB of RAM. A gamers machine will have several GiB of RAM, and the GPU will offer something between 1GiB to 2GiB.

This extremly limits the possibilites for a game developer and many PC gamers are mourning that so many games are "consoleish", yet their PCs could do so much more.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文