为什么不使用 GDI 用数组中的 RGB 数据重复填充窗口?
这是 这个问题。我目前正在编写一个简单的游戏,并正在寻找在 Win32 窗口中(重复)显示 RGB 数据数组的最快方法,而不会出现闪烁或其他伪影。
在上一个问题的答案中推荐了几种不同的方法,但对于哪种方法最快尚未达成共识。所以,我整理了一个测试程序。该代码只是尽可能快地在屏幕上重复显示帧缓冲区。
这些是我获得的结果,对于在 32 位视频模式下运行的 32 位数据 - 它们可能会让一些人感到惊讶:
- Direct3D (1): 500 fps
- Direct3D (2): 650 fps
- DirectDraw (3): 1100 fps
- DirectDraw (4): 800 fps
- GDI (SetDIBitsToDevice): 2000 fps
鉴于这些数字:
- 为什么许多人坚持认为 GDI 对于此操作来说太慢了?
- 是否有任何理由比 SetDIBitsToDevice 更喜欢 DirectDraw 或 Direct3D?
以下是每个 Direct* 代码路径所进行的调用的简要摘要。如果有人知道更有效的使用 DirectDraw/Direct3D 的方法,请评论。
1. CreateTexture(D3DUSAGE_DYNAMIC, D3DPOOL_DEFAULT);
LockRect(); memcpy(); UnlockRect(); DrawPrimitive()
2. CreateTexture(0, D3DPOOL_SYSTEMMEM); CreateTexture(0, D3DPOOL_DEFAULT);
LockRect(); memcpy(); UnlockRect(); UpdateTexture(); DrawPrimitive()
3. CreateSurface(); SetSurfaceDesc(lpSurface = &frameBuffer[0]);
memcpy(); primarySurface->Blt();
4. CreateSurface();
Lock(); memcpy(); Unlock(); primarySurface->Blt();
This is a follow-up to this question. I'm currently writing a simple game and am looking for the fastest way to (repeatedly) display an array of RGB data in a Win32 window, without flickering or other artifacts.
Several different approaches were recommended in the answers to the previous question, but there was no consensus on which would be the fastest. So, I threw together a test program. The code simply displays a framebuffer on the screen repeatedly, as fast as possible.
These are the results I obtained, for 32-bit data running in a 32-bit video mode - they may surprise some people:
- Direct3D (1): 500 fps
- Direct3D (2): 650 fps
- DirectDraw (3): 1100 fps
- DirectDraw (4): 800 fps
- GDI (SetDIBitsToDevice): 2000 fps
Given these figures:
- Why are many people adamant that GDI is simply too slow for this operation?
- Is there any reason to prefer DirectDraw or Direct3D over SetDIBitsToDevice?
Here is a brief summary of the calls made by each of the Direct* codepaths. If anyone knows a more efficient way to use DirectDraw/Direct3D, please comment.
1. CreateTexture(D3DUSAGE_DYNAMIC, D3DPOOL_DEFAULT);
LockRect(); memcpy(); UnlockRect(); DrawPrimitive()
2. CreateTexture(0, D3DPOOL_SYSTEMMEM); CreateTexture(0, D3DPOOL_DEFAULT);
LockRect(); memcpy(); UnlockRect(); UpdateTexture(); DrawPrimitive()
3. CreateSurface(); SetSurfaceDesc(lpSurface = &frameBuffer[0]);
memcpy(); primarySurface->Blt();
4. CreateSurface();
Lock(); memcpy(); Unlock(); primarySurface->Blt();
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这里有几件事需要记住。首先,很多“常识”都是基于一些不再真正适用的事实。
在 AGP 时代,当 CPU 直接与 GPU 通信时,它总是使用基本 PCI 协议,以“1x”速率发生(总是且不可避免)。 AGX 2x/4x/8x仅在 GPU 直接访问内存控制器时应用。换句话说,根据您查看的时间,GPU 从内存加载纹理的速度比 CPU 直接向 GPU 发送相同数据的速度快 8 倍。当然,CPU 还拥有比 PCI 总线支持的更多的内存带宽。
然而,当转向 PCI-E 时,情况就完全改变了。虽然根据路径不同,带宽可能会有所不同,但不存在内存 > GPU 会比 CPU > GPU 更快的一般规则。一个(大部分)安全的概括是,如果您有一张专用显卡,那么 GPU 几乎总是为显卡上的内存提供比主板上主内存更多的带宽。
就您而言,这并不重要——无论如何,您正在谈论将数据从 CPU 空间移动到 GPU 空间。当您将所有(或大部分)计算保留在 GPU 上,并完全避免使用 CPU(或主内存)时,使用 DirectX(或 OpenGL)的主要速度差异就会发生。它们(现在 AGP 已经成为历史)并没有在内存->显示带宽方面提供任何实质性的改进。
There are a couple of things to keep in mind here. First of all, a lot of "common knowledge" is based on some facts that no longer really apply.
In the days of AGP, when the CPU talked directly to the GPU, it always used the base PCI protocol, which happened at the "1x" rate (always and inevitably). AGX 2x/4x/8x only applied when the GPU was taking to the memory controller directly. In other words, depending on when you looked, it was up to 8 times as fast to have the GPU load a texture from memory as it was for the CPU to send the same data directly to the GPU. Of course, the CPU also had a great deal more bandwidth to memory than the PCI bus supported.
When things switched to PCI-E, however, that changed completely. While there can be differences in bandwidth depending on path, there's no general rule that memory->GPU will be faster than CPU->GPU. The one generalization that's (mostly) safe is that if you have a dedicated graphics card, then the GPU will almost always have more bandwidth to the memory on the graphics card than it does to main memory on the motherboard.
In your case, that doesn't matter much though -- you're talking about moving data from CPU space to GPU space regardless. The main speed difference with using DirectX (or OpenGL) happens when you keep all (or most) of the computation on the GPU, and avoid using the CPU (or main memory) at all. They don't (now that AGP is history) provide any substantial improvement in memory->display bandwidth.
杰里·科芬提出了一些很好的观点。需要记住的是 SetDIBitsToDevice 中的 DI 代表什么。它代表设备独立。这意味着你总是受到司机的摆布。一些驱动程序过去完全是垃圾,它极大地影响了性能。 DirectDraw 也遇到了类似的问题......但您还可以访问硬件位块传输,因此它通常更有用。由于 DirectDraw 与游戏的关联,IHV 还倾向于投入更多时间为 DirectDraw 编写适当的驱动程序。当硬件完全有能力做得更好时,谁愿意成为性能垫底的人呢?
如今,许多显卡可以直接接受位数据,因此不会发生转换。如果确实需要搅拌,在当今时代这也是令人难以置信的快。
相比之下,Direct3D 性能如此糟糕的原因是 Direct3D 本质上是完全在 GPU 内部使用的,它使用奇怪且复杂的格式来提高缓存性能等等。
再加上您没有通过创建纹理/表面、锁定它、复制、解锁然后在后台缓冲区上绘制(通过各种方法)来进行类似的测试(使用 DDraw 和 D3D)。为了获得最佳性能,最好使用 DISCARD 锁直接锁定后备缓冲区,然后在解锁之前直接 memcpy 到返回的缓冲区中。这将使您的性能更接近 SetDIBitsToDevice。然而,由于上述原因,我仍然预计 D3D 会比 DDraw 慢。
Jerry Coffin makes some good points. The thing to bear in mind is what the DI stands for in SetDIBitsToDevice. It stands for Device Independent. Which means you were ALWAYS at the mercy of drivers. Some drivers used to be complete rubbish and it affected the performance massively. DirectDraw suffered from similar issues as well ... but you also had access to the hardware blitters so it was generally more useful. IHVs also tended to put more time in to writing proper drivers for DirectDraw because of its gaming association. Who wants to be the bottom of the performance pile when the hardware is quite capable of doing better?
These days many graphics cards can accept the bit data directly so no conversion happens. If it does need to be swizzled this is also INCREDIBLY quick in this day and age.
The reason your Direct3D performance is so terrible, by comparison, is that Direct3D, by nature of the fact it is meant to be used totally internally to the GPU, uses odd and complex formats to improve cache performance and so forth.
Couple that with the fact that you aren't testing like for like (with DDraw and D3D) by creating a texture/surface, locking it, copying, unlocking and then drawing over the back buffer (via various methods). To get best performance you'd be best off directly locking the backbuffer using a DISCARD lock then memcpy'ing directly into the returned buffer before unlocking. This will bring your performance much closer to the SetDIBitsToDevice. I still would expect D3D to be slower than DDraw, however, for the reasons outlined above.
您会听到人们痛斥 GDI 的原因是它过去只是旧的 Windows API 调用。它的较新版本(当我上次查看它们时称为 GDI+)实际上只是放置在 DirectX 调用之上的 API。因此,使用 GDI 有时看起来相当简单的编程,但在事物之间添加一层总是会减慢速度。正如 Jerry Coffin 的回复中提到的,您的示例是关于移动数据的,这就是缓慢的时间。我有点惊讶 DirectX 的速度慢得多,但我无法通过深入研究 DirectX 文档来提供更多帮助(这在相当长一段时间内确实非常棒......可能想查看 www.codesampler.com我总是从他那里找到很好的起点,实际上,虽然我这样说可能很疯狂,但我发誓文档和示例中对 DirectX SDK 的改进是基于这些人的工作完成的!)
至于 DirectDraw 与 Direct3D (而不是 GDI 调用)讨论。我会说去 Direct3D。我相信 DirectDraw 从 8.0 左右就已经被弃用了,9.0 已经存在了很长一段时间了。归根结底,所有 DirectX 都是 3D 的,它只是根据周围有用的 2D api 的级别而有所不同,但您可能会发现,当您实际使用 3D 空间时,您可以在 2D 环境中做一些非常有趣的事情。 (我曾经有一个非常整洁的随机生成的闪电武器,用于太空入侵者克隆:))
任何人,希望这有帮助!
PS:需要注意的是DirectX并不总是最快的。对于键盘输入(除非在 10 或 11 中发生了变化),几乎总是建议使用 Windows 事件.. 因为 DirectInput 实际上只是该系统的包装!.. 然而 XInput 是 - 太棒了 - !
The reason you will hear people trounce on GDI is that it used to just be old windows API calls. The newer versions of it (that were called GDI+ when I last looked at em) are actually just an API placed on top of DirectX calls. So using GDI may seem fairly simple programming wise at times, but adding a layer between things always slows things down. As mentioned in the response from Jerry Coffin, your examples are about moving the data, and that is the slow time. I am a bit surprised that DirectX is that much slower though but I can not be much more help with out digging through the DirectX documentation (which has been pretty awesome for quite some time really.. Might want to check out www.codesampler.com. I have always found good starting places from him and actually, while I may be insane for saying this, I would swear the improvements to the DirectX SDK in doc and examples were done based on this guys work!)
As for the DirectDraw vs Direct3D (and not the GDI calls) discussion. I would say go to Direct3D. I believe DirectDraw has been deprecated since 8.0 or so, and 9.0 has been around for quite a long while. And at the end of the day all of DirectX is 3D, it just varies on the levels of helpful 2D apis that are around, but you may find you can do some very interesting things in a 2D environment when you are actually using 3D space. (I had a pretty neat randomly generated lightning weapon for a space invaders clone at one time :))
Anywho, hope this helped!
PS: It should be noted that DirectX is not always the fastest. For keyboard input (unless this has changed in 10 or 11) it has pretty much always been recommended to use the windows events.. as DirectInput was actually just a wrapper for that system!.. XInput however is -awesome-!!