了解 IDirect3DDevice9::Present 在阻止垂直同步时的行为

发布于 2024-09-24 15:53:21 字数 2626 浏览 7 评论 0原文

我正在开发一个科学应用程序,它必须(尽可能最好地)估计在视频后台缓冲区中绘制的对象与该对象在屏幕上实际可见的点之间的时间差。换句话说,Windows XP+ 上的 DirectX 如何处理显示器的垂直刷新周期。

首先我要说的是,我的视频例程基于 SDL 1.3 库。因此,我无法立即访问 DirectX API,但如果需要的话可以更改。 DirectX 在全屏模式下使用 D3DSWAPEFFECT_DISCARD、D3DPRESENT_INTERVAL_ONE 和 BackBufferCount = 1 进行初始化。这些似乎是最关键的参数,但如果需要更多信息,我很乐意深入研究 SDL 代码的其余部分。

D3DPRESENT_INTERVAL_ONE 标志确保每个刷新周期前后缓冲区交换不超过一次,并且永远不会在刷新中间交换(它基本上启用垂直同步)。事实上,如果我有一个简单的循环,不断调用 IDirect3DDevice9::Present(在我的例子中是 SDL_RenderPresent),则该函数将阻塞两个刷新周期之间的毫秒数(60Hz 时为 16.67ms,100Hz 时为 10ms,等等)。 。

这是我的问题...假设我在后台缓冲区中绘制一个白色方块并调用 SDL_RenderPresent,它会阻塞 16.67 毫秒(假设刷新率为 60Hz)。当对 SDL_RenderPresent 的调用返回时,我可以对监视器上可见图像的状态得出什么结论?在我看来,有以下几种可能性:

  1. 白色方块刚刚绘制在显示器上。
  2. 白色方块大约将被绘制(不到 1 毫秒)。
  3. 前面的前台缓冲区刚刚被绘制;在我的白色方块出现之前,将需要另一个刷新周期(16.67 毫秒)(再次调用 SDL_RenderPresent 将使我进入情况 1)。
  4. 前一个前端缓冲区是在最后 16.67 毫秒内绘制的,下一个是我的白色方块,但下一次刷新的确切时间未知。

从我完成的所有阅读来看,我倾向于选项 3,但我找不到针对选项 4 的任何保证。在我的配置中,仅当 Present 函数在两个刷新周期之间暂停。由于目标是交换前缓冲区和后缓冲区,因此第二次调用可以执行此操作的最早时间点是在刷新监视器之后(刚刚绘制了前一个缓冲区)。此时,包含白色方块的后台缓冲区可以移动到前面,但在监视器实际读取并显示缓冲区内容之前,它必须等待(最多)16.67 毫秒。理想情况下,我希望听到该函数应始终在上一个刷新周期完成后立即返回。

任何对 DirectX 更有经验的人都可以提供有关此主题的任何见解吗?我的假设正确还是我遗漏了什么?对于任何支持 DirectX 的系统,这些假设是否始终正确,或者逻辑是否会根据显卡、显示器或其他一些东西而改变?

作为最后一个小问题,回到我一遍又一遍地调用 SDL_RenderPresent 的循环,我注意到前 3 或 4 个调用立即返回,而所有后续调用都等待刷新周期。我是否正确地假设在第一次刷新之前只是忽略 D3DPRESENT_INTERVAL_ONE 限制(而不是我期望拥有的超过 2 个缓冲区进行某种排队)?

换句话说,假设进入循环大约需要 8 毫秒,直到下一个刷新周期。在此期间,它可能可以交换前后缓冲区 4 次。在第一次刷新发生之前,SDL_RenderPresent 将立即返回(因为从技术上讲,我们目前没有任何前端缓冲区,只有 2 个后端缓冲区),但是一旦其中一个缓冲区显示在屏幕上,阻塞就会开始发生。这是一个有效的解释吗?

[编辑]

根据下面的回复,很明显我使用 vsync 和 Present 的方法行不通。我想我找到了另一种方法来达到预期的结果,所以我将其发布在这里,以防有人发现我的想法中的错误,或者只是为了向其他解决类似问题的人提供信息。

第一步是摆脱 D3DPRESENT_INTERVAL_ONE。这会禁用垂直同步并确保对 SDL_RenderPresent 的任何调用都会立即返回。接下来,您可以使用 IDirect3DDevice9::GetRasterStatus 获取有关当前监视器状态的信息。它提供了一个布尔字段,在两个刷新周期之间的暂停期间设置为 true,另一个字段告诉您活动刷新期间的当前扫描线。使用这两条信息,可以实现您自己的垂直同步例程,尽管要运行一个不断轮询监视器状态的循环,从而消耗 100% 的 CPU。这对于我的需求来说是可以接受的。

仍然存在缓冲的问题 - 当我调用 SDL_RenderPresent 时,如何知道要在屏幕上绘制哪一帧?我想我找到了一种方法来确定这一点,这取决于我知道监视器上当前正在绘制哪条线的能力。基本逻辑如下:

  1. 等待新的刷新周期开始(暂停 = false,扫描线 = 0)。
  2. 用红色填充下一个后台缓冲区并调用 Present。
  3. 等待扫描线达到 32。
  4. 用绿色填充下一个后台缓冲区并调用 Present。

等等...在我的演示实现中,我使用了红色、绿色、蓝色,最后是黑色。这个想法是,只有当 GetRasterStatus 提供有关刷新状态的准确信息时,您才会看到 RGB 颜色模式,并且在调用 SDL_RenderPresent 时会立即翻转前后缓冲区。如果不满足其中任何一个条件,您可能看不到任何内容,颜色可能会交换或重叠等。另一方面,如果您在每个帧的屏幕顶部看到恒定的 RGB 图案,那么这证明您可以直接控制绘制的图像。

我应该补充一点,我今天在工作的几台计算机上测试了这个理论。大多数确实显示了该图案,但至少有一个将整个屏幕涂成红色。有些颜色带会上下跳跃,表明交换缓冲区时存在一些不一致。这通常发生在较旧的机器上。我认为这是一个很好的校准测试,可以确定硬件是否适合我们的测试目的。

I'm developing a scientific application that has to estimate (as best as possible) the time difference between an object getting drawn in the video back buffer and the point at which that object actually becomes visible on the screen. In other words, how DirectX on Windows XP+ deals with monitor's vertical refresh cycle.

I'll start by saying that my video routines are based on SDL 1.3 library. As a result, I don't have immediate access to DirectX API, but this could be changed if necessary. DirectX is being initialized with D3DSWAPEFFECT_DISCARD, D3DPRESENT_INTERVAL_ONE, and BackBufferCount = 1 in full-screen mode. Those seem to be the most critical parameters, but I'm happy to dig through the rest of SDL code if more information is needed.

The D3DPRESENT_INTERVAL_ONE flag ensures that back and front buffers are swapped no more than once per refresh cycle, and never in the middle of a refresh (it basically enables vsync). Indeed, if I have a simple loop that just continually calls IDirect3DDevice9::Present (SDL_RenderPresent, in my case), this function will block for the number of milliseconds between two refresh cycles (16.67ms with 60Hz, 10ms with 100Hz, etc.).

Here's my question... Suppose I draw a white square in the back buffer and call SDL_RenderPresent, which blocks for 16.67 ms (assuming 60Hz refresh). What can I conclude about the state of the visible image on the monitor when the call to SDL_RenderPresent returns? Here are the possibilities, as I see it:

  1. The white square was just drawn on the monitor.
  2. The white square is about to be drawn (in less than 1 ms).
  3. The previous front buffer was just drawn; it will take another refresh cycle (16.67 ms) before my white square appears (calling SDL_RenderPresent again will get me to case 1).
  4. The previous front buffer was drawn in the last 16.67 ms, my white square is next, but the exact time till the next refresh is unknown.

From all the reading that I've done, I'm leaning toward option 3, but I can't find any guarantees against 4. In my configuration, the Present function should block only if it is being called for the second time during a pause between two refresh cycles. Since the goal is to swap the front and back buffers, the earliest point at which the second call can do this is just after the monitor was refreshed (previous front buffer was just drawn). It is at that point that the back buffer containing my white square can be moved to the front, but it must wait for (at most) 16.67 ms before the monitor will actually read and display the buffer contents. Ideally, I'd like to hear that the function should always return as soon as the previous refresh cycle is finished.

Can anyone more experienced with DirectX provide any insight on this topic? Are my assumptions correct or am I missing something? Will these assumptions always be correct for any system that has DirectX support, or could the logic change depending on the video card, monitor, or some other things?

As a final minor question, going back to my loop that just calls SDL_RenderPresent over and over again, I noticed that the first 3 or 4 calls return immediately, while all subsequent ones wait for the refresh cycle. Am I correct in assuming that the D3DPRESENT_INTERVAL_ONE restriction is simply being ignored prior to the first refresh (as opposed to some sort of queuing taking place with more than 2 buffers that I'm expecting to have)?

In other words, suppose the loop is entered with ~8ms to go until the next refresh cycle. It might be able swap the front and back buffers 4 times during this period. Until that first refresh happens, SDL_RenderPresent will return immediately (since we technically don't have any front buffers for now, only 2 back buffers), but the blocking will start to take place as soon as one of those buffers is shown on the screen. Is this a valid explanation or not?

[edit]

Based on the replies below, it's clear that my approach using vsync and Present would not work. I think I found another way to achieve the desired result, so I'm posting it here in case someone can spot errors in my thinking, or just for the information of anyone else working on a similar problem.

The first step is to get rid of D3DPRESENT_INTERVAL_ONE. That disables vsync and ensures that any call to SDL_RenderPresent will return immediately. Next, you can use IDirect3DDevice9::GetRasterStatus to get information about the current monitor state. It provides a boolean field that is set to true during the pause between two refresh cycles, and another field that tells you the current scanline during an active refresh. Using these two pieces of information it's possible to implement your own vertical synchronization routines, albeit by running a loop that is constantly polling the monitor status and thus consuming 100% of the CPU. This is acceptable for my needs.

There is still the question of buffering - how do I know which frame is to be drawn on the screen when I call SDL_RenderPresent? I think I found a way to determine this, which relies on my ability to know what line on the monitor is currently being drawn. Here's the basic logic:

  1. Wait for a new refresh cycle to start (pause = false, scanline = 0).
  2. Fill the next back buffer with red color and call Present.
  3. Wait for scanline to reach 32.
  4. Fill the next back buffer with green and call Present.

And so on... In my demo implementation I used red, green, blue, and finally black. The idea is that you would see the RGB color pattern only if GetRasterStatus provides accurate information about the refresh status, and the front and back buffers are flipped immediately when SDL_RenderPresent is called. If either of those conditions is not met, you may not see anything, the colors could be swapped or overlapping, etc. If, on the other hand, you see a constant RGB pattern at the top of the screen for each frame, then this proves that you have direct control over the drawn image.

I should add that I tested this theory on several computers at work today. Most did display the pattern, but at least one had the entire screen painted red. A few would have the color bands jump up and down, indicating some inconsistency in swapping the buffers. This usually happened on older machines. I think this is a good calibration test to determine if the hardware is suitable for our testing purposes.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

玉环 2024-10-01 15:53:21

我强烈建议您查看 Microsoft 的 GPUView。这是介绍该工具的作者网页之一

  • D3D 通常会缓冲超过一帧的渲染命令(包括演示)。例如,请参见幻灯片 25,其中我们可以看到大约 3 帧正在 BumpEarth 设备队列中缓冲。这解释了 3-4 个调用立即返回(当前数据包是交叉的数据包)。他们只是排队。
  • 除非您正在进行全屏渲染,否则操作系统需要进行一些合成(同一张幻灯片显示了垂直同步上发生的合成 - 蓝色垂直线)

一些后果:

  • 当前返回根本无法保证您刚刚发送的渲染的时间命令将在屏幕上更新。
  • 您的命令渲染一帧所需的持续时间不容易计算出来。我见过应用程序依赖于先前渲染的计时,并进行了平滑处理(以防止乒乓渲染变化)。

作为补充评论:

  • 我在现实工作负载中目睹了约 1.5 帧的命令缓冲。
  • 即使发生垂直同步并且显卡更新前缓冲区,显示器仍然可以在内部进行一些缓冲(因为我们把 CRT 抛在了后面,所以更是如此)。

我想问,为什么需要精确控制框架何时显示在屏幕上?

I highly recommend you look at Microsoft's GPUView. Here is one of the authors webpage that introduces the tool.

  • D3D will typically buffer more than one frame worth of rendering commands (including presents). For an example, see slide 25, where we can see ~3 frames being buffered on the BumpEarth Device queue. This explains that the 3-4 calls return immediately (Present packets are the crossed ones). They just get queued.
  • Unless you're doing full-screen rendering, the OS needs to do some compositing (same slide shows the compositing happening on vsync - the blue vertical line)

Some consequences:

  • Present returning gives you no guarantee at all on when your just-sent rendering commands will update on screen.
  • the duration your commands will take to render a frame is not easy to figure out. I've seen applications rely on previously rendered timings, smoothened (to prevent ping-pong rendering changes).

As additional comments:

  • I've witnessed ~1.5 frame worth of command buffering in real life workloads.
  • even when the vsync happens and the video card updates the front-buffer, the monitor can still do some buffering internally (more so since we left CRTs behind).

I've got to ask, why do you need to control exactly when the frame shows on screen ?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文