C++ OpenGL 的性能怪异

发布于 2024-11-14 11:26:15 字数 564 浏览 3 评论 0原文

我正在用 C++ 重写一些渲染 C 代码。旧的 C 代码基本上计算它需要的所有内容并在每一帧渲染它。新的 C++ 代码会预先计算所需内容并将其存储为链接列表。

现在,实际的渲染操作是平移、颜色更改和对 GL 列表的调用。

虽然执行链表中的操作应该非常简单,但结果方法调用似乎比旧版本花费的时间更长(它每次都计算所有内容 - 我当然确保新版本不会重新计算)。

奇怪的是?与旧版本相比,它执行的 OpenGL 操作更少。但事情变得更奇怪了。当我为每种类型的操作添加计数器,并在方法末尾添加一个很好的旧 printf 时,它变得更快 - gprof 和手动测量都证实了这一点。

我还费心查看了 G++ 在这两种情况下生成的汇编代码(有跟踪和没有跟踪),并且没有重大变化(这是我最初的怀疑)——唯一的区别是为计数器分配了更多的堆栈字,增加所述计数器,并为 printf 做准备,然后跳转到它。

此外,这对于 -O2 和 -O3 都适用。我在 Ubuntu Maverick 上使用 gcc 4.4.5 和 gprof 2.20.51。

我想我的问题是:发生了什么事?我做错了什么?是否有什么东西影响了我的测量值和 gprof?

I am rewriting some rendering C code in C++. The old C code basically computes everything it needs and renders it at each frame. The new C++ code instead pre-computes what it needs and stores that as a linked list.

Now, actual rendering operations are translations, colour changes and calls to GL lists.

While executing the operations in the linked list should be pretty straightforward, it would appear that the resulting method call takes longer than the old version (which computes everything each time - I have of course made sure that the new version isn't recomputing).

The weird thing? It executes less OpenGL operations than the old version. But it gets weirder. When I added counters for each type of operation, and a good old printf at the end of the method, it got faster - both gprof and manual measurements confirm this.

I also bothered to take a look at the assembly code generated by G++ in both cases (with and without trace), and there is no major change (which was my initial suspicion) - the only differences are a few more stack words allocated for counters, increasing said counters, and preparing for printf followed by a jump to it.

Also, this holds true with both -O2 and -O3. I am using gcc 4.4.5 and gprof 2.20.51 on Ubuntu Maverick.

I guess my question is: what's happening? What am I doing wrong? Is something throwing off both my measurements and gprof?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

多彩岁月 2024-11-21 11:26:15

通过花时间在 printf 上,您可能会避免下一次 OpenGL 调用中的停顿。

By spending time in printf, you may be avoiding stalls in your next OpenGL call.

咆哮 2024-11-21 11:26:15

如果没有更多信息,很难知道这里发生了什么,但这里有一些提示:

  • 您确定 OpenGL 调用是相同的吗?您可以使用一些工具来比较发出的呼叫。确保没有因可能不同的完成顺序而导致状态发生变化。
  • 您是否尝试过在运行时使用分析器?如果您有许多对象,那么在循环列表时追逐指针的简单事实可能会导致缓存未命中。
  • 您是否已确定 CPU 端或 GPU 端的特定瓶颈?

这是我自己对可能出现问题的猜测。发送到 GPU 的调用需要一些时间才能完成:之前的代码通过混合 CPU 操作和 GPU 调用,使 CPU 和 GPU 并行工作;相反,新代码首先让 CPU 在 GPU 空闲时计算许多事情,然后在 CPU 无事可做时将所有需要完成的工作提供给 GPU。

Without more information, it is difficult to know what is happening here, but here are a few hints:

  • Are you sure the OpenGL calls are the same? You can use some tool to compare the calls issued. Make sure there was no state change introduced by the possibly different order things are done.
  • Have you tried to use a profiler at runtime? If you have many objects, the simple fact of chasing pointers while looping over the list could introduce cache misses.
  • Have you identified a particular bottleneck, either on the CPU side or GPU side?

Here is my own guess on what could be going wrong. The calls you send to your GPU take some time to complete: the previous code, by mixing CPU operations and GPU calls, made CPU and GPU work in parallel; on the contrary the new code first makes the CPU computes many things while the GPU is idling, then feeds the GPU with all the work to get done while the CPU has nothing left to do.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文