C++ OpenGL 的性能怪异
我正在用 C++ 重写一些渲染 C 代码。旧的 C 代码基本上计算它需要的所有内容并在每一帧渲染它。新的 C++ 代码会预先计算所需内容并将其存储为链接列表。
现在,实际的渲染操作是平移、颜色更改和对 GL 列表的调用。
虽然执行链表中的操作应该非常简单,但结果方法调用似乎比旧版本花费的时间更长(它每次都计算所有内容 - 我当然确保新版本不会重新计算)。
奇怪的是?与旧版本相比,它执行的 OpenGL 操作更少。但事情变得更奇怪了。当我为每种类型的操作添加计数器,并在方法末尾添加一个很好的旧 printf 时,它变得更快 - gprof 和手动测量都证实了这一点。
我还费心查看了 G++ 在这两种情况下生成的汇编代码(有跟踪和没有跟踪),并且没有重大变化(这是我最初的怀疑)——唯一的区别是为计数器分配了更多的堆栈字,增加所述计数器,并为 printf 做准备,然后跳转到它。
此外,这对于 -O2 和 -O3 都适用。我在 Ubuntu Maverick 上使用 gcc 4.4.5 和 gprof 2.20.51。
我想我的问题是:发生了什么事?我做错了什么?是否有什么东西影响了我的测量值和 gprof?
I am rewriting some rendering C code in C++. The old C code basically computes everything it needs and renders it at each frame. The new C++ code instead pre-computes what it needs and stores that as a linked list.
Now, actual rendering operations are translations, colour changes and calls to GL lists.
While executing the operations in the linked list should be pretty straightforward, it would appear that the resulting method call takes longer than the old version (which computes everything each time - I have of course made sure that the new version isn't recomputing).
The weird thing? It executes less OpenGL operations than the old version. But it gets weirder. When I added counters for each type of operation, and a good old printf at the end of the method, it got faster - both gprof and manual measurements confirm this.
I also bothered to take a look at the assembly code generated by G++ in both cases (with and without trace), and there is no major change (which was my initial suspicion) - the only differences are a few more stack words allocated for counters, increasing said counters, and preparing for printf followed by a jump to it.
Also, this holds true with both -O2 and -O3. I am using gcc 4.4.5 and gprof 2.20.51 on Ubuntu Maverick.
I guess my question is: what's happening? What am I doing wrong? Is something throwing off both my measurements and gprof?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
通过花时间在 printf 上,您可能会避免下一次 OpenGL 调用中的停顿。
By spending time in printf, you may be avoiding stalls in your next OpenGL call.
如果没有更多信息,很难知道这里发生了什么,但这里有一些提示:
这是我自己对可能出现问题的猜测。发送到 GPU 的调用需要一些时间才能完成:之前的代码通过混合 CPU 操作和 GPU 调用,使 CPU 和 GPU 并行工作;相反,新代码首先让 CPU 在 GPU 空闲时计算许多事情,然后在 CPU 无事可做时将所有需要完成的工作提供给 GPU。
Without more information, it is difficult to know what is happening here, but here are a few hints:
Here is my own guess on what could be going wrong. The calls you send to your GPU take some time to complete: the previous code, by mixing CPU operations and GPU calls, made CPU and GPU work in parallel; on the contrary the new code first makes the CPU computes many things while the GPU is idling, then feeds the GPU with all the work to get done while the CPU has nothing left to do.