游戏编程中,具体有哪些C++或者导致性能消耗的 STL 功能?

发布于 2024-09-08 02:01:12 字数 198 浏览 1 评论 0原文

我的问题主要是关于 STL,而不是 C++ 的其余部分,(我猜)可以与 C 一样快,只要不是在每个角落都使用类。

STL是游戏和OGRE3D等引擎的标准,但我想知道如果STL的功能很好用,问题是虽然我真的不知道它们是如何工作的,但我应该在使用之前首先知道哪些功能会导致严重的问题他们。

我很高兴开始学习游戏编程学校,显然我不可能不使用这些高级功能。

My question is mostly about STL than the rest of the C++ that can be compared (I guess) to be as much fast as C a long as classes aren't used at every corner.

STL is standard for games and in engines like OGRE3D, but I was wondering that if STL's features are nice to use, the problem is while I don't really know how they work, I should know first what features can cause serious hogs before using them.

I'm very excited to begin that game programming school, and apparently there is no way I am going to not use those advanced features.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

謌踐踏愛綪 2024-09-15 02:01:12

在许多情况下,使用 STL 往往会生成与手写代码一样好的代码,甚至更高效的代码。

使用分析器来查看哪里有问题。

即使 C++ STL 的性能可能较差,其代码也可能不太容易出错。因此,仅当分析器显示存在问题时才编写代码

Using STL tends to generate as good if not more efficient code than hand written code for many cases.

Use a profiler to see where you have problems.

Even where C++ STL might perform worse its code is likely to be less error prone. So only write code if the profiler shows there is an issue

夕嗳→ 2024-09-15 02:01:12

1) 调试构建。由于过多的错误检查,许多 stl 容器都会变慢。至少在 Microsoft 编译器上是这样。
2)动态内存分配过多。如果您有包含 std::vector 的例程,并且每帧调用它几千次,那么它会非常慢,并且瓶颈将出现在operator new 或另一个内存分配例程中的某个地方。如果您将此向量转换为某种静态缓冲区(这样您就不必每次都重新创建它),它会快得多。内存分配很慢。如果您有缓冲区,通常最好重用它,而不是在每次调用期间创建一个新缓冲区。
3)使用低效的算法。例如,使用线性搜索而不是二分搜索,使用错误的排序算法(例如,快速排序、堆排序在未排序的数据上比冒泡排序更快,但插入排序在部分排序的数据上比快速排序更快)。搜索而不是使用 std::map 等。
4) 使用错误类型的容器。例如, std::vector 不适合在随机位置插入元素。 std::deque虽然与std::vector(随机访问)相当,但允许快速push_front和push_back,如果减去两个迭代器(再次在MSVC上),它可能比std::vector慢10倍。

1) Debug builds. Severaly slow down many stl containers due to excessive error checking. At least on Microsoft compilers.
2) Excessive dynamic memory allocation. If you have routine that contains std::vector within it AND if you'll call it few thousand times per frame, it will be very slow and bottleneck will be somewhere within operator new or another memory allocation routine. If you'll turn this vector into some kind of static buffer (so you won't have to recreate it every time), it will be much faster. Memory allocation is slow. If you have a buffer, it is normally better to reuse it instead of making a new one during each call.
3) Using inefficient algorithms. For example, using linear search instead of binary search, using wrong sort algorithms (for example, quick sort, heap sort are faster than bubble sort on unsorted data but insertion sort can be faster than quicksort on partially sorted data). Searching instead of using std::map, and so on.
4) Using wrong kind of container. For example, std::vector isn't suitable for inserting elements at random places. std::deque, while is comparable with std::vector (random access), allows fast push_front and push_back, can be 10 times slower than std::vector if you subtract two iterators (on MSVC, again).

月朦胧 2024-09-15 02:01:12

除非您从头开始构建整个引擎,否则您不会注意到使用或不使用 C++ 类或 STL 的差异。一般来说,大多数时候 CPU 都会运行甚至不是您编写的代码。再加上您实现的任何脚本语言(Lua、Python 等)所带来的开销将超过使用 C++/STL 可能产生的任何小的性能损失。

简而言之,不用担心。编写好的 OOP 代码比从一开始就尝试编写超快的代码要好。 “过早的优化是许多编程罪恶的根源。”

Unless you're building the entire engine from scratch, you're not going to notice a difference in using or not using C++ classes or STL. In general, most of the time the CPU is going to be running code that's not even written by you anyway. Plus the overhead imposed by any scripting languages you implement (Lua, Python, etc) will eclipse any small performance penalty you may incur by using C++/STL.

In short, don't worry about it. Writing good OOP code is better than trying to write super-fast code from the get-go. "Premature optimization is the root of much programming evil."

温馨耳语 2024-09-15 02:01:12

实际上,您可以在任何地方使用类,并且仍然可以获得与 C 一样好的性能(并且通常比典型的 C 性能更好)。

大多数STL被设计为在编译时完成大部分“棘手”部分,因此运行时性能非常出色。需要注意的主要事情(特别是如果您为游戏机或移动电话等图形硬件功能较差的设备编写代码)是构建数据以便与缓存良好配合。

Actually, you can use classes everywhere, and still get as good of performance as C (and often better performance than typical C).

Most STL is designed to do most of its "tricky" parts at compile time, so the run-time performance is excellent. The main thing to look out for (especially if you write for things like game consoles or mobile phones, that have less capable graphics hardware) is structuring your data to work well with a cache.

顾铮苏瑾 2024-09-15 02:01:12

在我看来,编写高性能 C++/STL 代码的关键点是:

  • 了解每个 STL 容器的内存分配策略是什么,
  • 了解哪些算法最适合哪些迭代器类别,
  • 了解运行时多态性与编译时多态性。

好的起点是:

Here are, in my opinion, the key points to writing performant C++/STL code:

  • Learn what memory allocation strategies are for each STL container,
  • Learn what algorithms work best with what iterator categories,
  • Learn run-time polymorphism vs compile-time polymorphism.

Good starting points are:

爱她像谁 2024-09-15 02:01:12

我推荐 Scott Myers 的 Effective STL。 STL 最大的性能消耗是缺乏对它的理解。请好好学习一下!

另请参阅 Agner Fog 撰写的Optimizing software in C++,了解 C++ 特定性能相关主题。

I recommend Effective STL by Scott Myers. The biggest performance hog of STL, is the lack of understanding of it. Please learn it well!

Also see Optimizing software in C++ by Agner Fog for C++ specific performance related topics.

傲影 2024-09-15 02:01:12

其他答案都很准确:STL和游戏编程的问题大多是误用。

我的一般方法如下:
1.用STL编写,仔细选择合适的算法、容器等。
2. 分析瓶颈。
3. 如果是 STL 导致问题,请更换它。

过早优化确实会减慢您的速度,并且只会在以后引起更多问题。

当然,这也取决于平台。有时,您必须编写自己的所有内容,因为您根本无法承担 STL 的额外 CPU/RAM 开销。

The other answers are all accurate: the problems with STL and game programming are mostly with misuse.

My general approach is the following:
1. Write it with STL and carefully choose the appropriate algorithm, container, etc.
2. Profile for bottlenecks.
3. If it's STL causing the problem, replace it.

Optimizing too early can really slow you down and only cause more problems later.

Of course, it depends on platform as well. Sometimes, you have to write all of your own stuff because you simply can't afford the extra CPU/RAM overhead of STL.

一紙繁鸢 2024-09-15 02:01:12

Stroustrup 在他的《The C++ 编程语言》(第 3 版)一书中谈论了 STL 的设计和性能,特别是各种不同容器类型的不同性能特征。

Stroustrup talks about the design and performance of the STL in general, and specifically about the different performance characteristics of the various different container types, in his book The C++ Programming Language (3rd edition).

粉红×色少女 2024-09-15 02:01:12

我没有游戏经验,但艺电开发了他们自己的(不合格的)STL 实现。有一篇详细的文章解释了该库的动机和设计 这里

请注意,在许多情况下,最好使用实现中附带的 STL,然后进行测量,然后再次测量,并确保您了解正在发生的情况以及真正的性能问题。然后,只有到那时,如果问题出在 STL 内(而不是出在 STL 的使用方式上),我就会使用非标准库。

I don't have experience in gaming, but Electronic Arts developed their own (non-conforming) implementation of the STL. There is an extensive article explaining the motives and design of the library here.

Note that in many cases, you will be better off by using the STL that comes with your implementation, then measure, then measure again and make sure that you understand what is going on and what is really a performance problem. Then, only then, and if the problem is within the STL (and not in how the STL is used), I would use unstandard libraries.

不回头走下去 2024-09-15 02:01:12

如果使用得当,它们很少会成为性能消耗者。探查器应该始终是您在代码中查找瓶颈的主要方法,除非出现明显的算法效率低下的情况(在这种情况下,使用探查器来确保您是否在紧迫的期限内仍然是一个很好的做法)。

然而,如果您确实遇到 STL 使用显示为探查器热点,则存在一些合理的效率问题。

vector<ExpensiveElement> v;
// insert a lot of elements to v
v.push_back(ExpensiveElement(...) );

上面的这个 Push_back 具有最坏的情况,即必须线性复制到目前为止插入的所有 ExpenseElements(如果我们超出了当前容量)。在最好的情况下,我们仍然需要复制 ExpenseElement 一次,这是不必要的。

我们可以通过使向量存储shared_ptr来缓解这个问题,但现在我们为每个插入的ExpectiveElement支付两次额外的堆分配(一个用于boost::shared_ptr中的引用计数器,一个用于ExpectiveElement)以及每次指针间接寻址的开销我们想要访问存储在向量中的 ExpenseElement。

为了减轻内存分配/释放开销(通常比额外的间接级别更有可能成为热点),我们可以为 ExpenseElement 实现一个快速内存分配器。尽管如此,想象一下如果 std::vector 提供了 alloc_back 方法:

new (v.alloc_back()) ExpensiveElement(...);

这将避免任何复制向量开销,但不安全且容易被滥用。尽管如此,这正是我对矢量克隆所做的响应热点的事情。请注意,我从事光线跟踪工作,在这个领域,性能通常是最高的质量衡量标准之一(除了正确性)我们每天都会分析我们的代码,所以我们并不是突然决定向量对于我们的目的来说不够高效。

我们也别无选择,只能实现矢量克隆,因为我们提供了一个软件开发工具包,其他人的 std::vector 实现可能与我们自己的不兼容。我不想给您带来错误的想法:仅当您的探查器会话确实需要时才探索此类解决方案!

效率低下的另一个常见原因是使用链接的 STL 容器(如 set、multiset、地图、多重地图和列表。然而,这不一定是他们的错,而是使用默认 std::allocator 的错。它们为每个节点执行单独的内存分配/释放,因此默认分配器对于这些目的可能非常慢,尤其是跨多个线程(线程争用,最好使用每个线程内存池)。通过编写自己的内存分配器,您确实可以提高速度(尽管这不是一件小事,如果这样做,请不要忘记对齐)。

我必须强调的是,这些类型的优化只能应用于响应探查器。通过这种方式,您的代码将变得更难使用和维护,因此您应该只为了换取应用程序性能的可靠、明显的提升而这样做。

These rarely become performance hogs if used correctly. A profiler should always be your primary means of finding bottlenecks in your code short of obvious algorithmic inefficiencies (in which case it's still good practice to use a profiler to make sure if you are on a tight deadline).

There are some legitimate efficiency concerns, however, if you do come across STL usage to show up as a profiler hotspot.

vector<ExpensiveElement> v;
// insert a lot of elements to v
v.push_back(ExpensiveElement(...) );

This push_back immediately above has the worst case scenario of having to linearly copy all the ExpensiveElements inserted so far (if we've exceeded the current capacity). In the best case scenario, we still have to copy ExpensiveElement one time unnecessarily.

We can mitigate the issue by making vector store shared_ptr, but now we pay for two additional heap allocations per ExpensiveElement inserted (one for the reference counter in boost::shared_ptr, and one for ExpensiveElement) along with the overhead of a pointer indirection each time we want to access ExpensiveElement stored in the vector.

To mitigate the memory allocation/deallocation overhead (generally more likely to be a hotspot than an additional level of indirection), we can implement a fast memory allocator for ExpensiveElement. Nevertheless, imagine if std::vector provided an alloc_back method:

new (v.alloc_back()) ExpensiveElement(...);

This would avoid any copy ctor overhead, but is unsafe and prone to abuse. Nevertheless, that's exactly what I did with our vector-clone in response to hotspots. Note that I work in raytracing which is a field where performance is often one of the highest measures of quality (other than correctness) and we profile our code daily so it's not like we just decided out of the blue that vector wasn't efficient enough for our purposes.

We also had no choice but to implement a vector clone because we provide a software development kit where other people's std::vector implementations may be incompatible with our own. I don't want to give you the wrong idea: explore these kinds of solutions only if your profiler sessions really call for it!

Another common source of inefficiency is when using linked STL containers like set, multiset, map, multimap, and list. However, that's not necessarily their fault, but rather the fault of the default std::allocator being used. These perform a separate memory allocation/deallocation per node so the default allocator can be pretty slow for these purposes, especially across multiple threads (thread contention, better off with per-thread memory pools). You can really get a speed boost by writing your own memory allocator (though this is not a trivial thing to do and don't forget alignment if you do).

I can't emphasize enough that these kinds of optimizations should only be applied in response to the profiler. You'll make your code harder to use and maintain this way, so you should be doing it only in exchange for a solid, demonstrable boost in your application's performance.

我还不会笑 2024-09-15 02:01:12

本书介绍了您在游戏中使用 C++ 时遇到的问题。

This book covers what issues you face when using C++ in games.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文