进行 GC 或不进行 GC
我最近看到了两个非常好的和有教育意义的语言演讲:
这是 Herb Sutter 的第一个,介绍了 C++0x 的所有优秀和酷的功能,为什么 C++ 的未来似乎比以往任何时候都更加光明,以及 M$ 如何被认为是这个游戏中的好人。讨论围绕效率以及如何最大限度地减少堆活动来提高性能。
Andrei Alexandrescu 的另一篇文章激发了从C/C++ 到他的新游戏规则改变者 D。 D 的大部分东西看起来都非常有动机和设计。然而,有一件事让我感到惊讶,即 D 推动垃圾回收,并且所有类都是通过引用单独创建的。更令人困惑的是,The D 编程语言参考手册一书中专门在资源管理部分指出了以下内容,引用:
垃圾收集消除了繁琐且容易出错的工作内存分配跟踪代码 C 和 C++ 中必需的。这不仅意味着更快的开发时间和更低的成本 维护成本,但生成的程序通常运行更快!
这与 Sutter 不断谈论的最小化堆活动相冲突。我非常尊重 Sutter 和 Alexandrescou 的见解,因此我对这两个关键问题感到有点困惑
仅通过引用创建类实例不会导致大量不必要的堆活动吗?
在哪些情况下我们可以在不牺牲性能的情况下使用垃圾收集运行时性能?
I've recently seen two really nice and educating languages talks:
This first one by Herb Sutter, presents all the nice and cool features of C++0x, why C++'s future seems brighter than ever, and how M$ is said to be a good guy in this game. The talk revolves around efficiency and how minimizing heap activity very often improves performance.
This other one, by Andrei Alexandrescu, motivates a transition from C/C++ to his new game-changer D. Most of D's stuff seems really well motivated and designed. One thing, however, surprised me, namely that D pushes for garbage collection and that all classes are created solely by reference. Even more confusing, the book The D Programming Language Ref Manual specifically in the section about Resource Management states the following, quote:
Garbage collection eliminates the tedious, error prone memory allocation tracking code
necessary in C and C++. This not only means much faster development time and lower
maintenance costs, but the resulting program frequently runs faster!
This conflicts with Sutter's constant talk about minimizing heap activity. I strongly respect both Sutter's and Alexandrescou's insights, so I feel a bit confused about these two key questions
Doesn't creating class instances solely by reference result in a lot of unnecesseary heap activity?
In which cases can we use Garbage Collection without sacrificing run-time performance?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
直接回答您的两个问题:
是的,通过引用创建类实例确实会导致大量堆活动,但是:
a.在 D 中,您有
struct
和class
。struct
具有值语义,并且可以执行类可以执行的所有操作,但多态性除外。b.由于切片问题,多态性和值语义从来不能很好地协同工作.
c.在 D 中,如果您确实需要在某些性能关键型代码中在堆栈上分配类实例并且不关心安全性的损失,则可以通过
scoped
函数。在以下情况下,GC 可以与手动内存管理相当或更快:
a.您仍然尽可能在堆栈上进行分配(就像在 D 中通常所做的那样),而不是依赖堆来完成所有操作(就像在其他 GC 语言中经常做的那样)。
b.你有一个顶级的垃圾收集器(D 当前的 GC 实现确实有些幼稚,尽管它在过去的几个版本中已经看到了一些重大优化,所以它并不像以前那么糟糕)。
c.您分配的大部分是小对象。如果您分配大部分大型数组并且性能最终成为问题,您可能需要将其中一些切换到 C 堆(您可以访问 C 的 malloc 和 D 中的 free),或者,如果它有作用域生命周期,则可以使用其他一些分配器,例如 RegionAllocator。 (RegionAllocator 目前正在讨论和完善,以便最终包含在 D 的标准库中)。
d.您不太关心空间效率。如果您让 GC 运行得太频繁以保持内存占用超低,性能将会受到影响。
To directly answer your two questions:
Yes, creating class instances by reference does result in a lot of heap activity, but:
a. In D, you have
struct
as well asclass
. Astruct
has value semantics and can do everything a class can, except polymorphism.b. Polymorphism and value semantics have never worked well together due to the slicing problem.
c. In D, if you really need to allocate a class instance on the stack in some performance-critical code and don't care about the loss of safety, you can do so without unreasonable hassle via the
scoped
function.GC can be comparable to or faster than manual memory management if:
a. You still allocate on the stack where possible (as you typically do in D) instead of relying on the heap for everything (as you often do in other GC'd languages).
b. You have a top-of-the-line garbage collector (D's current GC implementation is admittedly somewhat naive, though it has seen some major optimizations in the past few releases, so it's not as bad as it was).
c. You're allocating mostly small objects. If you allocate mostly large arrays and performance ends up being a problem, you may want to switch a few of these to the C heap (you have access to C's malloc and free in D) or, if it has a scoped lifetime, some other allocator like RegionAllocator. (RegionAllocator is currently being discussed and refined for eventual inclusion in D's standard library).
d. You don't care that much about space efficiency. If you make the GC run too frequently to keep the memory footprint ultra-low, performance will suffer.
在堆上创建对象比在堆栈上创建对象慢的原因是内存分配方法需要处理堆碎片等问题。在堆栈上分配内存就像递增堆栈指针一样简单(恒定时间操作)。
然而,使用压缩垃圾收集器,您不必担心堆碎片,堆分配可以与堆栈分配一样快。 D 编程语言的垃圾收集页面更详细地解释了这一点。
GC 语言运行速度更快的断言可能是假设许多程序在堆上分配内存的频率比在堆栈上分配的频率高得多。假设堆分配在 GC 语言中可能更快,那么您刚刚优化了大多数程序的很大一部分(堆分配)。
The reason creating an object on the heap is slower than creating it on the stack is that the memory allocation methods need to deal with things like heap fragmentation. Allocating memory on the stack is as simple as incrementing the stack pointer (a constant-time operation).
Yet, with a compacting garbage collector, you don't have to worry about heap fragmentation, heap allocations can be as fast as stack allocations. The Garbage Collection page for the D Programming Language explains this in more detail.
The assertion that GC'd languages run faster is probably assuming that many programs allocate memory on the heap much more often than on the stack. Assuming that heap allocation could be faster in a GC'd language, then it follows that you have just optimized a huge part of most programs (heap allocation).
对 1) 的回答:
只要堆是连续的,在堆上分配就和在堆栈上分配一样便宜。
最重要的是,当您分配彼此相邻的对象时,您的内存缓存性能将会非常好。
只要您不必运行垃圾收集器,就不会损失性能,并且堆保持连续。
这是个好消息:)
回答 2):
GC 技术已经取得了很大进步;如今,它们甚至具有实时口味。这意味着保证连续内存是一个策略驱动、依赖于实现的问题。
因此,如果
您最终可能会获得更好的性能。
回答未提出的问题:
如果开发人员摆脱了内存管理问题,他们可能有更多时间花在代码中的真正性能和可扩展性方面。这也是一个正在发挥作用的非技术因素。
An answer to 1):
As long as your heap is contiguous, allocating on it is just as cheap as allocating on the stack.
On top of that, while you allocate objects that lie next to each other, your memory caching performance will be great.
As long as you don't have to run the garbage collector, no performance is lost, and the heap stays contiguous.
That's the good news :)
Answer to 2):
GC technology has advanced greatly; they even come in real-time flavors nowadays. That means that guaranteeing contiguous memory is a policy-driven, implementation-dependent issue.
So if
You may end up with better performance.
Answer to unasked question:
If developers are freed from memory-management issues, they may have more time to spend on real performance and scalability aspects in their code. That's a non-technical factor coming into play, too.
它不是“垃圾收集”或“繁琐且容易出错”的手写代码。真正智能的智能指针可以为您提供堆栈语义,并且意味着您永远不会键入“删除”,但您无需为垃圾收集付费。这是 Herb 的另一个视频,它说明了这一点 - 安全且快速 - 这就是我们想要什么。
It's not either "garbage collection" or "tedious error prone" handwritten code. Smart pointers that are truly smart can give you stack semantics and mean you never type "delete" but you aren't paying for garbage collection. Here's another video by Herb that makes the point - safe and fast - that's what we want.
另一个需要考虑的点是 80:20 规则。很可能您分配的绝大多数位置都是无关紧要的,即使您可以将那里的成本降低到零,您也不会比 GC 获得太多收益。如果您接受这一点,那么通过使用 GC 获得的简单性可以取代使用它的成本。如果您可以避免复印,则尤其如此。 D 为 80% 的情况提供了 GC,并为 20% 的情况提供了堆栈分配和 malloc 的访问权限。
Another point to consider is the 80:20 rule. It is likely that that vast majority of the places you allocate are irrelevant and you won't gain much over a GC even if you could push the cost there to zero. If you accept that, then the simplicity you can gain by using a GC can displace the cost of using it. This is particularly true if you can avoid doing copies. What D provides is a GC for the 80% cases and access to stack allocation and malloc for the 20%.
即使你有理想的垃圾收集器,它仍然会比在堆栈上创建东西慢。因此,您必须拥有一种能够同时支持这两种功能的语言。此外,使用垃圾收集器实现与手动管理的内存分配(以正确的方式完成)相同的性能的唯一方法是使其对内存执行与经验丰富的开发人员所做的相同的操作,并且在许多情况下会这样做要求垃圾收集器在编译时做出决定并在运行时执行。通常,垃圾收集会使事情变得更慢,仅使用动态内存的语言会更慢,并且用这些语言编写的程序的执行可预测性较低,而执行延迟较高。坦率地说,我个人不明白为什么需要垃圾收集器。手动管理内存并不难。至少在 C++ 中不是这样。当然,我不介意编译器生成为我清理所有内容的代码,就像我所做的那样,但这目前似乎不可能。
Even if you had ideal garbage collector, it still would have been slower than creating things on stack. So you have to have a language that allows both at the same time. Furthermore, the only way to achieve the same performance with garbage collector as with manually managed memory allocations (done the right way), is to make it do the same things with memory as experienced developer would have had done, and that in many cases would require a garbage collector decisions to be made in compile-time and executed in run-time. Usually, garbage collection makes things slower, languages working with dynamic memory only are slower, and predictability of execution of programs written in those languages is low while latency of execution is higher. Frankly, I personally don't see why one would need a garbage collector. Managing memory manually is not hard. At least not in C++. Of course, I won't mind compiler generate code that clean-ups all things for me as I would have done, but this doesn't seem possible at the moment.
在许多情况下,编译器可以将堆分配优化回堆栈分配。如果您的对象没有逃脱本地范围,就会出现这种情况。
在下面的示例中,一个像样的编译器几乎肯定会让
x
进行堆栈分配:编译器所做的事情称为 。
此外,D 理论上可以有一个移动 GC,这意味着潜在的性能改进当 GC 将堆对象压缩在一起时,改进了缓存的使用。正如 Jack Edmonds 的回答中所解释的,它还可以对抗堆碎片。类似的事情可以通过手动内存管理来完成,但这是额外的工作。
In many cases a compiler can optimize heap-allocation back to stack allocation. This is the case if your object doesn't escape the local scope.
A decent compiler will almost certainly make
x
stack-allocated in the following example:What the compiler does is called escape analysis.
Also, D could in theory have a moving GC, which means potential performance improvements by improved cache usage when the GC compacts your heap objects together. It also combats heap fragmentation as explained in Jack Edmonds' answer. Similar things can be done with manual memory management, but it's extra work.
当高优先级任务未运行时,增量低优先级 GC 将收集垃圾。高优先级线程将运行得更快,因为不会进行内存释放。
这是 Henriksson 的 RT Java GC 的想法,请参阅 http://www.oracle .com/technetwork/articles/javase/index-138577.html
A incremental low priority GC will collect garbage when high priority task are not running. The high priority threads will run faster since no memory deallocation will be done.
This is the idea of Henriksson's RT Java GC see http://www.oracle.com/technetwork/articles/javase/index-138577.html
垃圾收集实际上会减慢代码速度。它为除了代码之外还必须运行的程序添加了额外的功能。它还存在其他问题,例如,GC 在实际需要内存时才运行。这可能会导致小的内存泄漏。另一个问题是,如果没有正确删除引用,GC 将不会拾取它,并再次导致泄漏。我对 GC 的另一个问题是它会助长程序员的懒惰。我主张在进入更高级别之前先学习内存管理的低级别概念。这就像数学。您首先学习如何求解二次方的根,或者如何手动求导,然后学习如何在计算器上进行计算。使用这些东西作为工具,而不是拐杖。
如果您不想影响性能,请明智地对待 GC 以及堆与堆栈的使用情况。
Garbage collection does in fact slow code down. It's adding extra functionality to the program that has to run in addition to your code. There are other problems with it as well, such as for example, the GC not running until memory is actually needed. This can result in small memory leaks. Another issue is if a reference is not removed properly, the GC will not pick it up, and once again result in a leak. My other issue with GC is that it kind of promotes lazyness in programmers. I'm an advocate of learning the low level concepts of memory management before jumping into higher level. It's like Mathematics. You learn how to solve for the roots of a quadratic, or how to take a derivative by hand first, then you learn how to do it on the calculator. Use these things as tools, not crutches.
If you don't want to hit your performance, be smart about the GC and your heap vs stack usage.
我的观点是,当您进行正常的过程编程时,GC 不如 malloc。您只需从一个过程转到另一个过程,分配和释放,使用全局变量,并声明一些函数 _inline 或 _register 。这是C风格。
但是一旦进入更高的抽象层,您至少需要引用计数。因此,您可以通过引用传递,对它们进行计数,并在计数器为零时释放。这很好,并且在对象的数量和层次结构变得难以手动管理之后优于 malloc。这是C++风格。您将定义构造函数和析构函数来递增计数器,您将进行修改时复制,因此一旦一方修改了共享对象的某些部分,但另一方仍然需要原始值,共享对象将一分为二。因此,您可以在函数之间传递大量数据,而无需考虑是否需要在此处复制数据或仅在此处发送指针。引用计数会为您做出这些决定。
然后是全新的世界:闭包、函数式编程、鸭子类型、循环引用、异步执行。代码和数据开始混合,您发现自己比普通数据更频繁地传递函数作为参数。您意识到元编程可以在没有宏或模板的情况下完成。你的代码开始在天空中浸泡并失去坚实的基础,因为你正在回调的回调的回调中执行一些东西,数据变得无根,事情变得异步,你沉迷于闭包变量。因此,这是基于计时器的内存行走 GC 是唯一可能的解决方案,否则闭包和循环引用根本不可能。这是 JavaScript 的方式。
您提到了 D,但 D 仍然是改进的 C++,因此构造函数、堆栈分配、全局变量(即使它们是各种实体的复杂树)中的 malloc 或引用计数可能是您的选择。
My point is that GC is inferior to malloc when you do normal procedural programming. You just go from procedure to procedure, allocate and free, use global variables, and declare some functions _inline or _register. This is C style.
But once you go higher abstraction layer, you need at least reference counting. So you can pass by reference, count them and free once the counter is zero. This is good, and superior to malloc after the amount and hierarchy of objects become too difficult to manage manually. This is C++ style. You will define constructors and destructors to increment counters, you will copy-on-modify, so the shared object will split in two, once some part of it is modified by one party, but another party still needs the original value. So you can pass huge amount of data from function to function without thinking whether you need to copy data here or just send a pointer there. The ref-counting does those decisions for you.
Then comes the whole new world, closures, functional programming, duck typing, circular references, asynchronouse execution. Code and data start mixing, you find yourself passing function as parameter more often than normal data. You realize that metaprogramming can be done without macros or templates. Your code starts to soak in the sky and loosing solid ground, because you are executing something inside callbacks of callbacks of callbacks, data becomes unrooted, things become asynchronous, you get addicted to closure variables. So this is where timer based, memory-walking GC is the only possible solution, otherwise closures and circular references are not possible at all. This is JavaScript way.
You mentioned D, but D is still improved C++ so malloc or ref counting in constructors, stack allocations, global variables (even if they are compicated trees of entities of all kinds) is probably what you choose.