在纯函数式语言中,数据是不可变的。 通过引用计数,创建引用循环需要更改已创建的数据。 看起来纯函数式语言可以使用引用计数,而不必担心循环的可能性。 我是对的吗? 如果是这样,他们为什么不呢?
我知道在很多情况下引用计数比 GC 慢,但至少它减少了暂停时间。 在暂停时间很糟糕的情况下,如果能够选择使用引用计数,那就太好了。
In purely functional languages, data is immutable. With reference counting, creating a reference cycle requires changing already created data. It seems like purely functional languages could use reference counting without worrying about the possibility of cycles. Am is right? If so, why don't they?
I understand that reference counting is slower than GC in many cases, but at least it reduces pause times. It would be nice to have the option to use reference counting in cases where pause times are bad.
发布评论
评论(6)
相对于 Java 和 C# 等其他托管语言,纯函数式语言的分配非常疯狂。 他们还分配不同大小的对象。 已知最快的分配策略是从连续的可用空间(有时称为“托儿所”)进行分配,并保留硬件寄存器以指向下一个可用的可用空间。 从堆分配的速度与从堆栈分配的速度一样快。
引用计数从根本上与这种分配策略不兼容。 引用计数将对象放入空闲列表并再次将其删除。 当创建新对象时,引用计数还需要大量的开销来更新引用计数(如上所述,纯函数式语言确实很疯狂)。
引用计数在以下情况下往往表现得非常好:
要了解当今最好的高性能引用计数系统的工作原理,请查阅 大卫·培根和埃雷兹·佩特兰克。
Relative to other managed languages like Java and C#, purely functional languages allocate like crazy. They also allocate objects of different sizes. The fastest known allocation strategy is to allocate from contiguous free space (sometimes called a "nursery") and to reserve a hardware register to point to the next available free space. Allocation from the heap becomes as fast as allocation from a stack.
Reference counting is fundamentally incompatible with this allocation strategy. Ref counting puts objects on free lists and takes them off again. Ref counting also has substantial overheads required for updating ref counts as new objects are created (which, as noted above, pure functional languages do like crazy).
Reference counting tends to do really well in situations like these:
To understand how the best high-performance ref-counting systems work today, look up the work of David Bacon and Erez Petrank.
你的问题是基于一个错误的假设。 完全有可能存在循环引用和不可变数据。 请考虑以下 C# 示例,该示例使用不可变数据创建循环引用。
这种类型的技巧可以在许多函数语言中完成,因此任何收集机制都必须处理循环引用的可能性。 我并不是说引用计数机制对于循环引用是不可能的,只是说必须对其进行处理。
编辑 ephemient
作为对评论的回应...这在 Haskell 中是微不足道的
,在 SML 中几乎没有任何更多的努力。
不需要突变。
Your question is based on a faulty assumption. It's perfectly possible to have circular references and immutable data. Consider the following C# example which uses immutable data to create a circular reference.
This type of trick can be done in many functional languages and hence any collection mechanism must deal with the possibility of circular references. I'm not saying a ref counting mechanism is impossible with a circular reference, just that it must be dealt with.
Edit by ephemient
In response to the comment... this is trivial in Haskell
and barely any more effort in SML.
No mutation required.
我认为有几件事。
(曾几何时,我想我可能真的“知道”这一点,但现在我正在尝试记住/推测,所以不要将此视为任何权威。)
There are a few things, I think.
(Once upon a time I think I maybe really 'knew' this, but now I am trying to remember/speculate, so don't take this as any authority.)
不完全的。 您只需同时定义相互递归值,即可使用纯函数式编程创建循环数据结构。 例如,在 OCaml 中:
但是,可以定义一些语言,使其无法通过设计创建循环结构。 结果被称为单向堆,其主要优点是垃圾收集可以像引用计数一样简单。
有些语言确实禁止循环并使用引用计数。 Erlang 和 Mathematica 就是例子。
例如,在 Mathematica 中,当您引用一个值时,您会对其进行深层复制,因此更改原始值不会更改副本:
Not quite. You can create cyclic data structures using purely functional programming simply by defining mutually-recursive values at the same time. For example, in OCaml:
However, it is possible to define languages that make it impossible to create cyclic structures by design. The result is known as a unidirectional heap and its primary advantage is that garbage collection can be as simple as reference counting.
Some languages do prohibit cycles and use reference counting. Erlang and Mathematica are examples.
For example, in Mathematica when you reference a value you make a deep copy of it so mutating the original does not mutate the copy:
考虑一下这个寓言讲述了David Moon,Lisp 机器:
Consider this allegory told about David Moon, an inventor of the Lisp Machine:
引用计数比 GC 慢得多,因为它对 CPU 不利。 并且GC大部分时间可以等待空闲时间,并且GC可以并发(在另一个线程上)。 这就是问题所在 - GC 是最不邪恶的,并且大量的尝试表明了这一点。
Reference counting is MUCH slower than GC because it's not good for CPU. And GC most of the time can wait for idle time and also GC can be concurrent (on another thread). So that's the problem - GC is least evil and lots of tries shown that.