为什么纯函数式语言不使用引用计数?

发布于 2024-07-18 05:17:28 字数 188 浏览 6 评论 0 原文

在纯函数式语言中,数据是不可变的。 通过引用计数,创建引用循环需要更改已创建的数据。 看起来纯函数式语言可以使用引用计数,而不必担心循环的可能性。 我是对的吗? 如果是这样,他们为什么不呢?

我知道在很多情况下引用计数比 GC 慢,但至少它减少了暂停时间。 在暂停时间很糟糕的情况下,如果能够选择使用引用计数,那就太好了。

In purely functional languages, data is immutable. With reference counting, creating a reference cycle requires changing already created data. It seems like purely functional languages could use reference counting without worrying about the possibility of cycles. Am is right? If so, why don't they?

I understand that reference counting is slower than GC in many cases, but at least it reduces pause times. It would be nice to have the option to use reference counting in cases where pause times are bad.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

聊慰 2024-07-25 05:17:28

相对于 Java 和 C# 等其他托管语言,纯函数式语言的分配非常疯狂。 他们还分配不同大小的对象。 已知最快的分配策略是从连续的可用空间(有时称为“托儿所”)进行分配,并保留硬件寄存器以指向下一个可用的可用空间。 从堆分配的速度与从堆栈分配的速度一样快。

引用计数从根本上与这种分配策略不兼容。 引用计数将对象放入空闲列表并再次将其删除。 当创建新对象时,引用计数还需要大量的开销来更新引用计数(如上所述,纯函数式语言确实很疯狂)。

引用计数在以下情况下往往表现得非常好:

  • 几乎所有堆内存都用于保存活动对象。
  • 相对于其他操作,分配和指针赋值并不常见。
  • 可以在另一个处理器或计算机上管理引用。

要了解当今最好的高性能引用计数系统的工作原理,请查阅 大卫·培根埃雷兹·佩特兰克

Relative to other managed languages like Java and C#, purely functional languages allocate like crazy. They also allocate objects of different sizes. The fastest known allocation strategy is to allocate from contiguous free space (sometimes called a "nursery") and to reserve a hardware register to point to the next available free space. Allocation from the heap becomes as fast as allocation from a stack.

Reference counting is fundamentally incompatible with this allocation strategy. Ref counting puts objects on free lists and takes them off again. Ref counting also has substantial overheads required for updating ref counts as new objects are created (which, as noted above, pure functional languages do like crazy).

Reference counting tends to do really well in situations like these:

  • Almost all heap memory is used to hold live objects.
  • Allocation and pointer assignment are infrequent relative to other operations.
  • References can be managed on another processor or computer.

To understand how the best high-performance ref-counting systems work today, look up the work of David Bacon and Erez Petrank.

岁月蹉跎了容颜 2024-07-25 05:17:28

你的问题是基于一个错误的假设。 完全有可能存在循环引用和不可变数据。 请考虑以下 C# 示例,该示例使用不可变数据创建循环引用。

class Node { 
  public readonly Node other;
  public Node() { 
    other = new Node(this);
  }
  public Node(Node node) {
    other = node;
  }
}

这种类型的技巧可以在许多函数语言中完成,因此任何收集机制都必须处理循环引用的可能性。 我并不是说引用计数机制对于循环引用是不可能的,只是说必须对其进行处理。

编辑 ephemient

作为对评论的回应...这在 Haskell 中是微不足道的

data Node a = Node { other :: Node a }
recursiveNode = Node { other = recursiveNode }

,在 SML 中几乎没有任何更多的努力。

datatype 'a node = NODE of unit -> 'a node
val recursiveNode : unit node =
    let fun mkRecursiveNode () = NODE mkRecursiveNode
    in mkRecursiveNode () end

不需要突变。

Your question is based on a faulty assumption. It's perfectly possible to have circular references and immutable data. Consider the following C# example which uses immutable data to create a circular reference.

class Node { 
  public readonly Node other;
  public Node() { 
    other = new Node(this);
  }
  public Node(Node node) {
    other = node;
  }
}

This type of trick can be done in many functional languages and hence any collection mechanism must deal with the possibility of circular references. I'm not saying a ref counting mechanism is impossible with a circular reference, just that it must be dealt with.

Edit by ephemient

In response to the comment... this is trivial in Haskell

data Node a = Node { other :: Node a }
recursiveNode = Node { other = recursiveNode }

and barely any more effort in SML.

datatype 'a node = NODE of unit -> 'a node
val recursiveNode : unit node =
    let fun mkRecursiveNode () = NODE mkRecursiveNode
    in mkRecursiveNode () end

No mutation required.

葬﹪忆之殇 2024-07-25 05:17:28

我认为有几件事。

  • 循环:许多语言中的“let rec”确实允许创建“循环”结构。 除此之外,不变性通常意味着没有循环,但这打破了规则。
  • 引用计数在列表中很糟糕:我不知道引用计数集合是否能很好地处理例如您在 FP 中经常发现的长单链表结构(例如缓慢,需要确保尾部)递归,...)
  • 其他策略有好处:正如您所提到的,其他 GC 策略通常对于内存局部性来说仍然更好

(曾几何时,我想我可能真的“知道”这一点,但现在我正在尝试记住/推测,所以不要将此视为任何权威。)

There are a few things, I think.

  • There are cycles: "let rec" in many languages does allow "circular" structures to be created. Apart from this, immutability does usually imply no cycles, but this breaks the rule.
  • Ref-counts are bad at lists: I don't know that reference-counted collection works well with e.g. long singly-linked-list structures you often find in FP (e.g. slow, need to ensure tail-recursive, ...)
  • Other strategies have benefits: As you allude to, other GC strategies are still usually better for memory locality

(Once upon a time I think I maybe really 'knew' this, but now I am trying to remember/speculate, so don't take this as any authority.)

暮年慕年 2024-07-25 05:17:28

我说得对吗?

不完全的。 您只需同时定义相互递归值,即可使用纯函数式编程创建循环数据结构。 例如,在 OCaml 中:

let rec xs = 0::ys and ys = 1::xs

但是,可以定义一些语言,使其无法通过设计创建循环结构。 结果被称为单向堆,其主要优点是垃圾收集可以像引用计数一样简单。

如果是这样,他们为什么不呢?

有些语言确实禁止循环并使用引用计数。 Erlang 和 Mathematica 就是例子。

例如,在 Mathematica 中,当您引用一个值时,您会对其进行深层复制,因此更改原始值不会更改副本:

In[1] := xs = {1, 2, 3}
Out[1] = {1, 2, 3}

In[2] := ys = xs
Out[2] = {1, 2, 3}

In[3] := xs[[1]] = 5
Out[3] = 5

In[4] := xs
Out[4] = {5, 2, 3}

In[5] := ys
Out[5] = {1, 2, 3}

Am is right?

Not quite. You can create cyclic data structures using purely functional programming simply by defining mutually-recursive values at the same time. For example, in OCaml:

let rec xs = 0::ys and ys = 1::xs

However, it is possible to define languages that make it impossible to create cyclic structures by design. The result is known as a unidirectional heap and its primary advantage is that garbage collection can be as simple as reference counting.

If so, why don't they?

Some languages do prohibit cycles and use reference counting. Erlang and Mathematica are examples.

For example, in Mathematica when you reference a value you make a deep copy of it so mutating the original does not mutate the copy:

In[1] := xs = {1, 2, 3}
Out[1] = {1, 2, 3}

In[2] := ys = xs
Out[2] = {1, 2, 3}

In[3] := xs[[1]] = 5
Out[3] = 5

In[4] := xs
Out[4] = {5, 2, 3}

In[5] := ys
Out[5] = {1, 2, 3}
殊姿 2024-07-25 05:17:28

考虑一下这个寓言讲述了David MoonLisp 机器

有一天,一名学生来到 Moon 说:“我知道如何制作更好的垃圾收集器。我们必须保留指向每个 cons 的指针的引用计数。”

Moon耐心地给学生讲了以下故事:

<块引用>

“有一天,一名学生来到 Moon 说:‘我知道如何制作更好的垃圾收集器......

Consider this allegory told about David Moon, an inventor of the Lisp Machine:

One day a student came to Moon and said: "I understand how to make a better garbage collector. We must keep a reference count of the pointers to each cons."

Moon patiently told the student the following story:

"One day a student came to Moon and said: 'I understand how to make a better garbage collector...

空袭的梦i 2024-07-25 05:17:28

引用计数比 GC 慢得多,因为它对 CPU 不利。 并且GC大部分时间可以等待空闲时间,并且GC可以并发(在另一个线程上)。 这就是问题所在 - GC 是最不邪恶的,并且大量的尝试表明了这一点。

Reference counting is MUCH slower than GC because it's not good for CPU. And GC most of the time can wait for idle time and also GC can be concurrent (on another thread). So that's the problem - GC is least evil and lots of tries shown that.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文