“绿线”和“绿线”有什么区别?和 Erlang 的进程?

发布于 2024-08-15 21:43:10 字数 91 浏览 4 评论 0原文

在阅读了 Erlang 的轻量级进程之后,我非常确定它们是“绿色线程”。直到我读到绿色线程和Erlang进程之间存在差异。但我不明白。

实际差异是什么?

After reading about Erlang's lighweight processes I was pretty much sure that they were "green threads". Until I read that there are differences between green threads and Erlang's processes. But I don't get it.

What are the actual differences?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

太阳公公是暖光 2024-08-22 21:43:10

绿色线程可以直接在它们之间共享数据内存(尽管当然需要同步)。

Erlang 不使用“绿色线程”,而是使用更接近“绿色进程”的东西:进程不直接共享数据内存,而是通过“复制”它来实现(即拥有源数据的独立副本)。

Green Threads can share data memory amongst themselves directly (although synchronization is required of course).

Erlang doesn't use "Green Threads" but rather something closer to "Green Processes": processes do not share data memory directly but do so by "copying" it (i.e. having independent copies of the source data).

凉宸 2024-08-22 21:43:10

如果说 Erlang 进程不能直接共享数据内存,并且它们只能在彼此之间复制值,那么这种简化就太过分了。这更多地是对如何实现它以及如何假装它已实现的描述。至少对于除性能问题之外的所有目的而言。

Erlang 对程序员的行为施加了一些语义限制。例如,值是不可变的,这意味着在构造它们之后您无法更改它们。然后人们意识到,多个 Erlang 进程访问内存中的相同值是完全可以的,因为无论如何都无法更改它。那么锁就没有必要了。

在 Erlang/OTP 中完成此操作时,值得注意的情况是:

  • 大型二进制文件(超过 64 字节)在特殊的二进制堆中进行引用计数,并且在消息传递时传递对此堆的引用。
  • 文字值被放置在一个特殊的内存区域中,所有引用它们的进程都引用同一内存区域中的值(但是一旦在消息中发送该值,就会在接收进程中复制该值)。
  • 每个节点作为一个全局原子表,原子值实际上是对该表的引用,这使得原子相等性测试非常有效(比较指针而不是字符串)。
  • 实验性的 erl -hybrid 设置在消息中使用时,通过让进程首先将值从进程堆复制到共享堆来组合进程堆和共享堆。我发现了这个关于混合堆的线程,它也解释了这个概念的一些问题。

另一个可以做到的技巧是实际改变值,但确保它不可见。这是为了进一步解释不可变值是一种语义限制。

以下是 OTP/Erlang 实际上会改变值的一些示例:

  • 处理二进制语法的“最近”(R12) 优化允许您附加到二进制文件的末尾,并且实际上不会构造一个添加了新尾部的完整的新二进制文件。
  • 据说,具有直接 set_element 的新构造的元组可以或曾经被编译器翻译以实际就地更改元组的元素。

这些优化基于这样的理论:“如果一棵树倒在森林里,而没有人听到它,它真的会发出声音吗?”。也就是说,引用不能逃逸到要改变的对象。因为那时可以观察到它已经改变了。

这确实是 Erlang 语义的意义所在,事情不应该作为其他进程正在执行的操作的副作用而改变。我们称之为共享状态,但我们根本不喜欢它。

另一种过于简化的说法是,Erlang 没有副作用。但如果有人问过的话,那就是另一个问题了。

It is a simplification that goes too far to say that Erlang processes can not share data memory directly, and that they only copy values between each other. That is more of a description of how it could be implemented, and how one can pretend that it is implemented. At least for all purposes except performance issues.

Erlang enforces a few semantic restrictions on what you can do as a programmer. For example, values are immutable, meaning that you can't change them after they are constructed. One then realise that it would be perfectly fine for multiple Erlang processes to access the same value in memory, since none of the can change it anyway. And locks are not necessary then.

Notable situations when this is done in Erlang/OTP is:

  • Large binaries (more than 64 byte) are reference counted in a special binary heap, and references into this heap is passed when messaging.
  • Literal values are placed in a special memory area, all processes referring to them refer to values in the same memory area (but as soon as the value is sent in a message a duplicate is made in the receiving process).
  • Each node as a global atom table, and atom values are really references into this table, this makes atom equality testing very efficient (compare pointer instead of string).
  • The experimental erl -hybrid setting that combines process-heaps and shared-heaps by having processes copy values from the process-heap into the shared-heap first when used in a message. I found this thread about hybrid heaps, which also explains some issues with the concept.

Another trick that can be done is to actually mutate values, but making sure that it isn't visible. This is to further explain that immutable values is a semantic restriction.

These are some examples when OTP/Erlang will actually mutate values:

  • "Recent" (R12) optimisations in handling of the binary syntax allow you to append to the end of binaries and actually not construct a complete new binary with the new tail added.
  • It has been said that, newly constructed tuples with an immediate set_element can be, or have once been, translated by the compiler to actually change the element in-place for the tuple.

These optimisations go under the theory that "if a tree falls in the forest, and nobody is there to hear it, does it really make a sound?". That is, references must not have escaped to the object that is to be mutated. Because then it can be observed that it has changed.

And this is really what Erlang semantics is about, things should not change as a side-effect of what some other process is doing. We would call that shared state, and we don't like it at all.

Another simplification that goes too far is to say that Erlang has no side-effects. But that is for another question if it is ever asked.

生生漫 2024-08-22 21:43:10

当人们反对将 Erlang 的进程称为“绿色线程”时,他们并不是反对“绿色”部分,而是反对“线程”部分。

线程和进程之间的区别基本上是,线程只有自己的指令指针,但共享其他所有内容(尤其是状态、内存、地址空间)。 OTOH 进程完全隔离,不共享任何内容。

Erlang 的进程不共享任何内容,因此,它们是真正的进程。然而,它们通常以“绿色”方式实施。因此,从技术上讲,它们是“绿色工艺”。

当我想强调轻量级实现时,我通常将它们称为“绿色线程”;当我想强调无共享语义时,我通常将它们称为“进程”。这样我就不必解释“绿色流程”的含义。

When people object to calling Erlang's processes "green threads", they aren't objecting to the "green" part, they are objecting to the "threads" part.

The difference between threads and processes is basically, that threads have only their own instruction pointer, but share everything else (especially state, memory, address space). Processes OTOH are completely isolated and share nothing.

Erlang's processes share nothing, thus, they are true processes. However, they are usually implemented in a "green" manner. So, technically, they are "green processes".

I usually call them "green threads" when I want to emphasize the light weight implementation, and call them "processes" when I want to emphasize the shared-nothing semantics. That way I don't have to explain what I mean by "green processes".

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文