螺纹环基准
今天,我正在做《Programming Erlang》一书中的线程环练习,并在谷歌上搜索其他解决方案进行比较。我发现语言枪战与 基准测试。我的印象是 Erlang 在这个领域应该很快,但事实证明 C 和 C++ 再次领先。我怀疑 C/C++ 程序没有遵循“将令牌从一个线程传递到另一个线程”的规则。读完它们之后,似乎它们都操作了一些共享内存和全局变量,这与 Erlang 代码不同,但我可能是错的。
我的问题是:他们在做同样的事情还是 C/C++ 代码在概念上与 Erlang 代码不同(并且更快)?
还有一个问题:当解决方案非常相似时,为什么 Haskell 比 Erlang 更快?
Today I was doing the thread ring exercise from the Programming Erlang book and googled for other solutions to compare. I found that the language shootout has exactly the same problem as a benchmark. I had the impression that this is an area where Erlang should be fast, but turns out that C and C++ are again on top. My suspicion is that the C/C++ programs are not following the rules which say "pass the token from thread to thread". It seems, after reading them, that they both manipulate some shared memory and global variables which is different from the Erlang code but I could be wrong.
My question is: are they doing the same thing or the C/C++ code is conceptually different (and faster) from the Erlang one?
And another question: why is Haskell faster than Erlang when the solutions are very similar?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
我会回答另一个问题:Erlang 运行时是如何在幕后实现的?
它很可能是用 C 或类似的系统语言实现的(我怀疑他们都是用汇编语言实现的)。或者至少,它们表达的概念可以在 C 中有效地表达。
现在,为什么你会觉得优化的 C 版本(枪战肯定不会显示平均级别的代码)会击败 Erlang 版本如此奇怪,考虑到 Erlang 添加了自己的复杂性/间接性级别?
无论基准测试的类型如何,C 实现总是有可能击败另一种语言中最完美的程序...构建在 C 之上,仅仅是因为您采用了它生成的 C 版本,然后删除了您不需要的位需要。
另一方面:
有时“更快”并不值得付出代价。
I would answer by another question: how is the Erlang runtime implemented under the hood ?
Chances are it's implemented in C or a similar system language (I doubt they did it all in assembly). Or at the very least, that the concepts they express can be expressed as efficiently in C.
Now, why do you find it so strange that an optimized C version (the shootout certainly does not show average level code) would beat the Erlang version, considering that Erlang adds its own level of complexity / indirection ?
Whatever the type of benchmark, it's always possible for a C implementation to beat the most polished program in another language... built on top of C, simply because you take the C-version it generates then removes the bits that you don't need.
On the other hand:
Sometimes "faster" just isn't worth the cost.
考虑到并发编程有许多完全不同的方法,我确实发现很难既具有足够的包容性以引入不同的语言实现,又保留一些模糊的可比性。
现在看看使用不同的运行时配置测量的相同程序的性能< /a> 并注意这有多重要 -
Given how many really quite different approaches there are to programming concurrency, I did find it very difficult to both be inclusive enough to bring in different language implementations and yet retain some vague comparability.
Now look at the performance of the same programs measured with different run time configuration and note how much that matters -
该基准测试中需要注意的一件事是只有一个令牌可以传递。这意味着实际上它只是一个单线程程序从内存中读取和写入。
我希望结果在多处理器机器(或使其成为集群)上出现不同的结果,其中线程/进程必须以某种随机顺序传递 M 个令牌。
唔。还为基准解决方案的开发人员提供相同的小时数来完成他们的解决方案。然后我预计 Erlang 会名列前茅(或至少接近顶部)。
One thing to note in that benchmark is that there is only one token to pass around. Which means that in practice it's just a single threaded program reading and writing from/to memory.
I would expect the result to come out different on a multiprocessor machine (or make that a cluster) where the threads/processes have to pass M tokens around in some random order.
Hmm. Also give the developers' of the benchmark solutions equal number of hours to finish their solution. Then I would expect Erlang to come out on top (or close to the top at least).
Haskell GHC 是一种经过编译、本机代码优化的语言实现,具有非常快的线程。它通常比 Erlang/HIPE 快得多。 Erlang 并没有垄断轻量级线程:-)
Haskell GHC is a compiled, native-code optimized language implementation with very fast threads. It is generally much faster than Erlang/HIPE. Erlang doesn't have a monopoly on lightweight threads :-)
最终,现代机器上的消息传递是使用某种形式的共享内存来传递消息(以及锁或原子指令)来实现的。因此,所有 C 和 C++ 实现实际上所做的就是将消息传递的实现直接内联到其代码中。可以在本文中找到使用 C 语言快速消息传递库的类似基准测试,也针对 Haskell 和 Erlang 进行基准测试:http://www.cs.kent.ac.uk/pubs/2009/2928/index.html(第 5.1 节)
各种方法的速度实际上是由所涉及的并发运行时系统决定的。 Haskell 在这个领域做了很多出色的工作,这使得它领先于 Erlang。当然,在微基准上测量速度通常会产生误导,并且会忽略代码可读性等重要因素。需要记住的一个问题可能是:您愿意维持点球大战中的哪一个解决方案?
Ultimately, message-passing on modern machines is implemented using some form of shared memory to pass the messages (along with either locks or atomic instructions). So all the C and C++ implementations are really doing is inlining the implementation of message-passing straight into their code. A similar benchmark that uses a fast message-passing library in C, also benchmarked against Haskell and Erlang, can be found in this paper: http://www.cs.kent.ac.uk/pubs/2009/2928/index.html (section 5.1)
The speed of the various approaches is really determined by the concurrent run-time systems involved. Haskell has had a lot of good work done in this area, which leaves it ahead of Erlang. Of course, measuring speed on micro-benchmarks is often mis-leading, and leaves out important factors like the readability of the code. A question to bear in mind might be: which of the solutions in the shoot-out would you be happy to maintain?
我不认为我会称之为作弊。多线程和多进程之间的主要、根本区别是多个线程共享单个地址空间。因此,在我看来,指定多个线程(而不是多个进程)就像是利用共享地址空间的默认许可(至少在没有一些非常具体的“传递”定义来禁止这种情况的情况下)。
归根结底是这样的:Erlang 本身并没有真正的线程——它有异步通信的进程。这些进程(有意)在很大程度上彼此隔离。一方面,这使得开发变得更加容易——特别是,一个流程只能通过特定的、明确定义的沟通渠道影响另一个流程。在幕后,它使用了很多技巧(几乎肯定包括共享内存)来优化其进程并利用特定实现/情况下可能的功能(例如在单个共享地址空间中运行的所有进程)。尽管如此,必须隐藏所有技巧使其无法像 C 版本那样高效,在 C 版本中,“技巧”都是明确且完全暴露的。
我会用现实生活中的类比来解释其中的差异。将线程/进程视为人。 Erlang 加强了专业的工作关系,所有沟通都仔细记录在备忘录中。 C 和 C++ 版本更像是一对丈夫和妻子,他们可能会用一个对其他人来说没有任何意义的单词进行交流,甚至只是快速浏览一下。
后者在工作时效率极高,但更容易产生微妙的误解,如果两者发生争执,你可能不想待在同一个房间。对于管理者来说,处于纯粹职业关系中的人更容易管理,即使他们的最高效率不是很高。
I don't think I'd call it cheating. The primary, fundamental difference between multiple threads and multiple processes is that multiple threads share a single address space. As such, specifying multiple threads (rather than multiple processes) seems to me like tacit permission to take advantage of the shared address space (at least in the absence of some very specific definition of "passing" that prohibited this).
What it comes down to is this: Erlang doesn't really have threads, as such -- it has processes with asynchronous communications. The processes are (intentionally) isolated from each other to a large degree. On one hand, this makes development considerably easier -- in particular, one process can only affect another via specific, well-defined channels of communication. Under the hood, it uses lots of tricks (almost certainly including shared memory) to optimize its processes and take advantage of what's possible in a specific implementation/situation (such as all the processes running in a single, shared address space). Nonetheless, having to keep all the tricks hidden keeps it from being quite as efficient as something like the C version where the "tricks" are all explicit and completely exposed.
I'd use a real-life analogy to explain the difference. Think of the threads/processes as people. Erlang enforces a professional working relationship where communications are all carefully recorded in memos. The C and C++ versions are more like a husband and wife who might communicate with a single word that doesn't mean anything to anybody else, or even just a single quick glance.
The latter is extremely efficient when it works -- but it's a lot more open to subtle misunderstandings, and if the two have a fight you probably don't want to be in the same room. For the manager, people in purely professional relationships are a lot easier to manage even if their peak efficiency isn't quite a high.
C版本使用LWP,我认为它是一个用户空间线程库。这在多大程度上“不公平”还有待争论:我会考虑它是否支持真正的抢占式并发,因为您可以在一个线程中进行阻塞系统调用,而不会阻塞所有其他线程(您可以在 Haskell 中做到这一点,你能在 Erlang 中做到吗?)。
Haskell 的线程比 Erlang 的线程稍微轻一些,因为据我了解,Erlang 线程带有本地堆(在标准实现中),而 Haskell 线程都共享相同的堆。
The C version is using LWP, which I think is a user-space threading library. To what extent this is "unfair" is up for debate: I'd look at things like whether it supports true pre-emptive concurrency in the sense that you can make blocking system calls in a thread without blocking all the other threads (you can do this in Haskell, can you in Erlang?).
Haskell's threads are slightly more lightweight than Erlang's, because as I understand it an Erlang thread comes with a local heap (in the standard implementation) whereas Haskell threads all share the same heap.