Erlang 中的进程平衡
有谁知道 erlang 标准库中是否有一种“负载平衡器”? 我的意思是,如果我对一组非常大的数据进行一些非常简单的操作,那么为每个项目构建进程的开销将比顺序执行操作更大。 但是,如果我能够在“正确数量”的流程中平衡工作,它会表现得更好,所以我基本上是在问是否有一种简单的方法来完成这项任务。
顺便问一下,有人知道 OTP 应用程序是否具有某种平衡负载功能吗? 我的意思是,在 OTP 应用程序中存在“工作进程”的概念(如 java-ish 线程工作线程)?
Does anybody knows if there is a sort of 'load-balancer' in the erlang standard library? I mean, if I have some really simple operations on a really large set of data, the overhead of constructing a process for every item will be larger than perform the operation sequentially. But if I can balance the work in the 'right number' of process, it will perform better, so I'm basically asking if there is an easy way to accomplish this task.
By the way, does anybody knows if an OTP application does some kind of balance load? I mean, in an OTP application there is the concept of a "worker process" (like a java-ish thread worker)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
请参阅模块
pg2
和pool
。pg2
实现了非常简单的分布式进程池。pg2:get_closest_pid/1
返回“最接近的”pid,即随机本地进程(如果可用),否则随机远程进程。pool
实现以模块slave
启动的节点之间的负载均衡。See modules
pg2
andpool
.pg2
implements quite simple distributed process pool.pg2:get_closest_pid/1
returns "closest" pid, i.e. random local process if available, otherwise random remote process.pool
implements load balancing between nodes started with moduleslave
.plists
模块可能可以满足您的需求。 它基本上是lists
模块的并行实现,设计用作直接替换。 但是,您还可以控制它如何并行化其操作,例如通过定义应生成多少个工作进程等。您可能会根据列表的长度或系统的负载等计算工作进程的数量来实现 :
来自网站
The
plists
module probably does what you want. It is basically a parallel implementation of thelists
module, design to be used as a drop-in replacement. However, you can also control how it parallelizes its operations, for example by defining how many worker processes should be spawned etc.You probably would do it by calculating some number of workers depending on the length of the list or the load of the system etc.
From the website:
在我看来,otp 中没有有用的通用负载平衡工具。 也许只有在特定情况下才有用。 自己实现一个很容易。 plists 在相同情况下可能很有用。 我不相信平行图书馆可以替代真实的图书馆。 如果你走这条路,阿姆达尔将永远困扰你。
正确的工作进程数量等于调度程序的数量。 这可能会有所不同,具体取决于系统上完成的其他工作。 使用,
获取调度程序的数量。
当系统充满大量工作进程时,开销的概念有些错误。 新进程有开销,但没有操作系统线程那么多。 主要开销是进程之间的消息复制,这可以通过使用二进制文件来减轻,因为仅发送对二进制文件的引用。 使用 eterms,结构首先被扩展,然后复制到其他进程。
There is no, in my view, usefull generic load-balancing tool in otp. And perhaps it only usefull to have one in specific cases. It is easy enough to implement one yourself. plists may be useful in the same cases. I do not believe in parallel-libraries as a substitute to the real thing. Amdahl will haunt you forever if you walk this path.
The right number of worker processes is equal to the number of schedulers. This may vary depending of what other work is done on the system. Use,
to get the number of schedulers.
The notion of overhead when flooding the system with an abundance of worker processes is somewhat faulty. There is overhead with new processes but not as much as with os-threads. The main overhead is message copying between processes, this can be alleviated with the use of binaries since only the reference to the binary is sent. With eterms the structure is first expanded then copied to the other process.
如果不进行测量(例如执行),就无法机械地预测工作成本。 有人必须确定如何为某些类别的任务分配工作。 在负载均衡器这个词中,我的理解与您的问题非常不同。
There is no way how to predict cost of work mechanically without measure it e.g do it. Some person must determine how to partition work for some class of tasks. In load balancer word I understand something very different than in your question.