Ruby VM 并发性和并行性
我有一个关于 Ruby VM(Ruby Interpreter)的一般性问题。它如何与多处理器一起工作?关于 Ruby 中的并行性和并发性,假设我有 4 个处理器。 VM 会通过内核自动向处理器分配任务吗?通过缩放,假设我的 ruby 进程占用了大量 CPU 资源;如果我添加新处理器会发生什么?操作系统负责将任务分配给处理器,还是每个虚拟机都在一个处理器上工作?扩展我的 ruby 应用程序的最佳方法是什么?我尽可能地尝试分离我的进程并使用 amqp 队列。还有其他想法吗?
如果您能给我发送更多解释的链接,那就太好了。
提前致谢。
I have a general question about the Ruby VM (Ruby Interpreter ). How does it work with multiprocessors? Regarding parallelism and concurrency in Ruby, let's say that I have 4 processors. Will the VM automatically assign the tasks with the processors through the Kernel? With scaling, lets say that my ruby process is taking a lot of the CPU resources; what will happen if I add a new processor? Is the OS responsible for assigning the tasks to the processors, or will each VM work on one processor? What would be the best way to scale my ruby application? I tried as much as possible to separate my processes and use amqp queuing. Any other ideas?
It would be great if you can send me links for more explanation.
Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Ruby 线程
Ruby 语言本身通过线程模型支持并行执行;然而,实施决定是否使用额外的硬件资源。 “黄金标准”解释器(MRI Ruby)在 1.8 中使用“绿色线程”模型;线程是在解释器内完成的,并且仅使用单个系统线程来执行。然而,其他的(例如 JRuby)利用 Java VM 创建实际的系统级线程来执行。 MRI Ruby 1.9 添加了额外的线程功能,但(据我所知)它仍然仅限于仅在线程因 I/O 事件而停止时切换线程上下文。
高级线程
通常,操作系统管理逻辑核心的线程分配,因为大多数应用程序软件实际上并不关心。在某些高性能计算情况下,软件将专门请求某些线程在特定逻辑核心上执行,以获得特定于架构的性能。用 Ruby 编写的任何东西都不太可能属于这一类。
重构
每个应用程序的性能限制通常可以通过首先重构代码来解决。利用更适合特定问题的语言或其他环境可能是最好的第一步,而不是立即跳转到现有实现中的线程。
示例
我曾经开发过一个 Ruby on Rails 应用程序,在上传数据时其中包含大量哈希映射函数步骤。最初的实现完全用 Ruby 编写,花了大约 80 秒才能完成。用 ANSI C 重写代码并利用更具体的内存分配,执行时间降至不到一秒(甚至不使用线程)。下一个瓶颈是将大量数据插入回 MySQL,MySQL 最终也从 Ruby 代码移至线程 C 代码中。我特意选择了这条路线,因为 MRI Ruby 解释器可以轻松绑定到 C 代码。最终结果是 Ruby 为 C 代码准备环境,将其作为带参数的类上的 Ruby 实例方法调用,通过 C 代码的单个线程进行哈希映射,最后以 OpenMP 生成和执行 MySQL 插入的工作队列模型。
Ruby Threading
The Ruby language itself supports parallel execution through a threading model; however, the implementation dictates if additional hardware resources get used. The "gold standard" interpreter (MRI Ruby) uses a "green threading" model in 1.8; threading is done within the interpreter and only uses a single system thread for execution. However, others (such as JRuby) leverage the Java VM to create actual system level threads for execution. MRI Ruby 1.9 adds additional threading capability but (afaik) it's still limited to only switching thread contexts when a thread stalls on an I/O event.
Advanced Threading
Typically the OS manages assignment of threads to logical cores since most application software doesn't actually care. In some high performance compute cases, the software will specifically request certain threads to execute on specific logical cores for architecture specific performance. It's highly unlikely anything written in Ruby would fall into this category.
Refactoring
Per application performance limits can usually be addressed by refactoring the code first. Leveraging a language or other environment more suited to the specific problem is likely the best first step instead of immediately jumping to threading in the existing implementation.
Example
I once worked on a Ruby on Rails app with a massive hash mapping function step in it when data was uploaded. The initial implementation was written completely in Ruby and took ~80s to complete. Rewriting the code in ANSI C and leveraging more specific memory allocation, the execution time fell to under a second (without even using threads). The next bottleneck was inserting the massive amount of data back into MySQL which eventually also moved out of the Ruby code and into threaded C code. I specifically went this route since the MRI Ruby interpreter easily binds to C code. The final result has Ruby preparing the environment for the C code, calling it as a Ruby instance method on a class with parameters, hash mapping by a single thread of C code, and finally finishes with an OpenMP worker queue model of generating and executing inserts into MySQL.