JVM方法调用和远程调用之间的性能差异是什么?
我正在收集一些有关 JVM 方法调用和使用二进制协议(换句话说,不是 SOAP)的远程方法调用之间的性能差异的数据。我正在开发一个框架,其中方法调用可以是本地的或远程的,由框架自行决定,我想知道在什么时候“值得”远程评估该方法,无论是在更快的服务器上还是在计算某种网格。我知道远程调用会慢得多,所以我最感兴趣的是了解数量级的差异。是慢10倍,还是100倍,还是1000倍?有人有这方面的数据吗?如果有必要,我会编写自己的基准,但我希望重用一些现有的知识。谢谢!
I'm gathering some data about the difference in performance between a JVM method call and a remote method call using a binary protocol (in other words, not SOAP). I am developing a framework in which a method call may be local or remote at the discretion of the framework, and I'm wondering at what point it's "worth it" to evaluate the method remotely, either on a much faster server or on a compute grid of some kind. I know that a remote call is going to be much, much slower, so I'm mostly interested in understanding the order-of-magnitude differnce. Is it 10 times slower, or 100, or 1,000? Does anyone have any data on this? I'll write my own benchmarks if necessary, but I'm hoping to re-use some existing knowledge. Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
开发出低延迟 RMI(最少约 20 微秒)后,它仍然比直接调用慢 1000 倍。如果您使用普通 Java RMI(至少 500 微秒),速度可能会慢 25,000 倍。
注意:这只是一个非常粗略的估计,可以让您大致了解可能会看到的差异。有许多复杂的因素可能会极大地改变这些数字。根据方法的作用,差异可能会小得多,尤其是如果您对同一进程执行 RMI,如果网络相对较慢,差异可能会大得多。
此外,即使存在非常大的相对差异,也可能不会对整个应用程序产生太大影响。
为了详细说明我的最后一条评论...
假设您有一个 GUI,它必须每秒轮询一些数据,并且它使用后台线程来执行此操作。假设使用 RMI 需要 50 毫秒,而另一种方法是对分布式缓存的本地副本进行直接方法调用,需要 0.0005 毫秒。这看起来是一个巨大的差异,100,000 倍。然而,RMI 调用可以提前 50 毫秒开始,仍然每秒轮询,对用户来说差别几乎没有。
更重要的是,RMI 与使用其他方法相比要简单得多(如果它是适合该工作的工具)。
使用 RMI 的替代方法是使用 JMS。哪个最好取决于您的情况。
Having developed a low latency RMI (~20 micro-seconds min) it is still 1000x slower than a direct call. If you use plain Java RMI, (~500 micro-seconds min) it can be 25,000x slower.
NOTE: This is only a very rough estimate to give you a general idea of the difference you might see. There are many complex factors which could change these numbers dramatically. Depending on what the method does, the difference could be much lower, esp if you perform RMI to the same process, if the network is relatively slow the difference could be much larger.
Additionally, even when there is a very large relative difference, it may be that it won't make much difference across your whole application.
To elaborate on my last comment...
Lets say you have a GUI which has to poll some data every second and it uses a background thread to do this. Lets say that using RMI takes 50 ms and the alternative is making a direct method call to a local copy of a distributed cache takes 0.0005 ms. That would appear to be an enormous difference, 100,000x. However, the RMI call could start 50 ms earlier, still poll every second, the difference to the user is next to nothing.
What could be much more important is when RMI compared with using another approach is much simpler (if its the right tool for the job)
An alternative to use RMI is using JMS. Which is best depends on your situation.
准确回答你的问题是不可能的。执行时间的比率取决于以下因素:
但一般来说,直接 JVM 方法调用非常很快,任何类型的序列化加上 RMI 引起的网络延迟都会增加巨大的开销。查看这些数字,以便您粗略估计开销:
http://surana.wordpress.com/2009/01/01/numbers-everyone-should-know/
除此之外,您还需要进行基准测试。
一个建议 - 确保你使用一个非常好的二进制序列化库(avro、protocol buffers、kryo 等)以及一个像样的通信框架(例如 Netty)。这些工具比标准的 Java 序列化/io 工具要好得多,并且可能比您在合理的时间内自己编写的任何工具都要好。
It's impossible to answer your question precisely. The ratio of execution time will depends on factors like:
But in general, direct JVM method calls are very fast, any kind of of serialization coupled with network delay caused by RMI is going to add a significant overhead. Have a look at these numbers to give you a rough estimate of the overhead:
http://surana.wordpress.com/2009/01/01/numbers-everyone-should-know/
Apart from that, you'll need to benchmark.
One piece of advice - make sure you use a really good binary serialization library (avro, protocol buffers, kryo etc.) couple with a decent communications framework (e.g. Netty). These tools are far better than the standard Java serialisation/io facilities, and probably better than anything you can code yourself in a reasonable amount of time.
没有人能告诉你答案,因为是否分发的决定与速度无关。如果是这样,您将永远进行分布式调用,因为它总是比在内存中进行的相同调用慢。
您可以分发组件,以便多个客户端可以共享它们。如果共享很重要,那么它比速度的影响更重要。
您的收支平衡点与共享功能的价值有关,而不是方法调用速度。
No one can tell you the answer, because the decision of whether or not to distribute is not about speed. If it was, you would never make a distributed call, because it will always be slower than the same call made in-memory.
You distribute components so multiple clients can share them. If the sharing is what's important, it outweighs the speed hit.
Your break even point has to do with how valuable it is to share functionality, not method call speed.