为什么代理模式这么慢?
至少在java中,代理模式有很多开销——我不记得确切的数字,但是当包装微小方法时,代理花费的时间大约是包装方法的50倍。例如,这就是为什么 java.awt.image.BufferedImage.setRGB 和 getRGB 确实很慢的原因;大约有三个代理包装了实际的byte[]
。
为什么50次?!为什么代理不将时间加倍?
Edit: =(
就像往常一样,我得到了一堆答案,告诉我我的问题是错误的。它不是。查看 BufferedImage 或其他一些真实的代理模式,而不是那些微基准。事实上,如果您必须对 BufferedImage 进行大量像素操作并且了解其结构,则可以通过手动撤消代理来实现上述巨大的加速;请参阅此答案。
哦,还有这是我的来源50 倍。正如文章所详述的,当代理所包装的内容需要很长时间时,它们不会受到明显的惩罚,但是如果您包装一个很小的方法,它们确实会产生巨大的痛苦开销。
At least in java, the proxy pattern has a lot of overhead - I don't remember the exact figures, but when wrapping tiny methods the proxy takes something like 50 times as long as the wrapped method. This is, for example, why java.awt.image.BufferedImage.setRGB
&getRGB
are really slow; there's about three proxies wrapping the actual byte[]
.
Why 50 times?! Why doesn't the proxy just double the time?
Edit: =(
As seems usual for SO, I got a bunch of answers telling me that my question was wrong. It's not. Check out BufferedImage, or some other real proxy pattern, not those microbenchmarks. In fact, if you have to do a lot of pixel manipulation of a BufferedImage and you know its structure, you can achieve said enormous speedups by manually undoing the proxying; see this answer.
Oh, and here's my source for 50x. As the article details, proxies don't have a noticeable penalty when what they wrap takes a long time, but they do have major painful overhead if you're wrapping a tiny method.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我不知道“50次”这个数字从何而来,但它很可疑。特定代理可能比它所代理的代理明显慢,具体取决于每个代理正在做什么,但据此概括说“代理模式太慢了”就是采取这是逻辑上非常戏剧性且非常值得怀疑的飞跃。
试试这个:
Thingy.java
:ThingyProxy.java
:WithoutProxy.java
:WithProxy.java
:简单试验我的机器:
慢一点?是的。慢 50 倍?不。
现在,对 JVM 进行计时非常困难,并且像上面这样的简单实验必然是值得怀疑的。但我认为可能会出现 50 倍的差异。
编辑:我应该提到,上面的循环数量非常非常少,发布的数字如下:
...这让您了解环境中虚拟机的启动时间。例如,上面的时间大部分不是虚拟机启动,实际执行时间只有一微秒的差异,它们主要是执行时间。
I don't know where that "50 times" figure comes from, but it's pretty suspect. It may be that a specific proxy is markedly slower than what it's proxying, depending on what each of them is doing, but to generalize from that to say that "the proxy pattern is so slow" is to take a very dramatic and highly-questionable leap in logic.
Try this:
Thingy.java
:ThingyProxy.java
:WithoutProxy.java
:WithProxy.java
:Simple trials on my machine:
Slightly slower? Yes. 50x slower? No.
Now, timing the JVM is notoriously difficult and simple experiments like the above are necessarily suspect. But I think a 50x difference probably would have shown up.
Edit: I should have mentioned that the above with a very, very small number of loops posts numbers like this:
...which gives you an idea of VM startup time in the environment. E.g., the timings above are not mostly VM startup with just a microsecond of difference in actual execution time, they're mostly execution time.
当代码被编译为本机代码时,字节数组访问将类似于 3 1 周期指令(只要源数据和目标数据在缓存中是热的,并且未对齐的字节访问不会受到惩罚。YMMV 取决于平台)。
添加方法调用来存储四个字节将(取决于平台,但类似这样)添加将寄存器推送到堆栈、调用指令、数组访问指令、返回指令以及从堆栈中弹出寄存器。将为每一层或代理添加推送/调用/返回/弹出序列,并且这些指令大多不会在 1 个周期内执行。如果编译器无法内联这些方法(这很容易发生),您将遭受相当大的惩罚。
代理添加了在颜色深度等之间进行转换的功能,从而增加了额外的开销。
此外,编译器还可以进一步优化顺序数组访问(例如,将存储操作转变为多个字节访问操作 - 一次最多 8 位,同时仍然只需要 1 个周期),而代理调用则使这变得困难。
50x 听起来有点高,但并非不合理,具体取决于实际代码。
BufferedImage 尤其会增加大量开销。虽然代理模式本身可能不会增加明显的开销,但 BufferedImage 的使用可能会增加。请特别注意 setRGB() 是同步的,这在某些情况下可能会产生严重的性能影响。
When the code has been compiled into native code, the byte array accesses would be something like 3 1 cycle instructions (as long as the source and destination data are hot in the cache and unaligned byte accesses are not penalized. YMMV depending on platform).
Adding a method call to store the four bytes will (depending on platform, but something like this) add pushing registers to the stack, a call instruction, the array access instructions, a return instruction and popping the registers from the stack. The push/call/return/pop sequence will be added for each layer or proxy, and none of these instructions mostly don't execute in 1 cycle. If the compiler fails to inline these methods (which can happen rather easily) you'd run into a quite hefty penalty.
The proxies add functionality to convert between color depths and so on, adding extra overhead.
Also, sequential array accesses can be further optimized by the compiler (eg. turning the store operations into multiple byte access operations - up to 8 bits at a time while still taking only 1 cycle) where the proxy calls make that hard.
50x sounds a bit high, but not unreasonably so depending on the actual code.
BufferedImage in particular adds plenty of overhead. While the proxy pattern in itself might add no discernible overhead, usage of BufferedImage probably does. Note in particular that setRGB() is synchronized, which might have severe performance implications in certain circumstances.
我看到它们有所作为的一个地方是在不执行任何操作的代码上。 JVM 可以检测到不执行任何操作的代码并消除它。但是,使用方法调用可能会混淆此检查,并且代码不会被消除。如果您在此类示例中比较使用和不使用方法的时间,您可以获得您想要的任何比率,但是如果您查看无方法测试的进行情况,您会发现代码已被消除并且运行速度快得不合理。例如,每个循环比一个时钟周期快得多。
普通方法是内联的,例如 getter 和 setter。它们根本不会对性能产生影响。我非常怀疑真实程序声称的 50 倍。当正确测试时,我希望更接近无差异。
One place I have seen them make a difference is on code which doesn't do anything. The JVM can detect code which doesn't do anything can eliminate it. However, using methods calls can confuse this check and the code is not eliminated. If you compare the timing with and with out methods in such examples, you can get any ratio you wish, however if you look at how the no-methods test is going, you will see that the code has been eliminated and is going unreasonably fast. e.g. much faster than one clock cycle per loop.
Trivial methods are inlined, like getter and setters. They can result in no impact on performance at all. I very much doubt the 50x times claim for a real program. I would expect closer to no-difference-what-so-ever when tested properly.