同步代码比非同步代码执行得更快
我得到了这个令人惊叹的结果,但我绝对不知道原因: 我有两个方法,可简化为:
private static final ConcurrentHashMap<Double,Boolean> mapBoolean =
new ConcurrentHashMap<Double, Boolean>();
private static final ConcurrentHashMap<Double,LinkedBlockingQueue<Runnable>> map
= new ConcurrentHashMap<Double, LinkedBlockingQueue<Runnable>>();
protected static <T> Future<T> execute(final Double id, Callable<T> call){
// where id is the ID number of each thread
synchronized(id)
{
mapBoolean.get();// then do something with the result
map.get();//the do somethign with the result
}
}
protected static <T> Future<T> executeLoosely(final Double id, Callable<T> call){
mapBoolean.get();// then do something with the result
map.get();//the do somethign with the result
}
}
在使用超过 500 个线程进行分析时,每个线程调用上述每个方法 400 次,我发现execute(..) 的性能至少比executeLoosely(..) 好500 倍。这很奇怪,因为executeLoosely 不是同步的,因此更多线程可以同时处理代码。
有什么理由吗??
I came out with this stunning result which i absolutely do not know the reason for:
I have two methods which are shortened to:
private static final ConcurrentHashMap<Double,Boolean> mapBoolean =
new ConcurrentHashMap<Double, Boolean>();
private static final ConcurrentHashMap<Double,LinkedBlockingQueue<Runnable>> map
= new ConcurrentHashMap<Double, LinkedBlockingQueue<Runnable>>();
protected static <T> Future<T> execute(final Double id, Callable<T> call){
// where id is the ID number of each thread
synchronized(id)
{
mapBoolean.get();// then do something with the result
map.get();//the do somethign with the result
}
}
protected static <T> Future<T> executeLoosely(final Double id, Callable<T> call){
mapBoolean.get();// then do something with the result
map.get();//the do somethign with the result
}
}
On profiling with over 500 threads, and each thread calling each of the above methods 400 times each, I found out that execute(..) performs atleast 500 times better than executeLoosely(..) which is weird because executeLoosely is not synchronized and hence more threads can process the code simultaneously.
Any reasons??
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
更多
发布评论
评论(2)
在我假设没有 500 个核心的机器上使用 500 个线程的开销,使用大约需要 100-1000 倍时间的任务,只要在 Map 上查找以执行 JVM 可以检测到的代码但不执行任何操作,可能会产生随机结果。 ;)
您可能遇到的另一个问题是,使用一个线程执行速度更快的测试可以从使用同步中受益,因为它会偏向对一个线程的访问。即,它将您的多线程测试恢复为单线程测试,这首先是最快的。
您应该将获得的计时与执行循环的单个线程进行比较。如果这更快(我相信它会更快),那么它不是一个有用的多线程测试。
我的猜测是您在非同步代码之后运行同步代码。即在 JVM 稍微预热之后。交换执行这些测试的顺序并多次运行它们,您将得到不同的结果。
The overhead of using 500 threads on a machine which I assume doesn't have 500 cores, using tasks which takes about 100-1000x as long as a lookup on a Map to execute code which the JVM could detect doesn't do anything, is likely to produce a random outcome. ;)
Another problem you could have is that a test which faster being performed with one thread can benefit from using synchronized because it biases access to one thread. i.e. it turns your multi-threaded test back into a single threaded one which is the fastest in the first place.
You should compare the timings you get with a single thread doing a loop. If this is faster (which I believe it would be) then its not a useful multi-threaded test.
My guess is that you are running the synchronized code after the unsynchronised code. i.e. after the JVM has warmed up a little. Swap the order you perform these tests and run them many times and you will get different results.
在非同步场景下:
1)等待获取映射的一个段上的锁,锁定,对映射执行操作,解锁,等待获取另一个映射的段上的锁,锁定,对另一个映射执行操作,解锁。
仅在并发写入段的情况下才会执行段级锁定,而您的示例中看起来并非如此。
在同步场景中:
1)等待锁定,执行两个操作,解锁。
上下文切换所花费的时间会产生影响吗?运行测试的机器有多少个核心?
地图的结构如何,相同类型的键?
In the non synchronized scenario :
1) wait to acquire lock on a segment of the map, lock, perform operation on the map, unlock, wait to acquire lock on a segment of the other map, lock, perform operation on the other map, unlock.
The segment level locking will be performed only in cases of concurrent write to the segment which doesn't look to be the case in your example.
In the synchronized scenario :
1) wait to lock, perform both the operations, unlock.
The time taken for context switching can have an impact? How many cores does the machine running the test have?
How are the maps structured, same sort of keys?