垃圾收集重载,Java
问题是,由于垃圾收集时间的原因,我的性能受到了影响。该问题可以概括为:
public void loop(BlockingQueue<Runnable> queue)
{
int j = queue.size();
for(int i =0; i<j;i++)//line2
{
Runnable runnable = queue.take();
runnable.run();//line4
if(Math.random() > 0.9) System.gc();//line5
}
//line7 //will 'runnable = null;' answer the question, logically it looks right
}
现在,作为参数传递的队列通常将包含超过 40,000 个元素。 而且因为我在循环中迭代队列,即使已经“运行”的对象超出了范围,它们仍然不可用于垃圾收集,因为它们位于 不可见状态。因此,如果我没有第 5 行,那么当该方法退出堆栈时,垃圾收集器上会突然出现巨大的负载。想象一下,如果同时有多个线程访问该方法。
我的问题:
- 需要5号线吗?还有其他替代品吗?
- 如果我必须有第 5 行,我发现与没有它相比,性能非常非常糟糕。
最终必须进行垃圾收集吗?我不知道什么时候会发生。
PS:我的计算机上禁用了 JavaScript,因此无法评论答案。我将在这里编辑帖子以获取评论:
@amit:我已经更改了代码,我认为您已经理解了问题的本质。该代码只是一个示例。
@Tobi:谢谢,但是如何设置更大的堆大小来解决问题。那只是延迟了gc的时间。 所以你认为在没有手动GC的情况下它会表现最好? 进一步来自 http://java.sun.com/docs /books/performance/1st_edition/html/JPAppGC.fm.html,它说只有当该方法被从堆栈中取出时,只有在这种情况下它才可用于垃圾回收。我尝试使用 Finalize() (通过打印,不是正确的方式,但至少应该对 100000 个对象工作一次),绝对没有 gc。
@保罗:谢谢。我想要实现的是一个管道模型,其中每个线程都有一个消息队列,基本上是一个框架,任何线程都可以将可运行的线程发布到另一个线程(如果它与该线程有一些工作),并且另一个线程将执行他们在一段时间后(当它空闲时) Ans,当我的意思是,当方法从堆栈中出来时重载,我的意思是垃圾收集最终会发生,如果稍后发生,那么清除 40,000 个元素将花费很多时间
@ Joachim Sauer :System.gc 可以收集不可见的对象,只是垃圾收集器不会自动收集它们。但当强制时,它会按照: http:// java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html
The problem is that some how due to garbage collection timings I am having tradeoff's in my performance. The issue can be generalized as:
public void loop(BlockingQueue<Runnable> queue)
{
int j = queue.size();
for(int i =0; i<j;i++)//line2
{
Runnable runnable = queue.take();
runnable.run();//line4
if(Math.random() > 0.9) System.gc();//line5
}
//line7 //will 'runnable = null;' answer the question, logically it looks right
}
Now normally the queue passed as argument will contain more than 40,000 elements normally.
And because I am iterating over the queue in a loop, even though the already 'run' objects are out of scope, they are still not available for garbage Collection because they are in invisible state. Hence if I do not have the line 5, then suddenly there will be a huge load on the garbage collector when the method goes out of the stack. Imagine if concurrently many thread accessing the menthod.
My questions:
- Is line 5 needed? Is there any other substitute?
- If I have to have line 5, I found out the performance was very very bad when compared to not having it.
Ultimately garbage collection has to happen? I am unable to figure out when it should happen.
PS: Javascript is disabled on my computer hence can't comment for answers. I shall edit the post here for the comments:
@amit: I have changed the code, I think you have understood the essence of the problem. The code is just a sample.
@Tobi: Thanks, but how will setting up a bigger heap size, solve the problem. Thats only delaying the time of gc.
So you think it will perform the best without manual gc?
Further from http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html, it says that only once the method is taken of the stack, only then in this case it would be available for garbage collection. And i tried with finalize() (by having a print, not the right way, but should atleast work once for 100000 objects), there is absolutely no gc.
@Paolo: Thanks. What i am trying to implement is a pipelining model, where every thread has a meassage queue, basically a framework where any thread can post a runnable to the other thread (if it has some work with the thread), and the other thread will execute them after soem time(when it is idle)
Ans when i meant, overload when the method comes out of the stack, what i mean is that garbage colelction will eventually happen, if it happens later, then clearing 40,000 elements will take a lot time
@ Joachim Sauer : System.gc can collect invisible objects, It is just that garbage collector doesn't collect them automatically. But when forced, it does as per: http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
不需要调用
System.gc()
。事实上,调用 System.gc() 对吞吐量是有害的。只需删除第 5 行即可。无需用任何内容替换它。
这完全符合我的预期。见上文。
事实上:
最多可以有一个处于“不可见”状态的对象被超出范围的
可运行
变量引用,并且强制垃圾收集无论如何都不会回收它。
这种说法是错误的。事实上,不可见对象的问题在于GC无法收集它们。事实上,您链接的文章说得很清楚:
(强调。)
坦率地说,我认为您正在尝试解决一个不存在的问题。或者说,即使它确实存在,也与看不见的物体无关。
如果确实存在不可见对象的问题,那么解决方案就是简单地将
null
分配给循环末尾的变量。The call to
System.gc()
is unnecessary. In fact, it is positively harmful for throughput to callSystem.gc()
.Just delete line 5. There is no need to replace it with anything.
This is entirely as I would expect. See above.
In fact:
there can be at most one object in "invisible" state referenced by an out-of-scope
runnable
variable, andforcing garbage collection won't reclaim it anyway.
This statement is false. In fact the problem of invisible objects is that the GC can't collect them. Indeed, the article you linked to says this very clearly:
(Emphasis added.)
Frankly, I think you are trying to solve a problem that does not exist. Or if it does exist, it is nothing to do with invisible objects.
And if there really is a problem with invisible objects, then the solution is to simply assign
null
to the variable at the end of the loop.从根本上来说,您不应该确定何时收集垃圾,这取决于 JVM。第 5 行是不必要的,并且正如您所发现的,这会损害应用程序的性能。
尝试反复强制执行 GC 意味着您的程序设计有问题。
不一定,GC 不会在每次对象超出范围或弹出堆栈帧时运行。 JVM 决定何时执行 GC 运行,这可能是立即执行,也可能是在将来的某个时间执行。
您想通过这种方法实现什么目标?看起来您有一组想要并行执行的任务?如果是这种情况,您应该查看 ExecutorService 为您将任务放入线程池中。
Fundamentally you are not supposed to figure out when to collect garbage, that is up to the JVM. Line 5 is unnecesary and as you've spotted will be detrimental to your application's performance.
Trying to force the GC repeatedly means you've got something wrong with the design of your program.
Not necessarily, the GC doesn't run every time objects go out of scope or a stack frame is popped. The JVM determines when to do a GC run and this may be immediately, it may be sometime in the future.
What are you trying to achieve in this method? It looks like you have a collection of tasks you want to execute in parallel? If this is the case you should be looking at the ExecutorService to put tasks on a thread pool for you.
让垃圾收集器完成其工作,并且如果有必要(并且只有在那时),使用 VM 参数调整垃圾收集器,但不要调用
System.gc()
在这样的内循环中。几乎可以保证这样的调用会降低性能,因为它强制执行完整的垃圾收集周期,而通常垃圾收集器 1) 仅在堆接近满时进行收集(因此您可以通过给出来减少/增加收集次数VM 更大/更小的堆),并且 2)大部分时间只收集最年轻的对象(即 分代垃圾回收)。
至于您的特定情况,不可见状态意味着变量 runnable 引用的对象可能仍然在第 7 行被强引用,即使局部变量
runnable
超出了范围在那一点上。然而,虚拟机只为变量 runnable 保留一个槽,该槽在循环的每次迭代中都会重用。一旦您从队列中取出下一个元素并将其存储在runnable
中,它将覆盖对前一个元素的引用,从而使前一个元素符合垃圾回收的条件。换句话说,只有一个元素可能处于这种不可见状态(每次调用loop
)。Let the garbage collector do its work, and if necessary (and only then), tune the garbage collector using VM arguments, but do not put a call to
System.gc()
in an inner loop like this.It is pretty much guaranteed that such a call will kill performance, because it forces a full garbage collection cycle, while normally the garbage collector 1) only collects when the heap is nearly full (so you can reduce/increase the number of collections by giving the VM a larger/smaller heap), and 2) only collects the youngest objects most of the time (i.e., generational garbage collection).
As for your particular case, invisible state means that the object referenced by the variable runnable may still be strongly referenced at line 7, even though the local variable
runnable
is out of scope at that point. However, the vm only reserves one slot for the variable runnable, which is reused in each iteration of your loop. As soon as you take the next element from the queue and store it inrunnable
, it will overwrite the reference to the previous element, therefore making this previous element eligible for garbage collection. In other words, only one element will likely ever be in this invisible state (per call toloop
).只是有些人认为这可能有帮助,也可能没有帮助。
Just some thought that may be of help or not.