C并行垃圾收集器,如何避免锁定主线程
我正在开发一个并行垃圾收集器。它是一种三标记收集器,具有通常的白色->灰色->黑色。当收集器将对象从灰色移动到黑色时,它会下降到该对象中以将子对象标记为灰色。这时候就需要取出一个锁,以防止主线程在读取对象时,对对象进行更改。由于给每个对象一个独立的锁是一个疯狂的内存需求,所以我有一个锁(每个非 gc 线程),在修改对象之前必须锁定它。 GC 将在读取对象之前使用该线程锁。
因此,GC 将从线程中迭代对象,并在读取子级之前取出锁,然后在下一次迭代之前释放锁。我想确保 GC 不会过度占用锁。对我来说,明显的解决方案似乎是在释放锁后立即“屈服”,以便主线程在等待锁时可以继续。垃圾收集器不是优先线程,即使需要很长时间才能完成其工作也没关系。
然而,我正在使用 pthreads (linux),当我用 google 搜索 sched_yield() 函数时,我发现它被认为是有害的。大多数结果都是关于它应该做什么的争论。简而言之,似乎可以说,如果您使用 sched_yield() ,您就做错了。
http://www.technovelty.org/code/c/sched_yield.html似乎提出了一种替代方案,但我无法掌握该算法的关键点,或者具体来说如何将其应用到我的需求。
I am developing a parallel garbage collector. It is a tri-marking collector that does the usual white->grey->black. When the collector moves an object from grey to black it descends into the object in order to mark the children grey. At this time it needs to take out a lock in order to prevent the object from changing in the main thread while the object is being read. Since it would be an insane memory requirement to give each object an independant lock, I have a single lock (per non-gc thread) that must be locked before modifying an object. The GC will use that threads lock before reading the object.
So, the GC will be iterating objects from a thread and taking out a lock before reading children, then releasing the lock before the next iteration. I want to make sure the GC does not hog the lock to much. To me the obvious solution seems to be a 'yield' just after releasing the lock so that the main thread may continue if it is waiting on the lock. The garbage collector is not a priority thread, it doesn't matter if it takes a long time to get its work done.
However, I am using pthreads (linux), and when I google the sched_yield() function I find that it is considered harmful. Most of the results are an argument over what it should even be doing. In short it seems it can be argued that if your using sched_yield() you are doing something wrong.
http://www.technovelty.org/code/c/sched_yield.html seems to propose an alternative, but I am having trouble grasping the key point of the algorithm, or specifically how to apply it to my needs.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
关于拥有每个对象的锁,我用来限制空间需求的一种方法是使用锁的循环数组(其中有一些大但可管理数量的锁,例如 8096)。然后,在它前面放置一些仲裁器,将给定对象与数组中的下一个锁关联起来(或者如果该对象已经与数组中的锁关联,则仲裁器将该锁提升回数组的前面) 。
这为您提供了为每个对象保留单独的锁的性能优势,而无需为每个不同的对象实例实际拥有一个不同的锁对象而带来疯狂的空间需求。当然,您必须调整算法的锁数量,以确保循环整个锁循环数组所需的时间始终小于处理对象所需的时间。
或者也许更好的方法是让仲裁器更像是“签出”和“签入”锁池的资源管理器。然后,当有人请求签出锁时,仲裁器可以立即发出锁,只要池中有可用的锁(或者已经签出同一对象实例),否则它必须阻塞,直到其他线程检查锁回来无论如何
,我不确定这是否直接适用于您的问题,我只是想指出“每个对象一个锁”和“只有一个锁”之间还有其他可行的选项可能有用在您的使用模型内。
On the subject of having a per-object lock, one approach I've used to keep the space requirements constrained is to have a circular array of locks (with some large but manageable number of locks in it, like say 8096). Then you put some arbiter in front of it that associates a given object with the next lock in the array (or if the object is already associated with a lock in the array, then the arbiter promotes that lock back to the front of the array).
This gives you the performance benefits of keeping a separate lock for each object, without the insane space requirements of actually having a distinct lock object for every distinct object instance. Of course, you'll have to tune the number of locks to your algorithm to ensure that the time it takes to cycle through the entire circular array of locks is always less than the amount of time required to process an object.
Or maybe a better approach would be to have the arbiter work more like a resource-manager for a pool of locks that get "checked out" and "checked in". Then when someone requests to check out a lock the arbiter can immediately issue one so long as one is available in the pool (or already checked out for the same object instance), or otherwise it has to block until some other thread checks a lock back in.
Anyways, I'm not sure if this is directly applicable to your question, I just wanted to point out that there are other workable options in between "one lock for every object" and "only one lock ever" that may be useful within your usage model.