分叉后线程安全吗?
我了解到,通常应该坚持使用分叉或线程,以避免遇到非常奇怪且极难调试的问题,所以直到现在我总是这样做。我的问题是,当我坚持只分叉时,创建许多短期进程来分配工作块,随着我想要提供的CPU核心越多,成本就会越高,直到某些点性能不再合理扩展。同时,仅使用线程时,我必须非常小心使用哪些库,并且通常对线程安全性非常谨慎,这占用了大量宝贵的开发时间,并强制放弃一些最喜欢的库。因此,尽管有人警告我,混合分叉和线程的想法确实在很多层面上吸引了我。
现在,从我到目前为止所读到的内容来看,当分叉发生时已经创建了线程时,问题似乎总是会出现。
鉴于我设计的系统可以启动、守护进程、分叉其主要层,并且在我完全安全和健壮之后再也不会进行任何分叉。 如果其中一些预分叉层现在开始使用线程将其工作负载分配到许多 CPU 核心上,以便各个子进程永远不知道另一个孩子的情况,那会安全吗?我可以保证每个层本身都是线程安全的,并且非线程安全层永远不会启动自己的线程。
虽然我对这种方法感到相当安全,但我希望得到一些关于此事的专业意见,指出各种可能的警告、有趣的观点、高级阅读的链接等。个人在 Debian、RedHat、SuSe 和 OS X 上使用的是 Perl,但是主题应该足够通用,对于任何类似 Un*x/BSD 的平台上的任何语言都有效,这些平台会远程执行 POSIXish 操作,甚至可能是 Interix。
I've learned that you should usually stick with either forking or threading to avoid running into very strange and extremely hard-to-debug problems, so until now I always did exactly that. My problem with the matter is that when I stick with only forking, creating many short-lived processes to distribute chunks of work to gets the more expensive with the more CPU cores I want to feed, up until at some point performance just doesn't scale reasonably anymore. At the same time, using only threads I have to be ever so careful about which libraries I use and generally be extremely defensive with regards to thread-safety, this taking up lots of precious development time and enforcing the waiving of some favourite libraries. So, even though I'm warned, the thought of mixing forking and threading does appeal to me on a number of levels.
Now, from what I read so far, problems always seem to arise when there are already threads created when the fork happens.
Given I designed a system that would start up, daemonize, fork off its main tiers and never did any forking again ever after I'd be perfectly safe and robust. If some of those pre-forked tiers would now start to use threads to distribute their workload over many CPU cores, so that the various child processes never know of the other child's thrads, would that be safe still? I can assure that each tier in itself is thread-safe and that the non-thread-safe tiers won't ever start a thread of their own.
While I feel quite safe about this approach, I'd appreciate a few professional opinions on the matter, pointing out all sorts of possible caveats, interesting points of view, links to advanced reading etc. The language I personally use is Perl on Debian, RedHat, SuSe, and OS X, but the topic should be general enough to be valid for any language on any Un*x/BSD-like platform that would behave remotely POSIXish, maybe even Interix.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
但事实并非如此。
但是,您可以使用消息队列,而不是为每项工作分叉单独的进程。
创建一堆进程,所有进程都从公共队列中读取。将他们的工作放入队列中。不再分叉。许多小任务由一个公共队列提供。
和。没有线程安全问题。
Not really.
However, you can use message queues instead of forking individual processes for each piece of work.
Create a pile of processes that all read from a common queue. Put their work into the queue. No more forking. Many small tasks fed from a common queue.
And. No thread safety questions.
只要您不创建在分叉进程之间共享的任何
MAP_SHARED
共享内存区域,您的方法在 POSIX 下就很好。一旦进程被分叉,它们就是独立的。请参阅 有关
fork()
的 POSIX 文档。Your approach is fine under POSIX, as long as you don't create any
MAP_SHARED
shared memory regions that are shared among the forked processes. Once the processes are forked, they are independent.See the POSIX documentation on
fork()
.