通过许多不同的 GCD 队列访问硬盘是否安全?
安全吗?例如,如果我创建一堆不同的 GCD 队列,每个队列都压缩 (tar cvzf) 一些文件,我是否做错了什么?硬盘会被毁吗?
或者系统是否正确处理这些事情?
Is it safe? For instance, if I create a bunch of different GCD queues that each compress (tar cvzf) some files, am I doing something wrong? Will the hard drive be destroyed?
Or does the system properly take care of such things?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
迪特里希的答案是正确的,除了一个细节(这是完全不明显的)。
如果您要通过 GCD 分拆 100 个异步
tar
执行,您很快就会发现您的应用程序中有 100 个线程在运行(由于严重滥用了I/O 子系统)。在具有队列的完全异步并发系统中,无法知道特定工作单元是否因为正在等待系统资源或正在等待某些其他排队的工作单元而被阻塞。因此,任何时候发生任何阻塞,您几乎都必须启动另一个线程并消耗另一个工作单元,否则就有锁定应用程序的风险。
在这种情况下,“明显”的解决方案是当一个工作单元阻塞时等待一会儿,然后再启动另一个线程以出队并处理另一个工作单元,希望第一个工作单元“解除阻塞”并继续加工。
然而,这样做意味着任何在工作单元之间进行交互的异步并发系统(这是一种常见的情况)都会非常慢,以至于毫无用处。
更有效的方法是限制在任意时间排队到全局异步队列中的工作单元的数量。 GCD 信号量使这变得非常容易;您有一个串行队列,所有工作单元都排队到其中。每次将一个工作单元出队时,都会增加信号量。每次完成一个工作单元时,都会减少信号量。只要信号量低于某个最大值(例如 4),您就会将一个新的工作单元排入队列。
Dietrich's answer is correct save for one detail (that is completely non-obvious).
If you were to spin off, say, 100 asynchronous
tar
executions via GCD, you'd quickly find that you have 100 threads running in your application (which would also be dead slow due to gross abuse of the I/O subsystem).In a fully asynchronous concurrent system with queues, there is no way to know if a particular unit of work is blocked because it is waiting for a system resource or waiting for some other enqueued unit of work. Therefore, anytime anything blocks, you pretty much have to spin up another thread and consume another unit of work or risk locking up the application.
In such a case, the "obvious" solution is to wait a bit when a unit of work blocks before spinning up another thread to de-queue and process another unit of work with the hope that the first unit of work "unblocks" and continues processing.
Doing so, though, would mean that any asynchronous concurrent system with interaction between units of work -- a common case -- would be so slow as to be useless.
Far more effective is to limit the # of units of work that are enqueued in the global asynchronous queues at any one time. A GCD semaphore makes this quite easy; you have a single serial queue into which all units of work are enqueued. Every time you dequeue a unit of work, you increment the semaphore. Every time a unit of work is completed, you decrement the semaphore. As long as the semaphore is below some maximum value (say, 4), then you enqueue a new unit of work.
如果你采取通常 IO 受限的东西,例如
tar
,并在 GCD 中运行一堆副本,N
个任务,这就是GCD的点,所以“十亿个队列条目”和如果您有,“十个队列条目”会给您同样的结果少于10个线程,If you take something that is normally IO limited, such as
tar
, and run a bunch of copies in GCD,N
tasks will run at a time, which is the point of GCD, so "a billion queue entries" and "ten queue entries" give you the same thing if you have less than 10 threads,尽管这个问题是在 5 月份提出的,但仍然值得注意的是,GCD 现在已经在 10.7 (OS X Lion) 版本中提供了 I/O 原语。请参阅dispatch_read 和dispatch_io_create 的手册页,了解有关如何使用新API 进行高效I/O 的示例。它们足够聪明,可以根据实际 I/O 请求中可能或不可能的并发量,正确地针对单个磁盘(或多个磁盘)调度 I/O。
Even though this question was asked back in May, it's still worth noting that GCD has now provided I/O primitives with the release of 10.7 (OS X Lion). See the man pages for dispatch_read and dispatch_io_create for examples on how to do efficient I/O with the new APIs. They are smart enough to properly schedule I/O against a single disk (or multiple disks) with knowledge of how much concurrency is, or is not, possible in the actual I/O requests.