管道与临时文件

发布于 2024-11-28 06:40:13 字数 165 浏览 2 评论 0原文

进程

  • 进程 A 写入临时文件,进程 B 读取该文件
  • A 写入管道,进程 B 从该管道读取

我很想知道 Windows 和 Windows 的答案是什么*尼克斯。

编辑:我应该问:缓冲区缓存是否消除了临时文件和管道之间的差异?

Is there a big performance difference between:

  • Process A writing to a temp file, and process B reading that file
  • Process A writing to a pipe, and process B reading from that pipe

I'm curious to know what the answer is for both Windows and *nix.

EDIT: I should have asked: Does the buffer cache eliminate the difference between a temp file and a pipe?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

寒冷纷飞旳雪 2024-12-05 06:40:13

一个很大的区别是,使用管道,进程 A 和 B 可以同时运行,这样 B 就可以在 A 完成生成之前处理 A 的输出。此外,管道的大小是有限的,因此 A 无法生成比 B 消耗的数据多得多的数据;它将等待 B 赶上。

如果数据量很大,那么写入临时文件会涉及磁盘活动,即使只是为了创建然后销毁文件。数据很可能保留在内存缓冲池中 - 因此那里没有磁盘 I/O - 即使对于大得惊人的文件也是如此。写入管道“从不”涉及写入磁盘。

One big difference is that with the pipe, processes A and B can be running concurrently, so that B gets to work on the output from A before A has finished producing it. Further, the size of the pipe is limited, so A won't be able to produce vastly more data than B has consumed; it will be made to wait for B to catch up.

If the volume of data is big, then writing to the temporary file involves disk activity, even if only for creating and then destroying the file. The data might well stay in the in-memory buffer pools - so no disk I/O there - even for surprisingly large files. Writing to the pipe 'never' involves writing to disk.

貪欢 2024-12-05 06:40:13

最大的区别在于,第一种方法实际上使用磁盘存储,而管道将使用内存(除非您非常迂腐并开始考虑交换空间)。

就性能而言,内存比磁盘更快(几乎总是)。对于所有操作系统来说,这通常都是正确的。

使用临时文件真正有意义的唯一情况是进程 B 必须多次检查数据(例如某些类型的视频编码)。对于这种用途,整个数据流需要缓冲,如果有足够的数据,它可能会抵消内存优势。因此,对于多遍(搜索绑定)操作,请使用临时文件。

The big difference is that the first method actually uses on-disk storage, whereas a pipe will use memory (unless you get really pedantic and start thinking about swap space).

Performance-wise, memory is faster than disk (almost always). This should be generally true for all operating systems.

The only time when using a temp file really makes sense is if process B has to examine the data in multiple passes (like certain kinds of video encoding). For this use, the whole data stream would need to be buffered and if there were enough data yes it would probably negate the in-memory advantage. So for multi-pass (seek-bound) operations, go with a temp file.

却一份温柔 2024-12-05 06:40:13

除非我对管道的理解完全脱离了墙壁,否则答案是肯定的。

写入临时文件涉及磁盘访问以及相关的开销。

写入管道和读取管道都发生在内存中。快得多。

Unless my understanding of pipes in completely off the wall, the answer is YES.

Writing to a temp file involves disk access, and the associated overhead.

Writing to a pipe, and reading from it, happens in memory. Much faster.

为你拒绝所有暧昧 2024-12-05 06:40:13

我认为一个实际的答案可能会有所帮助。我正在对我使用的脚本进行速度优化,该脚本大约有 4 个步骤。我将其设置为使用管道和非管道方法。这是在 Windows 7 64 位下。

由于不使用管道,我的速度降低了 3%。这对我来说是值得的,因为现在我可以在每个步骤之间停止并更新窗口标题,而当这只是一个命令时我无法做到这一点。

就我个人而言,我会为窗口标题承担 3% 的费用。

出于好奇,我正在 grep 一个大于 20M 的文件,然后将其传递给一个专门的 perl 脚本来修改结果,然后使用 SORT.EXE 中内置的窗口对它们进行排序,然后使用 cygwin 的 UNIQ.EXE 对它们进行 uniq'ing,然后重新- grep 相同的结果以获得基于 ANSI 的 grep-result-coloring。大部分时间都花在排序阶段。

I thought a practical answer might help. I'm speed-optimizing a script I use that has about 4 steps. I set it up to use piping and non-piping methods. This is under Windows 7 64-bit.

I got a 3% slowdown for not using piping. Which is worth it, for me, because now I can stop between each step and update the window title, which I couldn't when it was all one command.

Personally, I'll take that 3% hit for the window titles.

For curiosity, I am grepping a >20M file, then passing it to a specialized perl script that modifies the results, then sorting them using windows built in SORT.EXE, then uniq'ing them using cygwin's UNIQ.EXE, then re-grepping those same results to get ANSI-based grep-result-coloring. Most of the time is spent in the sorting phase.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文