C 语言的 popen 性能
我正在设计一个计划用 C 语言实现的程序,我有一个关于调用外部程序的最佳方式(就性能而言)的问题。用户将为我的程序提供一个文件名,然后我的程序将使用该文件作为输入运行另一个程序。然后我的程序将处理另一个程序的输出。
我的典型方法是将其他程序的输出重定向到一个文件,然后让我的程序在完成后读取该文件。然而,我知道 I/O 操作非常昂贵,我想让这个程序尽可能高效。
我做了一些查找,发现了用于运行系统命令并获取输出的 popen
命令。该方法的性能与我刚才描述的方法的性能相比如何? popen
只是将外部程序的输出写入临时文件,还是将程序输出保留在内存中?
或者,是否有另一种方法可以提供更好的性能?
I'm designing a program I plan to implement in C and I have a question about the best way (in terms of performance) to call external programs. The user is going to provide my program with a filename, and then my program is going to run another program with that file as input. My program is then going to process the output of the other program.
My typical approach would be to redirect the other program's output to a file and then have my program read that file when it's done. However, I understand I/O operations are quite expensive and I would like to make this program as efficient as possible.
I did a little bit of looking and I found the popen
command for running system commands and grabbing the output. How does the performance of this approach compare to the performance of the approach I just described? Does popen
simply write the external program's output to a temporary file, or does it keep the program output in memory?
Alternatively, is there another way to do this that will give better performance?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
在 Unix 系统上,
popen
将通过内存管道传递数据。假设数据没有被换出,它就不会到达磁盘。这应该可以为您提供尽可能好的性能,而无需修改所调用的程序。On Unix systems,
popen
will pass data through an in-memory pipe. Assuming the data isn't swapped out, it won't hit disk. This should give you just about as good performance as you can get without modifying the program being invoked.popen 几乎可以满足您的要求:它执行 pipeline-fork-exec 习惯用法,并为您提供一个可以读取和写入的文件指针。
但是,管道缓冲区的大小有限制(~4K iirc),如果读取速度不够快,其他进程可能会阻塞。
您可以访问共享内存作为安装点吗? [在 Linux 系统上有一个 /dev/shm 挂载点]
popen does pretty much what you are asking for: it does the pipe-fork-exec idiom and gives you a file pointer that you can read and write from.
However, there is a limitation on the size of the pipe buffer (~4K iirc), and if you arent reading quickly enough, the other process could block.
Do you have access to shared memory as a mount point? [on linux systems there is a /dev/shm mountpoint]
1)
popen
将程序输出保存在内存中。它实际上使用管道在进程之间传输数据。2) 恕我直言,
popen
看起来是性能的最佳选择。与文件相比,它还具有减少延迟的优势。即,您的程序将能够在生成时即时获取其他程序的输出。如果此输出很大,那么您不必等到另一个程序完成后才开始处理其输出。
1)
popen
keep the program output in memory. It actually uses pipes to transfer data between the processes.2)
popen
looks IMHO as the best option for performance.It also have an advantage over files of reducing latency. I.e. your program will be able to get the other program output on the fly, while it is produced. If this output is large, then you don't have to wait until the other program is finished to start processing its output.
将子命令重定向到文件的问题在于,它可能不安全,而
popen
通信无法被其他进程拦截。另外,如果您正在运行主程序(以及子命令)的多个实例,则需要确保文件名是唯一的。popen
解决方案不会受到此问题的影响。只要您不读/写一字节块,
popen
的性能就很好。始终读/写 512 的倍数(如 4096)。但这也适用于文件操作。popen
通过管道连接您的进程和子进程,因此如果您不读取,则管道将被填满,子进程将无法写入,反之亦然。所以所有交换的数据都在内存中,但只是少量。The problem with having your subcommand redirect to a file is that it's potentially insecure while
popen
communication can't be intercepted by another process. Plus you need to make sure the filename is unique if you're running several instances of your master program (and thus of your subcommand). Thepopen
solution doesn't suffer from this.The performance of
popen
is just fine as long as your don't read/write one byte chunks. Always read/write multiples of 512 (like 4096). But that does apply to file operations as well.popen
connects your process and the child process through pipes, so if you don't read then the pipe fills up and the child can't write and vice versa. So all the exchanged data is in memory, but it's only small amounts.(假设 Unix 或 Linux)
如果文件位于速度较慢的磁盘上,则写入临时文件可能会很慢。这也意味着整个输出必须适合磁盘。
popen
使用管道连接到其他程序,这意味着输出将增量发送到您的程序。当它生成时,它会被逐块复制到您的程序中。(Assuming Unix or Linux)
Writing to the temp file may be slow if the file is on a slow disk. It also means the entire output will have to fit on the disk.
popen
connects to the other program using a pipe, which means that output will be sent to your program incrementally. As it is generated, it is copied to your program chunk-by-chunk.