识别程序“之前”和“之后”管道中的程序来自相同的“工具集”。
比如说,我正在编写一些工具集,其中每个工具都对相同的文本数据流进行操作,解析它,对其进行一些操作,并使用与原始输入中相同的语法返回文本流。工具可以是 在管道中组合(与其他 UNIX 工具/脚本/其他内容一起)。因为 文本输入处理(解析)非常昂贵,我想避免它,以防万一 工具集中的两个或多个工具在管道中一个接一个地使用 相反,二进制流(直接存储在内存结构中,没有无用的“额外”解析)。是吗 可能知道(使用一些技巧、进程间通信或其他任何方式) 管道中任何工具“之前”或“之后”的工具是工具集的一部分吗?我猜是 UNIX 环境。还没有准备好接受这种“信号”(AFAIK)。谢谢你的想法...
Say, I am writting some toolset where every single tool operates on the same textual data stream, parses it, does some operation on it and returns textual stream back using the same syntax as in the original input. The tools can be
combined (together with other unix tools/scripts/whatever) in a pipeline. Because the
textual input processing (parsing) is quite expensive, I would like to avoid it in case
two or more tools from the toolset are one right after another in the pipeline and use
binary streams instead (to store directly in a memory struct, w/o useless "extra" parsing). Is it
possible to know (using some trick, inter-process communication, or whatever else) if
the tool "before" or "after" any tool in a pipeline is part of the toolset? I guess the
unix env. is not prepared for such sort of "signalling" (AFAIK). Thanks for your ideas...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
另一种方法是让所有工具读取文本或二进制表示形式,可能由文件开头的幻数表示。命令行选项可以选择输出格式。
根据使用情况,最好将二进制设置为“默认”,并使用选项选择文本输出。
对比:
如果二进制幻数由非 ASCII 字节组成,则文本格式不需要幻数。
Another way would be to have all the tools read either textual or binary representations, perhaps indicated by a magic number at the beginning of the file. And a command-line option could select the output format.
Depending on the usage, it may be preferable to make binary the "default", and select text-output with an option.
vs.
You don't need a magic number for the text format if the binary magic number consists of non-ASCII bytes.
不,通过管道连接在一起的进程没有双向通信的方法。如果解析真的非常昂贵,以至于这是必要的(我猜它不是,但分析它),那么我可以想到两个选择:
如果希望用户有足够的知识,让每个工具允许标志告诉他们期望二进制输入并提供二进制输出,以便用户可以像这样链接:
<前><代码>tool1 -o |工具2 -i -o |工具3 -i -o |工具4-i
其中
-o
表示提供二进制输出,-i
表示接受二进制输入。No, processes that are piped together have no methods of two-way communication. If the parsing is really so expensive that this is necessary (I'd guess it isn't, but profile it), then you have a two options that I can think of:
If users are expected to be knowledgeable enough, have each tool allow flags to tell them to expect binary input and give binary output, so that users can chain like:
where
-o
means give binary output and-i
means accept binary input.您当然可以让工具链中的进程进行对话,但这需要一些工作。一种想法是让工具集中的每个进程使用 pgid(管道中每个进程的 pgid 相同)来确定共享内存名称,然后将其输入流的 pid 和 inode 写入共享内存。然后工具集中的每个进程都会知道管道中也在管道中的其他进程。如果 inode 匹配,它们就会知道它们的邻居是否在工具集中。
You can certainly have the processes in the tool chain talk, but it requires a bit of work. One idea is to have each process in the toolset use the pgid (the pgid for each process in the pipeline is the same) to determine a shared memory name and then write their pid and inodes of their input streams into the shared memory. Then each process in the tool set will know the other processes in the pipeline that are also in the pipeline. If inodes match, they will know whether their neighbor is in the tool set.