当前位置：文江博客话题详情

使用并行 AWK - 有人听说过吗？

发布于 2024-10-06 17:38:05 字数 281 浏览 0 评论 0原文

有这样的事吗？有人可以解释一下吗？我一直在使用 AWK 执行简单的任务，例如打印列和合并大数据文件，但不用于计算？我在想是否可以使用我的计算机或网络中的所有节点和 CPU 并行运行 AWK。但如何呢？使用并行 AWK 的主要目的是什么？

感谢您的意见。

发布问题后，我发现 Parallel AWK 确实存在。您可以找到更多相关信息。这是链接 http://www.parallel-awk.org/

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

萌面超妹 2024-10-13 17:38:05

并行 awk 实现的问题在于语义明确假设操作是按顺序处理的。例如：

awk '{print NR, $0}'

为您提供类似于 cat -n 的输出。并行处理的困难在于 NR 是处理的总行数，而不仅仅是给定文件中的行数 (FNR)

此外，还有涉及 getline 等命令的更复杂的技巧，无法并行化（例如，可以短路脚本来模拟 gnu nextfile 扩展）

The problem with a parallel awk implementation is that the semantics explicitly assume that operations are processed in order. For example:

awk '{print NR, $0}'

gives you output akin to cat -n. The difficulty with processing this in parallel is that NR is the total number of lines processed, not just the number of lines in the given file (FNR)

Also, there are more complicated tricks involving commands like getline, which cannot be parallelized (for example, a script can be short-circuited to emulate the gnu nextfile extension)

回复收藏 0 原文

~没有更多了~