如何在连接之前插入新行?
我有大约 80000 个文件正在尝试连接。这一个:
cat files_*.raw >> All
非常快,而以下:
for f in `ls files_*.raw`; do cat $f >> All; done;
非常慢。由于这个原因,我试图坚持使用第一个选项,除了我需要能够在每个文件连接到 All
之后插入一个新行。有什么快速的方法可以做到这一点吗?
I have about 80000 files which I am trying to concatenate. This one:
cat files_*.raw >> All
is extremely fast whereas the following:
for f in `ls files_*.raw`; do cat $f >> All; done;
is extremely slow. Because of this reason, I am trying to stick with the first option except that I need to be able to insert a new line after each file is concatenated to All
. Is there any fast way of doing this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
那
会在连接文件时在每个文件的末尾插入一个额外的换行符。
如果您不关心串联顺序,则可以使用并行版本:
What about
That will insert an extra newline at the end of each file as you concat them.
And a parallel version if you don't care about the order of concatenation:
第二个命令可能会很慢,因为您要打开“全部”文件进行追加 80000 次,而第一个命令中则为 1 次。尝试第二个命令的简单变体:
The second command might be slow because you are opening the 'All' file for append 80000 times vs. 1 time in the first command. Try a simple variant of the second command:
我不知道为什么会很慢,但我认为你没有太多选择:
I don't know why it would be slow, but I don't think you have much choice:
每次 awk 打开另一个文件进行处理时,FRN 等于 0,因此:
注意,这一切都是在一个 awk 进程中完成的。性能应该接近问题中的 cat 命令。
Each time awk opens another file to process, the FRN equals 0, so:
Note, it's all done in one awk process. Performance should be close to the cat command from the question.