向数千个文件添加 trec 格式标签
我需要在目录中的数千个文件中添加诸如文件文本之类的标签,我尝试使用 cat 并将其输出到文件流,
for file in *
do
cat ../gau > temp; //gau contain format i need to append in each file
echo $file >>temp;
cat ../gau_ >>temp ;//contains </DOCID>
cat $file >>temp;
cat ../gau1 >> temp; //this contain last sentence </DOC>
cat temp > $file
done
但这样做非常慢,请告诉我一种更好、更有效的方法来做到这一点。 os 不可能使用 c 来完成。我们如何批量打开文件,然后处理它们并放回,因为它可以加快这个过程,因为我认为打开和写入文件是瓶颈。
由于我们时间紧迫,是否有预制程序(高效且快速)来完成这项工作。
i need to add tags like text of file in thousand of files in a directory and i tried it using cat and outputing it to a stream of file using
for file in *
do
cat ../gau > temp; //gau contain format i need to append in each file
echo $file >>temp;
cat ../gau_ >>temp ;//contains </DOCID>
cat $file >>temp;
cat ../gau1 >> temp; //this contain last sentence </DOC>
cat temp > $file
done
but doing this is very slow can please tell me a better and efficient way to do this .os ot possible to do using c .how can we open files in batches and then process them and put back as it can fasten this process since opening and writing file is bottle neck i suppose.
Is there and premade program(which is efficient and fast) to do this job as we are scarcity in time.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是一个快速的Python代码,试试吧,它会比你的批处理脚本执行得更快:
不过我还没有尝试过。
THis is a quick python code, try it, it would execute faster than your batch script:
I haven't tried it though.
不要
cat temp > $file
,只需mv temp $file
——您不需要重写文件,只需重命名即可。这肯定是性能不佳的原因之一。您可能需要选择比“gau”、“gau_”和“gau1”更具描述性的文件名。
Don't
cat temp > $file
, justmv temp $file
-- you don't need to rewrite the file, just rename it. That's certainly one of the causes of bad performanceYou might want to choose more desctiptive filenames than "gau", "gau_" and "gau1".