根据标题文本拆分串联文件
我有一些非常大的文件,它们基本上是几个小文件的串联,我需要将它们拆分成各自的组成文件。我还需要将文件命名为与原始文件相同的名称。
例如,文件 QMAX123
和 QMAX124
已连接为:
;QMAX123 - Student
... file content ...
;QMAX124 - Course
... file content ...
我需要将文件 QMAX123
重新创建为
;QMAX123 - Student
... file content ...
And QMAX124
as
;QMAX124 - Course
... file content ...
原始文件的标头 ;QMAX
是唯一的,仅作为标头出现在文件中。
我使用下面的脚本来分割文件的内容,但我无法调整它以获得正确的文件名。
awk '/^;QMAX/{close("file"f);f++}{print $0 > "file"f}'
因此,我可以调整该脚本以正确命名文件,也可以根据文件的内容重命名使用上述脚本创建的拆分文件,以更容易的为准。
我目前正在使用 cygwin bash (它有 perl 和 awk),如果这对你的答案有任何影响的话。
I have a few very large files which are basically a concatenation of several small files and I need to split them into their constituent files. I also need to name the files the same as the original files.
For example the files QMAX123
and QMAX124
have been concatenated to:
;QMAX123 - Student
... file content ...
;QMAX124 - Course
... file content ...
I need to recreate the file QMAX123
as
;QMAX123 - Student
... file content ...
And QMAX124
as
;QMAX124 - Course
... file content ...
The original file's header ;QMAX<some number>
is unique and only appears as a header in the file.
I used the script below to split the content of the files, but I haven't been able to adapt it to get the file names right.
awk '/^;QMAX/{close("file"f);f++}{print $0 > "file"f}' <filename>
So I can either adapt that script to name the file correctly or I can rename the split files created using the script above based on the content of the file, whichever is easier.
I'm currently using cygwin bash (which has perl and awk) if that has any bearing on your answer.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
下面的 Perl 应该可以解决这个问题
,请注意,它会丢弃文件名
next 的行之前的所有输入,除非定义了 $F ;
您可能会关心生成错误或添加默认文件。让我知道,我可以更改它The following Perl should do the trick
Note it discards any input before a line with filename
next unless defined $F ;
You may care to generate an error or add a default file. Let me know and I can change it使用 Awk,就这么简单
With Awk, it's as simple as