根据标题文本拆分串联文件

发布于 2024-10-07 01:09:24 字数 799 浏览 1 评论 0原文

我有一些非常大的文件，它们基本上是几个小文件的串联，我需要将它们拆分成各自的组成文件。我还需要将文件命名为与原始文件相同的名称。

例如，文件 QMAX123 和 QMAX124 已连接为：

;QMAX123 - Student

... file content ...

;QMAX124 - Course

... file content ...

我需要将文件 QMAX123 重新创建为

;QMAX123 - Student

... file content ...

And QMAX124 as

;QMAX124 - Course

... file content ...

原始文件的标头 ;QMAX 是唯一的，仅作为标头出现在文件中。

我使用下面的脚本来分割文件的内容，但我无法调整它以获得正确的文件名。

awk '/^;QMAX/{close("file"f);f++}{print $0 > "file"f}'

因此，我可以调整该脚本以正确命名文件，也可以根据文件的内容重命名使用上述脚本创建的拆分文件，以更容易的为准。

我目前正在使用 cygwin bash （它有 perl 和 awk），如果这对你的答案有任何影响的话。

原文

I have a few very large files which are basically a concatenation of several small files and I need to split them into their constituent files. I also need to name the files the same as the original files.

For example the files QMAX123 and QMAX124 have been concatenated to:

;QMAX123 - Student

... file content ...

;QMAX124 - Course

... file content ...

I need to recreate the file QMAX123 as

;QMAX123 - Student

... file content ...

And QMAX124 as

;QMAX124 - Course

... file content ...

The original file's header ;QMAX<some number> is unique and only appears as a header in the file.

I used the script below to split the content of the files, but I haven't been able to adapt it to get the file names right.

awk '/^;QMAX/{close("file"f);f++}{print $0 > "file"f}' <filename>

So I can either adapt that script to name the file correctly or I can rename the split files created using the script above based on the content of the file, whichever is easier.

I'm currently using cygwin bash (which has perl and awk) if that has any bearing on your answer.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

揽月 2024-10-14 01:09:24

下面的 Perl 应该可以解决这个问题

    use warnings ;
    use strict ;

    my $F   ; #will hold a filehandle
    while (<>) {
      if ( / ^ ; (\S+) /x) {
        my $filename = $1 ;
        open $F, '>' ,  $filename  or die "can't open $filename " ;
      } else {
        next unless defined $F ;
        print $F $_ or warn "can't write"  ;
      }
    }

，请注意，它会丢弃文件名 next 的行之前的所有输入，除非定义了 $F ; 您可能会关心生成错误或添加默认文件。让我知道，我可以更改它

The following Perl should do the trick

    use warnings ;
    use strict ;

    my $F   ; #will hold a filehandle
    while (<>) {
      if ( / ^ ; (\S+) /x) {
        my $filename = $1 ;
        open $F, '>' ,  $filename  or die "can't open $filename " ;
      } else {
        next unless defined $F ;
        print $F $_ or warn "can't write"  ;
      }
    }

Note it discards any input before a line with filename next unless defined $F ; You may care to generate an error or add a default file. Let me know and I can change it

回复收藏 0 原文

疯了 2024-10-14 01:09:24

使用 Awk，就这么简单

awk '/^;QMAX/ {filename = substr($1,2)} {print >> filename}' input_file

With Awk, it's as simple as

awk '/^;QMAX/ {filename = substr($1,2)} {print >> filename}' input_file

回复收藏 0 原文

~没有更多了~

关于作者

时光磨忆

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

根据标题文本拆分串联文件

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

linfzu01

§对你不离不弃

可遇━不可求

枕梦

qq_3LFa8Q

JP

友情链接

根据标题文本拆分串联文件

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

linfzu01

§对你不离不弃

可遇━不可求

枕梦

qq_3LFa8Q

JP

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。