Bash 代码每 4 行拆分一次,然后合并

发布于 2024-12-17 06:51:57 字数 1049 浏览 0 评论 0原文

也许我的标题并不能完全说明我的意图。 我有一个如下所示的数据列表:

@HWI-ST150_0129:3:8:21208:93107#0/1
TGTCTAGTTTTTATAGGAAGATATTTCCTTTTCTACCTTTGACTTCAAAGCGGCTGAAATCTCCACTTGCAAATTCCACAAAAAGAGTGTTACAAGTCT
+
Yeeeeeeeeeceed]dddddd^YdceeeedaeeddYccccc\ddceeYeYY`[`bcYc^_XY^_]d^dd`abdddee\e\ddLb]`_`cTbbbYbaM_]
@HWI-ST150_0129:3:8:21208:93107#0/2
TTTGTAAAGTCTGCACGTGGATAACTTGACCACTTAGAGGCCTTCGTTGGAAACGGGTTTTTTTCATGTAAGGCTAGACAGAAGAATTCTCAGTAACTTCAAGTTACTGAGAATTCTTCTGTCTAGCCTTACATGAAAAAAACCCGTTTCCAACGAAGGCCTCTAAGTGGTCAAGTTATCCACGTGCAGACTTTACAAA
+
ffcaefffcdeeeeeeeeeedff^f`\\eeedaec^d^d`deaffeeTecb^bbbddadYcccW[X\MZ\XaU_UTI\]TZ]K[VQX^aIb`b`^X^YSYHWI-ST150_0129:3:8:21208:93107#0

我们可以看到第一行和第五行都是头/名称,但以 #0/1 或 #0/2 结尾。现在我希望每 4 行进行分组,但稍后将所有带有 #0/1 的行合并在一起,然后将 #0/2 合并在一起。

应该是这样的:

@HWI....#0/1
TTCCGC
+
cffccc
@HWI....#0/1
CCGGGG
+
abbcgg
....

另一个文件是: @HWI...#0/1 ATTCCG + FCCFCC @HWI...#0/1 CGCCGG + gbbcaa

我知道如何用一个简单的 python 脚本来做到这一点。但只是想知道我们是否只能使用一些非常简单的 bash 代码? 谢谢

Maybe my title cannot fully explain my intention.
I have a list of data like below:

@HWI-ST150_0129:3:8:21208:93107#0/1
TGTCTAGTTTTTATAGGAAGATATTTCCTTTTCTACCTTTGACTTCAAAGCGGCTGAAATCTCCACTTGCAAATTCCACAAAAAGAGTGTTACAAGTCT
+
Yeeeeeeeeeceed]dddddd^YdceeeedaeeddYccccc\ddceeYeYY`[`bcYc^_XY^_]d^dd`abdddee\e\ddLb]`_`cTbbbYbaM_]
@HWI-ST150_0129:3:8:21208:93107#0/2
TTTGTAAAGTCTGCACGTGGATAACTTGACCACTTAGAGGCCTTCGTTGGAAACGGGTTTTTTTCATGTAAGGCTAGACAGAAGAATTCTCAGTAACTTCAAGTTACTGAGAATTCTTCTGTCTAGCCTTACATGAAAAAAACCCGTTTCCAACGAAGGCCTCTAAGTGGTCAAGTTATCCACGTGCAGACTTTACAAA
+
ffcaefffcdeeeeeeeeeedff^f`\\eeedaec^d^d`deaffeeTecb^bbbddadYcccW[X\MZ\XaU_UTI\]TZ]K[VQX^aIb`b`^X^YSYHWI-ST150_0129:3:8:21208:93107#0

We can see the first line and 5th line are both head/name, but ending with either #0/1 or #0/2. Now I hope to group every 4 lines, but later merge all those with #0/1 together, and #0/2 together.

Should be like:

@HWI....#0/1
TTCCGC
+
cffccc
@HWI....#0/1
CCGGGG
+
abbcgg
....

also another file is:
@HWI....#0/1
ATTCCG
+
fccfcc
@HWI....#0/1
CGCCGG
+
gbbcaa

I know how to do this with a simple python script. But just wondering if we can do only with some quite simple bash code?
Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

猫七 2024-12-24 06:51:57

sed -n '1,${p;n;n;n;}' 应该可以获取每第四行:

[ 11:32 [email protected] ~/SO/bash ]$ cat blah | sed -n '1,${p;n;n;n;}'
@HWI-ST150_0129:3:8:21208:93107#0/1
@HWI-ST150_0129:3:8:21208:93107#0/2

[ 11:33 [email protected] ~/SO/bash ]$ cat blah
@HWI-ST150_0129:3:8:21208:93107#0/1
TGTCTAGTTTTTATAGGAAGATATTTCCTTTTCTACCTTTGACTTCAAAGCGGCTGAAATCTCCACTTGCAAATTCCACAAAAAGAGTGTTACAAGTCT
+
Yeeeeeeeeeceed]dddddd^YdceeeedaeeddYccccc\ddceeYeYY`[`bcYc^_XY^_]d^dd`abdddee\e\ddLb]`_`cTbbbYbaM_]
@HWI-ST150_0129:3:8:21208:93107#0/2
TTTGTAAAGTCTGCACGTGGATAACTTGACCACTTAGAGGCCTTCGTTGGAAACGGGTTTTTTTCATGTAAGGCTAGACAGAAGAATTCTCAGTAACTTCAAGTTACTGAGAATTCTTCTGTCTAGCCTTACATGAAAAAAACCCGTTTCCAACGAAGGCCTCTAAGTGGTCAAGTTATCCACGTGCAGACTTTACAAA
+
ffcaefffcdeeeeeeeeeedff^f`\\eeedaec^d^d`deaffeeTecb^bbbddadYcccW[X\MZ\XaU_UTI\]TZ]K[VQX^aIb`b`^X^YSYHWI-ST150_0129:3:8:21208:93107#0

用于 sed 的有用单行脚本
man sed

sed -n '1,${p;n;n;n;}' should work for getting every 4th line:

[ 11:32 [email protected] ~/SO/bash ]$ cat blah | sed -n '1,${p;n;n;n;}'
@HWI-ST150_0129:3:8:21208:93107#0/1
@HWI-ST150_0129:3:8:21208:93107#0/2

[ 11:33 [email protected] ~/SO/bash ]$ cat blah
@HWI-ST150_0129:3:8:21208:93107#0/1
TGTCTAGTTTTTATAGGAAGATATTTCCTTTTCTACCTTTGACTTCAAAGCGGCTGAAATCTCCACTTGCAAATTCCACAAAAAGAGTGTTACAAGTCT
+
Yeeeeeeeeeceed]dddddd^YdceeeedaeeddYccccc\ddceeYeYY`[`bcYc^_XY^_]d^dd`abdddee\e\ddLb]`_`cTbbbYbaM_]
@HWI-ST150_0129:3:8:21208:93107#0/2
TTTGTAAAGTCTGCACGTGGATAACTTGACCACTTAGAGGCCTTCGTTGGAAACGGGTTTTTTTCATGTAAGGCTAGACAGAAGAATTCTCAGTAACTTCAAGTTACTGAGAATTCTTCTGTCTAGCCTTACATGAAAAAAACCCGTTTCCAACGAAGGCCTCTAAGTGGTCAAGTTATCCACGTGCAGACTTTACAAA
+
ffcaefffcdeeeeeeeeeedff^f`\\eeedaec^d^d`deaffeeTecb^bbbddadYcccW[X\MZ\XaU_UTI\]TZ]K[VQX^aIb`b`^X^YSYHWI-ST150_0129:3:8:21208:93107#0

Useful One-Line Scripts For sed
man sed

倒带 2024-12-24 06:51:57

我不确定我是否理解你的意思,但是使用 GNU sed 获取每 4 行是微不足道的:

sed '1~4!d' file

将四行分组,我认为你的意思是将 4 行减少到一行:

sed '/#0\/[12]$/{N;N;N;s/\n//;d}' file

这使用了你的正则表达式上面提到的即以 #0/1#0/2 结尾的行

I'm not sure I understand you, however getting every 4th line is trivial with GNU sed:

sed '1~4!d' file

To group four lines, by which I presume you mean reduce 4 lines to one:

sed '/#0\/[12]$/{N;N;N;s/\n//;d}' file

This uses the regex which you mentioned above i.e. a line ending in #0/1 or #0/2

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文