如何替换以 > 开头的行通过 awk 命令与同一行的 15 列?

发布于 2025-01-05 09:24:19 字数 1766 浏览 0 评论 0原文

我的文件如下所示:

 >gi|358482566|ref|NW_003766328.1| Gallus gallus breed Red Jungle fowl, inbred line UCD001 unplaced genomic scaffold, Gallus_gallus-4.0 ChrUn_7180000961607, whole genome shotgun sequence
 TCTGTCTCTTGTCACTGTATTGTAGTGTGAACCCCTTAAAGGGAAGACCTGCTCTCCTTTGAAAATGCTT
 GCTCATCTATATGCCTCATGCATACCCTCACTGGCAAAGGAGAGCTGAAGTAATTTTAGGACAGAGGAGT
 ACTAGATTGTA
 >gi|358482565|ref|NW_003766329.1| Gallus gallus breed Red Jungle fowl, inbred line UCD001 unplaced genomic scaffold, Gallus_gallus-4.0 ChrUn_7180000961609, whole genome shotgun sequence
 TTTGACCAATGCATTTCAGCATGTTTTTTGACACTAGGTATGCCATTTGGGATGACAATATCAGTTTCCA
 TTTCCATTAGAGGAAAATAAGGTT 

我想将所有以 > 开头的行替换为其第 15 列。我不知道如何用列替换该行,所以我试图用第 15 列替换该行的所有列。

所以我期望的输出是:

     >ChrUn_7180000961607
     TCTGTCTCTTGTCACTGTATTGTAGTGTGAACCCCTTAAAGGGAAGACCTGCTCTCCTTTGAAAATGCTT
     GCTCATCTATATGCCTCATGCATACCCTCACTGGCAAAGGAGAGCTGAAGTAATTTTAGGACAGAGGAGT
     ACTAGATTGTA
     >ChrUn_7180000961609
     TTTGACCAATGCATTTCAGCATGTTTTTTGACACTAGGTATGCCATTTGGGATGACAATATCAGTTTCCA
     TTTCCATTAGAGGAAAATAAGGTT 

这些是我的命令:

 awk '{if ($1 ~ />/) for (i=1; i<=19; i++) gsub ($i, $15)}'
 test.fa

当我使用它时,我在文件中得到了一些更改,但不是我想要的!第 15 栏已删除!

 awk '{if ($1 ~ />/) for (i=1; i<=19; i++) a= $15 gsub($i, a)}'
 gga_ref_Gallus_gallus-4.0_unplaced.fa

当我使用这个时,我收到此错误!

awk: (FILENAME=gga_ref_Gallus_gallus-4.0_unplaced.fa FNR=1) fatal: sub_common: buf: can't allocate 521711124992 bytes of memory (Cannot allocate memory)

所以我想要的是将*所有以* >开头的行替换为第15列,并且我仍然想要开头有>

I have file which looks like this:

 >gi|358482566|ref|NW_003766328.1| Gallus gallus breed Red Jungle fowl, inbred line UCD001 unplaced genomic scaffold, Gallus_gallus-4.0 ChrUn_7180000961607, whole genome shotgun sequence
 TCTGTCTCTTGTCACTGTATTGTAGTGTGAACCCCTTAAAGGGAAGACCTGCTCTCCTTTGAAAATGCTT
 GCTCATCTATATGCCTCATGCATACCCTCACTGGCAAAGGAGAGCTGAAGTAATTTTAGGACAGAGGAGT
 ACTAGATTGTA
 >gi|358482565|ref|NW_003766329.1| Gallus gallus breed Red Jungle fowl, inbred line UCD001 unplaced genomic scaffold, Gallus_gallus-4.0 ChrUn_7180000961609, whole genome shotgun sequence
 TTTGACCAATGCATTTCAGCATGTTTTTTGACACTAGGTATGCCATTTGGGATGACAATATCAGTTTCCA
 TTTCCATTAGAGGAAAATAAGGTT 

I want to replace all the lines that start with >with its 15th column. what I don't know is how to replace the line with a column so I was trying to replace all the columns of that line by 15th column.

so what I expect to have as an output is:

     >ChrUn_7180000961607
     TCTGTCTCTTGTCACTGTATTGTAGTGTGAACCCCTTAAAGGGAAGACCTGCTCTCCTTTGAAAATGCTT
     GCTCATCTATATGCCTCATGCATACCCTCACTGGCAAAGGAGAGCTGAAGTAATTTTAGGACAGAGGAGT
     ACTAGATTGTA
     >ChrUn_7180000961609
     TTTGACCAATGCATTTCAGCATGTTTTTTGACACTAGGTATGCCATTTGGGATGACAATATCAGTTTCCA
     TTTCCATTAGAGGAAAATAAGGTT 

these are my commands:

 awk '{if ($1 ~ />/) for (i=1; i<=19; i++) gsub ($i, $15)}'
 test.fa

when I use this I get some changes in the file but not what I want!! the column 15 is removed!!!

 awk '{if ($1 ~ />/) for (i=1; i<=19; i++) a= $15 gsub($i, a)}'
 gga_ref_Gallus_gallus-4.0_unplaced.fa

and when I use this one I get this error!

awk: (FILENAME=gga_ref_Gallus_gallus-4.0_unplaced.fa FNR=1) fatal: sub_common: buf: can't allocate 521711124992 bytes of memory (Cannot allocate memory)

so what I want is two replace a*ll the lines that start with* >with the 15th column and I want to still have >in the beginning!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

黑色毁心梦 2025-01-12 09:24:19

我认为这将满足您的要求:

awk '$0 ~ /^>/ { print ">" $15; next } 1'

它使所有不以 > 开头的行保持不变。这是通过使用 next 告诉 awk 对于以 > 开头的行跳到下一条记录来完成的。 1 之所以存在,是因为它始终为 true,因此对于任何不以 > 开头的行,都会调用打印该行的默认操作。

I think this will do what you want:

awk '$0 ~ /^>/ { print ">" $15; next } 1'

It leaves all lines that do not start with > unchanged. This is accomplished by using next to tell awk to skip to the next record for the case of lines starting with >. The 1 is there because it is always true, so the default action of printing the line is invoked for any line that does not start with >.

メ斷腸人バ 2025-01-12 09:24:19

这可能对你有用:

 sed 's/^\(\s*\)>\(\S*\s*\)\{15\}.*/\1\2/;s/,\s*$//' file

This might work for you:

 sed 's/^\(\s*\)>\(\S*\s*\)\{15\}.*/\1\2/;s/,\s*$//' file
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文