$ f,\ t'\ n \ n \ n \ n''是什么意思是使用尴尬线性化Fasta时?
我正在尝试使用尴尬线性化Fasta。我完全是新手。我有一个脚本,
awk '/^>/ {printf("%s%s\t",(N>0?"\n":""),$0);N++;next;} {printf("%s",$0);} END {printf("\n");}' < $f | tr "\t" "\n" > ${f/.fasta/_lin.fasta}
我不了解&lt; $ f | tr“ \ t”“ \ n”&gt; $ {f/.fasta/_lin.fasta}
。什么是$ f
,whats tr
,t
,n
。我到底应该在哪里提供输入文件?有人可以详细说明吗?
I am trying to linearize fasta using awk. I am totally new to it. I have a script
awk '/^>/ {printf("%s%s\t",(N>0?"\n":""),$0);N++;next;} {printf("%s",$0);} END {printf("\n");}' < $f | tr "\t" "\n" > ${f/.fasta/_lin.fasta}
I dont understand anything in the < $f | tr "\t" "\n" > ${f/.fasta/_lin.fasta}
. What is $f
, whats tr
, t
, n
. Where exactly I am supposed to give the input file? Can someone please elaborate?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
让我们逐步浏览该代码。首先,我会添加一些空白使其更加清晰:
好的。首先,
$ f
是您的输入文件。代码的作者期望它包含.fasta
,大概是myfile.fasta
。在此特定情况下,shell脚本中的&lt;
运算符是多余的(除非您在文件名中具有等值的符号,因为awk
可以将其解释为变量分配),只是告诉awk
消耗该文件的内容。然后,Awk进来并匹配以
&gt;
开头的行。在这些行上,它将打印一个新线(如果N&gt; 0),否则什么也没有,其次是线的内容。然后,它会增加N并跳过该行的下一个命令。其他线条被看到。阅读了$ f
的所有行后,打印了最终的新线。此
awk
代码不是很清晰。它可以像这样重写:这里唯一棘手的作品是
n
最初是零的,因此,当您第一次说n ++
时,它会在增加之前返回值(零) = false),因此该条件不会触发。当您第二次说出来时,它会在下一个增量(一个= true)之前返回值,因此该条件触发。任何不是空字符串或零的东西都以true评估。在一行,更高尔夫球上,可以是
awk'/^&gt;/&amp;&amp; n ++ {printf“ \ n”} 1; end {printf“ \ n”}'}'
(<代码> 1; 触发默认操作,即打印行)。awk
之后,输出将传递到tr
将所有选项卡(\ t
)转换为newlines(\ n
) )。然后,使用&gt;
运算符将输出输送到shell替换$ {f/.fasta/_lin.fasta}
的文件中,该文件替换了第一个的实例.fasta
in$ f
带有_lin.fasta
,所以我们的示例输入文件myfile.fasta
被转换为输出文件myfile_lin.fasta
。Let's step through that code piece by piece. First, I'll add some white space to make it more legible:
Okay. First,
$f
is your input file. The code's author expects it to contain.fasta
, presumably at the end, likemyfile.fasta
. The<
operator in shell scripts is redundant in this particular case (unless you have an equals sign in the filename sinceawk
may interpret that as a variable assignment), simply tellingawk
to consume the contents of that file.AWK then comes in and matches lines that start with
>
. On those lines, it will print a newline (if N > 0) or else nothing, followed by the contents of the line. It then increments N and skips the next command for that line. Other lines are printed as they're seen. After reading all of the lines of$f
, a final newline is printed.This
awk
code is not very legible. It could be rewritten like this:The only tricky piece here is that
N
is initially zero, so when you sayN++
the first time, it returns the value before incrementing (zero = false) and therefore that condition does not trigger. When you say it the second time, it returns the value before the next incrementing (one = true) and therefore that condition triggers. Anything that is not an empty string or a zero evaluates as true.On one line, and more golfed, that could be
awk '/^>/&&N++{printf"\n"}1;END{printf"\n"}'
(1;
triggers the default action, which is to print the line).After
awk
, the output is passed totr
to translate all tabs (\t
) into newlines (\n
). Then the output is piped using the>
operator to write to a file described by the shell replacement${f/.fasta/_lin.fasta}
, which replaces the first instance of.fasta
in$f
with_lin.fasta
, so our example input filemyfile.fasta
is transformed to output filemyfile_lin.fasta
.您可能应该在使用它之前获取和理解工具的用户手册,除非您将较早的潜在损害宣布为可接受的。
这是带有2个参数的
tr
命令,大多数Linux命令都配有手册,您可以像如此流行的一个也具有在线版本一样访问,例如
tr
manpage 从其中\ t
和\ n
可以找到You should probably get and comprehend User Manual for tool before attemping to use it, unless you declared earlier potential damage as acceptable.
This is
tr
command with 2 arguments, most linux commands are furnished with manual which you can access like sopopular one have also online versions, for example
tr
manpage from where meaning of\t
and\n
might be found我猜OP正在尝试做类似的事情,将
awk
和tr
命令组合在一起:I'm guessing OP is trying to do something like this combining both the
awk
andtr
commands :