使用“|”进行 grep 操作替代运算符
以下是名为 AT5G60410.gff 的大文件的示例:
Chr5 TAIR10 gene 24294890 24301147 . + . ID=AT5G60410;Note=protein_coding_gene;Name=AT5G60410
Chr5 TAIR10 mRNA 24294890 24301147 . + . ID=AT5G60410.1;Parent=AT5G60410;Name=AT5G60410.1;Index=1
Chr5 TAIR10 protein 24295226 24300671 . + . ID=AT5G60410.1-Protein;Name=AT5G60410.1;Derives_from=AT5G60410.1
Chr5 TAIR10 exon 24294890 24295035 . + . Parent=AT5G60410.1
Chr5 TAIR10 five_prime_UTR 24294890 24295035 . + . Parent=AT5G60410.1
Chr5 TAIR10 exon 24295134 24295249 . + . Parent=AT5G60410.1
Chr5 TAIR10 five_prime_UTR 24295134 24295225 . + . Parent=AT5G60410.1
Chr5 TAIR10 CDS 24295226 24295249 . + 0 Parent=AT5G60410.1,AT5G60410.1-Protein;
Chr5 TAIR10 exon 24295518 24295598 . + . Parent=AT5G60410.1
我在使用 grep 从中提取特定行时遇到一些问题。我想提取第三列中指定的“基因”类型或“外显子”类型的所有行。当这不起作用时我很惊讶:
grep 'gene|exon' AT5G60410.gff
没有返回结果。我哪里出错了?
The following is a sample of a large file named AT5G60410.gff:
Chr5 TAIR10 gene 24294890 24301147 . + . ID=AT5G60410;Note=protein_coding_gene;Name=AT5G60410
Chr5 TAIR10 mRNA 24294890 24301147 . + . ID=AT5G60410.1;Parent=AT5G60410;Name=AT5G60410.1;Index=1
Chr5 TAIR10 protein 24295226 24300671 . + . ID=AT5G60410.1-Protein;Name=AT5G60410.1;Derives_from=AT5G60410.1
Chr5 TAIR10 exon 24294890 24295035 . + . Parent=AT5G60410.1
Chr5 TAIR10 five_prime_UTR 24294890 24295035 . + . Parent=AT5G60410.1
Chr5 TAIR10 exon 24295134 24295249 . + . Parent=AT5G60410.1
Chr5 TAIR10 five_prime_UTR 24295134 24295225 . + . Parent=AT5G60410.1
Chr5 TAIR10 CDS 24295226 24295249 . + 0 Parent=AT5G60410.1,AT5G60410.1-Protein;
Chr5 TAIR10 exon 24295518 24295598 . + . Parent=AT5G60410.1
I am having some trouble extracting specific lines from this using grep. I wanted to extract all lines that are of type "gene" or type "exon", specified in the third column. I was suprised when this did not work:
grep 'gene|exon' AT5G60410.gff
No results are returned. Where have I gone wrong?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您需要转义
|
。以下应该可以完成这项工作。You need to escape the
|
. The following should do the job.默认情况下,grep 将典型的特殊字符视为普通字符,除非它们被转义。因此,您可以使用以下内容:
但是,您可以使用以下形式更改其模式来执行您期望的操作:
By default, grep treats the typical special characters as normal characters unless they are escaped. So you could use the following:
However, you can change its mode by using the following forms to do what you are expecting:
这是对一些选择进行 grep 的不同方式:
-e
开关指定要匹配的不同模式。This is a different way of grepping for a few choices:
the
-e
switch specifies different patterns to match.这将起作用:
This will work:
我在谷歌搜索涉及 管道的特定问题时发现了这个问题命令到在正则表达式中使用交替运算符的
grep
命令,所以我想我会贡献我更专业的答案。我遇到的错误原来是与之前的管道运算符(即
|
)有关,而不是 grep 正则表达式中的交替运算符(即|
与管道运算符相同) 。对我来说,答案是根据需要正确转义和引用 特殊 shell 字符,例如 & ; 在假设问题出在涉及交替运算符的 grep 正则表达式之前。例如,我在本地计算机上执行的命令是:
此命令导致以下错误:
通过将命令更改为更正此错误:
通过用双引号转义
&
字符,我能够解决我的问题。答案与交替操作完全无关。I found this question while googling for a particular problem I was having involving a piped command to a
grep
command that used the alternation operator in a regex, so I thought that I would contribute my more specialized answer.The error I faced turned out to be with the previous pipe operator (i.e.
|
) and not the alternation operator (i.e.|
identical to pipe operator) in the grep regex at all. The answer for me was to properly escape and quote as necessary special shell characters such as & before assuming the issue was with my grep regex that involved the alternation operator.For example, the command I executed on my local machine was:
This command resulted in the following error:
This error was corrected by changing my command to:
By escaping the
&
character with double quotes I was able to resolve my issue. The answer had nothing to do with the alternation operation at all.