使用“|”进行 grep 操作替代运算符

发布于 2024-11-25 17:05:27 字数 1079 浏览 9 评论 0原文

以下是名为 AT5G60410.gff 的大文件的示例:

Chr5    TAIR10  gene    24294890    24301147    .   +   .   ID=AT5G60410;Note=protein_coding_gene;Name=AT5G60410
Chr5    TAIR10  mRNA    24294890    24301147    .   +   .   ID=AT5G60410.1;Parent=AT5G60410;Name=AT5G60410.1;Index=1
Chr5    TAIR10  protein 24295226    24300671    .   +   .   ID=AT5G60410.1-Protein;Name=AT5G60410.1;Derives_from=AT5G60410.1
Chr5    TAIR10  exon    24294890    24295035    .   +   .   Parent=AT5G60410.1
Chr5    TAIR10  five_prime_UTR  24294890    24295035    .   +   .   Parent=AT5G60410.1
Chr5    TAIR10  exon    24295134    24295249    .   +   .   Parent=AT5G60410.1
Chr5    TAIR10  five_prime_UTR  24295134    24295225    .   +   .   Parent=AT5G60410.1
Chr5    TAIR10  CDS 24295226    24295249    .   +   0   Parent=AT5G60410.1,AT5G60410.1-Protein;
Chr5    TAIR10  exon    24295518    24295598    .   +   .   Parent=AT5G60410.1

我在使用 grep 从中提取特定行时遇到一些问题。我想提取第三列中指定的“基因”类型或“外显子”类型的所有行。当这不起作用时我很惊讶:

grep 'gene|exon' AT5G60410.gff

没有返回结果。我哪里出错了?

The following is a sample of a large file named AT5G60410.gff:

Chr5    TAIR10  gene    24294890    24301147    .   +   .   ID=AT5G60410;Note=protein_coding_gene;Name=AT5G60410
Chr5    TAIR10  mRNA    24294890    24301147    .   +   .   ID=AT5G60410.1;Parent=AT5G60410;Name=AT5G60410.1;Index=1
Chr5    TAIR10  protein 24295226    24300671    .   +   .   ID=AT5G60410.1-Protein;Name=AT5G60410.1;Derives_from=AT5G60410.1
Chr5    TAIR10  exon    24294890    24295035    .   +   .   Parent=AT5G60410.1
Chr5    TAIR10  five_prime_UTR  24294890    24295035    .   +   .   Parent=AT5G60410.1
Chr5    TAIR10  exon    24295134    24295249    .   +   .   Parent=AT5G60410.1
Chr5    TAIR10  five_prime_UTR  24295134    24295225    .   +   .   Parent=AT5G60410.1
Chr5    TAIR10  CDS 24295226    24295249    .   +   0   Parent=AT5G60410.1,AT5G60410.1-Protein;
Chr5    TAIR10  exon    24295518    24295598    .   +   .   Parent=AT5G60410.1

I am having some trouble extracting specific lines from this using grep. I wanted to extract all lines that are of type "gene" or type "exon", specified in the third column. I was suprised when this did not work:

grep 'gene|exon' AT5G60410.gff

No results are returned. Where have I gone wrong?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

缘字诀 2024-12-02 17:05:27

您需要转义 |。以下应该可以完成这项工作。

grep "gene\|exon" AT5G60410.gff

You need to escape the |. The following should do the job.

grep "gene\|exon" AT5G60410.gff
独留℉清风醉 2024-12-02 17:05:27

默认情况下,grep 将典型的特殊字符视为普通字符,除非它们被转义。因此,您可以使用以下内容:

grep 'gene\|exon' AT5G60410.gff

但是,您可以使用以下形式更改其模式来执行您期望的操作:

egrep 'gene|exon' AT5G60410.gff
grep -E 'gene|exon' AT5G60410.gff

By default, grep treats the typical special characters as normal characters unless they are escaped. So you could use the following:

grep 'gene\|exon' AT5G60410.gff

However, you can change its mode by using the following forms to do what you are expecting:

egrep 'gene|exon' AT5G60410.gff
grep -E 'gene|exon' AT5G60410.gff
羁客 2024-12-02 17:05:27

这是对一些选择进行 grep 的不同方式:

grep -e gene -e exon AT5G60410.gff

-e 开关指定要匹配的不同模式。

This is a different way of grepping for a few choices:

grep -e gene -e exon AT5G60410.gff

the -e switch specifies different patterns to match.

舞袖。长 2024-12-02 17:05:27

这将起作用:

grep "gene\|exon" AT5G60410.gff

This will work:

grep "gene\|exon" AT5G60410.gff
假扮的天使 2024-12-02 17:05:27

我在谷歌搜索涉及 管道的特定问题时发现了这个问题命令到在正则表达式中使用交替运算符的grep命令,所以我想我会贡献我更专业的答案。

我遇到的错误原来是与之前的管道运算符(即 |)有关,而不是 grep 正则表达式中的交替运算符(即 | 与管道运算符相同) 。对我来说,答案是根据需要正确转义和引用 特殊 shell 字符,例如 & ; 在假设问题出在涉及交替运算符的 grep 正则表达式之前。

例如,我在本地计算机上执行的命令是:

get http://localhost/foobar-& | grep "fizz\|buzz"

此命令导致以下错误:

-bash: syntax error near unexpected token `|'

通过将命令更改为更正此错误:

get "http://localhost/foobar-&" | grep "fizz\|buzz"

通过用双引号转义 & 字符,我能够解决我的问题。答案与交替操作完全无关。

I found this question while googling for a particular problem I was having involving a piped command to a grep command that used the alternation operator in a regex, so I thought that I would contribute my more specialized answer.

The error I faced turned out to be with the previous pipe operator (i.e. |) and not the alternation operator (i.e. | identical to pipe operator) in the grep regex at all. The answer for me was to properly escape and quote as necessary special shell characters such as & before assuming the issue was with my grep regex that involved the alternation operator.

For example, the command I executed on my local machine was:

get http://localhost/foobar-& | grep "fizz\|buzz"

This command resulted in the following error:

-bash: syntax error near unexpected token `|'

This error was corrected by changing my command to:

get "http://localhost/foobar-&" | grep "fizz\|buzz"

By escaping the & character with double quotes I was able to resolve my issue. The answer had nothing to do with the alternation operation at all.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文