正则表达式捕获组适用于 regex101,不适用于 sed

发布于 2025-01-10 10:36:10 字数 1076 浏览 0 评论 0原文

我发现了其他一些类似标题的问题,但没有找到答案。

我的文本是:

##bcftools_mergeCommand=merge --force-samples -m none -O v -o analysis/STUDY1/hg19/exome/merged.vcf --threads 4 analysis/STUDY1/hg19/exome/varscan_norm.vcf.gz analysis/STUDY1/hg19/exome/gatk_norm.vcf.gz analysis/STUDY1/hg19/exome/samtools_norm.vcf.gz analysis/STUDY1/hg19/exome/freebayes_norm.vcf.gz

我想要 .vcf.gz 文件的名称。

sed 给我:

echo "##bcftools_mergeCommand=merge --force-samples -m none -O v -o analysis/STUDY1/hg19/exome/merged.vcf --threads 4 analysis/STUDY1/hg19/exome/varscan_norm.vcf.gz analysis/STUDY1/hg19/exome/gatk_norm.vcf.gz analysis/STUDY1/hg19/exome/samtools_norm.vcf.gz analysis/STUDY1/hg19/exome/freebayes_norm.vcf.gz" | sed -En 's/\/([^\/]+\.vcf\.gz)/\1/g'

结果为空。

Regex101 给出:

“在此处输入图像描述"

https://regex101.com/r/h3OGvN/1

I found some other similarly titled questions but didn't find the answer.

My text is:

##bcftools_mergeCommand=merge --force-samples -m none -O v -o analysis/STUDY1/hg19/exome/merged.vcf --threads 4 analysis/STUDY1/hg19/exome/varscan_norm.vcf.gz analysis/STUDY1/hg19/exome/gatk_norm.vcf.gz analysis/STUDY1/hg19/exome/samtools_norm.vcf.gz analysis/STUDY1/hg19/exome/freebayes_norm.vcf.gz

I want the names of the .vcf.gz files.

Sed gives me:

echo "##bcftools_mergeCommand=merge --force-samples -m none -O v -o analysis/STUDY1/hg19/exome/merged.vcf --threads 4 analysis/STUDY1/hg19/exome/varscan_norm.vcf.gz analysis/STUDY1/hg19/exome/gatk_norm.vcf.gz analysis/STUDY1/hg19/exome/samtools_norm.vcf.gz analysis/STUDY1/hg19/exome/freebayes_norm.vcf.gz" | sed -En 's/\/([^\/]+\.vcf\.gz)/\1/g'

with blank results.

Regex101 gives:

enter image description here

https://regex101.com/r/h3OGvN/1

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

猫腻 2025-01-17 10:36:11

这可能对您有用(GNU sed):

sed -E 'y/ \//\n\n/;/\`\S+\.vcf\.gz$/MP;D' file

将空格和 / 与换行符匹配,并使用 P;D 习惯用法打印/删除结果中第一行的匹配项

This might work for you (GNU sed):

sed -E 'y/ \//\n\n/;/\`\S+\.vcf\.gz$/MP;D' file

Match spaces and / to newlines and use the P;D idiom to print/delete a match of the first line in the result

你列表最软的妹 2025-01-17 10:36:10

为什么不使用 grep ?

$ data='##bcftools_mergeCommand=merge --force-samples -m none -O v -o analysis/STUDY1/hg19/exome/merged.vcf --threads 4 analysis/STUDY1/hg19/exome/varscan_norm.vcf.gz analysis/STUDY1/hg19/exome/gatk_norm.vcf.gz analysis/STUDY1/hg19/exome/samtools_norm.vcf.gz analysis/STUDY1/hg19/exome/freebayes_norm.vcf.gz'
$ echo $data | grep -Eo [^\/]+\.vcf\.gz
varscan_norm.vcf.gz
gatk_norm.vcf.gz
samtools_norm.vcf.gz
freebayes_norm.vcf.gz

  • -E:将模式解释为扩展正则表达式。
  • -o:仅打印匹配的(非空)部分。

Why not using grep ?

$ data='##bcftools_mergeCommand=merge --force-samples -m none -O v -o analysis/STUDY1/hg19/exome/merged.vcf --threads 4 analysis/STUDY1/hg19/exome/varscan_norm.vcf.gz analysis/STUDY1/hg19/exome/gatk_norm.vcf.gz analysis/STUDY1/hg19/exome/samtools_norm.vcf.gz analysis/STUDY1/hg19/exome/freebayes_norm.vcf.gz'
$ echo $data | grep -Eo [^\/]+\.vcf\.gz
varscan_norm.vcf.gz
gatk_norm.vcf.gz
samtools_norm.vcf.gz
freebayes_norm.vcf.gz

  • -E: Interpret patterns as extended regular expressions.
  • -o: Print only the matched (non-empty) parts.
江南月 2025-01-17 10:36:10

Regex101 支持的正则表达式方言与 sed 理解的方言不同。

具体来说,(去掉多余的 g 标志并)添加一个 p 标志来打印匹配的行来修复这个特定的脚本;但在一般情况下,不要依赖不直接支持您实际想要使用的工具。

The regex dialect supported by Regex101 is different from the one sed understands.

Concretely, (take out the superflous g flag and) add a p flag to print the matching lines to fix this specific script; but in the general case, don't rely on a tool which doesn't directly support the one you actually want to use.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文